New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC Dependencies #888

Closed
arlimus opened this Issue Aug 8, 2016 · 18 comments

Comments

Projects
None yet
5 participants
@arlimus
Contributor

arlimus commented Aug 8, 2016

Follow up #798 around dependency resolution.

Design

(1) dependencies are written in inspec.yml:

depends:
  - name: hello
  [ ... ]

(2) dependencies are based on semantic versioning: http://semver.org/ . Dependencies do not support alpha/pre-release information (9, 10) (see comment)
(3) dependency version constraints are specified with:

- version: [op] [version] [...]

with op being <, >, <=, >=, =, != or ~>. Multiple constraints may be specified. Whitespaces are optional. Whitespaces in version numbers are not supported.

(4) each dependency has one source, with the default pointing to supermarket. sources can be specified via:

- path: ../relative_path/to/profile
- path: /absolute/path
- supermarket: owner/name
- compliance: owner/name
- github: owner/name
- url: http://sth...

(5) upon resolution, a lockfile is created in inspec.lock. If this lockfile exists, dependencies are not resolved but taken from this file intead.

(6) dependencies are vendored to a local cache in ${inspec-home}/cache. Users may specify a custom vendor location.

(7) dependencies provide their library functions the profile (without additional specification). For example: Resources that are defined in libraries are available to the profile that requires it.

(8) dependencies may provide controls via include_controls (all) or require_controls (selective). If none of these are used, dependencies will not execute or report on any controls.

(9) dependencies are scoped with their name. scoping applies to all aspects, including resources, controls, and attributes. resources are added to the global space if there is no conflict.

... and discuss 😁

scoping

  • All profiles have a simple name. Names are short, to the point (and should not contain spaces). Example: ssh, cis-centos6-lvl1

  • When a profile is pulled in via dependencies, its name may be overwritten. This allows the inclusion of 2 profiles with the same name. Example: my ssh becomes my-ssh while upstream ssh becomes upstream-ssh. It's done via the name field:

    depends:
    - name: my-ssh
      url: go.to/my/ssh
    
  • All profile resources, controls, attributes, and other future artifacts are scoped under this name. Controls and attributes must always follow the name convention, resources may be added to a shared global scope but may overwrite existing resource with this name.

sources

The following types are supported:

path

Profile which is located in a folder on disk. This should only be used for development and debugging.

Does not support version constraints. The folder must exist. If it doesn't, throw an error.

depends:
- name: my-profile
  path: /absolute/path
- name: another
  path: ../relative/path

url

Fixed HTTP/HTTPS-based URL which contains a profile. To retrieve the profile use a HTTP GET operation. The profile is provided in either zip, tar, or tar.gz format. If the download fails or doesn't provide the expected format, throw an error.

depends:
- name: my-profile
  url: https://my.domain/path/to/profile.tgz

supermarket, git, and github

These sources are translated into a URL upon resolution. All support version indexing. For versions to be indexed, they must be provided via semantic versioning as git tags.

Git is the basic mechanism, which supports an optional branch, tag, commit, or version specification. Version specs are resolved via tags matching semantic versioning patterns. If a version constraint cannot be resolved, an error is thrown.

depends:
- name: git-profile
  git: http://url/to/repo
  [branch:  desired_branch]
  [tag:     desired_version]
  [commit:  pinned_commit]
  [version: semver_via_tags]

Github and supermarket build on this source and support all git options:

depends:
- name: gh-profile
  github: username/project
- name: super-profile
  supermarket: username/profile

MVP

  • Specify a dependency in inspec.yml
  • Dependency is a path-based and does not require version resolution
  • No need for vendoring yet
  • Ability to require controls from said profile via include_controls and require_controls
  • Create a Lockfile

Slices

MVP

  • MVP path #891
  • Transitive dependencies (A pulls B pulls C) #915
  • Create and load a Lockfile for dependencies after resolution #950
  • Introduce scoping to the ProfileContext which has a view of all of its dependencies #958
    • create an up-front profile context for every profile that is pulled in (e.g. as dependencies)
    • attach profile contexts to the dependency tree
    • runner loads content in the context of profile inside the dep tree
  • Vendor URL dependencies so that they don't conflict on disk and load them (ignore conflicts)
  • Vendor Github and Supermarket dependencies #959
  • Design UX for scoping of attributes and resources #1057
  • All resources are scoped #1058

@arlimus arlimus modified the milestones: 1.0.0, 0.30.0 Aug 8, 2016

@alexpop

This comment has been minimized.

Show comment
Hide comment
@alexpop

alexpop Aug 8, 2016

Contributor

🍒 | 🍎 | 🍌

🍉 | 🍒 | 🍓

🍏 | 🍉 | 🍒

Contributor

alexpop commented Aug 8, 2016

🍒 | 🍎 | 🍌

🍉 | 🍒 | 🍓

🍏 | 🍉 | 🍒

arlimus added a commit that referenced this issue Aug 8, 2016

introduce dependency resolution
This commit is the foundation of the dependency resolution as described in #888 .

It currently only works with local dependencies, as seen in the example inheritance profile.

Tests and full resolution are coming next on the path to an MVP implementation.

arlimus added a commit that referenced this issue Aug 9, 2016

introduce dependency resolution
This commit is the foundation of the dependency resolution as described in #888 .

It currently only works with local dependencies, as seen in the example inheritance profile.

Tests and full resolution are coming next on the path to an MVP implementation.
@vinyar

This comment has been minimized.

Show comment
Hide comment
@vinyar

vinyar Aug 9, 2016

As you guys are working on the feature, I would like to know how the default supermarket URL will be resolved? I've seen bugs, where default URL is hardcoded to public supermarket (supermarket.chef.io), which blows up inside firewalled environment where customer has their own supermarket running.

Please consider how/where this URL will be fetched from, and where/how it can be overwritten to point to internal source by default.

vinyar commented Aug 9, 2016

As you guys are working on the feature, I would like to know how the default supermarket URL will be resolved? I've seen bugs, where default URL is hardcoded to public supermarket (supermarket.chef.io), which blows up inside firewalled environment where customer has their own supermarket running.

Please consider how/where this URL will be fetched from, and where/how it can be overwritten to point to internal source by default.

@vinyar

This comment has been minimized.

Show comment
Hide comment
@vinyar

vinyar Aug 9, 2016

Is there a page with words that describes (9) ?

vinyar commented Aug 9, 2016

Is there a page with words that describes (9) ?

@chris-rock

This comment has been minimized.

Show comment
Hide comment
@chris-rock

chris-rock Aug 9, 2016

Member

In addition to @vinyar the same challenge will happen with our compliance:// prefix. The challenge is therefore the same for:

- supermarket: owner/name
- compliance: owner/name

One solution could be to use the compliance and supermarket url from inspec compliance login and a new inspec supermarket login.

Member

chris-rock commented Aug 9, 2016

In addition to @vinyar the same challenge will happen with our compliance:// prefix. The challenge is therefore the same for:

- supermarket: owner/name
- compliance: owner/name

One solution could be to use the compliance and supermarket url from inspec compliance login and a new inspec supermarket login.

arlimus added a commit that referenced this issue Aug 9, 2016

introduce dependency resolution
This commit is the foundation of the dependency resolution as described in #888 .

It currently only works with local dependencies, as seen in the example inheritance profile.

Tests and full resolution are coming next on the path to an MVP implementation.

arlimus added a commit that referenced this issue Aug 9, 2016

introduce dependency resolution
This commit is the foundation of the dependency resolution as described in #888 .

It currently only works with local dependencies, as seen in the example inheritance profile.

Tests and full resolution are coming next on the path to an MVP implementation.

@arlimus arlimus modified the milestones: 1.0.0, 0.30.0 Aug 9, 2016

@arlimus

This comment has been minimized.

Show comment
Hide comment
@arlimus

arlimus Aug 9, 2016

Contributor

@vinyar @chris-rock fun story, this might even happen to github prefix ;)

To address it, we have 2 options afaics

  1. Let the user inspec [sth] login [url] to the service. This is great, if you have a login. But what if you don't? How about an internal supermarket that doesn't require a login?
  2. Specify the provider's url via the target: supermarket://my.server/owner/profile. This also has a number of problems, eg there is no way to specify http vs https
  3. Specify the provider's url via an additional parameter: inspec exec profile --supermarket-url https://my.server. This may get quite wordy.
  4. This ^^ could also be split in 2 stages: inspec [supermarket|compliance|github] login/url https://my.server and then inspec exec profile (i.e. not just login but also just specify via url). The login information is either kept in a local(ish) file or in env (recommended imho)
Contributor

arlimus commented Aug 9, 2016

@vinyar @chris-rock fun story, this might even happen to github prefix ;)

To address it, we have 2 options afaics

  1. Let the user inspec [sth] login [url] to the service. This is great, if you have a login. But what if you don't? How about an internal supermarket that doesn't require a login?
  2. Specify the provider's url via the target: supermarket://my.server/owner/profile. This also has a number of problems, eg there is no way to specify http vs https
  3. Specify the provider's url via an additional parameter: inspec exec profile --supermarket-url https://my.server. This may get quite wordy.
  4. This ^^ could also be split in 2 stages: inspec [supermarket|compliance|github] login/url https://my.server and then inspec exec profile (i.e. not just login but also just specify via url). The login information is either kept in a local(ish) file or in env (recommended imho)
@arlimus

This comment has been minimized.

Show comment
Hide comment
@arlimus

arlimus Aug 9, 2016

Contributor

@vinyar added a lot of text around scoping

Contributor

arlimus commented Aug 9, 2016

@vinyar added a lot of text around scoping

@alexpop

This comment has been minimized.

Show comment
Hide comment
@alexpop

alexpop Aug 10, 2016

Contributor

Proposing a git source as well or instead of github

Contributor

alexpop commented Aug 10, 2016

Proposing a git source as well or instead of github

@chris-rock

This comment has been minimized.

Show comment
Hide comment
@chris-rock

chris-rock Aug 10, 2016

Member

@alexpop great idea to support git directly

Member

chris-rock commented Aug 10, 2016

@alexpop great idea to support git directly

@alexpop

This comment has been minimized.

Show comment
Hide comment
@alexpop

alexpop Aug 10, 2016

Contributor

(A) Is the url source going to point to an archive?


(B) Is ${inspec-home} the installation path for inspec? If so, what happens when inspec is upgraded?


(C) I like the idea of defining a source per dependency. But this creates the need to be able to override these locations for environments that don't have direct access to them or have a policy to review and store all dependencies before using. Let's take this for example, where gitlab is a local git repository:

hippa-profile (gitlab)
│
├─────> oracle-db (gitlab) ─────────> oracle (github)
│                                     └─────────────────────> inspec-sugar (bitbucket)
│          
└─────> docker-engine (private gitlab) ─────> cis-docker-benchmark (public supermarket)

(C1) Passing overriding options to inspec can get wordy very fast and lacks granularity.

(C2) Not wordy for inspec and very granular is to use a file, similar to Berksfile/Policyfile where you can define a default source and individual source for each dependency.

(C3) Like (C2) but without another file, but instead using inspec.yml to override the source of dependencies


(D) Should probably design the sources to receive parameters. So, instead of:

depends:
  - name: profile
    github: owner/name

have something like this:

depends:
  - name: profile1
    source:
      github: docker/security
      branch: stable
      rel: 'inspec-profiles/profile1'
  - name: profile2
    source:
      s3: profiles-bucket/inspec/profile2.tgz
      aws_access_key: x
      aws_secret_acces_key: y

same way berkshelf is doing it.

Contributor

alexpop commented Aug 10, 2016

(A) Is the url source going to point to an archive?


(B) Is ${inspec-home} the installation path for inspec? If so, what happens when inspec is upgraded?


(C) I like the idea of defining a source per dependency. But this creates the need to be able to override these locations for environments that don't have direct access to them or have a policy to review and store all dependencies before using. Let's take this for example, where gitlab is a local git repository:

hippa-profile (gitlab)
│
├─────> oracle-db (gitlab) ─────────> oracle (github)
│                                     └─────────────────────> inspec-sugar (bitbucket)
│          
└─────> docker-engine (private gitlab) ─────> cis-docker-benchmark (public supermarket)

(C1) Passing overriding options to inspec can get wordy very fast and lacks granularity.

(C2) Not wordy for inspec and very granular is to use a file, similar to Berksfile/Policyfile where you can define a default source and individual source for each dependency.

(C3) Like (C2) but without another file, but instead using inspec.yml to override the source of dependencies


(D) Should probably design the sources to receive parameters. So, instead of:

depends:
  - name: profile
    github: owner/name

have something like this:

depends:
  - name: profile1
    source:
      github: docker/security
      branch: stable
      rel: 'inspec-profiles/profile1'
  - name: profile2
    source:
      s3: profiles-bucket/inspec/profile2.tgz
      aws_access_key: x
      aws_secret_acces_key: y

same way berkshelf is doing it.

@alexpop

This comment has been minimized.

Show comment
Hide comment
@alexpop

alexpop Aug 10, 2016

Contributor

Dom (8) isn't include_controls (all) and require_controls(selective)?
Link this as example for the two: https://github.com/chef/inspec/blob/master/docs/profiles.rst#profile-inheritance

Contributor

alexpop commented Aug 10, 2016

Dom (8) isn't include_controls (all) and require_controls(selective)?
Link this as example for the two: https://github.com/chef/inspec/blob/master/docs/profiles.rst#profile-inheritance

@vinyar

This comment has been minimized.

Show comment
Hide comment
@vinyar

vinyar Aug 10, 2016

@arlimus there is also another problem outside of the realm of compliance, is that presently supermarket does not support publishing profiles via command line or even curl.
(chef/supermarket#1166)

vinyar commented Aug 10, 2016

@arlimus there is also another problem outside of the realm of compliance, is that presently supermarket does not support publishing profiles via command line or even curl.
(chef/supermarket#1166)

@stevendanna

This comment has been minimized.

Show comment
Hide comment
@stevendanna

stevendanna Aug 10, 2016

Contributor

Re (2) dependencies are based on semantic versioning: http://semver.org/. Do we intend to support alpha/pre-release identifiers or build identifiers:

My vote would be that we do not support these parts of semver.

Contributor

stevendanna commented Aug 10, 2016

Re (2) dependencies are based on semantic versioning: http://semver.org/. Do we intend to support alpha/pre-release identifiers or build identifiers:

My vote would be that we do not support these parts of semver.

chris-rock added a commit that referenced this issue Aug 10, 2016

introduce dependency resolution
This commit is the foundation of the dependency resolution as described in #888 .

It currently only works with local dependencies, as seen in the example inheritance profile.

Tests and full resolution are coming next on the path to an MVP implementation.
@arlimus

This comment has been minimized.

Show comment
Hide comment
@arlimus

arlimus Aug 11, 2016

Contributor

@alexpop thank you, fixed 👍 :)
somehow github edit reverted my last set of edits to the first post, will have to do it again...

Contributor

arlimus commented Aug 11, 2016

@alexpop thank you, fixed 👍 :)
somehow github edit reverted my last set of edits to the first post, will have to do it again...

@arlimus

This comment has been minimized.

Show comment
Hide comment
@arlimus

arlimus Aug 11, 2016

Contributor

@stevendanna We don't have a use-case for these yet, so happy to keep the feature-set small. 👍

Contributor

arlimus commented Aug 11, 2016

@stevendanna We don't have a use-case for these yet, so happy to keep the feature-set small. 👍

@stevendanna

This comment has been minimized.

Show comment
Hide comment
@stevendanna

stevendanna Aug 11, 2016

Contributor

I'm having trouble understanding how some of this dependency RFC will work. It mixes some ideas from dependency systems that do minimal version management and others that do full dependency resolution, usually using a central index of the universe of available packages to speed up the process.

Here is a more basic question that I think we should spell out more explicitly:

Consider: A depends on B and C with no version constraints. B and C depends on D with no version constrains but from different sources. (You can create other examples with completing versions). It seems in this case we have a few options:

  • Raise an error and instruct the user they need to add D to their top level inspec.yml and specify a source that will take precedence.
  • Develop features inside of inspec that would allow both version of D to be used simultaneously, with B and C each having access to the version of D that they required. Perhaps, building the "source" into the internal identifier that one requires.
  • Not consider "source" information from our transitive dependencies and assume transitive deps will be available in some default source.

Or am I missing an option? Or more generally: For a given dependency "X" are we going to try to find a single version of X to try to load into the app or will we allow for multiple versions.

Contributor

stevendanna commented Aug 11, 2016

I'm having trouble understanding how some of this dependency RFC will work. It mixes some ideas from dependency systems that do minimal version management and others that do full dependency resolution, usually using a central index of the universe of available packages to speed up the process.

Here is a more basic question that I think we should spell out more explicitly:

Consider: A depends on B and C with no version constraints. B and C depends on D with no version constrains but from different sources. (You can create other examples with completing versions). It seems in this case we have a few options:

  • Raise an error and instruct the user they need to add D to their top level inspec.yml and specify a source that will take precedence.
  • Develop features inside of inspec that would allow both version of D to be used simultaneously, with B and C each having access to the version of D that they required. Perhaps, building the "source" into the internal identifier that one requires.
  • Not consider "source" information from our transitive dependencies and assume transitive deps will be available in some default source.

Or am I missing an option? Or more generally: For a given dependency "X" are we going to try to find a single version of X to try to load into the app or will we allow for multiple versions.

@stevendanna

This comment has been minimized.

Show comment
Hide comment
@stevendanna

stevendanna Aug 11, 2016

Contributor

I'd also like to see us spec out some of the UX of how you will interact with this feature. It seems to me the core operation of a tool like this are:

  • Resolve & Fetch all dependencies
  • Update all dependencies
  • Update a single dependency
    – Show the "activated" list of dependencies
  • Show the "activated" list of dependencies in tree form to better understand your dependencies
Contributor

stevendanna commented Aug 11, 2016

I'd also like to see us spec out some of the UX of how you will interact with this feature. It seems to me the core operation of a tool like this are:

  • Resolve & Fetch all dependencies
  • Update all dependencies
  • Update a single dependency
    – Show the "activated" list of dependencies
  • Show the "activated" list of dependencies in tree form to better understand your dependencies
@stevendanna

This comment has been minimized.

Show comment
Hide comment
@stevendanna

stevendanna Aug 11, 2016

Contributor

@arlimus After discussing this with @chris-rock a bit, I think an important point to discuss and make a bit more explicit in this RFC is how dependency will be scoped and the impact that has on the features we need in the dependency resolution.

From our discussion, I believe one of the goals of this work is to allow two different versions of the same dependencies to be loaded at the same time. For example, you might have a dependency tree that looks like:

A
|
+-->B-->D@1.0.0
|
+-->C-->D@2.0.0

This proposes that we change inspec such that code inside B can reference code inside D and be sure it is getting the 1.0.0 version of the code, while code inside C can reference code inside D and be sure it is getting the 2.0.0 version of the code. Is this a correct characterization of part of the feature being proposed?

If so, it leads to a follow-on thought:

Is there utility in including the version operators at all. In the dependency management systems I've used (I'm by no means an expert) one of the major reason to include the version operators is because the runtime can only load a single version of a given dependency. Packages use version constraints to ensure that the single version that is loaded is within the range of versions it supports. The package manager solves the constraint problem after getting version constraint information from every source to ensure it has the full universe of dependencies.

However, in the world describe above, it seems that the version constraints would only be used to limit which version is fetched in the absence of a lockfile. However, most of the proposed sources don't have a standard API for exposing information about the available versions.

Contributor

stevendanna commented Aug 11, 2016

@arlimus After discussing this with @chris-rock a bit, I think an important point to discuss and make a bit more explicit in this RFC is how dependency will be scoped and the impact that has on the features we need in the dependency resolution.

From our discussion, I believe one of the goals of this work is to allow two different versions of the same dependencies to be loaded at the same time. For example, you might have a dependency tree that looks like:

A
|
+-->B-->D@1.0.0
|
+-->C-->D@2.0.0

This proposes that we change inspec such that code inside B can reference code inside D and be sure it is getting the 1.0.0 version of the code, while code inside C can reference code inside D and be sure it is getting the 2.0.0 version of the code. Is this a correct characterization of part of the feature being proposed?

If so, it leads to a follow-on thought:

Is there utility in including the version operators at all. In the dependency management systems I've used (I'm by no means an expert) one of the major reason to include the version operators is because the runtime can only load a single version of a given dependency. Packages use version constraints to ensure that the single version that is loaded is within the range of versions it supports. The package manager solves the constraint problem after getting version constraint information from every source to ensure it has the full universe of dependencies.

However, in the world describe above, it seems that the version constraints would only be used to limit which version is fetched in the absence of a lockfile. However, most of the proposed sources don't have a standard API for exposing information about the available versions.

@arlimus

This comment has been minimized.

Show comment
Hide comment
@arlimus

arlimus Aug 11, 2016

Contributor

Great discussion with @stevendanna and @chris-rock :

Which sources do we have that provide version information in an index?

  • url, path, compliance: no index information
  • github, supermarket: index information without dependency/version info
  • for all indexes, we find the optimal item that matches the user's specification

Ideas:

  • We still build the full dependency tree for the final pinned state; We fetch the version that matches the specification for the sources that support it.

I'll update the spec:

  • When a Profile X executes, it will have access to its dependencies D' and either dependencies D". Resources have flat exposure, e.g. if D" creates d_conf, X will have access to it via calling d_conf
  • When a Profile X executes, its dependent attributes are accessible via the mapped names of dependencies D' and D", e.g. for profile postgres you get access to its attribute user via postgres/user
  • If a Profile X pulls any dependencies with the same space but different versions D1 and D2, the profile X gets access to the latest version of D in terms of library contents and attributes.
  • Users may specify attributes, libraries, and controls within X of with two identically spaces child profiles D1 and D2 by giving a full access path X/Y/D1/control...
Contributor

arlimus commented Aug 11, 2016

Great discussion with @stevendanna and @chris-rock :

Which sources do we have that provide version information in an index?

  • url, path, compliance: no index information
  • github, supermarket: index information without dependency/version info
  • for all indexes, we find the optimal item that matches the user's specification

Ideas:

  • We still build the full dependency tree for the final pinned state; We fetch the version that matches the specification for the sources that support it.

I'll update the spec:

  • When a Profile X executes, it will have access to its dependencies D' and either dependencies D". Resources have flat exposure, e.g. if D" creates d_conf, X will have access to it via calling d_conf
  • When a Profile X executes, its dependent attributes are accessible via the mapped names of dependencies D' and D", e.g. for profile postgres you get access to its attribute user via postgres/user
  • If a Profile X pulls any dependencies with the same space but different versions D1 and D2, the profile X gets access to the latest version of D in terms of library contents and attributes.
  • Users may specify attributes, libraries, and controls within X of with two identically spaces child profiles D1 and D2 by giving a full access path X/Y/D1/control...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment