Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lockfiles update #236

Closed
chrmarti opened this issue May 9, 2023 · 9 comments
Closed

Lockfiles update #236

chrmarti opened this issue May 9, 2023 · 9 comments
Assignees
Labels
proposal Still under discussion, collecting feedback

Comments

@chrmarti
Copy link
Contributor

chrmarti commented May 9, 2023

Continuing from #101.

Default Behavior

Stated goals in the proposal: Reproduceablity, security, cachability.

The security goal is to protect the community from malicious code being injected after the lockfile has been written. For this the lockfile needs to be enabled by default.

The lockfile will prevent new bugs and new malicious code from propagating, but it will also prevent bug fixes and security updates from propagating. We could expand "disallowed features" to also support "deprecated/disapproved feature versions".

Supporting dependsOn

Notes

NPM's package-lock.json and Yarn's yarn.lock are similar in their approaches. Both support having multiple versions of the same package (at runtime each version can run separately). Notable difference:

  • package-lock.json v2 lays out the node_modules folder structure in the lockfile including checksums.
  • package-lock.json v1 refers to dependencies with "requires" which resolves to a single version. If a different version is required an additional "dependencies" object is added alongside the package's "requires" property.
  • yarn.lock uses a list of references as the top-level keys, each package lists dependencies refering to these references. If a different version is required, an additional top-level entry is added.

Conclusion

The current proposal can be used for features with dependencies. One limitation is that if two feature references are resolved with the same feature version, there will be two entries in the lockfile referring to the same version. E.g.:

{
    "features": {
        "ghcr.io/devcontainers/features/node:1": {
            "resolved": "oci:ghcr.io/devcontainers/features/node:1.2.4",
            "version": "1.2.4",
            "integrity": "sha256:567d704b3f4d3eca3acee51ded7c460a8395436d135d53d1175fb565daff42b8"
        },
        "ghcr.io/devcontainers/features/node:1.2": {
            "resolved": "oci:ghcr.io/devcontainers/features/node:1.2.4",
            "version": "1.2.4",
            "integrity": "sha256:567d704b3f4d3eca3acee51ded7c460a8395436d135d53d1175fb565daff42b8"
        }
    }
}

The proposal could be updated to avoid the redundancy, but that is not strictly necessary because the lockfile is managed by the tooling.

@chrmarti chrmarti added the proposal Still under discussion, collecting feedback label May 9, 2023
@chrmarti chrmarti self-assigned this May 9, 2023
@chrmarti
Copy link
Contributor Author

It might make sense to copy the "dependsOn" references (without options) from the feature manifest to the lockfile. package-lock.json and yarn.lock both do this and it makes them easier to understand (if one has to).

package-lock.json v2:

        "node_modules/@azure/core-http/node_modules/tough-cookie": {
            "version": "4.0.0",
            "resolved": "https://registry.npmjs.org/tough-cookie/-/tough-cookie-4.0.0.tgz",
            "integrity": "sha512-tHdtEpQCMrc1YLrMaqXXcj6AxhYi/xgit6mZu1+EDWUn+qhUf8wMQoFIy9NXuq23zAwtcB0t/MjACGR18pcRbg==",
            "dependencies": {
                "psl": "^1.1.33",
                "punycode": "^2.1.1",
                "universalify": "^0.1.2"
            },
            "engines": {
                "node": ">=6"
            }
        },

yarn.lock:

anymatch@~3.1.2:
  version "3.1.3"
  resolved "https://registry.yarnpkg.com/anymatch/-/anymatch-3.1.3.tgz#790c58b19ba1720a84205b57c618d5ad8524973e"
  integrity sha512-KMReFUr0B4t+D+OBkjR3KYqvocp2XaSzO55UcB6mgQMd3KbcE+mWTyvVV7D/zsdEbNnV6acZUutkiHQXvTr1Rw==
  dependencies:
    normalize-path "^3.0.0"
    picomatch "^2.0.4"

@joshspicer
Copy link
Member

One note on the structure. I think i'd prefer change to omit the oci: prefix from the resolved property, perhaps moving the sourceInformation "type" to another property?

That way the 'resolved' property is something that can actually be placed into a devcontainer.json.

ie

{
    "features": {
        "ghcr.io/devcontainers/features/node:1": {
            "resolved": "ghcr.io/devcontainers/features/node:1.2.4",
            "sourceType": "oci",
            "version": "1.2.4",
            "integrity": "sha256:567d704b3f4d3eca3acee51ded7c460a8395436d135d53d1175fb565daff42b8"
        },
        "ghcr.io/devcontainers/features/node:1.2": {
            "resolved": "ghcr.io/devcontainers/features/node:1.2.4",
            "sourceType": "oci",
            "version": "1.2.4",
            "integrity": "sha256:567d704b3f4d3eca3acee51ded7c460a8395436d135d53d1175fb565daff42b8"
        }
    }
}

@jkeech
Copy link
Contributor

jkeech commented May 17, 2023

I think the resolved property should point to the exact SHA rather that the major.minor.patch semver label. While the patch version should be immutable, nothing prevents it from being mutated. The lockfile should point to the exact SHA.

Instead of node:1 resolving to node:1.2.4 with a SHA of 567d7... in the example below, node:1 should directly resolve to SHA 567d7.... That way if node:1.2.4 gets republished in the future, the lockfile still points to the original one and is unaffected. We still have the version property for easy reference to see what the version of the feature is, but the main pointer in the lockfile should directly point to the immutable SHA.

Before:

{
    "features": {
        "ghcr.io/devcontainers/features/node:1": {
            "resolved": "ghcr.io/devcontainers/features/node:1.2.4",
            "sourceType": "oci",
            "version": "1.2.4",
            "integrity": "sha256:567d704b3f4d3eca3acee51ded7c460a8395436d135d53d1175fb565daff42b8"
        }
    }
}

After:

{
    "features": {
        "ghcr.io/devcontainers/features/node:1": {
            "resolved": "ghcr.io/devcontainers/features/node@sha256:567d704b3f4d3eca3acee51ded7c460a8395436d135d53d1175fb565daff42b8",
            "sourceType": "oci",
            "version": "1.2.4",
            "integrity": "sha256:567d704b3f4d3eca3acee51ded7c460a8395436d135d53d1175fb565daff42b8"
        }
    }
}

chrmarti added a commit to devcontainers/cli that referenced this issue May 17, 2023
@chrmarti
Copy link
Contributor Author

I was using an outdated example, sorry about that. We are already using the hash in "resolved" and there is no type prefix. We just use the same syntax we use in the devcontainer.json's "features" keys with the hash.

To add dependencies for a devcontainer.json, e.g.:

{
	"image": "mcr.microsoft.com/devcontainers/base:ubuntu",
	"features": {
		"ghcr.io/codspace/dependson/C": true
	}
}

We could just list them without options, but including any version information. E.g.:

{
  "features": {
    "ghcr.io/codspace/dependson/A": {
      "version": "2.0.1",
      "resolved": "ghcr.io/codspace/dependson/a@sha256:932027ef71da186210e6ceb3294c3459caaf6b548d2b547d5d26be3fc4b2264a",
      "integrity": "sha256:932027ef71da186210e6ceb3294c3459caaf6b548d2b547d5d26be3fc4b2264a",
      "dependsOn": [
        "ghcr.io/codspace/dependson/E"
      ]
    },
    "ghcr.io/codspace/dependson/C": {
      "version": "2.0.0",
      "resolved": "ghcr.io/codspace/dependson/c@sha256:db651708398b6d7af48f184c358728eaaf959606637133413cb4107b8454a868",
      "integrity": "sha256:db651708398b6d7af48f184c358728eaaf959606637133413cb4107b8454a868",
      "dependsOn": [
        "ghcr.io/codspace/dependson/A",
        "ghcr.io/codspace/dependson/E"
      ]
    },
    "ghcr.io/codspace/dependson/E": {
      "version": "2.0.0",
      "resolved": "ghcr.io/codspace/dependson/e@sha256:9f36f159c70f8bebff57f341904b030733adb17ef12a5d58d4b3d89b2a6c7d5a",
      "integrity": "sha256:9f36f159c70f8bebff57f341904b030733adb17ef12a5d58d4b3d89b2a6c7d5a"
    }
  }
}

Each entry in the lockfile's "dependsOn" arrays is also part of the keys in the top-level "features" object. This way the lockfile applies to all installed features including dependencies and it contains a short-form of the dependency graph (ignoring differences in options).

@joshspicer
Copy link
Member

joshspicer commented Jun 28, 2023

In this example, how would the lockfile pin a dependency at that is both (1) in the user's devcontainer.json and (2) a dependency of another Feature, and also a different version?

Say ghcr.io/codspace/dependson/{a,c} depends on ghcr.io/codspace/dependson/e:2 and the user's devcontainer.json requires ghcr.io/codspace/dependson/e:3. That isn't captured in the lockfile, and the dependency may float to a later version. That's probably not what the user intends if they enable lock files.

Given the following `devcontainer.json

{
	"image": "mcr.microsoft.com/devcontainers/base:ubuntu",
	"features": {
		"ghcr.io/codspace/dependson/c": {},
                "ghcr.io/codespace/dependson/e:3": {}
	}
}

That will result in the following dependency graph:

flowchart
3992d6[ghcr.io/codspace/dependson/c]
aaaaaa[ghcr.io/codspace/dependson/e:3]
3992d6[ghcr.io/codspace/dependson/c] --> cbd9ef[ghcr.io/codspace/dependson/a]
cbd9ef[ghcr.io/codspace/dependson/a] --> 58daac[ghcr.io/codspace/dependson/e:2]
3992d6[ghcr.io/codspace/dependson/c] --> 58daac[ghcr.io/codspace/dependson/e:2]
Loading

I think we need to take the output of the dependency algorithm and write an entry for each Feature (perhaps deduping the ones that only vary by options).

@jkeech
Copy link
Contributor

jkeech commented Jun 28, 2023

I think we need to take the output of the dependency algorithm and write an entry for each Feature (perhaps deduping the ones that only vary by options).

Agreed. The graph does not need to be captured, but the transitive set of all features that were installed (including multiple versions of the same feature if multiple were installed) need to be recorded in the lock file as a flat list. The proposed syntax in #236 (comment) doesn't work for this since you'd have a collision, but the syntax in #236 (comment) where the version is included in the key would work.

@chrmarti
Copy link
Contributor Author

I think we agree (my example wasn't good). The feature identifiers in the "features" keys and the "dependson" elements can (and often will) include a version. So @joshspicer's example would look like:

{
  "features": {
    "ghcr.io/codspace/dependson/a": {
      "version": "1.2.1",
      "resolved": "ghcr.io/codspace/dependson/a@sha256:...",
      "integrity": "sha256:...",
      "dependsOn": [
        "ghcr.io/codspace/dependson/e:2"
      ]
    },
    "ghcr.io/codspace/dependson/c": {
      "version": "2.0.1",
      "resolved": "ghcr.io/codspace/dependson/c@sha256:...",
      "integrity": "sha256:...",
      "dependsOn": [
        "ghcr.io/codspace/dependson/a",
        "ghcr.io/codspace/dependson/e:2"
      ]
    },
    "ghcr.io/codspace/dependson/e:2": {
      "version": "2.3.4",
      "resolved": "ghcr.io/codspace/dependson/e@sha256:...",
      "integrity": "sha256:..."
    },
    "ghcr.io/codspace/dependson/e:3": {
      "version": "3.2.4",
      "resolved": "ghcr.io/codspace/dependson/e@sha256:...",
      "integrity": "sha256:..."
    }
  }
}

E.g., "ghcr.io/codspace/dependson/e:2" resolves to version 2.3.4 and ""ghcr.io/codspace/dependson/e:3" resolves to version 3.2.4.

@chrmarti
Copy link
Contributor Author

I think the OCI identifier and the tarball URL are both case-insensitive. Since the casing can vary between devcontainer.json and devcontainer-lock.json's dependsOn, I suggest we specify that the lockfile always uses the lowercased identifier as the key in the features map.

@watoone28
Copy link

watoone28 commented Jul 20, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Still under discussion, collecting feedback
Projects
None yet
Development

No branches or pull requests

4 participants