Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock the digests by adding sha256 values for installed packages. #301

Merged
merged 19 commits into from
Sep 30, 2022

Conversation

nmahendru
Copy link
Contributor

@nmahendru nmahendru commented Aug 11, 2022

Populate sha256sums in a specified manifest file
Usage:
hermit manifest add-digests <manifest.hcl>

It should populate(in place) all the sha256sums for all the version/platform combinations in the original manifest file.

Testing:

  • An integration test has been added to verify the digest created.
  • An integration test has been added to verify that hermit install fails
    on wrong digest.

@nmahendru nmahendru force-pushed the nmahendru/lock_digests_embedded branch 2 times, most recently from 8e46bf1 to 394f460 Compare September 26, 2022 22:56
@nmahendru nmahendru marked this pull request as ready for review September 26, 2022 22:56
Copy link
Collaborator

@alecthomas alecthomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, but it needs some reworking. Please let me know if you'd like to pair on it.

app/manifest_lock_digests_cmd.go Outdated Show resolved Hide resolved
return errors.WithStack(err)
}
if len(installed) == 0 {
fmt.Printf("There are no packages installed. Nothing to do here.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use Printf directly, use ui.UI.Infof() and friends.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

app/manifest_lock_digests_cmd.go Outdated Show resolved Hide resolved
app/manifest_lock_digests_cmd.go Outdated Show resolved Hide resolved
env.go Outdated Show resolved Hide resolved
app/manifest_lock_digests_cmd.go Outdated Show resolved Hide resolved
value, _ := hcl.Marshal(localmanifest.Manifest)

//Now just write down fresh manifest files in a subdirectory in the `bin`.
os.MkdirAll(filepath.Join(env.BinDir(), "lockedDigests"), os.ModePerm)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename lockedDigests to hermit.lock.

Also, check for errors here and elsewhere. You can use errors.Is(err, fs.ErrExist) to ignore "already exists" errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

cache/git.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@quad quad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this PR introduces three unique terms for the same concept:

  1. "digests" / "Digests"
  2. sha256sums
  3. checksums / "actual checksums"

Please consider picking and using one term.

(I prefer "digests"; "sha256sums" is implementation specific and "checksums" is inaccurate.)

@@ -79,6 +79,7 @@ func AutoVersion(httpClient *http.Client, ghClient GitHubClient, path string) (l
// "hclBlock" is the block containing the auto-version information. Its "Labels" field should be updated with the new versions, if any.
func parseVersionBlockFromManifest(manifest []byte) (ast *hcl.AST, hclBlock *hcl.Block, autoVersionBlock *hmanifest.AutoVersionBlock, err error) {
ast, err = hcl.ParseBytes(manifest)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

env.go Outdated
@@ -675,7 +675,7 @@ func (e *Env) Upgrade(l *ui.UI, pkg *manifest.Package) (*shell.Changes, error) {
//
// Link chains are in the form
//
// <binary> -> <pkg>-<version>.pkg -> hermit
// <binary> -> <pkg>-<version>.pkg -> hermit
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: was this intentional?

manifest/resolver.go Outdated Show resolved Hide resolved
state/state.go Outdated
@@ -290,7 +292,6 @@ func (s *State) CacheAndUnpack(b *ui.Task, p *manifest.Package) error {
return errors.WithStack(err)
}
defer lock.Release(b)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

nit: intentional removal?

`
}
func (e *lockDigestsCmd) Run(l *ui.UI, env *hermit.Env, cache *cache.Cache, state *state.State) error {
//time.Sleep(time.Second * 7)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//time.Sleep(time.Second * 7)

@quad
Copy link
Collaborator

quad commented Sep 27, 2022

Out of curiosity, why a unique .hcl file per package vs. a single hermit-lock.hcl?

app/commands.go Outdated
Exec execCmd `cmd:"" help:"Directly execute a binary in a package." group:"env"`
Env envCmd `cmd:"" help:"Manage environment variables." group:"env"`
Validate activatedValidateCmd `cmd:"" help:"Hermit validation." group:"global"`
LockDigests lockDigestsCmd `cmd:"" help:"Lock the Digests for installed packages using sha256sums." group:"global"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LockDigests lockDigestsCmd `cmd:"" help:"Lock the Digests for installed packages using sha256sums." group:"global"`
LockDigests lockDigestsCmd `cmd:"" help:"Lock the manifest for installed packages." group:"global"`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are really locking in the digests and not the manifest. also again sha256sums is an existing attribute so that information is helpful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not particularly fussed about the verbage; more that they should they be the same.

@@ -4,4 +4,5 @@ type manifestCmd struct {
Validate validateSourceCmd `cmd:"" help:"Check a package manifest source for errors." group:"global"`
AutoVersion autoVersionCmd `cmd:"" help:"Upgrade manifest versions automatically where possible." group:"global"`
Create manifestCreateCmd `cmd:"" help:"Create a new manifest from an existing package artefact URL." group:"global"`
LockDigests lockDigestsCmd `cmd:"" help:"Update the manifest files for installed packages with sha256sums" group:"global"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LockDigests lockDigestsCmd `cmd:"" help:"Update the manifest files for installed packages with sha256sums" group:"global"`
LockDigests lockDigestsCmd `cmd:"" help:"Lock the manifest for installed packages." group:"global"`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above


func (e *lockDigestsCmd) Help() string {
return `
This command will build a list of installed packages and their checksums(SHA256) values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This command will build a list of installed packages and their checksums(SHA256) values.
This command will build a list of installed packages and their digests (SHA256).


That file can be manually reviewed by the project owner and checked in to the repo.

On subsequent uses of the hermit environment, the installed dependencies will have a reference checksum
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
On subsequent uses of the hermit environment, the installed dependencies will have a reference checksum
On subsequent uses of the hermit environment, the installed dependencies will have a reference digest

app/manifest_lock_digests_cmd.go Outdated Show resolved Hide resolved
Comment on lines 63 to 65
for _, mc := range localmanifest.Manifest.Versions {
for _, p := range platform.Core {
ref = manifest.ParseReference(ref.Name + "-" + mc.Version[0])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for _, mc := range localmanifest.Manifest.Versions {
for _, p := range platform.Core {
ref = manifest.ParseReference(ref.Name + "-" + mc.Version[0])
for _, mc := range localmanifest.Manifest.Versions {
ref = manifest.ParseReference(ref.Name + "-" + mc.Version[0])
for _, p := range platform.Core {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good optimization. Done!

Comment on lines 73 to 76
if err != nil {
l.Debugf("Continuing with the next platform tuple. Current %s: %s", p.OS, p.Arch)
continue
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we don't want to log the error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every package might not have every platform combo. We try to digest what we can resolve. Refer:
#301 (comment)

app/manifest_lock_digests_cmd.go Outdated Show resolved Hide resolved
Comment on lines 67 to 71
if p.OS == "Darwin" {
config = manifest.DarwinConfig(res.GetConfig(), p.Arch)
} else {
config = manifest.LinuxConfig(res.GetConfig())
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Is the OS value always passed through?
  2. How does this handle digests for multiple architectures? (e.g. Linux / arm64)

Copy link
Contributor Author

@nmahendru nmahendru Sep 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the OS value always passed through?

yes. But it also helps decide what to call, Linux or DarwinConfig

How does this handle digests for multiple architectures? (e.g. Linux / arm64)

the if here is handling linux/mac and there is one check inside DarwinConfig that handles arm/amd64

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Both DarwinConfig and LinuxConfig leave the OS` value unchanged.
  • What about Linux / arm64? It seems LinuxConfig ignores that combination?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AIUI that's the minimum. @alecthomas?

@nmahendru
Copy link
Contributor Author

nit: this PR introduces three unique terms for the same concept:

  1. "digests" / "Digests"
  2. sha256sums
  3. checksums / "actual checksums"

Please consider picking and using one term.

(I prefer "digests"; "sha256sums" is implementation specific and "checksums" is inaccurate.)
Good point.

sha256sums is an existing term(part of the manifest). I have renamed checksum to digest in the new command. Thinking more about it, I see that the terminology checksum was present prior and that's what I think prompted me to use it where I saw it.

state/state.go Outdated
}
actualDigest = ch
} else {
path := s.cache.Path(p.SHA256, p.Source)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is p.SHA256 already the sha256 value? Can we use that instead of re-calculating?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, At the moment manifest files have sha256 and now this command is called locked-digests so these are the only two.

Then would it be more consistent to use "sha256" everywhere, not introducing the concepts of "checksum" or "digest"?

I have removed checksum. sha256 in the manifest file tells the user about what algorithm is used and that is essential information. lock-digests is a generic enough name that if we later want to use sha512 then it would still be applicable. wdyt ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not clear why we can't use p.SHA256?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @quad is suggesting is that if p.SHA256 != "", just return that immediately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are all good points. AFAIK the p.SHA256 value is coming from a user and it was manually calculated using maybe something like:

sha256sum <file>

I am thinking if we should check the value here at least ?
The reason I am with that is we plan to use this tooling to automatically update digests in our sources so we should be trusting this more than user inputs ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made the change and skipping calculation if already present.

state/state.go Outdated Show resolved Hide resolved
@quad
Copy link
Collaborator

quad commented Sep 27, 2022

Good point.

sha256sums is an existing term(part of the manifest). I have renamed checksum to digest in the new command. Thinking more about it, I see that the terminology checksum was present prior and that's what I think prompted me to use it where I saw it.

Some concepts are in the code and some concepts are shown to the user.

Which ones are already shown to the user?

@nmahendru
Copy link
Contributor Author

Out of curiosity, why a unique .hcl file per package vs. a single hermit-lock.hcl?

So that we can just put it inside hermit-packages alongside every file. The existing checks will just work transparently then.

@nmahendru
Copy link
Contributor Author

Good point.
sha256sums is an existing term(part of the manifest). I have renamed checksum to digest in the new command. Thinking more about it, I see that the terminology checksum was present prior and that's what I think prompted me to use it where I saw it.

Some concepts are in the code and some concepts are shown to the user.

Which ones are already shown to the user?

AFAIK, At the moment manifest files have sha256 and now this command is called locked-digests so these are the only two.

@quad
Copy link
Collaborator

quad commented Sep 27, 2022

AFAIK, At the moment manifest files have sha256 and now this command is called locked-digests so these are the only two.

Then would it be more consistent to use "sha256" everywhere, not introducing the concepts of "checksum" or "digest"?

cache/http.go Outdated
@@ -82,17 +83,17 @@ func (s *httpSource) Validate() error {
return nil
}

func downloadHTTP(b *ui.Task, response *http.Response, checksum string, uri string, cachePath string) (path string, etag string, err error) {
func downloadHTTP(b *ui.Task, response *http.Response, checksum string, uri string, cachePath string, cache *Cache) (path string, etag string, returnChecksum string, err error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is cache used?

state/state.go Outdated Show resolved Hide resolved
@nmahendru nmahendru force-pushed the nmahendru/lock_digests_embedded branch from d92d216 to 09e5173 Compare September 29, 2022 04:27
Copy link
Collaborator

@quad quad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is soooo nice

Comment on lines +324 to +326
func NewPackage(manifest *AnnotatedManifest, platform2 platform.Platform, ref Reference) (*Package, error) {
config := Config{}
config.Platform = platform2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

platform2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is goland doing it for me and I did not bother :)

Comment on lines +66 to +67
// If the file is a directory then no checksum is required
if info.IsDir() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If the file is a directory then no checksum is required
if info.IsDir() {
// If the file isn't a regular file then no checksum is required
if !info.IsRegular() {

state/state.go Outdated
}
actualDigest = ch
} else {
path := s.cache.Path(p.SHA256, p.Source)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not clear why we can't use p.SHA256?

Copy link
Collaborator

@alecthomas alecthomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM!

}
err = os.WriteFile(f, value, os.ModePerm)
if err != nil {
l.Errorf("Could not write the manifest file %s", f)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be returned in the error below using errors.Wrapf()

cache/source.go Outdated
return s.path, "", "", nil
}
var data []byte
data, err = os.ReadFile(s.path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost never a good idea with arbitrary inputs. Instead, do:

r, err := os.Open(s.path)
h := sha256.New()
_, err = io.Copy(h, r)

@@ -255,6 +255,23 @@ Describe "Hermit"
End
End

Describe "Running the add-digests command"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

state/state.go Outdated
} else {
// if the artifact is cached then just calculate the digest.
path := s.cache.Path(p.SHA256, p.Source)
data, err := os.ReadFile(path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, change this to use os.Open() + io.Copy()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

state/state.go Outdated
}
actualDigest = ch
} else {
path := s.cache.Path(p.SHA256, p.Source)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @quad is suggesting is that if p.SHA256 != "", just return that immediately.

nmahendru and others added 11 commits September 29, 2022 15:39
Populate sha256sums in a specified manifest file
Usage:
hermit manifest add-digests <manifest.hcl>

It should populate(in place) all the sha256sums for all the version/platform combinations in the original manifest file.

Testing:
An integration test has been added to verify the digest created.
An integration test has been added to verify that hermit install fails
on wrong digest.
1. Removed duplicate code for manifest parsing.
2. Iteration on platform.core instead of manually indexing all possible
   combinations.
3. Logging improvements.
4. optimizations.
Renaming checksum to digests. and some formatting cleanups.
@nmahendru nmahendru force-pushed the nmahendru/lock_digests_embedded branch from 09e5173 to 938d8e6 Compare September 29, 2022 23:01
@alecthomas alecthomas merged commit 2d416f6 into cashapp:master Sep 30, 2022
@alecthomas
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants