Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRESOLVER-269] [MRESOLVER-275] Trusted checksums source and more compact format backed source #199

Merged

Conversation

cstamas
Copy link
Member

@cstamas cstamas commented Sep 21, 2022

This PR implements several improvement issues:

  • introduces "trusted" checksums source (and adapts them for provided transport checksums)
  • pushed/moved existing implementation as "trusted" source and introduces "compact" (file) based source
  • cleanup re naming: before it was "file" source, now there is "file-sparse" (old) and "file-compact" (new)
  • also adds minor "cleanup" in AetherModule (stylistic, renames provider private vars and makes them unmodifiable). This does not affects consumer of this library that use sisu (like Maven).

Reason: existing ProvidedChecksumSource is all about transport (see API). Uses transport related classes and is meant -- as it's name and package says -- to provide expected checksums for transport related checks (it provides ChecksumKind#PROVIDED, uses RemoteRepository and transfer related classes). OTOH, there may be requirement to "provide" checksums for operations totally unrelated to transport. Hence, the introduced trusted checksums (using non transport related API) is exactly that, provides checksums for given Artifact (optionally factoring in origin as well in form of ArtifactRepository). This clearly separates "transport realm" and rest of the things.

Along with existing (moved) source, new trusted checksums implementation added that uses more compact format: a "summary" file that contains list of Artifact IDs and checksum per one line. This format is more VCS friendly, and also easier to handle then sparse directories.

By default, new "trusted" checksum sources are adapted to "provided" checksum sources (see TrustedToProvidedChecksumsSourceAdapter), so no functionality loss happens.

Perfomed cleanup around trusted checksum sources as well, old one was in wrong place and wrongly named, dropped it (as it was final class), and now we have two sources:

  • sparse-directory -- behaves exactly same as dropped one, expects provided checksums in "local repo"-like sparse layout
  • summary-file -- is the new format, where one file checksums.${checksumExt} is expected to contain Artifact ID and checksum for given algorithm per line (separated by space)

Both source are able to be "origin aware" when it factors in origin repository ID as well (so one could get checksums-central.sha1 with all the known trusted checksums for use).

Sources are DISABLED by default, as even if file is present (check possible only for file-compact) it does not mean user want to use it in every project. Enabling them is possible via usual means (-D... or by config in .mvn directory to make it per-project persistent). All configuration is sourced from repo system session, no system properties used.

Based on work done in #192

Co-authored-by: @raphw rafael.wth@gmail.com


https://issues.apache.org/jira/browse/MRESOLVER-275 -- Introduce trusted checksums source
https://issues.apache.org/jira/browse/MRESOLVER-269 -- Allow more compact storage of provided checksums

New provided checksums implementation that uses more compact
format: a "summary" file that contains list of Artifact IDs
and provided checksum per one line.

This file is more VCS friendly, and also easier to handle
then sparse directories.
@cstamas
Copy link
Member Author

cstamas commented Sep 21, 2022

This PR obsoletes #192 as it provides 100% same change as in that PR but this PR also cleans up in some areas.

@raphw
Copy link

raphw commented Sep 21, 2022

This looks really good, thanks!

@cstamas cstamas changed the title [MRESOLVER-269] Compact format for provided checksums [MRESOLVER-269] [MRESOLVER-275] Compact format for provided checksums Sep 23, 2022
@cstamas cstamas changed the title [MRESOLVER-269] [MRESOLVER-275] Compact format for provided checksums [MRESOLVER-269] [MRESOLVER-275] Trusted checksums source and more compact format backed source Sep 23, 2022
Copy link
Member

@michael-o michael-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cstamas
Copy link
Member Author

cstamas commented Sep 28, 2022

  • Since the LocalPathComposer is used throughout the docs of the sources should mention that this substructure is also reflected in .checksums/

The javadoc of sparse source says:

 * Sparse file {@link FileTrustedChecksumsSourceSupport} implementation that use specified directory as base
 * directory, where it expects artifacts checksums on standard Maven2 "local" layout. This implementation uses Artifact
 * coordinates solely to form path from basedir, pretty much as Maven local repository does.

IMHO no, as BSD checksum file contains filename, here, we operate with ArtifactId ("file format is artifact ID and checksum separated by space per line"), see https://github.com/apache/maven-resolver/blob/master/maven-resolver-util/src/main/java/org/eclipse/aether/util/artifact/ArtifactIdUtils.java#L45

Cache should share lifecycle with session, and not with a singleton
component, that may survive several sessions (like in mvnd).
In this loop check for file existence, instead to "blindly" try to
read it and "expect" FileNotFoundEx. Not only is cheaper, but is
simpler as well.
@michael-o
Copy link
Member

  • Since the LocalPathComposer is used throughout the docs of the sources should mention that this substructure is also reflected in .checksums/

The javadoc of sparse source says:

 * Sparse file {@link FileTrustedChecksumsSourceSupport} implementation that use specified directory as base
 * directory, where it expects artifacts checksums on standard Maven2 "local" layout. This implementation uses Artifact
 * coordinates solely to form path from basedir, pretty much as Maven local repository does.

I don't see how this answers my question... ?!!

IMHO no, as BSD checksum file contains filename, here, we operate with ArtifactId ("file format is artifact ID and checksum separated by space per line"), see https://github.com/apache/maven-resolver/blob/master/maven-resolver-util/src/main/java/org/eclipse/aether/util/artifact/ArtifactIdUtils.java#L45

hm...we basically use our current format, but compile multiple files and key them by artifact id. Agree.

@cstamas
Copy link
Member Author

cstamas commented Sep 30, 2022

I don't see how this answers my question... ?!!

I might missed the real intent of your question then...

@michael-o
Copy link
Member

Going through again...

Copy link
Member

@michael-o michael-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: the origin aware. We have now aether.enhancedLocalRepository.splitRemoteRepository. To make it consistent, why not name it instead of originalAware splitRepository?
This may be obsolete if the LocalPathPrefixComposer is used.

@michael-o
Copy link
Member

I don't see how this answers my question... ?!!

I might missed the real intent of your question then...

Forget it, I understand now. No further explanation necessary.

@michael-o
Copy link
Member

michael-o commented Oct 1, 2022

Yet another question: It is worthwhile to enable it by default and add checksums on the fly when they are present?

Copy link
Member

@michael-o michael-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet another issue with the .enabled: I think this is again inconsistent with our existing boolean properties: https://maven.apache.org/resolver/configuration.html
They use the name as boolean property with a default value. So to apply this appoach .enabled just can be dropped and its parent used.

@cstamas
Copy link
Member Author

cstamas commented Oct 3, 2022

Yet another question: It is worthwhile to enable it by default and add checksums on the fly when they are present?

This could be similar as "record" functionality is for remote repository filter feature. Yes, it could be possible, but unsure is it worth it: the whole idea of this is to get checksums for other trusted source than remote...

@cstamas
Copy link
Member Author

cstamas commented Oct 3, 2022

Yet another issue with the .enabled: I think this is again inconsistent with our existing boolean properties: https://maven.apache.org/resolver/configuration.html They use the name as boolean property with a default value. So to apply this appoach .enabled just can be dropped and its parent used.

This is fixed now.

@michael-o
Copy link
Member

Yet another question: It is worthwhile to enable it by default and add checksums on the fly when they are present?

This could be similar as "record" functionality is for remote repository filter feature. Yes, it could be possible, but unsure is it worth it: the whole idea of this is to get checksums for other trusted source than remote...

ОК, let's put this aside.

Copy link
Member

@michael-o michael-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically good/done, but https://github.com/apache/maven-resolver/blob/master/src/site/markdown/configuration.md needs to be updated for this.

@michael-o michael-o self-requested a review October 3, 2022 18:01
Copy link
Member

@michael-o michael-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good to merge.

@cstamas cstamas merged commit 8b0d143 into apache:master Oct 3, 2022
@cstamas cstamas deleted the MRESOLVER-269-compact-format-provided-checksums branch October 3, 2022 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants