Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MNG-8026] Maven drives regarding scopes #1391

Merged
merged 17 commits into from Feb 5, 2024
Merged

Conversation

cstamas
Copy link
Member

@cstamas cstamas commented Jan 24, 2024

Changes:

Step1: basically moved (copied, as resolver bits are deprecated) all the classes dealing with scopes (and their interpretation) to Maven.
Step 2: Properties moved to Artifacts from Dependencies, Snapshot check delegated to Aether Artifact, New types for build path, build path scope, dependency scope, language..., Changes propagated.


https://issues.apache.org/jira/browse/MNG-8026

@cstamas cstamas requested a review from gnodet January 24, 2024 11:34
@cstamas cstamas self-assigned this Jan 24, 2024
@cstamas
Copy link
Member Author

cstamas commented Jan 24, 2024

The most important: someApiCall(project + buildPathScope) == buildPath but buildPath class is not done fully yet (just there as placeholder).

The from it, classpath = buildPath(classpathFilter), and modulepath= buildPath(modulepathFilter) and so on, and classpath filter should filter out this based on information from #1378

* @return the artifact properties, never {@code null}
*/
@Nonnull
ArtifactProperties getArtifactProperties();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this change. An Artifact is obtained from coordinates and as no knowledge on how it will be used. The same pointer could point to the same artifact, but with different Type, hence different properties. I think those do belong to the resolved. Dependency

Copy link
Member Author

@cstamas cstamas Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, happens similarly as in Resolver:

  • IF "ad-hoc" resolution happened (i.e. plugin used resolve service) it is plugin that knows why it did that
  • IF artifact entered via project (as dependency) then you have dependency -> type -> artifact so properties are set.

We may need something more for this, still refining.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, then they should be kept on the Dependency

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree here: Dependency is what you have in POMs (either project or transitive deps). So, when you resolve, you end up with bunch of Artifacts, NOT Dependencyies. Basically, that is why this is needed on Artifact and not Dependency (where is also needed, as properties from type should and could be still augmented).

Copy link
Member Author

@cstamas cstamas Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or in other words: see this PR apache/maven-resolver#411
Basically Resolver can be now asked to:

  • create Artifact as new Artifact("g:a:v", artifactTypeRegistry.get("annotation-processor")). (and this ctor is not new, in a way, you could do the same even with existing ctors, this was just a shorthand to make possible to use type with string "coords").
  • the to resolve it to have it's full transitive hull
  • and you again end up with "bunch of Artifacts", but you still want to sift thru them, filter them by properties

No dependency in scope, but you are STILL interested in Artifact properties....

Or similarly for ANY artifact, as if artifact you are resolving may have type="modular-jar", then the needed info will be still present in artifact.properties. Again, no dependencies involved, etc.

Am pretty much confident that this change is correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...we already have some properties (is added to classpath etc) no?

@rmannibucau

That method is one of the main sources of confusion 😄 This is what happens:

Typical example: the POM packaging, see definition here:
https://github.com/apache/maven/blob/master/maven-core/src/main/java/org/apache/maven/internal/impl/types/PomTypeProvider.java

Long time ago, before BOMs were a thing, people used it to "group dependencies" (see for example this blog: https://blog.sonatype.com/2009/10/maven-tips-and-tricks-grouping-dependencies/).

In essence, you create a POM that has your grouped dependencies and deploy that. So you have:

G:A:pom:V
  +- D1
  +- D2
  +- D3

On consuming side, when you declare in your project dependency (yes, a dependency, not import as today you do with BOMs) like G:A:pom:V, hence as:

<dependency>
  <groupId>G</groupId>
  <artifactId>A</artifactId>
  <version>V</version>
  <type>pom</type>
</dependency>

On resolving (for @gnodet collect+resolve) this project "compile" build path scope, assuming is the only dependency you have, you would end up with this list of artifacts: [D1, D2, D3] despite resolution started as P (root) where P had 1 dependency G:A:pom:V.

Simply put, the Type definition (lack of "constitutes build path") causes that intermediate "grouping POM" be completely omitted (left out) from the results, as it is not part of the build path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: the long time ago pattern of doing pom for deps is still accurate today and bom created more mess than helped there - but unrelated to the PR, just wanted to highlight this is still used for goods and not legacy.

Ok but it means we need a dependency handler as we have an artifact handler (think it is a bit complex).
Ultimately, do we need dependency outside the model (literally ) or can it not just be artifacts?
I understand what you mean but it seems like a lot of work for not much gain, no? Can't we merge it somehow in new api?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: the long time ago pattern of doing pom for deps is still accurate today and bom created more mess than helped there - but unrelated to the PR, just wanted to highlight this is still used for goods and not legacy.

Ok but it means we need a dependency handler as we have an artifact handler (think it is a bit complex). Ultimately, do we need dependency outside the model (literally ) or can it not just be artifacts? I understand what you mean but it seems like a lot of work for not much gain, no? Can't we merge it somehow in new api?

There's no ArtifactHandler in the v4 api. It's called Type and it represents the usage of an Artifact as a Dependency. There's nothing to handle afaik on an Artifact: it's just a pointer to a file in a repository. How it's used needs to be taken care of and that's done in the Dependency which is associated to a Type and Scope and DependencyProperties. If we move things to the Artifact or merge Artifact and Dependency, I think that's a source of confusion.

Copy link
Member Author

@cstamas cstamas Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolver also encodes "type" in Artifact#properties. It also exposed since eons ctors for DefaultArtifact where you could pass in ArtifactType. The type you declare (i.e. if in project, you have packaging -> type) or if "ad-hoc" by root dependency node type (where callar knows what it does).

Or in short:

  • artifact is a file decorated with coordinates and properties
  • dependency is an artifact having extras like scope, type, optionality using which you can "derive" an artifact as above
  • true, in Resolver "dependency" is ONLY a thing that comes from POM (ArtifactDescriptor) but it is being distilled to artifact (as in second bullet -> first bullet).

Basically "everything is an artifact", and "dependency is an artifact that enters the game as dependency". This is how Resolver abstracts.

Doing collect+resolve operation and ending up with list of sole Dependencies is what you say?

When you collect+resolve, you are interested in all, so even root (again, this is for "ad-hoc" operations, like javadoc may tell: "resolve me this taglet and its dependencies", in that case, javadoc plugin KNOWS root is taglet, but will still need to handle the rest of transitive hull, and place them or use that as needed).


@gnodet if I understand you correctly, sans "root", you consider everything else (the transitive hull) as "dependency"? Which kinda makes sense, as they DID enter the result collection as "dependencies of root"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolver also encodes "type" in Artifact#properties. It also exposed since eons ctors for DefaultArtifact where you could pass in ArtifactType. The type you declare (i.e. if in project, you have packaging -> type) or if "ad-hoc" by root dependency node type (where callar knows what it does).

Or in short:

  • artifact is a file decorated with coordinates and properties
  • dependency is an artifact having extras like scope, type, optionality using which you can "derive" an artifact as above
  • true, in Resolver "dependency" is ONLY a thing that comes from POM (ArtifactDescriptor) but it is being distilled to artifact (as in second bullet -> first bullet).

Basically "everything is an artifact", and "dependency is an artifact that enters the game as dependency". This is how Resolver abstracts.

Yes, and I think that's where some confusion comes from. Everything is an artifact means that when you have an Artifact object, you don't really know what it is: it can be 5 different things:

  • a pointer to a file (with a version range): ArtifactCoordinate
  • a resolved artifact in a repository (a definite version): Artifact
  • a pointer to a dependency (version range + scope + type)
  • a resolved dependency: the resolver actually has org.eclipse.aether.graph.Dependency and that's what the v4 api wraps
  • an artifact generated by the project

Doing collect+resolve operation and ending up with list of sole Dependencies is what you say?

This is also a confusing part. It's conceptually a 3 step process: collect + flatten + resolve. The collect grabs the input and computes the graph of Node. Flattens will output a list of Node. Then the resolution step becomes the same as the artifact resolution because each node can be turned into an artifact coordinate, and we then just download the actual artifacts.

In the v4 api, we have ArtifactResolver which gives an Artifact. The DependencyCollector returns the graph with Nodes. The DependencyResolver performs both flatten which gives an ordered list of Nodes, and resolve which returns an ordered Map<Dependency, Path> which I think is what is needed to compute the build paths.

When you collect+resolve, you are interested in all, so even root (again, this is for "ad-hoc" operations, like javadoc may tell: "resolve me this taglet and its dependencies", in that case, javadoc plugin KNOWS root is taglet, but will still need to handle the rest of transitive hull, and place them or use that as needed).

@gnodet if I understand you correctly, sans "root", you consider everything else (the transitive hull) as "dependency"? Which kinda makes sense, as they DID enter the result collection as "dependencies of root"?

Yes, everything is a dependency. It's only the root which can be a project or artifact, but when you collect the dependencies for a project or artifact, you actually resolve all its dependencies.

I think there may be a small piece missing, which is the ability to access the graph from a dependency, i.e. adding a Node getNode() on the org.apache.maven.api.Dependency to get back to the graph.

At some point, I did envision to call it ResolvedDependency which would avoid the confusion with the model dependency, which is is translated into a DependencyCoordinate, the collected / flattened / resolved.

To sum up: the v3 api lacks a bunch of things, that's why plugins have to use the resolver to get fine grain access to the resolution output. The v4 api aims to expose those bits, but those aren't really new, the org.apache.maven.api.Dependency simply wraps org.eclipse.aether.graph.Dependency. That was clearly a missing piece in the v3 api.

@@ -95,7 +97,12 @@ public String getClassifier() {

@Override
public boolean isSnapshot() {
return DefaultModelVersionParser.checkSnapshot(artifact.getVersion());
return artifact.isSnapshot();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had this discussion weeks ago, I think we agreed the snapshot knowledge was not on the artifact. We now have a pluggable VersionScheme, I think it belongs there.

Anyway, I think those changes related to snapshots are unrelated and would be better in a separate PR to ease the discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

My point is to have one "source of truth". So if we change this once, we change that one spot. Currently, resolver Artifact.isSnapshot should be it IMHO. But agree, let's move it out from this PR, does not belong here.

@cstamas
Copy link
Member Author

cstamas commented Jan 24, 2024

There is one maven-core UT failing as DefaultPackagingManager seems not set somewhere.

@rmannibucau
Copy link
Contributor

@cstamas I see the 1-1 with what exists but I have some hard time - blame the hour maybe ;) - to see how we can make it consistent with the type work which is pending, typically build path will not be a build path but more a resolved artifact set totally unrelated to any path now, no? So for me scope becomes just a resolution stage more than anything else. Am I missing something?

@cstamas
Copy link
Member Author

cstamas commented Jan 24, 2024

It was always "build path" but Maven internally called it wrongly "classpath" 😉 (and/or was what was inherited from Maven2)

Basically what we do here is laying down new API blocks for dep scopes and build path scopes.

The other PR is about "filling in the properties".

And ultimately, you as for "main-compile" or "test-compile" build path, Maven calls into Resolver and it resolves those for you, and you end up with a bunch of artifacts (that are coming via project, so are dependency, hence, had type).

Then, those collections are "split", filtered by various properties present in their artifact.properties that makes them "fall" into proper bucket, like classpath, modulepath, etc

import org.apache.maven.api.annotations.Nonnull;

/**
* Build path. Build path is calculated for given {@link Project} and {@link BuildPathScope}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be noted what the build path is and what's the purpose. It's not a common java notion, unlike class path and module path.
Especially the "build" term could lead to think it's role is build time, but that's not the case and it can be used at build time or run time.

}

@Nonnull
Collection<Artifact> getArtifacts();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Node would be required.
From that one, we can derive a List<Dependency> and eventually a List<Artifact>. But that's related to my remark around artifact properties which I think are still dependency properties.

* @return the artifact properties, never {@code null}
*/
@Nonnull
ArtifactProperties getArtifactProperties();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, then they should be kept on the Dependency

@rmannibucau
Copy link
Contributor

@cstamas does it mean we should fix the naming for mvn 4 and align it on what it is? Resolverpaths or alike since it is unrelated to any actual build and becomes even more ambiguous with jpms?

@gnodet
Copy link
Contributor

gnodet commented Jan 25, 2024

@cstamas does it mean we should fix the naming for mvn 4 and align it on what it is? Resolverpaths or alike since it is unrelated to any actual build and becomes even more ambiguous with jpms?

I also find the term build path somewhat misleading...

@cstamas
Copy link
Member Author

cstamas commented Jan 25, 2024

I also find the term build path somewhat misleading...

Am completely fine to keep "resolution path" as well, but did not want to disturb existing classes...

@cstamas
Copy link
Member Author

cstamas commented Jan 25, 2024

@gnodet and this is why I think that "step 1" should have been merged. This discussion will take some, and this PR will just potentially drift away from master (unless we freeze master, which I don't want).

@gnodet
Copy link
Contributor

gnodet commented Jan 25, 2024

@gnodet and this is why I think that "step 1" should have been merged. This discussion will take some, and this PR will just potentially drift away from master (unless we freeze master, which I don't want).

Maybe, but there's no point in the first PR if we don't go further...

@cstamas
Copy link
Member Author

cstamas commented Jan 25, 2024

Maybe, but there's no point in the first PR if we don't go further...

Um, the point of first commit is to move off from all the deprecated classes in resolver (see apache/maven-resolver#412 and related JIRA), as currently, boundary of "scope interpretation and transformation" is completely blurred and doubled between Resolver and Maven Core. Things like adding new scopes is not trivial at all, as it requires changes in both projects, whereis Resolver should be really oblivious about scopes.

The first commit is really not related to 2nd step, it just does the right thing (moves all scope related logic to Maven Core), that in turn also effectively clears grounds for step2.

@rmannibucau
Copy link
Contributor

I'm more than fine doing it by "blocks" - in particular the last part Tamas explained - but how can we ensure we don't get this PR without next one in terms of release to avoid to expose - even in an alpha - something we know we'll not keep like that and we'll break?

@cstamas
Copy link
Member Author

cstamas commented Jan 25, 2024

Just a general remark: Am fine if we "redefine" here everything, from ground up, but IMHO "safest" would be to go "resolver way" (adhere to its abstractions closest as possible) as it is proven and works. And then improve it from there (consider "resolver way" as starting point), instead to try to come up with something completely new and different, and risk the problem of something (some use case, edge case) being forgotten, and then go thru pain of "fixing it" in upcoming API versions...

As Guillaume is right here: I am somewhat pushing for resolver-like way of working (and interfaces/services).

@cstamas
Copy link
Member Author

cstamas commented Jan 25, 2024

First "block" or "step" is here #1392

Changes:
* Properties moved to Artifacts from Dependencies
* snapshot check delegated to Aether Artifact
* new types for build path, build path scope, dependency scope, language
* changes propagated
@gnodet
Copy link
Contributor

gnodet commented Jan 31, 2024

@cstamas I've removed BuildPath which will be provided by #1378 (inside DependencyResolverResult) and renamed BuildPathScope to PathScope to align with the other PR which defines PathType.
Let me know what you think about the current state.

NONE("none", false),

/**
* Empty scope.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explain the different between "empty" and "none"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We saw that "empty" is present in old codebase (3.x), maybe will go away, but we left it here just to be on safe side until we clear up things.

The "none" on the other hand is clearly defined IMHO: merely added just to affect the reactor sorting.

COMPILE_ONLY("compile-only", false),

/**
* Compile.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume that we mean "compile, test and runtime"? (for clarifying the difference with COMPILE_ONLY).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be worth to edit the javadoc accordingly (same for TEST).

TEST_ONLY("test-only", false),

/**
* Test.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume that we mean "test compilation and execution"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

* Implementation must have {@code equals()} and {@code hashCode()} implemented, so implementations of this interface
* can be used as keys.
*/
public interface ExtensibleEnum {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for information: the International Organization for Standardization (ISO) uses the CodeList term for extensible enumerations, at least in the ISO 19xxx series of international standards (geospatial information). But I did not verified if CodeList was consistently used in all other standards as well.

@gnodet gnodet marked this pull request as ready for review February 5, 2024 08:00
@gnodet gnodet merged commit 3f9fec2 into apache:master Feb 5, 2024
18 checks passed
@cstamas
Copy link
Member Author

cstamas commented Feb 5, 2024

Looks good! Sorry to be late to party, but we can improve all this "on the go" anyway...

@cstamas cstamas deleted the MNG-8026 branch February 5, 2024 10:11
Copy link

@Jircs1 Jircs1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

api/maven-api-core/src/main/java/org/apache/maven/api/Language.java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants