Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes MRESOLVER-7, third try #30

Closed
wants to merge 7 commits into from

Conversation

hwellmann
Copy link
Contributor

Rebased PR #10 on master and added a small change which reverts dependency management to its former somewhat broken state, see MRESOLVER-9.

This fixes failing tests in the maven and maven-integration-testing projects.

hwellmann and others added 3 commits March 22, 2019 17:43
Dependencies are now processed asynchronously by an Executor

Cherry-pick commit from PR#2 Nov 1, 2016
This restores legacy behaviour, but means that dependency management
is broken, as noted in MRESOLVER-9.
Indicates the number of threads used for parallel resolution of
artifact descriptors. Default is 5. Set to 1 to revert to serial
resolution.

maven.artifact.descriptor.threads can be used as alias.
@slachiewicz
Copy link
Member

During verification, I found that one change is being rolled out - a245b56 I do not think it was intentional.
You can also freely squash my commit with WorkerThreadFactory

...which got lost in rebasing this branch
@hwellmann
Copy link
Contributor Author

I've reintroduced the change made in a245b56 that got lost during rebase - thanks for spotting this!

What exactly did you mean regarding the "commit with WorkerThreadFactory"?

@@ -666,46 +679,4 @@ private static boolean isLackingDescriptor( Artifact artifact )
}
return repositories;
}

private static List<? extends Version> filterVersions( Dependency dependency, VersionRangeResult rangeResult,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why has this method been removed in this commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been moved to DependencyCollectionUtils.

@@ -89,6 +90,10 @@

static final String CONFIG_PROP_MAX_CYCLES = "aether.dependencyCollector.maxCycles";

private static final String CONFIG_PROP_NUM_ARTIFACT_DESCRIPTOR_THREADS = "aether.artifact.descriptor.threads";
private static final String CONFIG_PROP_NUM_ARTIFACT_DESCRIPTOR_THREADS_ALT = "maven.artifact.descriptor.threads";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a need for an alternative. As soon as the package names will change we will likely rename properties too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, removed.

@@ -35,18 +42,39 @@

final Boolean premanagedOptional;

/**
* @since 1.1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to 1.4.0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

final Collection<Exclusion> premanagedExclusions;

/**
* @since 1.1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to 1.4.0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@michael-o michael-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good in general. Please take care of the comments. I will test against Core ITs. If everything is fine, I will merge for 1.4.0.

@michael-o
Copy link
Member

All ITs pass for me. As soon as the requested changes are performed I will merge into master.

@michael-o
Copy link
Member

I have now pushed a branch with minimal changes. Everyone's invited to have look. If I hear no objections, I'll merge in a couple of days.

@asfgit asfgit closed this in 9d3b324 Apr 12, 2019
@nilson-aguiar
Copy link

@michael-o this is not going to be merged?

@michael-o
Copy link
Member

??? This has been merged two months ago.

@nilson-aguiar
Copy link

Sorry then, saw it as closed instead of merged then i asked. Thanks for the reply.

sschuberth added a commit to oss-review-toolkit/ort that referenced this pull request Jan 15, 2020
The parallel download of POMs [1] introduced in version 1.4.0 seems to
cause probelms with our "jgnash-core dependencies are detected correctly"
test in "MavenTest". To reproduce, do

    $ rm -fr ~/.m2/repository/org/apache/poi/poi-ooxml-schemas/3.14/
    $ ./gradlew :analyzer:funTest --tests com.here.ort.analyzer.managers.MavenTest

[1] apache/maven-resolver#30

Signed-off-by: Sebastian Schuberth <sebastian.schuberth@bosch.io>
sschuberth added a commit to oss-review-toolkit/ort that referenced this pull request Jan 15, 2020
The parallel download of POMs [1] introduced in version 1.4.0 seems to
cause probelms with our "jgnash-core dependencies are detected correctly"
test in "MavenTest". To reproduce, do

    $ rm -fr ~/.m2/repository/org/apache/poi/poi-ooxml-schemas/3.14/
    $ ./gradlew :analyzer:funTest --tests com.here.ort.analyzer.managers.MavenTest

which makes the test fail as none of the dependencies of "poi-ooxml-schemas"
are found. Interestingly, at the time ORT creates the dependency nodes,
the POM for "poi-ooxml-schemas" is not downloaded yet, but it *does* get
downloaded afterwards.

As a quick solution to the problem, simply revert to the most recent
version of maven-resolver that does not include the parallel download of
POMs. Later, we should probably inspect whether this is a user error on
our side, or a bug in maven-resolver.

[1] apache/maven-resolver#30

Signed-off-by: Sebastian Schuberth <sebastian.schuberth@bosch.io>
sschuberth added a commit to oss-review-toolkit/ort that referenced this pull request Jan 15, 2020
The parallel download of POMs [1] introduced in version 1.4.0 seems to
cause problems with our "jgnash-core dependencies are detected correctly"
test in "MavenTest". To reproduce, do

    $ rm -fr ~/.m2/repository/org/apache/poi/poi-ooxml-schemas/3.14/
    $ ./gradlew :analyzer:funTest --tests com.here.ort.analyzer.managers.MavenTest

which makes the test fail as none of the dependencies of "poi-ooxml-schemas"
are found. Interestingly, at the time ORT creates the dependency nodes,
the POM for "poi-ooxml-schemas" is not downloaded yet, but it *does* get
downloaded afterwards.

As a quick solution to the problem, simply revert to the most recent
version of maven-resolver that does not include the parallel download of
POMs. Later, we should probably inspect whether this is a user error on
our side, or a bug in maven-resolver.

[1] apache/maven-resolver#30

Signed-off-by: Sebastian Schuberth <sebastian.schuberth@bosch.io>
@sschuberth
Copy link

Does this new feature imply some API usage changes? Because in our OSS Review Toolkit (which analyzes project dependencies) we actually see some POMs only being downloaded after we expect them to be present. Downgrading to version 1.3.3 resolves the issue. For details please see oss-review-toolkit/ort#2113.

@michael-o
Copy link
Member

@sschuberth This has been reverted because it suffered from race conditions.

@sschuberth
Copy link

Thanks @michael-o, I was just reading about that in MRESOLVER-92 which is mentioned in the 1.4.1 release announcement. However, interestingly the issue we're seeing occurs with both versions 1.4.0 and 1.4.1. So maybe it's something that's not caused by the parallel download of POMs after all.

@michael-o
Copy link
Member

@sschuberth If you see some other error, please report.

@sschuberth
Copy link

@michael-o, in our case, the root cause turned out to be a side effect of https://www.alphabot.com/security/blog/2020/java/Your-Java-builds-might-break-starting-January-13th.html.

@Eskibear
Copy link

Any update? Will you guys re-enable this improvement to download pom files in parallel?

@michael-o
Copy link
Member

Any update? Will you guys re-enable this improvement to download pom files in parallel?

No, unless someone is willing to work on the race conditions.

@Eskibear
Copy link

Eskibear commented Mar 5, 2020

All I can find about the race conditions is @Tibor17 's comments in MRESOLVER-7 . And it seems that details were discussed in the ASF slack but it is internal. So it might be hard for someone to work on it without knowing the details.

@hwellmann @slachiewicz May I ask that, are you still working on the fix, or some other ASF guys will take it over?

@michael-o
Copy link
Member

@Eskibear, no, no one is working on it. We prefer stability over performance.

@Eskibear
Copy link

Eskibear commented Mar 5, 2020

@michael-o thanks for the information. BTW, could you share some details (or point out the link) about the failed cases, race conditions mentioned above. My point is, if someone from community is willing to fix that, your previous discussion would help a lot.

@michael-o
Copy link
Member

Basically, we have frequent race conditions in the Maven ITs. @Tibor17 Do you remember the discussion the mailing list? I guess it was on Slack only :-(

@Eskibear. I think you can reproduce it yourself. Fetch the first broken Resolver release and run Maven ITs multiple times. I think you should see failures not related to Maven itself.

@Tibor17
Copy link
Contributor

Tibor17 commented May 27, 2020

@michael-o
@Eskibear
Hello guys. You'r right! We should restart the work.
Can we start again during the weekend?

@Eskibear
Copy link

Any news or follow-ups?

@michael-o
Copy link
Member

@Eskibear, since we have other serious issues with concurency this isn't happening soon.

@Tibor17
Copy link
Contributor

Tibor17 commented Jun 16, 2020

@michael-o @Eskibear
If you have time these days or weeks we can start analysing the code @hwellmann was working on.
We should define some order of tasks.
Lets start with the first analysis where we will find the parts where the downloads can be made parallel.

@michael-o
Copy link
Member

@michael-o No, I won't. Looking at https://issues.apache.org/jira/browse/MRESOLVER-114 will make things worse because we do not have decent concurrency support. Even if we solve this one, the synchronization will trick us anyway.

@Eskibear
Copy link

@michael-o I just glanced at MRESOLVER-114, and I don't think it's related with MRESOLVER-7.
Learning for the comments, MRESOLVER-114 is about resolving artifacts in parallel.
While here in MRESOLVER-7, the problem is, when the list of dependencies is already known, it performs downloading of the corresponding pom files in a single thread. Given you guys have more context information, correct me if I'm wrong.
What I see is, networking is the bottleneck. It would be a great performance boost even if we just make the downloading parallel.

Lets start with the first analysis where we will find the parts where the downloads can be made parallel.

@Tibor17 Agree with you that we should do some first analysis. Below is what I find in master branch, where downloading dependencies is mono-threaded. As far as I learn, basic idea of this PR was to make below logic parallel.

for ( Dependency dependency : dependencies )
{
processDependency( args, results, repositories, depSelector, depManager, depTraverser, verFilter,
dependency );
}

@michael-o
Copy link
Member

Please move discussion to the ticket because this PR has been closed.

@Tibor17
Copy link
Contributor

Tibor17 commented Jun 17, 2020

@Eskibear
We have to find the place where the files are physically downloaded, because it is the place which should run in parallel. The caller methods should not go to the thread pool because Maven's implementations of objects are not thread safe. I do not want to make al of them immutable. And why to call the Java mess in forked thread. It is not a performance penalty. The penatly we pay is the physical download.

@Tibor17
Copy link
Contributor

Tibor17 commented Jun 17, 2020

@Eskibear
If I am right, here is one of the low level places. I do not like shared objects on instance and here is the shared state.

@michael-o
Copy link
Member

Maven does not use the HTTP Transporter. We use Resolver via Wagon only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants