fix(java): use Maven mirror, remove unnecessary plugins, other CI-stability improvements#2253
Conversation
| <artifactId>nexus-staging-maven-plugin</artifactId> | ||
| <version>1.6.7</version> | ||
| <groupId>org.sonatype.central</groupId> | ||
| <artifactId>central-publishing-maven-plugin</artifactId> |
There was a problem hiding this comment.
Seeing a similar error:
[INFO] Scanning for projects...
Downloading from central: https://repo.maven.apache.org/maven2/org/sonatype/central/central-publishing-maven-plugin/0.9.0/central-publishing-maven-plugin-0.9.0.pom
Error: ] Some problems were encountered while processing the POMs:
Error: Unresolveable build extension: Plugin org.sonatype.central:central-publishing-maven-plugin:0.9.0 or one of its dependencies could not be resolved:
Failed to read artifact descriptor for org.sonatype.central:central-publishing-maven-plugin:jar:0.9.0
@
@
Error: The build could not read 1 project -> [Help 1]
Error:
Error: The project com.nvidia.cuvs:cuvs-java:26.08.0 (/__w/cuvs/cuvs/java/cuvs-java/pom.xml) has 1 error
Error: Unresolveable build extension: Plugin org.sonatype.central:central-publishing-maven-plugin:0.9.0 or one of its dependencies could not be resolved:
Error: Failed to read artifact descriptor for org.sonatype.central:central-publishing-maven-plugin:jar:0.9.0
Error: -> [Help 2]
Error:
Error: To see the full stack trace of the errors, re-run Maven with the -e switch.
Error: Re-run Maven using the -X switch to enable full debug logging.
Error:
Error: For more information about the errors and possible solutions, please read the following articles:
Error: [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
Error: [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/PluginManagerException
https://github.com/rapidsai/cuvs/actions/runs/27970909678/job/82782494961?pr=2253
Makes me think that maybe there's something else happening here, like maybe an SSL error downloading stuff from one of the package repositories or something. Will try re-running with more verbose logs.
There was a problem hiding this comment.
Debug logs show the real problem... the package repository is responding with 429s. We're getting rate-limited.
Caused by: org.eclipse.aether.resolution.ArtifactResolutionException: The following artifacts could not be resolved: org.sonatype.central:central-publishing-maven-plugin:pom:0.9.0 (absent): Could not transfer artifact org.sonatype.central:central-publishing-maven-plugin:pom:0.9.0 from/to central (https://repo.maven.apache.org/maven2): status code: 429, reason phrase: Too Many Requests (429)
at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve (DefaultArtifactResolver.java:474)
https://github.com/rapidsai/cuvs/actions/runs/27974250285/job/82792307295?pr=2253#step:13:5175
This would also explain why we OFTEN but not ALWAYS see this issue in cuvs / cuvs-lucene CI!
We should probably keep the -e option to mvn so errors like this aren't swallowed in the future. Seeing that traceback would have saved a lot of time here.
| <id>ossrh</id> | ||
| <url>https://oss.sonatype.org/content/repositories/snapshots</url> | ||
| </snapshotRepository> | ||
| </distributionManagement> |
There was a problem hiding this comment.
This project doesn't distributed any snapshots... you'll only see releases at https://central.sonatype.com/artifact/com.nvidia.cuvs/cuvs-java/versions
And it doesn't rely on downloading any snapshots of dependencies.
So this configuration is totally unnecessary. As more evidence of that... it references a repo (https://oss.sonatype.org/) that doesn't even exist any more.
$ curl -I https://oss.sonatype.org
HTTP/2 404There was a problem hiding this comment.
This project doesn't distributed any snapshots... you'll only see releases at
We are actualy shortly going to be deploying nightly snapshos. I know this doesn't change much in this PR (and we still need to ditch the oss.sonatype.org), but I want to point this out because we have several customers asking us to distriute snapshots (some of them cannot build our code).
There was a problem hiding this comment.
Yep makes sense, whatever configuration is needed can be added as part of the work of publishing snapshots.
| </execution> | ||
| </executions> | ||
| </plugin> | ||
| <plugin> |
There was a problem hiding this comment.
mvn verify attempts to download any dependencies involved in the build "lifecycle" up to but not including deploy (https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html#usual-command-line-calls).
It also will try to download and check any dependencies for any stage that have <extensions>true</extensions>.
That's why the original build failures we saw mentioned this nexus-staging-maven-plugin package.
We can just remove this... the release process for cuvs-java doesn't use mvn deploy, it directly hits the Sonatype API with a command similar to this:
curl \
-XPOST \
-F bundle=@com.nvidia.cuvs:cuvs-java:26.06.0.zip \
https://central.sonatype.com/api/v1/publisher/uploadAs described at https://central.sonatype.org/publish/publish-portal-api/#uploading-a-deployment-bundle
(publishing code is private, but can link offline for reviewers)
| <repository> | ||
| <id>gcs-maven-central-mirror</id> | ||
| <name>GCS Maven Central mirror</name> | ||
| <url>https://maven-central.storage-download.googleapis.com/maven2/</url> |
There was a problem hiding this comment.
This read-only mirror of Maven Central faces much higher (or no?) rate limits.
Its docs at https://storage-download.googleapis.com/maven-central/index.html say "This is not an officially supported Google product.", but I think we can trust it... it was first set up back in 2015 by the creator of Maven (http://takari.io/2015/10/28/google-maven-central.html) with enough official Google backing that it was announced on GCP's tech blog: https://cloudplatform.googleblog.com/2015/11/faster-builds-for-Java-developers-with-Maven-Central-mirror.html
It is used by big, high-profile Java projects like Apache Beam, Gluten, Lucene, Orc, and Spark: https://github.com/search?q=org%3Aapache%20%22maven-central.storage-download.googleapis.com%22&type=code
And was recently adopted by cuDF: rapidsai/cudf#22875
| </repository> | ||
| </repositories> | ||
|
|
||
| <pluginRepositories> |
There was a problem hiding this comment.
This configuration has to be duplicated, once for regular dependencies and once for plugins.
| SPDX-License-Identifier: Apache-2.0 | ||
| --> | ||
|
|
||
| <?xml version="1.0" encoding="UTF-8"?> |
There was a problem hiding this comment.
The check-xml pre-commit hook flagged this.
check xml................................................................Failed
- hook id: check-xml
- exit code: 1
java/examples/pom.xml: Failed to xml parse (java/examples/pom.xml:5:0: XML or text declaration not at start of entity)
If it's included, I think a <?xml type of block needs to be the first line of the file.
But it can just be omitted:
- the other
pom.xmlfiles in this repo don't have it and are working fine - the UTF-8 encoding is declared elsewhere
- most XML parsers assume XML 1.0, I think it's safe to omit it
| This maven project contains JMH benchmarks for the CAGRA Java API. | ||
|
|
||
| ## Prerequisites | ||
|
|
There was a problem hiding this comment.
All of these whitespace changes are intentional. I was reading these docs and found that all the lines being smooshed together made it slightly harder to find what I was looking for.
This can also prevent some renderers from formatting them correctly, see https://yihui.org/en/2021/06/markdown-breath/
Happy to revert these style changes if reviewers disagree with them.
| -Daether.connector.basic.downstreamThreads=1 | ||
| -Daether.transport.http.retryHandler.count=5 | ||
| -Daether.transport.http.retryHandler.interval=10000 | ||
| -Dmaven.wagon.http.retryHandler.count=5 |
There was a problem hiding this comment.
This will affect every mvn {something} call from the cuvs-java directory. There are several, so using a shared config file seemed preferable to supplying these as CLI arguments.
In short, this is basically "loudly report errors, and make requests to package repositories more slowly"
Explanations:
-e- print stacktrace on failures (this would have helped us identify the rate-limiting issue much sooner!)
-B- "batch" mode, basically "don't print thousands of lines of progress bars"
-Daether.connector.basic.downstreamThreads=1- only download 1 package at a time
-Daether.transport.http.retryHandler.interval=10000- wait 10 seconds between retries (default is 5 seconds)
-Daether.transport.http.retryHandler.count=5and-Dmaven.wagon.http.retryHandler.count=5- retry retriable failures (like timeouts) 5 times (default is 3)
- Aether is the HTTP transport library used by newer Maven versions, Wagon is used by older Maven versions ... configuring both to ensure this isn't sensitive to
mavenversion - see https://maven.apache.org/guides/mini/guide-resolver-transport.html for details
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>com.diffplug.spotless</groupId> |
There was a problem hiding this comment.
This is a build-time plugin, already declared in the file. Having it in <dependencies> too is unnecessary.
| <version>3.13.0</version> | ||
| </plugin> | ||
| <plugin> | ||
| <artifactId>maven-surefire-plugin</artifactId> |
There was a problem hiding this comment.
We don't run tests, build javadocs, or publish jars of the code in examples/. It doesn't need entries for plugins used for those purposes.
📝 WalkthroughSummary by CodeRabbit
WalkthroughThe PR updates ChangesCI and Tooling Updates
Java Maven Build Infrastructure and Docs
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@java/examples/README.md`:
- Around line 16-18: All fenced code blocks in the README.md file that contain
shell commands are missing language specifications. For each of the three code
blocks containing mvn package commands (the ones for CagraExample, HnswExample,
and BruteForceExample), change the opening fence from ``` to ```sh to properly
declare the shell language. This will enable proper syntax highlighting and
comply with the markdownlint MD040 rule.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e12c9642-8539-4f29-b57a-19cad4f754ab
📒 Files selected for processing (9)
.github/workflows/pr.yaml.pre-commit-config.yamljava/README.mdjava/benchmarks/README.mdjava/benchmarks/pom.xmljava/cuvs-java/.mvn/maven.configjava/cuvs-java/pom.xmljava/examples/README.mdjava/examples/pom.xml
bdice
left a comment
There was a problem hiding this comment.
Excellent. All makes sense to me.
cjnolet
left a comment
There was a problem hiding this comment.
LGTM, thanks so much @jameslamb for figuring out a fix!
| <distributionManagement> | ||
| <snapshotRepository> | ||
| <id>ossrh</id> | ||
| <url>https://oss.sonatype.org/content/repositories/snapshots</url> |
There was a problem hiding this comment.
Suppose the deploy/distribute cuvs-java JARs do not depend on this part? If yes, good to remove it
There was a problem hiding this comment.
That's right, as of now we don't pull or publish snapshots.
As Corey pointed out in #2253 (comment) we may do that in the future, but for now this is unused.
NvTimLiu
left a comment
There was a problem hiding this comment.
LGTM for pom.xml from CI point of view
|
Talked with @jolorunyomi offline and it seems like this should not impact our release process for Thanks for the help and reviews everyone! This was educational for me 😊 |
|
/merge |
Contributes to rapidsai/build-planning#297 Follow-up to #22875 Java CI on projects using NVIDIA's hosted GitHub Actions runners have been getting rate-limited by Maven Central. Similar to NVIDIA/cuvs#2253, this proposes fixing that by using the same read-only Maven mirror that Apache Orc, Lucene, Spark and others use in their builds (https://storage-download.googleapis.com/maven-central/index.html). #22875 did that by adding a custom `settings.xml` and passing it to `mvn -s`. This moves those settings in to the project's `pom.xml` so it'll affect all `mvn` invocations, not just that script. Other changes: * adding default `mvn` options to request packages more slowly and wait longer between retries * adds `check-xml` pre-commit hook to validate that `pom.xml` is valid XML * other minor `pom.xml` cleanup (see comments) Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) - Nghia Truong (https://github.com/ttnghia) URL: #22969
Contributes to rapidsai/build-planning#297 RAPIDS projects using NVIDIA-hosted GitHub Actions runners have been getting rate-limited by Maven Central. Similar to NVIDIA/cuvs#2253, this proposes fixing that by using the same read-only Maven mirror that Apache Orc, Lucene, Spark and others use in their builds (https://storage-download.googleapis.com/maven-central/index.html). Other changes: * adding default `mvn` options to request packages more slowly and wait longer between retries * updating RAPIDS `pre-commit` hooks * adding `check-xml` hook to validate that `pom.xml` are valid XML docs * reformatting READMEs per https://yihui.org/en/2021/06/markdown-breath/ * other minor `pom.xml` maintenance (see comments) Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Gil Forsyth (https://github.com/gforsyth) URL: #992
…#166) Fixes #145 Contributes to rapidsai/build-planning#297 The root cause of #145 appears to be that we were getting rate-limited by Maven Central. Similar to NVIDIA/cuvs#2253, this proposes fixing that by using the same read-only Maven mirror that Apache Orc, Lucene, Spark and others use in their builds (https://storage-download.googleapis.com/maven-central/index.html). Other changes: * fixes microbenchmarks version (was still 26.02 because it used the cuVS pattern for version replacement) * adding default `mvn` options to request packages more slowly and wait longer between retries * updating all `pre-commit` hooks w/ `pre-commit autoupdate` * adding `check-xml` hook to validate that `pom.xml` are valid XML docs * reformatting READMEs per https://yihui.org/en/2021/06/markdown-breath/ Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - https://github.com/jakirkham - Bradley Dice (https://github.com/bdice) URL: #166
Contributes to NVIDIA/cuvs-lucene#145
Java builds have been failing like this:
This is hiding the true issue ... we're getting rate-limited by Maven Central, with requests hitting
429 (Too Many Requests).This proposes the following changes to fix that:
pom.xmlchanged-filesrules to reduce how often Java CI runs on PRsAnd these other changes relevant to the Java builds:
pre-commithooks (this did catch a syntax error in the Java examples'pom.xml!)See inline comments for more details.
Notes for Reviewers
Impact
I don't believe there are any breaking changes in this PR. Our existing publishing process for
cuvs-javashould be unaffected and CI should become faster and more reliable.If anything breaks because of it, assume that was unintentional.
AI use
I'm not very familiar with Java packaging, so heavily relied on back-and-forth with an agent for help with this.
I wrote this PR description, code comments, and all inline review comments here myself.
My motivation is to unblock CI here and in
cuvs-lucene. If a more qualified reviewer wants to close this and implement a different fix, please do.