Skip to content

fix(java): use Maven mirror, remove unnecessary plugins, other CI-stability improvements#2253

Merged
rapids-bot[bot] merged 15 commits into
mainfrom
fix/sonatype
Jun 24, 2026
Merged

fix(java): use Maven mirror, remove unnecessary plugins, other CI-stability improvements#2253
rapids-bot[bot] merged 15 commits into
mainfrom
fix/sonatype

Conversation

@jameslamb

@jameslamb jameslamb commented Jun 22, 2026

Copy link
Copy Markdown
Member

Contributes to NVIDIA/cuvs-lucene#145

Java builds have been failing like this:

Error:  Unresolveable build extension: Plugin org.sonatype.plugins:nexus-staging-maven-plugin:1.6.7 or one of its dependencies could not be resolved:
	Failed to read artifact descriptor for org.sonatype.plugins:nexus-staging-maven-plugin:jar:1.6.7

This is hiding the true issue ... we're getting rate-limited by Maven Central, with requests hitting 429 (Too Many Requests).

This proposes the following changes to fix that:

  • switching to the unofficial Maven Central read-only mirror on GCS
  • removing unnecessary dependencies from pom.xml
  • configuring Maven to download more slowly and wait longer between retries
  • fine-tuning changed-files rules to reduce how often Java CI runs on PRs

And these other changes relevant to the Java builds:

  • validating XML in pre-commit hooks (this did catch a syntax error in the Java examples' pom.xml!)
  • removing unnecessary configuration for snapshots (this project doesn't publish or download snapshots)
  • reformatting Java READMEs per https://yihui.org/en/2021/06/markdown-breath/

See inline comments for more details.

Notes for Reviewers

Impact

I don't believe there are any breaking changes in this PR. Our existing publishing process for cuvs-java should be unaffected and CI should become faster and more reliable.

If anything breaks because of it, assume that was unintentional.

AI use

I'm not very familiar with Java packaging, so heavily relied on back-and-forth with an agent for help with this.

I wrote this PR description, code comments, and all inline review comments here myself.

My motivation is to unblock CI here and in cuvs-lucene. If a more qualified reviewer wants to close this and implement a different fix, please do.

@jameslamb jameslamb added improvement Improves an existing functionality non-breaking Introduces a non-breaking change Java labels Jun 22, 2026
@jameslamb jameslamb changed the title WIP: fix(java): update publishing plugin WIP: fix(java): switch publishing plugin to 'central-publishing-maven-plugin' Jun 22, 2026
Comment thread java/cuvs-java/pom.xml Outdated
<artifactId>nexus-staging-maven-plugin</artifactId>
<version>1.6.7</version>
<groupId>org.sonatype.central</groupId>
<artifactId>central-publishing-maven-plugin</artifactId>

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing a similar error:

[INFO] Scanning for projects...
Downloading from central: https://repo.maven.apache.org/maven2/org/sonatype/central/central-publishing-maven-plugin/0.9.0/central-publishing-maven-plugin-0.9.0.pom
Error: ] Some problems were encountered while processing the POMs:
Error:  Unresolveable build extension: Plugin org.sonatype.central:central-publishing-maven-plugin:0.9.0 or one of its dependencies could not be resolved:
	Failed to read artifact descriptor for org.sonatype.central:central-publishing-maven-plugin:jar:0.9.0
 @ 
 @ 
Error:  The build could not read 1 project -> [Help 1]
Error:    
Error:    The project com.nvidia.cuvs:cuvs-java:26.08.0 (/__w/cuvs/cuvs/java/cuvs-java/pom.xml) has 1 error
Error:      Unresolveable build extension: Plugin org.sonatype.central:central-publishing-maven-plugin:0.9.0 or one of its dependencies could not be resolved:
Error:      	Failed to read artifact descriptor for org.sonatype.central:central-publishing-maven-plugin:jar:0.9.0
Error:      -> [Help 2]
Error:  
Error:  To see the full stack trace of the errors, re-run Maven with the -e switch.
Error:  Re-run Maven using the -X switch to enable full debug logging.
Error:  
Error:  For more information about the errors and possible solutions, please read the following articles:
Error:  [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
Error:  [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/PluginManagerException

https://github.com/rapidsai/cuvs/actions/runs/27970909678/job/82782494961?pr=2253

Makes me think that maybe there's something else happening here, like maybe an SSL error downloading stuff from one of the package repositories or something. Will try re-running with more verbose logs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug logs show the real problem... the package repository is responding with 429s. We're getting rate-limited.

Caused by: org.eclipse.aether.resolution.ArtifactResolutionException: The following artifacts could not be resolved: org.sonatype.central:central-publishing-maven-plugin:pom:0.9.0 (absent): Could not transfer artifact org.sonatype.central:central-publishing-maven-plugin:pom:0.9.0 from/to central (https://repo.maven.apache.org/maven2): status code: 429, reason phrase: Too Many Requests (429)
at org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve (DefaultArtifactResolver.java:474)

https://github.com/rapidsai/cuvs/actions/runs/27974250285/job/82792307295?pr=2253#step:13:5175

This would also explain why we OFTEN but not ALWAYS see this issue in cuvs / cuvs-lucene CI!

We should probably keep the -e option to mvn so errors like this aren't swallowed in the future. Seeing that traceback would have saved a lot of time here.

@NVIDIA NVIDIA deleted a comment from copy-pr-bot Bot Jun 23, 2026
Comment thread java/cuvs-java/pom.xml
<id>ossrh</id>
<url>https://oss.sonatype.org/content/repositories/snapshots</url>
</snapshotRepository>
</distributionManagement>

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This project doesn't distributed any snapshots... you'll only see releases at https://central.sonatype.com/artifact/com.nvidia.cuvs/cuvs-java/versions

And it doesn't rely on downloading any snapshots of dependencies.

So this configuration is totally unnecessary. As more evidence of that... it references a repo (https://oss.sonatype.org/) that doesn't even exist any more.

$ curl -I https://oss.sonatype.org
HTTP/2 404

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This project doesn't distributed any snapshots... you'll only see releases at

We are actualy shortly going to be deploying nightly snapshos. I know this doesn't change much in this PR (and we still need to ditch the oss.sonatype.org), but I want to point this out because we have several customers asking us to distriute snapshots (some of them cannot build our code).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep makes sense, whatever configuration is needed can be added as part of the work of publishing snapshots.

Comment thread java/cuvs-java/pom.xml
</execution>
</executions>
</plugin>
<plugin>

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mvn verify attempts to download any dependencies involved in the build "lifecycle" up to but not including deploy (https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html#usual-command-line-calls).

It also will try to download and check any dependencies for any stage that have <extensions>true</extensions>.

That's why the original build failures we saw mentioned this nexus-staging-maven-plugin package.

We can just remove this... the release process for cuvs-java doesn't use mvn deploy, it directly hits the Sonatype API with a command similar to this:

curl \
  -XPOST \
  -F bundle=@com.nvidia.cuvs:cuvs-java:26.06.0.zip \
  https://central.sonatype.com/api/v1/publisher/upload

As described at https://central.sonatype.org/publish/publish-portal-api/#uploading-a-deployment-bundle

(publishing code is private, but can link offline for reviewers)

Comment thread java/cuvs-java/pom.xml
<repository>
<id>gcs-maven-central-mirror</id>
<name>GCS Maven Central mirror</name>
<url>https://maven-central.storage-download.googleapis.com/maven2/</url>

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This read-only mirror of Maven Central faces much higher (or no?) rate limits.

Its docs at https://storage-download.googleapis.com/maven-central/index.html say "This is not an officially supported Google product.", but I think we can trust it... it was first set up back in 2015 by the creator of Maven (http://takari.io/2015/10/28/google-maven-central.html) with enough official Google backing that it was announced on GCP's tech blog: https://cloudplatform.googleblog.com/2015/11/faster-builds-for-Java-developers-with-Maven-Central-mirror.html

It is used by big, high-profile Java projects like Apache Beam, Gluten, Lucene, Orc, and Spark: https://github.com/search?q=org%3Aapache%20%22maven-central.storage-download.googleapis.com%22&type=code

And was recently adopted by cuDF: rapidsai/cudf#22875

Comment thread java/cuvs-java/pom.xml
</repository>
</repositories>

<pluginRepositories>

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This configuration has to be duplicated, once for regular dependencies and once for plugins.

See https://maven.apache.org/pom.html#plugin-repositories

Comment thread java/examples/pom.xml
SPDX-License-Identifier: Apache-2.0
-->

<?xml version="1.0" encoding="UTF-8"?>

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check-xml pre-commit hook flagged this.

check xml................................................................Failed
- hook id: check-xml
- exit code: 1

java/examples/pom.xml: Failed to xml parse (java/examples/pom.xml:5:0: XML or text declaration not at start of entity)

If it's included, I think a <?xml type of block needs to be the first line of the file.

But it can just be omitted:

  • the other pom.xml files in this repo don't have it and are working fine
  • the UTF-8 encoding is declared elsewhere
  • most XML parsers assume XML 1.0, I think it's safe to omit it

@NVIDIA NVIDIA deleted a comment from copy-pr-bot Bot Jun 23, 2026
Comment thread java/benchmarks/README.md
This maven project contains JMH benchmarks for the CAGRA Java API.

## Prerequisites

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these whitespace changes are intentional. I was reading these docs and found that all the lines being smooshed together made it slightly harder to find what I was looking for.

This can also prevent some renderers from formatting them correctly, see https://yihui.org/en/2021/06/markdown-breath/

Happy to revert these style changes if reviewers disagree with them.

-Daether.connector.basic.downstreamThreads=1
-Daether.transport.http.retryHandler.count=5
-Daether.transport.http.retryHandler.interval=10000
-Dmaven.wagon.http.retryHandler.count=5

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will affect every mvn {something} call from the cuvs-java directory. There are several, so using a shared config file seemed preferable to supplying these as CLI arguments.

In short, this is basically "loudly report errors, and make requests to package repositories more slowly"

Explanations:

  • -e
    • print stacktrace on failures (this would have helped us identify the rate-limiting issue much sooner!)
  • -B
    • "batch" mode, basically "don't print thousands of lines of progress bars"
  • -Daether.connector.basic.downstreamThreads=1
    • only download 1 package at a time
  • -Daether.transport.http.retryHandler.interval=10000
  • -Daether.transport.http.retryHandler.count=5 and -Dmaven.wagon.http.retryHandler.count=5

Comment thread java/examples/pom.xml
</dependency>

<dependency>
<groupId>com.diffplug.spotless</groupId>

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a build-time plugin, already declared in the file. Having it in <dependencies> too is unnecessary.

Comment thread java/examples/pom.xml
<version>3.13.0</version>
</plugin>
<plugin>
<artifactId>maven-surefire-plugin</artifactId>

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't run tests, build javadocs, or publish jars of the code in examples/. It doesn't need entries for plugins used for those purposes.

@jameslamb jameslamb changed the title WIP: fix(java): switch publishing plugin to 'central-publishing-maven-plugin' fix(java): use Maven mirror, remove unnecessary plugins, other CI-stability improvements Jun 23, 2026
@jameslamb jameslamb marked this pull request as ready for review June 23, 2026 19:26
@jameslamb jameslamb requested review from a team as code owners June 23, 2026 19:26
@jameslamb jameslamb requested a review from gforsyth June 23, 2026 19:26
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • Documentation

    • Improved formatting and structure of Java setup and benchmark guides for better readability.
  • Chores

    • Enhanced CI/CD pipeline configuration for improved change detection and testing.
    • Updated build system configuration and pre-commit code quality checks.

Walkthrough

The PR updates .github/workflows/pr.yaml changed-files exclusion patterns to narrow job triggers, bumps pre-commit-hooks to v6.0.0 and adds check-xml, migrates all three Java subproject POMs from Sonatype Nexus staging to a GCS Maven Central mirror with Central fallback, introduces a maven.config with retry/batch settings, and reformats several Java READMEs.

Changes

CI and Tooling Updates

Layer / File(s) Summary
pr.yaml changed-files exclusion pattern updates
.github/workflows/pr.yaml
Adds CODEOWNERS, Dockerfile, README.md, SECURITY.md, conda/*, examples/**, and pyproject.toml negations to several files_yaml exclusion groups, and expands the workflow YAML exclusion sub-list to cover C-ABI, labeler, and baseline workflows.
pre-commit-hooks v6.0.0 bump and check-xml hook
.pre-commit-config.yaml
Updates pre-commit/pre-commit-hooks revision from v5.0.0 to v6.0.0 and enables the check-xml hook alongside the existing hooks.

Java Maven Build Infrastructure and Docs

Layer / File(s) Summary
cuvs-java POM: GCS mirror repos, Nexus removal, SPDX header
java/cuvs-java/pom.xml
Replaces the distributionManagement block and nexus-staging-maven-plugin with repositories/pluginRepositories pointing to a GCS Maven Central mirror with a Central fallback; updates the license header to SPDX format.
New maven.config with batch and retry settings
java/cuvs-java/.mvn/maven.config
Creates a new Maven config file enabling -e and -B flags and setting Aether/Wagon HTTP retry counts to 5, retry interval to 10000ms, and downstream thread count to 1.
benchmarks and examples POMs: GCS mirror repos and plugin cleanup
java/benchmarks/pom.xml, java/examples/pom.xml
Adds GCS mirror repositories/pluginRepositories to both POMs; removes managed deploy/install/javadoc/site/source/surefire plugins from benchmarks; removes spotless-maven-plugin dependency, maven-surefire-plugin entry, and XML declaration from examples.
Java README formatting updates
java/README.md, java/benchmarks/README.md, java/examples/README.md
Adds fenced sh code blocks to the Testing section in java/README.md, introduces a "Prerequisites" subsection in benchmarks/README.md, and adjusts blank-line spacing in examples/README.md; command text is unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: switching to Maven mirror, removing unnecessary plugins, and making CI-stability improvements in Java builds.
Description check ✅ Passed The description is directly related to the changeset, explaining the root cause of CI failures, proposing specific solutions, and listing all the changes made across multiple files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/sonatype

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@java/examples/README.md`:
- Around line 16-18: All fenced code blocks in the README.md file that contain
shell commands are missing language specifications. For each of the three code
blocks containing mvn package commands (the ones for CagraExample, HnswExample,
and BruteForceExample), change the opening fence from ``` to ```sh to properly
declare the shell language. This will enable proper syntax highlighting and
comply with the markdownlint MD040 rule.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e12c9642-8539-4f29-b57a-19cad4f754ab

📥 Commits

Reviewing files that changed from the base of the PR and between 71306ec and 67d7633.

📒 Files selected for processing (9)
  • .github/workflows/pr.yaml
  • .pre-commit-config.yaml
  • java/README.md
  • java/benchmarks/README.md
  • java/benchmarks/pom.xml
  • java/cuvs-java/.mvn/maven.config
  • java/cuvs-java/pom.xml
  • java/examples/README.md
  • java/examples/pom.xml

Comment thread java/examples/README.md

@bdice bdice left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. All makes sense to me.

@cjnolet cjnolet left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks so much @jameslamb for figuring out a fix!

Comment thread java/cuvs-java/pom.xml
<distributionManagement>
<snapshotRepository>
<id>ossrh</id>
<url>https://oss.sonatype.org/content/repositories/snapshots</url>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose the deploy/distribute cuvs-java JARs do not depend on this part? If yes, good to remove it

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, as of now we don't pull or publish snapshots.

As Corey pointed out in #2253 (comment) we may do that in the future, but for now this is unused.

@NvTimLiu NvTimLiu left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for pom.xml from CI point of view

@jameslamb

Copy link
Copy Markdown
Member Author

Talked with @jolorunyomi offline and it seems like this should not impact our release process for cuvs-java. I'll merge this and go make similar changes in the other repos with Java CI.

Thanks for the help and reviews everyone! This was educational for me 😊

@jameslamb

Copy link
Copy Markdown
Member Author

/merge

@rapids-bot rapids-bot Bot merged commit c854ea3 into main Jun 24, 2026
164 of 166 checks passed
@jameslamb jameslamb deleted the fix/sonatype branch June 24, 2026 15:46
rapids-bot Bot pushed a commit to rapidsai/cudf that referenced this pull request Jun 24, 2026
Contributes to rapidsai/build-planning#297

Follow-up to #22875

Java CI on projects using NVIDIA's hosted GitHub Actions runners have been getting rate-limited by Maven Central. Similar to NVIDIA/cuvs#2253, this proposes fixing that by using the same read-only Maven mirror that Apache Orc, Lucene, Spark and others use in their builds (https://storage-download.googleapis.com/maven-central/index.html).

#22875 did that by adding a custom `settings.xml` and passing it to `mvn -s`. This moves those settings in to the project's `pom.xml` so it'll affect all `mvn` invocations, not just that script.

Other changes:

* adding default `mvn` options to request packages more slowly and wait longer between retries
* adds `check-xml` pre-commit hook to validate that `pom.xml` is valid XML
* other minor `pom.xml` cleanup (see comments)

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Nghia Truong (https://github.com/ttnghia)

URL: #22969
rapids-bot Bot pushed a commit to rapidsai/kvikio that referenced this pull request Jun 24, 2026
Contributes to rapidsai/build-planning#297

RAPIDS projects using NVIDIA-hosted GitHub Actions runners have been getting rate-limited by Maven Central. Similar to NVIDIA/cuvs#2253, this proposes fixing that by using the same read-only Maven mirror that Apache Orc, Lucene, Spark and others use in their builds (https://storage-download.googleapis.com/maven-central/index.html).

Other changes:

* adding default `mvn` options to request packages more slowly and wait longer between retries
* updating RAPIDS `pre-commit` hooks
* adding `check-xml` hook to validate that `pom.xml` are valid XML docs
* reformatting READMEs per https://yihui.org/en/2021/06/markdown-breath/
* other minor `pom.xml` maintenance (see comments)

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Gil Forsyth (https://github.com/gforsyth)

URL: #992
rapids-bot Bot pushed a commit to NVIDIA/cuvs-lucene that referenced this pull request Jun 26, 2026
…#166)

Fixes #145 

Contributes to rapidsai/build-planning#297

The root cause of #145 appears to be that we were getting rate-limited by Maven Central. Similar to NVIDIA/cuvs#2253, this proposes fixing that by using the same read-only Maven mirror that Apache Orc, Lucene, Spark and others use in their builds (https://storage-download.googleapis.com/maven-central/index.html).

Other changes:

* fixes microbenchmarks version (was still 26.02 because it used the cuVS pattern for version replacement)
* adding default `mvn` options to request packages more slowly and wait longer between retries
* updating all `pre-commit` hooks w/ `pre-commit autoupdate`
* adding `check-xml` hook to validate that `pom.xml` are valid XML docs
* reformatting READMEs per https://yihui.org/en/2021/06/markdown-breath/

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - https://github.com/jakirkham
  - Bradley Dice (https://github.com/bdice)

URL: #166
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality Java non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants