Skip to content

Make Voyager index and names filenames configurable#5914

Merged
clairemcginty merged 6 commits intospotify:mainfrom
dylanrb123:dbannon/voyager-configurable-filenames
Apr 1, 2026
Merged

Make Voyager index and names filenames configurable#5914
clairemcginty merged 6 commits intospotify:mainfrom
dylanrb123:dbannon/voyager-configurable-filenames

Conversation

@dylanrb123
Copy link
Copy Markdown
Contributor

Currently, the voyager integration assumes that the index file and names file will always be constant index.hnsw and names.json, respectively. This is not always the case (e.g. sharded indices) so we want to support user-provided filenames. This PR adds just that, allowing the user to specify the filenames for the index and names file at the already provided storage URI.

dylanrb123 and others added 5 commits March 31, 2026 15:03
Add optional indexFile and namesFile parameters to VoyagerUri and
asVoyagerSideInput, defaulting to the existing "index.hnsw" and
"names.json" values. This allows users to specify custom filenames
when the index files at a given path don't use the default names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verifies end-to-end write and read via sc.voyagerSideInput using
custom index and names filenames on VoyagerUri.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verifies that the voyagerSideInput(uri) overload (which reads index
settings from v2 metadata) works correctly with custom filenames.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 63.63636% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.70%. Comparing base (d59fbba) to head (d497624).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
.../scio/extra/voyager/syntax/SCollectionSyntax.scala 40.00% 3 Missing ⚠️
...scala/com/spotify/scio/extra/voyager/Voyager.scala 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5914      +/-   ##
==========================================
+ Coverage   61.54%   61.70%   +0.15%     
==========================================
  Files         317      317              
  Lines       11653    11654       +1     
  Branches      822      815       -7     
==========================================
+ Hits         7172     7191      +19     
+ Misses       4481     4463      -18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add binary compatibility filters for the 46 incompatibilities caused by
VoyagerUri changing from AnyVal to a regular case class. Verified against
0.15.3. Also adds a test for the no-settings voyagerSideInput overload
with custom filenames.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@clairemcginty clairemcginty merged commit d88b8e9 into spotify:main Apr 1, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants