Skip to content

fix: allow NAR URLs with dashes and underscores and improve NAR path handling#820

Closed
kalbasit wants to merge 7 commits intomainfrom
02-09-fix_nar.parseurl_should_allow_urls_with_dashes_and_underscores
Closed

fix: allow NAR URLs with dashes and underscores and improve NAR path handling#820
kalbasit wants to merge 7 commits intomainfrom
02-09-fix_nar.parseurl_should_allow_urls_with_dashes_and_underscores

Conversation

@kalbasit
Copy link
Owner

@kalbasit kalbasit commented Feb 10, 2026

NAR URLs and paths can contain dashes and underscores, but the previous regex was too restrictive. This change updates the regex in pkg/nar/url.go to correctly support these characters.

Key changes:

  • Updated the NAR URL regex to allow '-' and '_' in the filename portion.
  • Implements NAR URL normalization to handle URLs with embedded narinfo hash prefixes, ensuring consistent lookups.
  • Refactored pkg/cache/cache.go to use normalized URLs for NAR operations.
  • Updated testdata/nar7.go and related tests to use the correct NAR path structure.
  • Improved error handling and logging around NAR URL parsing and retrieval.

fixes #806

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Feb 10, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kalbasit, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where the nar.ParseURL() function failed to correctly parse valid NAR URLs containing dashes or underscores due to an overly strict regular expression. The changes update the parsing logic to accommodate these characters and introduce new test cases to confirm the fix, enhancing the robustness and correctness of NAR URL handling.

Highlights

  • Regular Expression Update: The narRegexp in pkg/nar/url.go has been updated to correctly parse NAR URLs that include dashes (-) and underscores (_) in their hash component.
  • Test Coverage Expansion: New test cases have been added to pkg/nar/url_test.go to ensure the nar.ParseURL() function correctly handles URLs containing dashes and underscores, validating the regex change.
Changelog
  • pkg/nar/url.go
    • Modified the narRegexp to allow dashes and underscores in the hash part of the URL.
  • pkg/nar/url_test.go
    • Added two new test cases to TestParseURL to validate NAR URLs with dashes and underscores.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added bug Something isn't working go Pull requests that update go code labels Feb 10, 2026
@dosubot
Copy link

dosubot bot commented Feb 10, 2026

Related Documentation

Checked 7 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

Copy link
Owner Author

kalbasit commented Feb 10, 2026

@kalbasit kalbasit added the backport release-0.8 Backport PR automatically label Feb 10, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request successfully updates the narRegexp to allow dashes and underscores in NAR URLs, addressing the issue of overly strict parsing. The added test cases in url_test.go effectively validate this change. However, there's a potential inconsistency with the helper.ValidateHash function, which uses a stricter regex and might cause nar.ToFilePath to fail for URLs successfully parsed by this updated regex. This aligns with the project's emphasis on code consistency.

@codecov
Copy link

codecov bot commented Feb 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.50%. Comparing base (184373f) to head (b8bb677).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #820      +/-   ##
==========================================
+ Coverage   85.41%   85.50%   +0.09%     
==========================================
  Files           2        2              
  Lines         480      483       +3     
==========================================
+ Hits          410      413       +3     
  Misses         65       65              
  Partials        5        5              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kalbasit kalbasit force-pushed the 02-09-fix_nar.parseurl_should_allow_urls_with_dashes_and_underscores branch from 564809c to 8022194 Compare February 10, 2026 00:53
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Feb 10, 2026
@kalbasit
Copy link
Owner Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes an issue where NAR URLs with dashes and underscores were not being parsed correctly by updating the regular expression for NAR hashes. The addition of test cases for these scenarios is a great way to ensure the fix is working as expected.

Furthermore, the pull request includes a significant and beneficial refactoring. Moving NAR and NarInfo related helper functions from the generic pkg/helper into their own dedicated packages (pkg/nar and pkg/narinfo) greatly improves the project's structure, modularity, and maintainability. All usages have been updated accordingly, which is excellent.

I have one minor suggestion to improve the clarity of an error message.

@kalbasit kalbasit enabled auto-merge (squash) February 10, 2026 00:56
@kalbasit kalbasit disabled auto-merge February 10, 2026 00:57
@kalbasit kalbasit changed the base branch from main to graphite-base/820 February 10, 2026 05:00
@kalbasit kalbasit force-pushed the 02-09-fix_nar.parseurl_should_allow_urls_with_dashes_and_underscores branch from 1343346 to e3e1852 Compare February 10, 2026 05:00
@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Feb 10, 2026
@kalbasit kalbasit changed the base branch from graphite-base/820 to add-support-for-narinfo-without-filesize-filehash February 10, 2026 05:00
@kalbasit kalbasit force-pushed the add-support-for-narinfo-without-filesize-filehash branch from 0e3ffd1 to 9f40b7f Compare February 10, 2026 05:10
@kalbasit kalbasit force-pushed the 02-09-fix_nar.parseurl_should_allow_urls_with_dashes_and_underscores branch 2 times, most recently from 8ba2a15 to 9c987a4 Compare February 10, 2026 05:21
@kalbasit kalbasit force-pushed the add-support-for-narinfo-without-filesize-filehash branch 2 times, most recently from b620ca6 to 9ffcfa0 Compare February 10, 2026 05:24
auto-merge was automatically disabled February 10, 2026 19:29

Pull request was closed

kalbasit added a commit that referenced this pull request Feb 10, 2026
Nix-serve sometimes serves the NarURL with the narinfo hash included as
a prefix to the NAR hash (e.g., "nar/<hash>-<narhash>.nar"). This causes
issues for ncps which expects the NarURL to point directly to the NAR
file.

Modified GetNarInfo in the upstream cache to remove the narinfo hash and
the subsequent hyphen from the URL using strings.ReplaceAll.

Added new test entries (Nar7 and Nar8) to testdata to cover this
scenario and updated the upstream cache tests to verify that the narinfo
hash is correctly removed from the URL.

Also updated internal cache tests to better log sizes and handle
transparent zstd compression when calculating expected sizes during LRU
testing.

fixes #806
closes #820

(cherry picked from commit 3beaceb)
@kalbasit kalbasit reopened this Feb 10, 2026
kalbasit added a commit that referenced this pull request Feb 10, 2026
…831] (#833)

Nix-serve sometimes serves the NarURL with the narinfo hash included as
a prefix to the NAR hash (e.g., "nar/<hash>-<narhash>.nar"). This causes
issues for ncps which expects the NarURL to point directly to the NAR
file.

Modified GetNarInfo in the upstream cache to remove the narinfo hash and
the subsequent hyphen from the URL using strings.ReplaceAll.

Added new test entries (Nar7 and Nar8) to testdata to cover this
scenario and updated the upstream cache tests to verify that the narinfo
hash is correctly removed from the URL.

Also updated internal cache tests to better log sizes and handle
transparent zstd compression when calculating expected sizes during LRU
testing.

fixes #806
closes #820

(cherry picked from commit 3beaceb)
@kalbasit kalbasit force-pushed the 02-09-fix_nar.parseurl_should_allow_urls_with_dashes_and_underscores branch from 3c75973 to faae2ba Compare February 10, 2026 21:18
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Feb 10, 2026
@kalbasit kalbasit changed the title fix: nar.ParseURL() should allow urls with dashes and underscores fix: allow NAR URLs with dashes and underscores and improve NAR path handling Feb 10, 2026
@kalbasit kalbasit enabled auto-merge (squash) February 10, 2026 21:21
@kalbasit kalbasit disabled auto-merge February 10, 2026 23:24
@kalbasit kalbasit force-pushed the 02-09-fix_nar.parseurl_should_allow_urls_with_dashes_and_underscores branch from da8ed06 to da19700 Compare February 11, 2026 04:33
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Feb 11, 2026
…handling

NAR URLs and paths can contain dashes and underscores, but the previous regex was too restrictive. This change updates the regex in pkg/nar/url.go to correctly support these characters.

Key changes:
- Updated the NAR URL regex to allow '-' and '_' in the filename portion.
- Implements NAR URL normalization to handle URLs with embedded narinfo hash prefixes, ensuring consistent lookups.
- Refactored pkg/cache/cache.go to use normalized URLs for NAR operations.
- Updated testdata/nar7.go and related tests to use the correct NAR path structure.
- Improved error handling and logging around NAR URL parsing and retrieval.
@kalbasit kalbasit force-pushed the 02-09-fix_nar.parseurl_should_allow_urls_with_dashes_and_underscores branch from da19700 to f70c4e0 Compare February 11, 2026 05:13
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Feb 11, 2026
When a NAR has no compression (compression type is 'none'), the
ToFileExtension() method returns an empty string. The path construction
was incorrectly concatenating '.' + '' = '.', resulting in paths like
'/nar/hash.nar.' instead of '/nar/hash.nar'. This caused the test
server to return 404 when serving NARs without compression.

Fixed by only appending the extension if it's not empty. Also added
support for fetching normalized (prefix-stripped) NAR hashes from the
test server.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@kalbasit kalbasit force-pushed the 02-09-fix_nar.parseurl_should_allow_urls_with_dashes_and_underscores branch from f70c4e0 to 4b20d85 Compare February 11, 2026 06:02
kalbasit and others added 5 commits February 10, 2026 22:59
- Fix GenerateEntry to create valid narinfo entries that can be parsed
  - Remove invalid fake signature (test-cache:1:fakesignature==)
  - Use proper References format (narInfoHash-generated-test)
  - Add comment explaining why no signatures are used

- Fix distributed tests to not use public key verification for generated entries
  - Generated entries cannot be signed (we don't have the private key)
  - Skip public key verification in all distributed tests that use generated entries
  - Includes: DownloadDeduplication, ConcurrentReads, LargeNARConcurrentDownload, CDCProgressiveStreamingDuringChunking

These changes fix the failing distributed tests where GetNar was unable to
fetch narinfo for generated test entries due to invalid narinfo format.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Normalize NAR URLs in both local and S3 storage implementations
to handle URLs with embedded narinfo hash prefixes. This ensures
consistent storage and retrieval regardless of whether the NAR
URL contains a prefix.

Changes:
- pkg/storage/local/local.go: Add URL normalization to HasNar, GetNar, PutNar, DeleteNar
- pkg/storage/s3/s3.go: Add URL normalization to narPath method

This is part of fixing issues with NAR URLs that contain prefixed hashes
(e.g., "narinfo-hash-actual-hash" format).

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…layer

- Modified prePullNar to pass normalized NAR URL to pullNarIntoStore instead of original
- Added support for fetching narinfo by normalized NAR hash in testdata server
- Fixes RunLRU test failures when handling NARs with prefixed hashes like 'prefix-actualhash'

The issue was that getNarFromUpstream was receiving the original (unnormalized) URL and
failing to find the NAR when it was stored using the normalized hash.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Apply hash normalization to all database queries in the RunLRU tests to handle
NAR URLs with embedded narinfo hash prefixes (format: prefix-actualhash).

The fixes ensure database queries use the normalized hash, which is consistent
with how NARs are stored after normalization in the cache layer.

This fixes failures in:
- testRunLRU: First nar_file lookup query
- testRunLRU: Entries loop verification after LRU
- testRunLRU: Last entry verification after LRU
- testRunLRUCleanupInconsistentNarInfoState: All entries loop verification

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
kalbasit added a commit that referenced this pull request Feb 11, 2026
…es in hashes (#836)

Add Normalize() method to nar.URL to handle NAR URLs with embedded narinfo hash
prefixes. Some upstream caches (like nix-serve) serve NAR URLs with the narinfo
hash as a prefix (e.g., "narinfo-hash-actual-hash"). This method strips the prefix
by identifying the first separator (dash or underscore) and removing everything
before it.

Changes:
- pkg/nar/url.go: Add Normalize() method and supporting logic
- pkg/nar/hash.go: Update narHashPattern to allow dashes and underscores ([a-z0-9_-]+)
- pkg/nar/url_test.go: Add TestNormalize with three test cases (no prefix, dash-separated, underscore-separated)
- testdata/nar7.go: Update to use prefixed hash format for testing

This enables parsing and normalizing NAR URLs from sources that include hash
prefixes, which will be applied consistently across cache, storage, and test
layers in subsequent commits.

Part of #806
Closes #820 

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
kalbasit added a commit that referenced this pull request Feb 12, 2026
…es in hashes [backport #836]

Add Normalize() method to nar.URL to handle NAR URLs with embedded narinfo hash
prefixes. Some upstream caches (like nix-serve) serve NAR URLs with the narinfo
hash as a prefix (e.g., "narinfo-hash-actual-hash"). This method strips the prefix
by identifying the first separator (dash or underscore) and removing everything
before it.

Changes:
- pkg/nar/url.go: Add Normalize() method and supporting logic
- pkg/nar/hash.go: Update narHashPattern to allow dashes and underscores ([a-z0-9_-]+)
- pkg/nar/url_test.go: Add TestNormalize with three test cases (no prefix, dash-separated, underscore-separated)
- testdata/nar7.go: Update to use prefixed hash format for testing

This enables parsing and normalizing NAR URLs from sources that include hash
prefixes, which will be applied consistently across cache, storage, and test
layers in subsequent commits.

Part of #806
Closes #820

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
(cherry picked from commit 7e1c28e)
kalbasit added a commit that referenced this pull request Feb 12, 2026
…es in hashes [backport #836]

Add Normalize() method to nar.URL to handle NAR URLs with embedded narinfo hash
prefixes. Some upstream caches (like nix-serve) serve NAR URLs with the narinfo
hash as a prefix (e.g., "narinfo-hash-actual-hash"). This method strips the prefix
by identifying the first separator (dash or underscore) and removing everything
before it.

Changes:
- pkg/nar/url.go: Add Normalize() method and supporting logic
- pkg/nar/hash.go: Update narHashPattern to allow dashes and underscores ([a-z0-9_-]+)
- pkg/nar/url_test.go: Add TestNormalize with three test cases (no prefix, dash-separated, underscore-separated)
- testdata/nar7.go: Update to use prefixed hash format for testing

This enables parsing and normalizing NAR URLs from sources that include hash
prefixes, which will be applied consistently across cache, storage, and test
layers in subsequent commits.

Part of #806
Closes #820

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
(cherry picked from commit 7e1c28e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport release-0.8 Backport PR automatically bug Something isn't working go Pull requests that update go code size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add nix-serve upstream support

1 participant