Support resuming from binary caches that don't support ranged requests by edolstra · Pull Request #445 · DeterminateSystems/nix-src

edolstra · 2026-05-04T15:32:44Z

Motivation

We just start over and discard the data we've already received.

Context

This fixes resuming NAR downloads from binary caches that don't support ranged requests.

Summary by CodeRabbit

Bug Fixes
- Prevented duplicate data when resuming transfers by tracking per-response progress and discarding bytes already written.
- Reset per-response progress on new responses to ensure accurate resume behavior.
- Only allow resuming when server byte-range support is confirmed.
- Disable transparent decompression for partial-range uploads/downloads.
- Disable retries when streaming handlers are used with non-identity content encodings.

coderabbitai · 2026-05-04T15:36:54Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 63bf9a07-8ca9-4041-9fa6-48e442c747d7

📥 Commits

Reviewing files that changed from the base of the PR and between dbd6993 and 6839d85.

📒 Files selected for processing (1)

src/libstore/filetransfer.cc

📝 Walkthrough

Walkthrough

Adds per-response state to TransferItem (requestRange flag, bytesReceived counter), resets per-response fields on new HTTP responses/requests, updates sink handling to discard duplicate data across retries, tightens retry gating for range-resume and compressed responses, and conditions transparent decompression and CURLOPT_RESUME_FROM_LARGE on requestRange.

Changes

Transfer/resume / per-response state

Layer / File(s)	Summary
Data Shape `src/libstore/filetransfer.cc`	Adds `requestRange` flag and `bytesReceived` counter to `TransferItem` and initializes/resets per-request and per-response fields.
Core Implementation `src/libstore/filetransfer.cc`	Reset `bytesReceived` on new HTTP status line and on request init; update `finalSink` handling to trim/ignore bytes that would duplicate previously written data and to update `bytesReceived` on successful writes.
Wiring / Options `src/libstore/filetransfer.cc`	Enable transparent decompression only when not using range requests and not uploading (`!requestRange && !request.data`); apply `CURLOPT_RESUME_FROM_LARGE` only when `requestRange` is true; ensure resume-from behavior uses `requestRange`.
Retry Logic `src/libstore/filetransfer.cc`	Tighten retry gating: disallow retries when `dataCallback` is active with non-identity `Content-Encoding`; when retrying with non-zero `writtenToSink`, set `requestRange` only if `acceptRanges` is true; ensure resumption occurs only when range requests are active.
Tests / Documentation	No test or doc files changed in this diff.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant TransferItem
    participant Server
    participant Sink

    Client->>TransferItem: start request (may set requestRange)
    TransferItem->>Server: send HTTP request (with Range if requestRange)
    Server-->>TransferItem: HTTP response (status, headers, body)
    TransferItem->>TransferItem: parse status line -> reset bytesReceived, acceptRanges, hasContentEncoding
    alt hasContentEncoding && dataCallback active
        TransferItem->>Client: block retries (no retry)
    end
    TransferItem->>Sink: finalSink writes bytes
    Sink-->>TransferItem: writtenToSink update
    TransferItem->>TransferItem: bytesReceived += bytesWritten
    alt error / retry needed
        TransferItem->>TransferItem: if writtenToSink>0 and acceptRanges set requestRange=true
        TransferItem->>Server: retry (with Range if requestRange)
        Note right of Server: New response may overlap previous written bytes
        TransferItem->>TransferItem: discard/trim duplicate bytes based on bytesReceived vs writtenToSink
        TransferItem->>Sink: continue writing remaining bytes
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

"I nibbled bytes and tracked each hop,
Reset my paws when headers pop,
Skip the crumbs the last attempt left,
Resume with ranges — tidy and deft.
🐇✨"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and specifically describes the main change: adding support for resuming downloads from binary caches without range request support, which aligns with the code changes adding per-response state tracking and retry logic.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch resume-unranged-downloads

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai · 2026-05-04T15:37:12Z

📝 Walkthrough

Walkthrough

The file transfer implementation now tracks bytes received across retry attempts and introduces a request range flag. It prevents duplicate data writing by comparing previous and current received bytes, resets counters on new HTTP responses, and ties decompression behavior to range-request state rather than first-attempt checks. Retry eligibility is tightened to prevent retries when dataCallback meets non-identity Content-Encoding, and range requests are only set on retry if supported.

Changes

HTTP Range Request and Deduplication Logic

Layer / File(s)	Summary
State Tracking `src/libstore/filetransfer.cc` (lines 126–139)	`TransferItem` adds `bytesReceived` counter and `requestRange` flag to coordinate deduplication and range request behavior across retries.
Response Reset `src/libstore/filetransfer.cc` (lines 317–320)	`bytesReceived` is reset to zero when a new HTTP response status line arrives, clearing per-response state.
Deduplication Logic `src/libstore/filetransfer.cc` (lines 184–195)	Sink write path increments `bytesReceived` and discards data that was already written in prior attempts by comparing against `writtenToSink`, trimming or skipping the chunk to prevent duplicates.
Decompression Condition `src/libstore/filetransfer.cc` (lines 510–513)	Transparent decompression enabling is now gated on `requestRange` and download-only state instead of checking `writtenToSink == 0` on first attempt.
Retry Eligibility & Range Control `src/libstore/filetransfer.cc` (lines 767–774)	Retries are blocked when `dataCallback` coexists with non-identity `Content-Encoding`; range requests are set only on retry if `acceptRanges` is true, preventing inappropriate range requests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A rabbit hops through ranges true,
Deduplication sees what's new,
No bytes repeat, no data lost,
Smart retries win without the cost! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main objective of the pull request - enabling resumable downloads from binary caches without HTTP range request support.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch resume-unranged-downloads

_{Review rate limit: 2/5 reviews remaining, refill in 35 minutes and 36 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-04T15:39:01Z

🚀 Deployed on https://69fb4cbc4502fc11d4e7e5b5--determinate-nix.netlify.app

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/libstore/filetransfer.cc`:
- Around line 185-194: The duplicate-trimming logic incorrectly compares the
response-local bytesReceived (which resets on a resumed 206 response) against
the global writtenToSink, causing valid resumed bytes to be dropped; change the
comparison to work in absolute file-offset space by adding the response's
starting offset to prevReceived (i.e., compute absolutePrev = responseStart +
prevReceived or use the existing resume/expected offset variable) and only trim
data when absolutePrev < writtenToSink, calculating the trim amount as
writtenToSink - absolutePrev and slicing data accordingly; update the block that
references bytesReceived, prevReceived, writtenToSink, and data to use this
absolute-offset logic so ranged (206) retries do not discard bytes.
- Around line 510-513: The condition that sets CURLOPT_ACCEPT_ENCODING is
inverted: currently it enables transparent decompression when requestRange is
true, but the comment and semantics require skipping decompression for range
requests; change the conditional that calls curl_easy_setopt(req,
CURLOPT_ACCEPT_ENCODING, "") to run only when requestRange is false and
request.data is false (i.e., use if (!requestRange && !request.data)) so
byte-range downloads are not decompressed; update the surrounding comment if
needed to reflect this behavior and ensure the check references the same
requestRange and request.data symbols used in the surrounding code.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5ed76e76-dde5-4967-ab79-cd73d33754ab

📥 Commits

Reviewing files that changed from the base of the PR and between d461a35 and 9cc82b9.

📒 Files selected for processing (1)

src/libstore/filetransfer.cc

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/libstore/filetransfer.cc (1)

772-781: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Differentiate real range resumes from restart-and-skip retries.

When acceptRanges is false, requestRange stays false, but this still warns "retrying from offset ...". On the non-ranged-cache path, the next attempt restarts from byte 0 and only skips duplicates locally, so the message is misleading.

💡 Proposed fix

                     if (writtenToSink) {
                         if (acceptRanges)
                             requestRange = true;
-                        warn(
-                            "%s; retrying from offset %d in %d ms (attempt %d/%d)",
-                            exc.message(),
-                            writtenToSink,
-                            ms,
-                            attempt,
-                            fileTransfer.settings.tries);
+                        if (requestRange)
+                            warn(
+                                "%s; retrying from offset %d in %d ms (attempt %d/%d)",
+                                exc.message(),
+                                writtenToSink,
+                                ms,
+                                attempt,
+                                fileTransfer.settings.tries);
+                        else
+                            warn(
+                                "%s; retrying in %d ms and skipping the first %d bytes locally (attempt %d/%d)",
+                                exc.message(),
+                                ms,
+                                writtenToSink,
+                                attempt,
+                                fileTransfer.settings.tries);
                     } else {

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/libstore/filetransfer.cc` around lines 772 - 781, The warning message is
misleading when acceptRanges is false because requestRange stays false but the
code logs "retrying from offset X" even though the next attempt restarts from 0
and skips duplicates locally; update the branch that handles writtenToSink to
check acceptRanges and emit two different warn messages: when acceptRanges is
true keep the existing "retrying from offset %d ..." message and when
acceptRanges is false log something like "retrying by restarting from the
beginning and skipping %d bytes locally ..." (use writtenToSink and
fileTransfer.settings.tries in the message), and ensure this logic is placed
around the existing requestRange/acceptRanges handling so the correct message
corresponds to requestRange being true or false.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/libstore/filetransfer.cc`:
- Line 139: The bytesReceived counter (curl_off_t bytesReceived) isn't reset on
each retry, so non-HTTP transfers reuse the previous attempt's value and can
re-send sunk bytes to dataCallback; fix this by resetting bytesReceived = 0 at
the start of every new download attempt (i.e., inside the retry/attempt loop or
immediately before the logic that handles each attempt) so both the
getHTTPStatus() != 0 and getHTTPStatus() == 0 code paths use a fresh counter;
reference the bytesReceived variable, getHTTPStatus(), and dataCallback to
locate where to add the reset.

---

Outside diff comments:
In `@src/libstore/filetransfer.cc`:
- Around line 772-781: The warning message is misleading when acceptRanges is
false because requestRange stays false but the code logs "retrying from offset
X" even though the next attempt restarts from 0 and skips duplicates locally;
update the branch that handles writtenToSink to check acceptRanges and emit two
different warn messages: when acceptRanges is true keep the existing "retrying
from offset %d ..." message and when acceptRanges is false log something like
"retrying by restarting from the beginning and skipping %d bytes locally ..."
(use writtenToSink and fileTransfer.settings.tries in the message), and ensure
this logic is placed around the existing requestRange/acceptRanges handling so
the correct message corresponds to requestRange being true or false.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bac9bc7f-7016-41a9-b784-88112531127f

📥 Commits

Reviewing files that changed from the base of the PR and between 08118a0 and dbd6993.

📒 Files selected for processing (1)

src/libstore/filetransfer.cc

We just start over and discard the data we've already received.

github-actions Bot temporarily deployed to pull request May 4, 2026 15:39 Inactive

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/libstore/filetransfer.cc

Comment thread src/libstore/filetransfer.cc Outdated

cole-h self-requested a review May 5, 2026 15:14

edolstra force-pushed the resume-unranged-downloads branch 2 times, most recently from 08118a0 to dbd6993 Compare May 5, 2026 18:05

github-actions Bot temporarily deployed to pull request May 5, 2026 18:11 Inactive

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

Comment thread src/libstore/filetransfer.cc

Support resuming from binary caches that don't support ranged requests

6839d85

We just start over and discard the data we've already received.

edolstra force-pushed the resume-unranged-downloads branch from dbd6993 to 6839d85 Compare May 6, 2026 14:07

github-actions Bot temporarily deployed to pull request May 6, 2026 14:14 Inactive

cole-h approved these changes May 6, 2026

View reviewed changes

edolstra added this pull request to the merge queue May 6, 2026

Merged via the queue into main with commit 8596db3 May 6, 2026
31 checks passed

edolstra deleted the resume-unranged-downloads branch May 6, 2026 15:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support resuming from binary caches that don't support ranged requests#445

Support resuming from binary caches that don't support ranged requests#445
edolstra merged 1 commit intomainfrom
resume-unranged-downloads

edolstra commented May 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 4, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot commented May 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 4, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

edolstra commented May 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Context

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

edolstra commented May 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading

github-actions Bot commented May 4, 2026 •

edited

Loading