Skip to content

Match standard HTTP semantics of inclusive end value in Range and Content-Range headers#7632

Merged
achamayou merged 7 commits intomicrosoft:mainfrom
eddyashton:range_header_handling
Jan 30, 2026
Merged

Match standard HTTP semantics of inclusive end value in Range and Content-Range headers#7632
achamayou merged 7 commits intomicrosoft:mainfrom
eddyashton:range_header_handling

Conversation

@eddyashton
Copy link
Member

Resolves #7626.

I think the simplest way to do this is to convert during parsing, and back just before emission, but keep our exclusive range end value for processing in C+ - that's more idiomatic, and avoids peppering +/-1s throughout the code.

Note that the Content-Range header is quite funny, with a total size as well as the Content-Length header, and because we consistently return 206 partial contents even for a complete response, you can end up with something like:

Content-Length: 500
Content-Range 0-499/500

This is a (HTTP-standard compatible) response containing all 500 bytes of a 500 byte file. Only the range end value is inclusive.

I've prodded our endpoints a little with curl and httpx when returning large files. They seem happy, but they were also happy before. It looks like because the server is allowed to return a different range, and there are multiple places where we spell that response-range's true size, they're extremely tolerant of us misparsing the range ends.

@eddyashton eddyashton requested a review from a team as a code owner January 30, 2026 12:05
Copilot AI review requested due to automatic review settings January 30, 2026 12:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix issue #7626 by properly handling HTTP Range header semantics. In HTTP, both the start and end positions in Range headers are inclusive (e.g., "bytes=0-9" requests 10 bytes). The PR converts inclusive range end values to exclusive values during parsing for idiomatic C++ processing, and converts them back to inclusive before emitting response headers.

Changes:

  • Added HTTP Range header constant to improve code consistency
  • Updated server-side range parsing to convert inclusive end values to exclusive for internal processing
  • Updated server-side response generation to convert back to inclusive before emitting Content-Range headers
  • Updated tests to verify correct inclusive range semantics

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
include/ccf/http_consts.h Added RANGE constant for consistent header name usage
src/node/rpc/file_serving_handlers.h Server-side: parses inclusive range ends, converts to exclusive for processing, converts back to inclusive for response headers
src/snapshots/fetch.h Client-side: parses inclusive range ends and converts to exclusive, but fails to convert back to inclusive when sending Range headers
tests/e2e_operations.py Updated tests to verify inclusive range semantics in server responses

@achamayou
Copy link
Member

@eddyashton should we implement the standard exactly (including the error behaviour when reading past the end), and add a custom header we can use internally when we want our different (arguably better) behaviour?

That way Range would work as expected for external clients downloading ledger chunks (or snapshots), but still be efficient and avoid the extra round trip internally.

@eddyashton
Copy link
Member Author

@eddyashton should we implement the standard exactly (including the error behaviour when reading past the end), and add a custom header we can use internally when we want our different (arguably better) behaviour?

Discussed offline, but repeating here for posterity. I'd thought that truncating an end-past-the-size response was a hack we added (surely a normal client would send an open-ended range for the remainder, and/or HEAD requests to discover the total content-size?), but it turns out this is actually the standards-mandated behaviour in https://datatracker.ietf.org/doc/html/rfc7233#section-2.1:

A client can limit the number of bytes requested without knowing the
size of the selected representation. If the last-byte-pos value is
absent, or if the value is greater than or equal to the current
length of the representation data, the byte range is interpreted as
the remainder of the representation (i.e., the server replaces the
value of last-byte-pos with a value that is one less than the current
length of the selected representation).

That's exactly what we do. So that's already standard behaviour, and we don't have to change anything.

@achamayou achamayou enabled auto-merge (squash) January 30, 2026 15:57
@achamayou achamayou merged commit b4be36f into microsoft:main Jan 30, 2026
17 checks passed
eddyashton added a commit to eddyashton/CCF that referenced this pull request Feb 4, 2026
eddyashton added a commit that referenced this pull request Feb 4, 2026
eddyashton added a commit to eddyashton/CCF that referenced this pull request Feb 4, 2026
eddyashton added a commit to eddyashton/CCF that referenced this pull request Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The range handling is inconsistent with HTTP Range semantics

2 participants