Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gw): entity-bytes with negative indexes beyond file size #523

Merged
merged 5 commits into from
Jan 25, 2024

Conversation

hannahhoward
Copy link
Contributor

@hannahhoward hannahhoward commented Dec 8, 2023

Goal

Fix the following problem:

  • I pass a pair of negative indexs to the entity-bytes parameter of a trustless HTTP request.
  • The start index is MORE negative than the length of the underlying entity (i.e. file) -- so entity-bytes=-1000,-50 on a 100 byte file
  • The code will see the start index is negative and add the entity length, but that will still leave it negative (https://github.com/ipfs/boxo/blob/main/gateway/blocks_backend.go#L510) -- in our example case from is no -900
  • The calculation of bytes to read will be higher than it should be -- https://github.com/ipfs/boxo/blob/main/gateway/blocks_backend.go#L533 -- in our example case, 1 + 50 - -900 = 951
  • When executed, this will simply read all 100 bytes of the underlying entity, instead of the intended behavior to read only the first 50 (i.e everything up to the -50 index)

Implementation

Very simple fix -- if after adding the entity length, the from is still negative, simply set it to 0.

This is the logic already implemented in the IPLD selector for ranges -- https://github.com/ipld/go-ipld-prime/blob/master/traversal/selector/matcher.go#L49

For discussion

This wil cause a gateway conformance test to fail, which is due to a failure in the way the test is setup -- the issue is documented here: ipfs/gateway-conformance#189

@hannahhoward hannahhoward requested review from lidel, hacdias and a team as code owners December 8, 2023 22:57
Copy link

codecov bot commented Dec 8, 2023

Codecov Report

Attention: 10 lines in your changes are missing coverage. Please review.

Comparison is base (3d57bce) 65.58% compared to head (22331dc) 65.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #523      +/-   ##
==========================================
- Coverage   65.58%   65.00%   -0.58%     
==========================================
  Files         207      204       -3     
  Lines       25589    25567      -22     
==========================================
- Hits        16782    16620     -162     
- Misses       7334     7461     +127     
- Partials     1473     1486      +13     
Files Coverage Δ
gateway/blocks_backend.go 42.09% <0.00%> (-0.28%) ⬇️

... and 19 files with indirect coverage changes

Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, adding a guard here makes sense, but since this is HTTP Gateway, [initial instinct] I feel is we should error instead of silently fixing up – rationale below.

(we will engage on this in january, after holiday break, and will include a fix in Kubo 0.26, but posting today so we have more time to think about this)

gateway/blocks_backend.go Show resolved Hide resolved
@lidel lidel changed the title Bound negative indexes to length of file to reading too much data fix(gw): entity-bytes with negative indexes beyond length of file Dec 16, 2023
@rvagg
Copy link
Member

rvagg commented Dec 16, 2023

As an implementer I don't want to have to make status code decisions after I've successfully processed the first block; I know boxo is fine buffering all of the blocks at the beginning but that's not what we do elsewhere and the transport is currently designed for maximal streaming. So introducing a requirement to status code at some point >1 block down into a DAG is going to introduce further incompatibilities between implementations.

I think I see the negative start value landing before the start of the file in a similar way to the end value being beyond the end of the file, it's like setting the end to MaxInt64, or *. It's not quite the same, because in most cases 0 would do the job for you. But perhaps you want to have a query that always grabs the last >= 100 bytes for a file - let's say Station has a reason to do this for any unixfs file it can find indexed in filecoin for some weird reason, so it has a generic query that adds entity-bytes=-100:* - I think I'd expect that to just truncate the start if it encountered a 50 byte file rather than error out that it's unsatisfiable.

@lidel lidel self-assigned this Jan 23, 2024
when using entity-bytes with a file and a negative start range, insure that ranges more negative
than the file's length are bounded at zero
@lidel
Copy link
Member

lidel commented Jan 24, 2024

(looked deeper with a bit more fresh eyes today) I agree, truncation proposed here will be less headache across systems than HTTP-style error that is hard to enforce across implementations.

(Given the non-HTTP abstractions we will be dealing with, and the fact we already have a situation where we return CAR With HTTP 200 on error cases such as sub-path not found in a DAG, it is not as big precedent as I originally thought, and truncation instead of error makes implementations easier)

@hannahhoward agree, this is a niche thing, but I do think we should have one sentence in spec and basic tests for edge cases like this because someone may build things that depend on them, like @rvagg noted above in #523 (comment)

Created ipfs/kubo#10320 to test this PR against ipfs/gateway-conformance#190.

Will try to land it this week, next steps are:

Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel lidel changed the title fix(gw): entity-bytes with negative indexes beyond length of file fix(gw): entity-bytes with negative indexes beyond file size Jan 25, 2024
@lidel lidel merged commit bf34cd0 into main Jan 25, 2024
15 checks passed
@lidel lidel deleted the feat/fix-entity-bytes-negative branch January 25, 2024 17:34
lidel added a commit to ipfs/kubo that referenced this pull request Jan 25, 2024
lidel added a commit to ipfs/specs that referenced this pull request Jan 25, 2024
Cosmetic clarification based on discussion that happened in 
ipfs/boxo#523
lidel added a commit to ipfs/rainbow that referenced this pull request Jan 29, 2024
lidel added a commit to ipfs/rainbow that referenced this pull request Jan 29, 2024
lidel added a commit to ipfs/rainbow that referenced this pull request Jan 29, 2024
lidel added a commit to ipfs/rainbow that referenced this pull request Jan 29, 2024
* Bump ipfs/gateway-conformance from 0.4.1 to 0.5.0
* chore: ipfs/boxo#523
* refactor: NewNoopTracerProvider deprecation

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

3 participants