Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easy to seek through large objects #264

Closed
roman-khimov opened this issue May 30, 2023 · 2 comments
Closed

Make it easy to seek through large objects #264

roman-khimov opened this issue May 30, 2023 · 2 comments
Assignees
Labels
enhancement Improving existing functionality I2 Regular impact S2 Regular significance U4 Nothing urgent
Milestone

Comments

@roman-khimov
Copy link
Member

There are several ways to do this:

  1. Store the size of each chunk in the link object along with the hash. It has the most flexibility, but it also means additional per-part overhead to store this.
  2. Keep link object as is, but store the offset from the beginning in each regular object and at least allow to bisect into some particular offset instead of looping through the whole list (10 HEADs is much better than 1000 HEADs).
  3. Demand individual parts to be of the same size, specify this size in the split data header. It's pretty easy to do, we probably HEAD header.split.previous anyway, so checking that the size of the object is the same as the previous one costs nothing.
@carpawell
Copy link
Member

I vote for the first one as the most explicit object-handling information to me: helper object has helper info. An offset inside every object seems redundant info for the object itself, it does not take part in the assembly and not required for the storing payload. Demanding the same size would be the best and the easiest to me but we have not done it yet and it looks like a feature, mb we do not want to be so strict.

@roman-khimov
Copy link
Member Author

Storing offset in a small object costs nothing, storing sizes in the link object does have some cost to it (~10% per-object additional overhead). We can do both though.

Equal parts are not compatible with non-reslicing S3 multipart handler (nspcc-dev/neofs-s3-gw#843).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving existing functionality I2 Regular impact S2 Regular significance U4 Nothing urgent
Projects
None yet
Development

No branches or pull requests

2 participants