Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust http client idea: stream rather than chunk #94

Open
michaelkirk opened this issue Dec 22, 2020 · 4 comments
Open

rust http client idea: stream rather than chunk #94

michaelkirk opened this issue Dec 22, 2020 · 4 comments
Labels

Comments

@michaelkirk
Copy link
Collaborator

The current streaming strategy is based on chunks:

  1. request a chunk
  2. iterate over the chunk
  3. request a chunk
  4. iterate over the chunk
  5. ...

It seems like we could increase throughput, and in particular reduce the latency-to-first-feature, by utilizing a continuous byte stream.

Our http lib, reqwest supports response.byte_stream if you enable the stream feature which might be useful.

I'm personally not familiar with creating Futures and consuming Stream apis in rust, but this seems conceptually straight forward for the feature-fetching portion of the http reader.

There might be some way to also utilize it for the header+index downloads as well, but that seems a more complicated cost to benefit ratio.

@michaelkirk
Copy link
Collaborator Author

If anyone has any thoughts on this, I'm having a go at it.

@bjornharrtell
Copy link
Member

While I'm still (forever?) a rust-noob it sounds like a good idea to me. :) @pka?

@pka
Copy link
Member

pka commented Sep 1, 2023

My gut feeling says that only the first point, reduce the latency-to-first-feature, will be a measurable win. And then we need also a streaming FGB client, which does e.g. render the first feature immediately, to make the performance win also measurable for the end user.
As an alternative, experimenting with https://sozip.org/ could also bring a performance improvement for http access and would solve one downside of FGB, missing compression.

@michaelkirk
Copy link
Collaborator Author

michaelkirk commented Sep 1, 2023

I would expect measurable wins anywhere the cost of a waiting on a request roundtrip is measurably faster than transmitting (currently 1MB).

Conversely, I'd expect no real impact on small files or lower bandwidth connections, especially if they have relatively low latency.

Some motivating (but naive) math for transferring a 10MB file:

Screenshot 2023-09-01 at 12 00 19

check my math: https://docs.google.com/spreadsheets/d/1eu35b1mFwKTN-LzpmJFjwC4n4uP9AwfuPxuSh5OwO2k/edit#gid=0

This is all just theoretical numbers, I'm sure it won't be that good, but I'm reasonably optimistic that there will be more wins than only "time to first feature", especially for larger files on fast connections, without requiring users to hand-tune their chunk size. It also has the upside that you don't have to hold chunk_size in memory all at once.

I'm not yet sure how much this will benefit bbox selection. The request response model there is more complicated. I'd expect a benefit, but the implementation might get complicated.

As an alternative, experimenting with https://sozip.org/ could also bring a performance improvement for http access and would solve one downside of FGB, missing compression.

Oh that's an interesting idea. Anecdotally, I've seen a lot of benefit from compressing fgb. I could see how something like this would speed things up. Though, it seems like the chunk vs. stream question could equally apply to perusing a remote SOzip file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants