Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: async CAR put() & close() for backpressure & error handling #74

Merged
merged 1 commit into from
Aug 9, 2021

Conversation

rvagg
Copy link
Contributor

@rvagg rvagg commented Aug 7, 2021

Ref: #73 /cc @jimpick

Bear with me ... This is because of the challenges of using asynciterables as streams, but we can do this!

In the current form, the promises for writer.put() and writer.close() get lost, so (a) we have no means of backpressure so if we had a fast block store + a large number of blocks and a slow reader for the output then we'd use up a lot of memory, and (b) any errors are lost to the wind.

The CAR internals for {writer,out} ("IteratorChannel") are careful to maintain backpressure between them if you choose to use await on your writer.put() (because it's not strictly necessary), so the approach in this PR should maintain a steady stream state with not excessive bytes accumulating if you have a slow writer on the other end (it would be interesting to test this assertion, we're getting into complicated territory here).

So, we set up 2 parallel activities: (1) writing blocks from the blockstore to the CAR writer and (2) reading blocks from the CAR output channel. We want to give the user the results of (2) and we don't want (1) to start until the user starts reading and we don't want blocks to be written until the bytes for previous blocks have been consumed. CAR internals take care of that latter concern as long as we can await on our writer.put() calls and waiting until the user calls [Symbol.asyncIterator]() means we can defer the blockstore read->CAR write until they start iterating. We also get to hang on to the promise where we do all the writing and by awaiting that at the last we get to propagate errors.

@vasco-santos vasco-santos self-requested a review August 9, 2021 09:59
Copy link
Member

@vasco-santos vasco-santos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks ❤️
I tried this with several large files and everything is working as expected

@vasco-santos vasco-santos merged commit 42d5dde into web3-storage:main Aug 9, 2021
Copy link
Member

@alanshaw alanshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this cool! A bit hard to follow, but it looks like it works. I would like to give CarWriter an blocks (async) iterable so make this simple:

const out = await CarWriter.fromIterable([root], blockstore.blocks())

The writer would auto-close when the iterable is done and deal with backpressure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants