Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming functionality #4

Merged
merged 6 commits into from
Feb 23, 2023
Merged

Streaming functionality #4

merged 6 commits into from
Feb 23, 2023

Conversation

anjor
Copy link
Contributor

@anjor anjor commented Feb 23, 2023

This is built on top of #3 so we can ignore that one.

This adds the ability to stream in a car file:

anjor@seven data (main)$ cat 5gb.car | /Users/anjor/repos/alanshaw/go-carbites/cmd/carbites split -s 1000000000
Splitting into ~1000000000 byte chunks using strategy "simple"
Writing CAR chunk to ./stdin-0.car
Writing CAR chunk to ./stdin-1.car
Writing CAR chunk to ./stdin-2.car
Writing CAR chunk to ./stdin-3.car
Writing CAR chunk to ./stdin-4.car
Writing CAR chunk to ./stdin-5.car

Copy link
Owner

@alanshaw alanshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly fine, but I would like to retain the existing functionality.

carbites.go Outdated
@@ -29,7 +29,7 @@ func Split(in io.Reader, targetSize int, s Strategy) (Splitter, error) {
case Simple:
return NewSimpleSplitter(in, targetSize)
case Treewalk:
return NewTreewalkSplitter(in, targetSize)
return nil, fmt.Errorf("treewalk strategy caches the entier CAR, which is not allowed due to memory considerations")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please allow this? It should be up to the user to determine if they are ok to buffer in memory. I think io.Reader isn't necessarily buffered anyway...

Copy link
Contributor Author

@anjor anjor Feb 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this code path doesn't get accessed anyway, which is why I removed it.

In main.go we first check for and handle treewalk strategy separately -- https://github.com/alanshaw/go-carbites/blob/main/cmd/main.go#L36 which means in carbites.go we will never have case Treewalk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh i guess you mean because carbites.Split is public. Yeah, that makes sense.

cmd/main.go Show resolved Hide resolved
cmd/main.go Show resolved Hide resolved
Copy link
Owner

@alanshaw alanshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alanshaw alanshaw merged commit 28d6432 into alanshaw:main Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants