Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bao Tree Support #17

Open
pcfreak30 opened this issue May 12, 2023 · 9 comments
Open

Bao Tree Support #17

pcfreak30 opened this issue May 12, 2023 · 9 comments

Comments

@pcfreak30
Copy link

I am requesting to have Bao verified streaming/Merkle tree support based on the https://github.com/n0-computer/abao rust fork.

I am also requesting the outboard mode be supported as well.

Thanks.

@lukechampine
Copy link
Owner

Implemented in f9980aa

Chunk sizes are fixed at 1024. I can make that configurable, but the SIMD stuff all assumes 1024 byte chunks, so there will be a performance hit.

No support for slices at this time, but I can add that if it's an important feature.

@oconnor663
Copy link

Awesome! This is probably already clear, but just in case: We can't change the chunk size per se without breaking compatibility with BLAKE3. However, we can use larger "chunk groups" in the encoding, effectively pruning the lower levels of the tree, without breaking back-compat. (The encoding isn't compatible if we change the chunk group sizes, but at least the root hash is unchanged.)

@pcfreak30
Copy link
Author

Implemented in f9980aa

Chunk sizes are fixed at 1024. I can make that configurable, but the SIMD stuff all assumes 1024 byte chunks, so there will be a performance hit.

No support for slices at this time, but I can add that if it's an important feature.

IIRC @redsolver S5 uses 256 KB chunks and since verifying the tree data needs to be standardized across clients (and portal nodes), as it would change the tree data, this does need to be configurable. The data generated portal side would be downloadable for the client.

But appreciate the work done so far as this is a starting point to migrate. But I do need 256 kb chunks/chunk groups to interop correctly. I did not state that before as I assumed all chunk sizes could just get ported 😅.

Kudos.

@oconnor663
Copy link

My fault for taking so long to add chunk group support to the original Bao implementation 😅

@redsolver
Copy link

I'm using 256 KiB chunk groups by default, but would like to support other sizes too. For example Iroh is using 64 KiB chunk groups. So right now all of my streaming implementations just download the entire outboard file first and then start streaming (and verify) the file. Long-term it would be nice to switch to a more flexible outboard format that supports efficiently fetching parts of the outboard file down to a specific chunk group size. That would make it possible to generate and host one outboard file that goes down to the 64 KiB level, but a client could only fetch the parts needed down to 256 KiB chunk groups (with range requests) if they don't need to verify smaller chunks (for example when streaming video). But I'm waiting on what the Iroh team comes up with, right now I'll just keep using the default bao outboard format with 256 KiB chunk groups

@pcfreak30
Copy link
Author

Also @lukechampine please make baoOutboardSize public and a New constructor for bufferAt as I have to copy this code for my needs as the standard library does not have any WriterAt implementations IIRC.

I am also unsure if baoOutboardSize should be taking an int64?

To give an idea of what I'm currently working with:

func ComputeTree(reader io.Reader, size int64) ([]byte, error) {
	bufSize := baoOutboardSize(int(size))
	buf := bufferAt{buf: make([]byte, bufSize)}

	_, err := blake3.BaoEncode(&buf, bufio.NewReader(reader), size, true)
	if err != nil {
		return nil, err
	}

	return buf.buf, nil
}

@lukechampine
Copy link
Owner

exported in 6e43259

bufferAt is 9 lines of code, just copy it if you need it.

@pcfreak30
Copy link
Author

exported in 6e43259

bufferAt is 9 lines of code, just copy it if you need it.

Yes, I have, just wanted to avoid it.

@pcfreak30
Copy link
Author

Hello,

To bump this issue, my project currently has the following two requirements I request to be supported.

As a secondary nice-to-have, if the existing implementation will not support them, SIMD support on all of this for speed?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants