Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix ReadFrom generate corrupted bitset when reader incompletely fills buf #111

Conversation

mingmingtsao
Copy link

@mingmingtsao mingmingtsao commented Sep 14, 2022

io.Read does not guarantee read len(p) bytes.

Read reads up to len(p) bytes into p. It returns the number of bytes read (0 <= n <= len(p)) and any error encountered. Even if Read returns n < len(p), it may use all of p as scratch space during the call. If some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more.

@thanhpk
Copy link
Contributor

thanhpk commented Sep 14, 2022

I think this function is getting complicated more than it should be.

I propose a version that is less performant but much simpler.
No buffio or limitedreader hidden stuffs, just plain binary.Read.

// ReadFrom reads a BitSet from a stream written using WriteTo
func (b *BitSet) ReadFrom(stream io.Reader) (int64, error) {
	var length uint64

	// Read length first
	err := binary.Read(stream, binaryOrder, &length)
	if err != nil {
		if err == io.EOF {
			err = io.ErrUnexpectedEOF
		}
		return 0, err
	}
	newset := New(uint(length))

	if uint64(newset.length) != length {
		return 0, errors.New("unmarshalling error: type mismatch")
	}

	nWords := uint64(wordsNeeded(uint(length)))
	for i := uint64(0); i < nWords; i++ {
		var item uint64
		if err := binary.Read(stream, binaryOrder, &item); err != nil {
			if err == io.EOF {
				err = io.ErrUnexpectedEOF
			}
			return 0, err
		}
		newset.set[i] = item
	}

	*b = *newset
	return int64(b.BinaryStorageSize()), nil
}

@lemire
Copy link
Member

lemire commented Sep 14, 2022

I am merging this and re-releasing. Thanks.

@lemire lemire merged commit e160993 into bits-and-blooms:master Sep 14, 2022
@mingmingtsao mingmingtsao deleted the fix-bitset-readfrom-incomplete-buffer branch September 15, 2022 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants