-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proto: add scatter/gather API for serialization #609
Comments
If we're going to go this route we should do (We need seeker to allow multi-pass algorithms.) |
I'm intentionally trying to stay away from the My current idea is to take in this slice of slices and create a wrapper around it which provides slice like functionality that the implementation looks for. For instance, This way we make minimal changes to the current implementation. |
I should mention that the |
That does look promising. We can leverage |
Here's the wrapper data structure that I think can be hooked in the current implementation with very little friction: package proto
import (
"io"
"unsafe"
)
// byteSlice wraps a slice of byte slices
// to provide slice like functions on it.
type byteSlice struct {
ofst int
bufs [][]byte
size int
}
func newByteSlice(bufs ...[]byte) *byteSlice {
bs := &byteSlice{
bufs: bufs,
}
for _, b := range bufs {
bs.size += len(b)
}
return bs
}
// peekByteAt reads a byte at loc. It returns an error
// if there's no more data. It doesn't move the ofst
func (s *byteSlice) peekByteAt(loc int) (byte, error) {
if s.ofst+loc > s.size {
return 0, io.ErrUnexpectedEOF
}
for _, b := range s.bufs {
if len(b) > loc {
return b[loc], nil
}
loc -= len(b)
continue
}
panic("Incorrect implementation!")
}
// moveBy moves ofst by n.
func (s *byteSlice) moveBy(n int) error {
if s.ofst+n > s.size {
return io.ErrUnexpectedEOF
}
s.ofst += n
for n > 0 {
if len(s.bufs[0]) > n {
s.bufs[0] = s.bufs[0][n:]
return nil
}
n -= len(s.bufs[0])
s.bufs = s.bufs[1:]
}
return nil
}
// Note: this allocates new memory and should be only
// used when assigning a byte slice to a proto.Message
// field.
func (s *byteSlice) readn(n int) ([]byte, error) {
// We want a non-nil value returned when n is 0.
if n == 0 {
return emptyBuf[:], nil
}
if s.ofst+n > s.size {
return nil, io.ErrUnexpectedEOF
}
// TODO(mmukhi): Evaluate variable usage to reduce
// cache misses.
s.ofst += n
var p []byte
for n > 0 {
if len(s.bufs[0]) == 0 {
s.bufs = s.bufs[1:]
continue
}
sz := n
if sz > len(s.bufs[0]) {
sz = len(s.bufs[0])
}
// The use of append here is a trick which avoids the zeroing
// that would be required if we used a make/copy pair.
p = append(p, s.bufs[0][:sz]...)
s.bufs[0] = s.bufs[0][sz:]
n -= sz
}
return p, nil
}
// Note: this allocates new memory and should be only
// used when assigning a string to a proto.Message
// field.
func (s *byteSlice) readString(n int) (string, error) {
b, err := s.readn(n)
if err != nil {
return "", err
}
// This trick is to prevent extra memory allocation made by
// casting a byte slice to string.
return *(*string)((unsafe.Pointer)(&b)), nil
}
func (s *byteSlice) length() int {
return s.size - s.ofst
}
func (s *byteSlice) isEmpty() bool {
return s.ofst >= s.size
} |
@MakMukhi would you know how much this change would improve grpc benchmarks? |
This will cut down memory foot-print for each RPC call(request-response) by half. |
I think this is needed in go. And if you implement this look at the new c++ parser loop. The new parser is quite a bit faster and becomes very flexible due to resumability. In go this can probably be easily and elegantly implemented with go-routines, removing the necessity of a side stack. |
goroutines are certainly cheap, but not so cheap that you would want to spawn one for every unmarshal operation and deal with the synchronization. In Go, the standard library only has support for |
I think supporting io.Reader io.Writer API is needed,the tensorflow module size is about 1-2G, so i need 5G memory to load the module。 |
Looking back on this -- I think it's possible func UnmarshalFrom(io.Reader, proto.Message) error However, this results in an extra, unnecessary copy as the If we were to flip that around, we could avoid that extra copy. Something like: type Unmarshaller struct {}
func (u *Unmarshaller) Write([]byte) (int, error) {}
func (u *Unmarshaller) Message() (Message, error) {} // or maybe "Close"
// and the Unmarshaller wraps a proto.Message which is valid after
// Close returns Here the implementation of Vice-versa, a marshaller could provide an @dsnet - what do you think? |
@dfawley does it looks like C++'s zero copy stream? https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.zero_copy_stream |
Serialized data read from the wire often spans several data frames. The current
proto.Unmarshal
API requires that data from all these frames be copied to a slice.This extra copying(and memory allocation) can be avoided if we have an unmarshal gather which can deserialize data from multiple byte slices.
If there's an agreement on the proposed API, I'd happy to work on it's implementation as well.
The text was updated successfully, but these errors were encountered: