-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make optimal memory allocation #6
Comments
Hmmm, I seem to remember that there was a reason why it was this way; I'll take a look at this ASAP. I'll probably add a check that it doesn't exceed the maximum length first (so that a client can't do DoS attacks) and then use the request header! |
… amount of memory that is required per request (see #6)
So I've tried this patch, but we're getting significantly worse performance: With the patch: pojntfx@fedora:~/Projects/r3map$ go build -o /tmp/r3map-benchmark-direct-mount ./cmd/r3map-benchmark-direct-mount/ && sudo /tmp/r3map-benchmark-direct-mount
Open: 10.24412ms
Latency till first two chunks: 362.348µs
Read throughput: 1713.94 MB/s (13711.50 Mb/s)
Read check: Passed
Write throughput: 477.73 MB/s (3821.86 Mb/s)
Sync: 215.875989ms
Write check: Passed
Close: 35.668024ms Before the patch: pojntfx@fedora:~/Projects/r3map$ go build -o /tmp/r3map-benchmark-direct-mount ./cmd/r3map-benchmark-direct-mount/ && sudo /tmp/r3map-benchmark-direct-mount
Open: 14.540542ms
Latency till first two chunks: 51.69µs
Read throughput: 2356.33 MB/s (18850.61 Mb/s)
Read check: Passed
Write throughput: 502.19 MB/s (4017.50 Mb/s)
Sync: 197.373918ms
Write check: Passed
Close: 36.308795ms I'll take a look at why that is exactly. |
Figured it out, if we only increase the buffer size when we actually need it we're getting the same or better performance: 184321a |
@pojntfx Thanks for the patch. Basically either of these solutions (what we had before vs what we had now) can not solve the problems together. The problems:
With current solution, once the request size increases, we will allocate (or rather increase the buffer size, but we will never decrease). This properly addresses problem#2. But when we are in memory crunch, 12GiB of resident memory is pretty expensive. In my use case there can be large read request spikes, which will allocate upto 12GiB but will never come down. Adaptive allocation / a pool of buffer memories with all powers of 2 (4KiB-32MiB) - request will choose the closes greater buffer (upperbound) / smaller buffer withe few cpu cycles At this point I would request to make the strategy configurable so that user can choose what suits best. If performance is the demand, one can choose the It can also be an interface type transfer interface {
CopyN(backend io.ReaderAt, offset int64, size int, socket io.Writer) (int, error)
} Or by introducing a new Backend interface with bufferless functions. type ReaderTo interface {
ReadNTo(offset int64, size int, socket io.Writer) (int, error)
}
type WriterFrom interface {
WriteNFrom(socket io.Reader, writeAtOffset int64, size int) (int, error)
}
type BackendV2 interface {
ReaderTo
WriterFrom
Size() (int64, error)
Sync() error
} |
Hmm, this sounds like a good idea. I'd personally argue against changing the underlying interface (since a change from the typical |
Another option could also be to do this: b := []byte{}
for {
var requestHeader protocol.TransmissionRequestHeader
if err := binary.Read(conn, binary.BigEndian, &requestHeader); err != nil {
return err
}
if requestHeader.RequestMagic != protocol.TRANSMISSION_MAGIC_REQUEST {
return ErrInvalidMagic
}
length := requestHeader.Length
if length > defaultMaximumRequestSize {
return ErrInvalidBlocksize
}
if length != uint32(len(b)) {
b = make([]byte, length)
} This way we just change the buffer length if the one we require is different from the one we currently have. From my local r3map benchmark, this leads to the same performance as the "reuse or grow" strategy, and should help fix your problem. |
Considering the fixes of #4 (comment) is done, now the next step will be to support highly concurrent IO.
So far I have observed for any size of IO request 32MiB is allocated, which increases the memory usage unnecessarily.
go-nbd/pkg/server/nbd.go
Line 319 in b8f5045
As an alternative we can allocate the exact amount of the request size. (for both read and write)
I have tested this and since linux dev mapper devices (if used on top of nbd device) makes very small io requests this has significantly reduced memory usage around 80-90% (since most IO requests do not exceed 4MiB).
The text was updated successfully, but these errors were encountered: