-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize tar stream generation #22
Conversation
This allows to avoid extra allocations on `ReadBytes` and decoding buffers. Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
4121bf2
to
8086bff
Compare
Travis fails because Also, can you explain why the unpacker has the duplicate entry check. I would think we would need something like that when making a tar-split file but why do we have to recheck a file already on disk. |
adding a simple benchmark for getter putter, and the json looks nicer
|
}, | ||
} | ||
|
||
func copyWithBuffer(dst io.Writer, src io.Reader, buf []byte) (written int64, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be trivial, but would you add some tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To this func specifically? This is copied from https://github.com/golang/go/blob/master/src/io/io.go#L366
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hah. fair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps a comment citing such
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
- New writeTo method allows to avoid creating extra pipe. - Copy with a pooled buffer instead of allocating new buffer for each file. - Avoid extra object allocations inside the loop. Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
8086bff
to
23b6435
Compare
with a basic benchmark on
|
@vbatts that last benchmark doesn't really reflect real-world usage where you may have thousands of file accesses per one tar-data file. Also docker never calls edit:
|
I see. Well overall this LGTM. I wish it were a bit more comparable for benchmarking, but so it goes. |
@tonistiigi I've tagged release v0.9.11 for this. |
understood. feel free to offer up more appropriate benchmarks. On Wed, Dec 2, 2015 at 2:54 PM, Tõnis Tiigi notifications@github.com
|
After content addressability PR docker has a migration step when starting daemon for the first time. This step calculates the sha256 checksum of all the current data on disk.
This is quite time consuming if you have lots of data so I've tried to optimize it to make it as fast as possible.
This branch has the changes in docker side: moby/moby@master...tonistiigi:migration-opt
All docker side optimization makes migration 55% faster in my testcase. Half of that is the parallel processing on the docker side, other half is the general optimizations mostly in this PR.
The JSON parsing itself is not optimal yet. Especially the part where it creates new buffers for decoding base64. Making this would probably result 5-10% speed increase, but I'm not sure if its worth considering the code would be quite more messy.