Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize pack readHeader() implementation #1574

Merged
merged 1 commit into from Jan 24, 2018

Conversation

@ifedorenko
Copy link
Contributor

ifedorenko commented Jan 24, 2018

What is the purpose of this change? What does it change?

Load pack header length and 15 header entries with single backend
request. This eliminates separate header Load() request for most pack
files and significantly improves index.New() performance.

Was the change discussed in an issue or in the forum before?

See #1567

Checklist

  • I have read the Contribution Guidelines
  • I have added tests for all changes in this PR
  • I have added documentation for the changes (in the manual)
  • There's a new file in a subdir of changelog/x.y.z that describe the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review
Load pack header length and 15 header entries with single backend
request. This eliminates separate header Load() request for most pack
files and significantly improves index.New() performance.

Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>
@ifedorenko ifedorenko force-pushed the ifedorenko:1567_optimize-pack-readHeader branch to 953f3d5 Jan 24, 2018
const maxHeaderSize = 16 * 1024 * 1024

// we require at least one entry in the header, and one blob for a pack file
var minFileSize = entrySize + crypto.Extension

// number of header enries to download as part of header-length request
var eagerEntries = uint(15)

This comment has been minimized.

Copy link
@ifedorenko

ifedorenko Jan 24, 2018

Author Contributor

@fd0 This number is based on stats from single 435GB repository (where >98% of all packs have 15 or less header entries, fwiw). Maybe useful to get stats from other repositories, assuming you have access or have interested users who can provide the info.


return binary.LittleEndian.Uint32(buf), nil
}

const maxHeaderSize = 16 * 1024 * 1024

// we require at least one entry in the header, and one blob for a pack file
var minFileSize = entrySize + crypto.Extension

This comment has been minimized.

Copy link
@ifedorenko

ifedorenko Jan 24, 2018

Author Contributor

Not related to my change, but I believe minFileSize should be 4 bytes longer to account for header length record at the end of the file.

This comment has been minimized.

Copy link
@fd0

fd0 Jan 24, 2018

Member

Oh indeed, that's right.

@fd0
fd0 approved these changes Jan 24, 2018
Copy link
Member

fd0 left a comment

Cool, thanks!

@fd0 fd0 merged commit 953f3d5 into restic:master Jan 24, 2018
2 of 3 checks passed
2 of 3 checks passed
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
hound No violations found. Woof!
fd0 added a commit that referenced this pull request Jan 24, 2018
fd0 added a commit that referenced this pull request Jan 24, 2018
@ifedorenko ifedorenko deleted the ifedorenko:1567_optimize-pack-readHeader branch Jan 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.