Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce index memory #2781

Merged
merged 5 commits into from
Jun 14, 2020
Merged

Conversation

aawsome
Copy link
Contributor

@aawsome aawsome commented Jun 12, 2020

What is the purpose of this change? What does it change?

Reduce the index memory by only applying small local code changes.

The discussion in #2523 seems to be stalled and I realize that core developers actually do not have time to discuss that many changes to improve index memory.

So I cherry-picked the ideas which will give the best improvement while needing the smallest code modifications.

Actually restic stores index entries in a map where the key length is 33 bytes and the entry length is 24 bytes (for the slice) + needs 48 bytes per really saved index entry.
Assuming a map load factor of 2/3 this gives 134 bytes per index entry.

This PR has three improvements:

  • save offset and length as uint32 instead of uint. This saves 8 bytes per index entry on 64bit architectures. By this change the pack size is effectively limited to 4GB and restic will now panic if used with larger packfiles.
  • save pack ID in a separate array and only save the index of this array in indexEntry. Assuming at least 8 blobs per pack in average this saves at least 20 bytes per index entry. The big advantage of this change is that repos with small files actually suffer a lot from the index memory consumption and this case will benefit much more from this improvement.
  • As these two modifications make Optimize in-memory index memory consumption #2338 much more attractive, I added the idea of @MichaelEischer to this PR (did not find a git branch containing the changes to base on). As @MichaelEischer pointed out, for smaller entries Optimize in-memory index memory consumption #2338 should be changed to inline saving the indexEntry, I changed this.
    Together with the two other improvements this IMO leads to saving additionally 28 bytes if there exists no duplicates (which should be the usual case).

So using this PR, the memory usage is reduced by a total of 56 bytes per index entry (again under the assumptions) leading to a total reduction of ~41% of the memory used by the index.

Was the change discussed in an issue or in the forum before?

see #2523
May close some of the memory related issues.

Checklist

  • I have read the Contribution Guidelines
  • I have enabled maintainer edits for this PR
  • I have not added tests for all changes in this PR - all present index tests succeed
  • I have not added documentation for the changes (in the manual)
  • There's a new file in changelog/unreleased/ that describes the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review

@aawsome
Copy link
Contributor Author

aawsome commented Jun 12, 2020

Oh, just realized that I should rebase - now that #2749 was merged.
This will also need a modification if #2773 (which implements other ideas from #2523) is merged before.

@aawsome
Copy link
Contributor Author

aawsome commented Jun 12, 2020

Is now rebased and contains three commits for each individual change.

@MichaelEischer / @rawtaz Thanks for giving the hint about git rebase -i! This command rocks!

@aawsome
Copy link
Contributor Author

aawsome commented Jun 12, 2020

There was another issue when saving pointers to the packID array: Every append may copy the array. As there are already references to the old array, it will never be garbage collected...
Hence I switched this to saving the index of packs.
I did however something wrong when using git rebase. So, the commits about saving packs and saving duplicate blobs separately are now merged into one commit - sorry about that!

Copy link
Member

@MichaelEischer MichaelEischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had implemented a proof-of-concept of rather similar changes in https://github.com/MichaelEischer/restic/commits/opt-hash-idx . However, I never got around to clean-up the code. Feel free to "steal" useful parts.

For deduplicating packIDs to a packIndex, I've also used an additional hashmap during index construction to ensure that each packID is actually only added once. With the StorePack method from #2773 that's probably no longer necessary. See d2086a1 .

The duplicates overflow map is similar to MichaelEischer@8d2fbe5 . You might want to take the BenchmarkIndexAlloc from there. The commit message also contains a rather detailed reasoning how the changes work and why they were made. I can tell from my own experience with e.g. some config file changes from roughly a decade ago which had no explanation at all (and no, the original author didn't remember the why anymore), that it is extremely useful to include a detailed commit message when the reason for the change is not completely obvious.

The conversion to uint32 index/length fields is much less messy than my variant. 👍

internal/repository/index.go Outdated Show resolved Hide resolved
@aawsome
Copy link
Contributor Author

aawsome commented Jun 13, 2020

I actually had implemented a proof-of-concept of rather similar changes in https://github.com/MichaelEischer/restic/commits/opt-hash-idx . However, I never got around to clean-up the code. Feel free to "steal" useful parts.

Oh, I wasn't aware of this branch - I did steal some parts, thank you 😄

For deduplicating packIDs to a packIndex, I've also used an additional hashmap during index construction to ensure that each packID is actually only added once. With the StorePack method from #2773 that's probably no longer necessary. See d2086a1 .

I added this when using Store - maybe Store will be removed in future, then this code can also be cleaned.

You might want to take the BenchmarkIndexAlloc from there. The commit message also contains a rather detailed reasoning how the changes work and why they were made. I can tell from my own experience with e.g. some config file changes from roughly a decade ago which had no explanation at all (and no, the original author didn't remember the why anymore), that it is extremely useful to include a detailed commit message when the reason for the change is not completely obvious.

I added the benchmark (thanks for the hint!) together with a detailed comment of the current state in the first commit.
With the following commits I also changed this comment to make it clear, what the impact of the change is. Also added the explanation why using "inlining" of indexEntry to the map in the commit.

I hope all comments are understandable; happy to get feedback or suggestions for improvements!

Copy link
Member

@MichaelEischer MichaelEischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few more remarks on the comments, but besides that the PR seems finished.

internal/repository/index.go Outdated Show resolved Hide resolved
internal/repository/index.go Outdated Show resolved Hide resolved
internal/repository/index.go Outdated Show resolved Hide resolved
internal/repository/index.go Outdated Show resolved Hide resolved
internal/repository/index.go Outdated Show resolved Hide resolved
internal/repository/index.go Outdated Show resolved Hide resolved
internal/repository/index.go Outdated Show resolved Hide resolved
internal/repository/index.go Outdated Show resolved Hide resolved
internal/repository/index.go Outdated Show resolved Hide resolved
internal/repository/index.go Outdated Show resolved Hide resolved
aawsome and others added 5 commits June 14, 2020 07:50
A side remark to the definition of Index.blob:

Another possibility would have been to use:
blob       map[restic.BlobHandle]*indexEntry

This would have led to the following sizes:
key: 32 + 1 = 33 bytes
value: 8 bytes
indexEntry:  8 + 4 + 4 = 16 bytes
each packID: 32 bytes

To save N index entries, we would therefore have needed:
N * OF * (33 + 8) bytes + N * 16 + N * 32 bytes / BP = N * 82 bytes

More precicely, using a pointer instead of a direct entry is the better memory choice if:
OF * 8 bytes + entrysize < OF * entrysize <=> entrysize > 8 bytes * OF/(OF-1)
Under the assumption of OF=1.5, this means using pointers would have been the better choice
if sizeof(indexEntry) > 24 bytes.
@aawsome
Copy link
Contributor Author

aawsome commented Jun 14, 2020

Here are the benchmark results on my laptop:

before the changes (only including commit 7419844):

BenchmarkPackerManager-4                           	       4	 265876473 ns/op	 198.21 MB/s	   93398 B/op	    1708 allocs/op
BenchmarkDecodeIndex-4                             	   77702	     16464 ns/op
BenchmarkIndexHasUnknown-4                         	13663558	        85.7 ns/op
BenchmarkIndexHasKnown-4                           	13718383	        87.5 ns/op
BenchmarkIndexAlloc-4                              	       1	1531219826 ns/op	904583360 B/op	 2707659 allocs/op
BenchmarkMasterIndexLookupSingleIndex-4            	 6282409	       217 ns/op
BenchmarkMasterIndexLookupMultipleIndex-4          	 1929421	       615 ns/op
BenchmarkMasterIndexLookupSingleIndexUnknown-4     	 8851906	       135 ns/op
BenchmarkMasterIndexLookupMultipleIndexUnknown-4   	 2059674	       564 ns/op
BenchmarkSaveAndEncrypt-4                          	 6194496	       191 ns/op	22009700.59 MB/s	       3 B/op	       0 allocs/op
BenchmarkLoadTree-4                                	    3322	    355004 ns/op
BenchmarkLoadBlob-4                                	     144	   8240706 ns/op	 121.35 MB/s
BenchmarkLoadAndDecrypt-4                          	     126	   9521347 ns/op	 105.03 MB/s
BenchmarkLoadIndex-4                               	      33	  40962411 ns/op
ok  	github.com/restic/restic/internal/repository	165.710s

with changes:

BenchmarkPackerManager-4                           	       4	 266619089 ns/op	 197.66 MB/s	   93466 B/op	    1709 allocs/op
BenchmarkDecodeIndex-4                             	   75967	     16353 ns/op
BenchmarkIndexHasUnknown-4                         	14869858	        85.5 ns/op
BenchmarkIndexHasKnown-4                           	13833954	        86.4 ns/op
BenchmarkIndexAlloc-4                              	       1	1130858274 ns/op	785641912 B/op	  976445 allocs/op
BenchmarkMasterIndexLookupSingleIndex-4            	 5429026	       223 ns/op
BenchmarkMasterIndexLookupMultipleIndex-4          	 1880902	       630 ns/op
BenchmarkMasterIndexLookupSingleIndexUnknown-4     	 8154114	       135 ns/op
BenchmarkMasterIndexLookupMultipleIndexUnknown-4   	 2070068	       556 ns/op
BenchmarkSaveAndEncrypt-4                          	 6143935	       192 ns/op	21827305.25 MB/s	       3 B/op	       0 allocs/op
BenchmarkLoadTree-4                                	    3522	    357187 ns/op
BenchmarkLoadBlob-4                                	     145	   8296829 ns/op	 120.53 MB/s
BenchmarkLoadAndDecrypt-4                          	     126	   9814544 ns/op	 101.89 MB/s
BenchmarkLoadIndex-4                               	      32	  40884265 ns/op

ok  	github.com/restic/restic/internal/repository	114.982s

Copy link
Member

@MichaelEischer MichaelEischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Grrr, it seems like Github is unable to just show the raw difference introduced by a force push. However, git diff b6d687e2..1361341c or git range-diff b6d687e2...1361341c (compares the patch series) come to the rescue.

@MichaelEischer MichaelEischer merged commit 74bc714 into restic:master Jun 14, 2020
@aawsome aawsome deleted the reduce-index-memory branch June 14, 2020 09:02
@greatroar
Copy link
Contributor

Confirmed that this works: restic mount of a 1TB repo used 1.714GiB peak RAM in 1d66bb9, now 1.325GiB, a 22.7% reduction. The discrepancy with the predicted 41% is largely due to the blob size cache.

@aawsome
Copy link
Contributor Author

aawsome commented Jun 14, 2020

I see I should bring #2587 into a good shape again 😉 - will rebase it to master soon...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants