-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage #90
Comments
I've repeated my experiment from #85. Tests below were done with version bda33e6. To recap: My test set: 80449 directories, 318949 files (71Gb) Machine used:
Backup is on same-disk SSD. First run
Strangely, it appears to have become even more hungry: In #85 this was Right before exiting it jumps up a little higher: Is there any way we can get better numbers from this? Can we do heap profiling in Go? Looking at the numbers, things also seemed to have slowed down a bit. Mind you though: this is a single run and highly unscientific. Second run
Any ideas how we could easily track this over time? |
Thanks for the data, I expected something like this. Yesterday, I've changed the chunker implementation to use streams instead of buffers, but the complete tree is still held in memory. My plan is to change the internals to use a pipeline-style processing infrastructure, but this is not yet finished and to be done this week. Before that is possible, I need to upgrade several other internal components (notably encryption and backend) to use streams. We'll see how that goes. Building a pipeline to process a tree structure of dependend nodes concurrently is surprisingly hard... Do you have a good idea for a test that tracks memory usage over time? |
The talk "Taking Out the Trash" by @kisielk is interesting, especially the |
I spent some time to implement a pipeline-based architecture for backup and switched several internal parts to use the new stream-based (io.Reader/Writer) infrastructure, the results look promising:
The files
Where
I have just pushed some commits to master (master is 06ed5c1 at the moment), if you like, please have a look. Unfortunately, incremental backups are broken at the moment, the infrastructure is not yet ready. |
Just pushed several improvements, it'd be good to have feedback on this :) The current RAM usage for my
And it's even faster, too! I usually got only ~30MiB/s until today... btw: |
@rubenv Did you find time yet to give it a shot again? |
Yes I did! It's gotten better. Here's the output:
As for the memory usage: it starts up at about 250mb (much much less than before). Then it slowly starts to climb: |
Second backup of same data set seems to error though:
At this point it just hangs (no activity). Note the stat error, there's probably something broken in there. Had to kill it. For completeness, the memory usage at this point: Both runs were performed with git revision f51aba1 |
Without the snapshot reference:
This backup barely introduced new data (which is why it's so fast), so I guess the memory bubbling is in the blob packing. |
Thanks for trying it again. I must acknowledge that you found a bug: the second run with
Once #105 is done this should not be necessary any more. |
restic doesn't terminate in this case, this was described by @rubenv in #90 #90 (comment)
I think this issue is closed for now. |
In #85 @rubenv reported huge memory usage, this should be tracked (and fixed) in this issue.
@rubenv Could you try again with the current master branch? Thanks a lot!
The text was updated successfully, but these errors were encountered: