-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive memory use when uploading #233
Comments
Hi, as I said on the forum I'll be happy to test a release candidate for this fix, simply reach me here on this issue. |
Forgive me, since its the holidays I'll be responding from a slight bit of ignorance, as I haven't gone back to look at the code in question. My feeling is that this problem is unfortunately not about memory use or garbage made, I think it's more a problem where various use goals aren't aligning. The end use of network copy package has the primary goal of moving bytes across a network as fast as possible. As a secondary goal in some situations, you also want to constrict how much memory can be utilized at any moment (however, not always). The achievement of those goals across multiple scenarios is where everything kinda goes bad. Are we:
What I remember of the existing structure (my memory may be faulty) was that there were three knobs:
My memory may be faulty on this, but I remember the design being so that the common case was to use a single client to transfer files to a single endpoint (reusing the sync.Pool). The API provided restrictions through an options struct, if I remember. I didn't want to make changes to the API without thinking about an overhaul of the package, so that lead to some restrictions on what I was going to change. For use cases I tests (which revolved around kusto and azcopy), using these changes I could increase transfer speeds by up to 10x for many transfer types (though my main goal was to fix a couple of bugs). So apparently I was sacrificing your use case to increase others. I remember thinking when I was making the change that a good long term plan would be to have an interface that looked something like: type TransferManager interface{
Get() *bytes.Buffer
Put(b *bytes.Buffer)
Run(func() error)
} This would get passed in whenever someone made a streaming call. There would be several pre-defined types that could be used and users could define their own if none of those were satisfactory. The code using TransferManager would never use more than the size of an allocated buffer (unless it was 0, which would error). Run() would act as a thread-pool manager (which is more efficient than spinning goroutines off per copy). So if static buffers were better for a use case, you could do that (the method before the change). Or a sync.Pool for others. Sometimes a combination of both is great (done this a few times). For long term services doing copies in BE qos transfer class, you could write something that responds to network back pressure (dialing up or down your attempted copy speed). And because you would pass your concrete type to the call, you could choose to share this between all calls or create a new one per call. I'll dig into this deeper next week, going to try enjoying the holiday, but I am concerned specifically with the OOM mentioned. I'm not familiar with rclone, so if you could link me the code where its using the streaming code and whatever you were running to pull the memory profile, I'd appreciate it. Thanks for bringing this up and tagging me. |
The main upload code is here in rclone Rclone calls azblob.UploadStreamToBlockBlob azure-storage-blob-go/azblob/highlevel.go Lines 386 to 388 in 456ab47
which turn calls copyFromReader which is the code in question.
I don't think there is any way of re-using the Since we are making and throwing away a sync.Pool for each object uploaded, the sync.Pool retains its memory until some indeterminate time later, hence the memory leak. It may well be that there is a better interface for rclone to be calling here - all it wants to do is upload a short blob of known size which doesn't need to be chunked. It has a separate multipart upload routine of its own and doesn't use the transfer manager (which I didn't know about until now!). Here are docs on how to get memory profiles out of rclone: https://rclone.org/rc/#debugging-memory-use - you'll need the Enjoy your holidays - no rush :-) |
Thanks for getting back to me. I'm making some changes now that hopefully will give a more robust and customizable solution. Though, I feel that we should probably consider just re-writing the streaming interface into a client model, I'm going to keep this change API compatible. Back to you next day or so. |
See: Azure#233 There needs to be a more robust way of controling memory reuse and threadpooling for streaming copies (or maybe all copies, I will not speak to that here). This introduces a new interface called TransferManager. This provides 3 main fuctions: - Control buffer allocation - Control goroutine allocation - Allow reuse This allows a developer to control exactly how they wish to handle this for their use case. We provide two implementations: - Static buffer allocation similar to the original implementation - Threadpool allocation similar to how it currently works since my change Additionally, both of these support having shared pools between calls which neither did before. It also allows for custom creation of implementations by users that better support their needs (a hybrid of the two above, one that adjusts buffer or concurrency based on transfer speed, ...) Note: the behavior by default returns to the static buffer method here. If azcopy likes the current behavior, it should switch using NewSyncPool() Like the original change, this does not change the API compatibility. I think that is probably something that should be done, but not in this change. There are 3 files here that were not part of my change, but were not go fmt, so they have been updated by the go tool: - bytes_wrtier.go - zc_* All tests without the end to end test system that are expected to pass do (comparing the non-changed version against this one).
Added PR: #234 |
Add option to UploadStreamToBlockBlob to specify custom BufferPool. This is needed to for example reuse pool between transfers. Relates to Azure#233
Add option to UploadStreamToBlockBlob to specify custom BufferPool. This is needed to for example reuse pool between transfers. Relates to Azure#233
Any news on this? It came up again in the rclone forum. @ajankovic are you working on a fix for this? Your idea of passing in a pool looks like a great fix to me. |
This fixes the excessive memory use in the Azure SDK by using the as yet unmerged PR to implement a custom memory allocator. Azure/azure-storage-blob-go#242 This is to fix the underlying bug Azure/azure-storage-blob-go#233 See: https://forum.rclone.org/t/ask-for-settings-recommendation-for-azureblob/21505/
@ncw yes the PR is one way to solve it but it's not up to me to merge it. Although I am not sure why the tests are failing because it's not related to my changes in any way, they are probably flaky. |
You need your environment setup to run some of the tests, I remember this from doing the original patch. I left comments on this or you patch in your PR and linked in my PR. I believe azblob has some type of release cadence so things get looked at as those approach. Btw, @ajankovic, thanks for the patch on what was a stupid oversight from my patch. I detangled a bunch of stuff and had to write benchmarks that didn't go into the repo to make sure certain things were at least as fast as before for certain apps. I lost the forrest because of the trees it appears. |
I've got a user testing your patch @ajankovic on the rclone forum. Hopefully it will fix their problem. We need a fix for this ASAP though it is causing a lot of problems with rclone users in particular. |
@ncw we have our own fork of rclone that uses my patch and it's definitely working in our tests. Having all of this upstream would be even better to cut on maintenance.
But this failed on CI where the environment is correctly setup. And my PR is not changing default behaviour nor integration tests, if pool is not provided it's not used. I'll check it out. |
I am 99% sure that certain tests never works in the CI, that this only can be tested locally if you have the right setup. I'm pretty sure only the devs have that setup. |
See: #233 There needs to be a more robust way of controling memory reuse and threadpooling for streaming copies (or maybe all copies, I will not speak to that here). This introduces a new interface called TransferManager. This provides 3 main fuctions: - Control buffer allocation - Control goroutine allocation - Allow reuse This allows a developer to control exactly how they wish to handle this for their use case. We provide two implementations: - Static buffer allocation similar to the original implementation - Threadpool allocation similar to how it currently works since my change Additionally, both of these support having shared pools between calls which neither did before. It also allows for custom creation of implementations by users that better support their needs (a hybrid of the two above, one that adjusts buffer or concurrency based on transfer speed, ...) Note: the behavior by default returns to the static buffer method here. If azcopy likes the current behavior, it should switch using NewSyncPool() Like the original change, this does not change the API compatibility. I think that is probably something that should be done, but not in this change. There are 3 files here that were not part of my change, but were not go fmt, so they have been updated by the go tool: - bytes_wrtier.go - zc_* All tests without the end to end test system that are expected to pass do (comparing the non-changed version against this one). Co-authored-by: John Doak <jdoak@DANEWMAN-LENOVO.redmond.corp.microsoft.com>
@element-of-surprise @ajankovic Thanks a lot for the contribution! We really appreciate it. I also apologize for the long delay in reviewing the PR. We really like #242 but chose to merge #234 instead, as it is more future-proofing. Thanks for the detailed analysis! @ncw |
We'll make a release soon. |
This fixes the excessive memory use in the Azure SDK by using the dev branch which contains a fix for it. Rclone then implements a custom TransferManager which integrates with rclone's memory pool. This is to fix the underlying bug Azure/azure-storage-blob-go#233 See: https://forum.rclone.org/t/ask-for-settings-recommendation-for-azureblob/21505/
Thanks for fixing this :-) If you could make the release in the next day or two I can fit it into rclone 1.54 which would be very convenient for me :-) |
It is done! Sorry again for the long delay. |
…ransferManager In the Azure SDK there was a bug which caused excessive memory use when doing repeated transfers: Azure/azure-storage-blob-go#233 This patch updates the SDK to v0.13.0 which allowed us to implement a custom TransferManager which integrates with rclone's memory pool. This fixes the excessive memory consumption. See: https://forum.rclone.org/t/ask-for-settings-recommendation-for-azureblob/21505/
Thank you very much for making the release @zezha-msft I've put this fix into rclone now and it will be released with rclone 1.54 and many users will rejoice! |
…ransferManager In the Azure SDK there was a bug which caused excessive memory use when doing repeated transfers: Azure/azure-storage-blob-go#233 This patch updates the SDK to v0.13.0 which allowed us to implement a custom TransferManager which integrates with rclone's memory pool. This fixes the excessive memory consumption. See: https://forum.rclone.org/t/ask-for-settings-recommendation-for-azureblob/21505/
Which version of the SDK was used?
v0.11.0
Which platform are you using? (ex: Windows, Linux, Debian)
Linux
What problem was encountered?
Excess memory usage compared to v0.8.0
How can we reproduce the problem in the simplest way?
Uploading lots of files with rclone works pretty well.
Have you found a mitigation/solution?
Setting
GOGC=20
mitigates the problem. This indicative of too much garbage being created.Analysis
In c0650f8 #167 by @element-of-surprise the chunked uploader was refactored and a
sync.Pool
was introduced.This
sync.Pool
appears to be causing memory problems according to the memory profiles below.Testing with the azureblob backend in rclone produces the memory profile attached
memprofile-1.53.3.log
This looks like
Compared to the profile from the previous version of rclone, which shows no use of sync.Pool.
This caused memory usage for rclone to go up from 400MB to 1.4 GB between the two versions. The old version of rclone was using v0.8.0
What appears to be happening is that memory is getting stuck in the
sync.Pool
.From the sync.Pool docs:
I think the use of sync.Pool is falling into that case. The Pool isn't shared between connections so it isn't really helping.
Because it is created and destroyed for each upload the old Pools will hang around in memory waiting to be garbage collected this is and the reason why GOGC=20 works around the problem I think.
Since you are recycling buffers of a fixed size maybe a
chan []byte
might work better instead of thesync.Pool
- you could then flush it at the end of the transfer (unlike async.Pool
which you can't flush).Or alternatively implement a global
sync.Pool
. You'll need one for each size of chunk though which is a disadvantage.This was originally discussed on the rclone forum.
The text was updated successfully, but these errors were encountered: