Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blob upload results in excessive memory usage/leak #894

Closed
pawboelnielsen opened this issue May 22, 2019 · 34 comments
Closed

Blob upload results in excessive memory usage/leak #894

pawboelnielsen opened this issue May 22, 2019 · 34 comments

Comments

@pawboelnielsen
Copy link

pawboelnielsen commented May 22, 2019

Which service(blob, file, queue, table) does this issue concern?

Blob

Which version of the SDK was used?

Microsoft.Azure.Storage.Blob -Version 10.0.3

Which platform are you using? (ex: .NET Core 2.1)

.net core 2.2.300

What problem was encountered?

OutOfMemory / excessive memory usage

How can we reproduce the problem in the simplest way?

namespace ConsoleAppAzureBlobMemoryIssue
{
    class Program
    {
        static void Main(string[] args)
        {
            CloudStorageAccount account = CloudStorageAccount.Parse("StorageConnectionString");
            CloudBlobClient serviceClient = account.CreateCloudBlobClient();

            var container = serviceClient.GetContainerReference("backup");
            container.CreateIfNotExistsAsync().Wait();

            string path = @"c:\folder\gigafile.zip"; 
            CloudBlockBlob blob = container.GetBlockBlobReference(Path.GetFileName(path));
            blob.UploadFromFileAsync(path).Wait();
        }
    }
}

Run the program, open Task Manager and "enjoy" how the memory usage of dotnet.exe explodes as the file gets uploaded, the memory usages increases directly with amount of bytes uploaded sofar. So if you attempt to upload a 50GB file you will need +50GB of free memory.

Have you found a mitigation/solution?

Last working version is WindowsAzure.Storage -Version 9.3.3, newer versions contains the issue.

@carlkenne
Copy link

carlkenne commented Jun 4, 2019

we're about to upgrade from v8 to v10. But then I saw this, it looks like a show stopper?

@pawboelnielsen
Copy link
Author

pawboelnielsen commented Jun 4, 2019

we're about to upgrade from v8 to v10. But then I saw this, it looks like a show stopper?

Depends on the size of the files you want to upload and the memory available - the memory is released when the upload is complete, it is only during uploading the memory is not released.

@djj0809
Copy link
Member

djj0809 commented Jun 23, 2019

I ran into the same problem when upload large file (> 5G)

.Net Core 3.0.0-preview6 & Microsoft.Azure.Storage.Blob version 10.0.3

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at Microsoft.Azure.Storage.Core.Util.StorageAsyncResult1.End() at System.IO.Stream.<>c.<BeginEndWriteAsync>b__61_1(Stream stream, IAsyncResult asyncResult) at System.Threading.Tasks.TaskFactory1.FromAsyncTrimPromise1.Complete(TInstance thisRef, Func3 endMethod, IAsyncResult asyncResult, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Azure.Storage.Blob.BlobWriteStream.WriteAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken token)
at Microsoft.Azure.Storage.Core.Util.AsyncStreamCopier1.StartCopyStreamAsyncHelper(Nullable1 copyLength, Nullable1 maxLength, CancellationToken token) at Microsoft.Azure.Storage.Core.Util.AsyncStreamCopier1.StartCopyStreamAsync(Nullable1 copyLength, Nullable1 maxLength, CancellationToken cancellationToken)
at Microsoft.Azure.Storage.Blob.CloudBlockBlob.UploadFromStreamAsyncHelper(Stream source, Nullable`1 length, AccessCondition accessCondition, BlobRequestOptions options, OperationContext operationContext, AggregatingProgressIncrementer progressIncrementer, CancellationToken cancellationToken)
at Microsoft.Azure.Storage.Blob.CloudBlockBlob.UploadFromFileAsync(String path, AccessCondition accessCondition, BlobRequestOptions options, OperationContext operationContext, AggregatingProgressIncrementer progressIncrementer, CancellationToken cancellationToken)

@djj0809
Copy link
Member

djj0809 commented Jun 27, 2019

Have you found a mitigation/solution?

In WindowsAzure.Storage -Version 9.3.3 it works however this version we cannot use as some files are corrupted when uploaded.

@pawboelnielsen May I ask in what condition that files will be corrupted in Version 9.3.3? I think I have to go back to 9.3.3.

@pawboelnielsen
Copy link
Author

@djj0809 There was an issue if an upload was aborted before completed, then upon retrying the upload with the same file name we would get an error: The specified blob or block content is invalid.
This error can be avoid by running the delete command before uploading like this: blob.DeleteIfExistsAsync()
So it was a minor issue to fix, aside from that 9.3.3 works fine for us.

@djj0809
Copy link
Member

djj0809 commented Jun 27, 2019

@djj0809 There was an issue if an upload was aborted before completed, then upon retrying the upload with the same file name we would get an error: The specified blob or block content is invalid.
This error can be avoid by running the delete command before uploading like this: blob.DeleteIfExistsAsync()
So it was a minor issue to fix, aside from that 9.3.3 works fine for us.

A big thank you for your really quick response:)

@markheath
Copy link

I'm seeing the same issue (OutOfMemoryException) uploading a 3.5GB file with UploadFromFileAsync in v10.0.0.3

@markheath
Copy link

Can't even work around it with UploadFromStreamAsync - that also seems to require enough memory to hold the entire file:

using (var s = File.OpenRead("large-file.zip"))
{
    await targetBlob.UploadFromStreamAsync(s);
}

@pawboelnielsen
Copy link
Author

pawboelnielsen commented Jun 28, 2019

Can't even work around it with UploadFromStreamAsync - that also seems to require enough memory to hold the entire file:

Internally both methods are using the same upload implementation.

@markheath
Copy link

I'd be interested to know if there are any plans to fix this issue soon, or whether there is an underlying technical limitation that requires the entire file to be held in memory. I recently updated our application to the latest version of Microsoft.Azure.Storage.Blob following the advice given here, but this high memory usage is an absolute showstopper for us, and looks like I'm going to have to roll everything back to WindowsAzure.Storage 9.3.3.

Interestingly it doesn't seem to be a problem if the source isn't from a file. I can use UploadFromStreamAsync to copy from a stream from another large blob opened with OpenReadAsync and memory usage stays well below the overall blob size.

@djj0809
Copy link
Member

djj0809 commented Jun 28, 2019

I'd be interested to know if there are any plans to fix this issue soon, or whether there is an underlying technical limitation that requires the entire file to be held in memory. I recently updated our application to the latest version of Microsoft.Azure.Storage.Blob following the advice given here, but this high memory usage is an absolute showstopper for us, and looks like I'm going to have to roll everything back to WindowsAzure.Storage 9.3.3.

Interestingly it doesn't seem to be a problem if the source isn't from a file. I can use UploadFromStreamAsync to copy from a stream from another large blob opened with OpenReadAsync and memory usage stays well below the overall blob size.

I browsed some pieces of code (trying to find the bug), I think if we wrap FileStream with some un-seekable stream (aka CanSeek = false), it may not have this problem.

Code internally seems will determine if it's a seekable stream, and will do some optimization for seekable stream (which sauses this issue).

@markheath
Copy link

Yeah I had a quick browse through the code, with UploadFromMultiStreamAsync and SubStream being possible places that are holding onto more memory than they should, but couldn't spot an obvious bug.

Not 100% sure about whether just setting CanSeek = false will be enough. For example, this works just fine, copying a huge file from blob to blob, even through CanSeek == true on the source stream:

async Task CopyWithUploadFromStreamAsync(CloudBlockBlob fromBlob, CloudBlockBlob toBlob)
{
    using (var sourceStream = await fromBlob.OpenReadAsync())
    {
        // n.b. sourceStream.CanSeek is true
        await toBlob.UploadFromStreamAsync(sourceStream);
    }
}

@dcarr42
Copy link

dcarr42 commented Jun 28, 2019

@markheath What's the memory profile usage for blob to blob sample. I would like someone on the team to officially consider this a break in functionality.

@markheath
Copy link

From watching task manager, it slowly ramps up to about 200Mb-300Mb during the copy, and then presumably a GC collect occurs and memory usage drops back down to 50Mb. Managed to copy the whole 3.5GB blob using the technique shown above in 12 minutes with the memory usage of the app not going above about 300MB most of the time, although towards the end of the copy it did reach 500Mb at one point. Presumably though that's just about when GC's occur. It's only when I try to upload the same thing from a local file that I get the memory leak issue.

@dcarr42
Copy link

dcarr42 commented Jun 28, 2019

Pretty healthy then. Could try increasing the block size closer to the upper bound (100mb) to reduce the number of parallel streams, the 4mb default could be modelling smaller payloads. But to be honest the implementation seems a little sketchy.

@pawboelnielsen
Copy link
Author

From watching task manager, it slowly ramps up to about 200Mb-300Mb during the copy, and then presumably a GC collect occurs and memory usage drops back down to 50Mb. Managed to copy the whole 3.5GB blob using the technique shown above in 12 minutes with the memory usage of the app not going above about 300MB most of the time, although towards the end of the copy it did reach 500Mb at one point. Presumably though that's just about when GC's occur. It's only when I try to upload the same thing from a local file that I get the memory leak issue.

Using V9.3.3 with my simple upload code at no point the memory consumption goes above 30K even if the file being uploaded is 50GB. That is where we should be when this hopefully gets fixed.

@markheath
Copy link

markheath commented Jul 2, 2019

Here's a trace from DotMemory half way through an upload of a large file. As you can see it's being held onto by the HttpClientHandler RequestState and TaskCompletionSource.

image

I'm wondering if its this code in CloudBlockBlob.UploadFromMultiStreamAsync which references the localBlock so it can dispose it. I can't see anything wrong with the implementation of SubStream.Dispose though.

Task cleanupTask = uploadTask.ContinueWith(finishedUpload =>
{
    localBlock.Dispose();
});

Update: Having explored a bit more, I'm not sure the multi-stream code is running at all during the memory leak. Looks like AsyncStreamCopier is performing the copy in this scenario. If I get time I'll see if I can dig a bit deeper and trace through in a bit more detail

@markheath
Copy link

Here's a Visual Studio memory snapshot of it after it's leaked a bit of memory. Looks like the instances of MultiBufferMemoryStream are being kept alive by something. I've been looking into whether any TaskCancellationSource instances aren't getting disposed correctly, but so far haven't found anything.
image

@markheath
Copy link

Still trying to get to the bottom of this, and made an interesting discovery. I noticed that the docs say that ParallelOperationThreadCount should be set to 1 for 'small' blobs (less than 256MB), and so I tried uploading a large file with ParallelOperationThreadCount set to 2, and the memory leak issue is resolved (for my test file anyway). I also tried with a thread count of 4 which also works.

Trying to understand from the code why that should be the case has been a bit harder. It looks like AsyncSemaphoreAsync is used to back up pending writes, so I was wondering if perhaps reading was happening much faster than writing, and so a backlog builds up, but the pendingWaits queue does not appear to build up. It really does look like that with a thread count of 1, something is holding onto all the writable tasks (even the ones that have completed), and results in needing as much memory as the size of the entire file.

I've yet to try with a really huge file (e.g. 50GB) yet, so not sure if upping the ParallelOperationThreadCount is a bullet proof solution for all large files, but so far seems like this might be a workaround. I'll report back once I've tested with even bigger files.

@markheath
Copy link

I've managed to upload a 50GB file successfully with a thread count of 2, so the high memory usage clearly only applies when thread count is 1. And it looks like from #909 you can also avoid the high memory usage by increasing StreamWriteSizeInBytes above 4MB.

@pawboelnielsen
Copy link
Author

Thanks @markheath !

@markheath
Copy link

I tried with StreamWriteSizeInBytes above 4MB, but got high memory usage again, uploading a 8GB file it used a full 8GB RAM and then threw a StackOverflowException when the file had been uploaded, with the stack trace showing it unwinding lots of calls to release the async semaphore

image

@r-aghaei
Copy link

Setting StreamWriteSizeInBytes is not enough, you also need to disable MD5 hash computing (StoreBlobContentMD5 ), then regardless of the ParallelOperationThreadCount, the memory management would be fine.

@kfarmer-msft
Copy link
Contributor

@markheath --

Thanks for your work looking into this. I'm now freed up and looking into this on our end, and it's been helpful.

Your stack trace involving the async semaphore's ReleaseAsync method are interesting. In fact, I think it's the likely culprit here. There is a simple issue with its implementation which could lead to unnecessary recursion. If you've got the source, see AsyncSemaphore.Common:

        public async Task ReleaseAsync(CancellationToken token)
        {
            Func<bool, Task> next = null;
            lock (this.pendingWaits)
            {
                if (this.pendingWaits.Count > 0)
                {
                    next = this.pendingWaits.Dequeue();
                }
                else
                {
                    this.count++;
                }
            }

            if (next != null)
            {
                await next(false).ConfigureAwait(false); 
            }
        }

Awaiting a task and returning the results is unnecessary: just return the task.

            if (next != null)
            {
                return next(false);
            }
            else
            {
                return Task.Delay(0); // IIRC Task.Delay(0) should be a cached no-op.  I would prefer Task.CompletedTask, but need to verify targets.
            }

A 5GB file no longer blows up memory. I've got a debug run with a 50GB file and I'm seeing much better behavior than before.

@kfarmer-msft
Copy link
Contributor

Nope... the multibuffer is still accumulating. I'll keep looking at this.

@kfarmer-msft
Copy link
Contributor

I found cases of the multibuffer not being disposed of. I've added some tests to verify the delta peak working set is < 128 MB (what I have been seeing in testing on my machine is an absolute peak of < 64 MB). After further checks, I'll see about getting this in place for our next release.

@markheath
Copy link

That sounds great, thank you for looking into this. I look forward to trying out the updated version.

@seanmcc-msft
Copy link
Member

Fixed in v11.0.0.

@GFoley83
Copy link

Hi @seanmcc-msft, just to clarify: was this fixed for uploading a file using blob.UploadFromFileAsync(path) and/or blob.UploadFromStreamAsync(stream)? Thanks.

@seanmcc-msft
Copy link
Member

Hi @GFoley83 , if I recall correctly, it was fixed in both.

-Sean

@Bud0211
Copy link

Bud0211 commented Jun 23, 2020

Hi @seanmcc-msft , @kfarmer-msft , Could you confirm that the StackOverflow issue is resolved entirely with the blob.UploadFromStreamAsync(stream) API? Still seeing some StackOverflow issues with larger file uploads on v11.1.3?

@CooperRon
Copy link

I am still getting the StackOverflow exception when trying to upload large blobs (i.e. more than 2GB) with ParallelOperationThreadCount=1. When using ParallelOperationThreadCount > 1 the operation is successful.

I am using Microsoft.Azure.Storage.Blob nuget version 11.0.0 but it also happens with the latest version (11.2.2).
Running with .NET framework 4.7.2.

@Bud0211
Copy link

Bud0211 commented Aug 26, 2020

@CooperRon have you tried increasing the block size (setting the StreamWriteSizeInBytes property of the CloudBlockBlob object that you're uploading) and inserting a retry operation (RetryPolicy)?

@CooperRon
Copy link

@CooperRon have you tried increasing the block size (setting the StreamWriteSizeInBytes property of the CloudBlockBlob object that you're uploading) and inserting a retry operation (RetryPolicy)?

I was always trying with retry policy (exponential) and now I have tested with different StreamWriteSizeInBytes values:

  • 4MB - failed with stack overflow (this is the value I was always running with)

  • 8MB/16MB - succeeded, even with ParallelOperationThreadCount=1. Memory usage increased significantly here, from ~500MB to 750MB/1GB respectively (but this is probably expected). Just a note, I am uploading a 4.6GB blob.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests