Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetObjectRequest hangs randomly. #152

Closed
Stormsys opened this issue Dec 8, 2014 · 47 comments
Closed

GetObjectRequest hangs randomly. #152

Stormsys opened this issue Dec 8, 2014 · 47 comments
Labels
guidance Question that needs advice or information.

Comments

@Stormsys
Copy link
Contributor

Stormsys commented Dec 8, 2014

Randomly my application will hang infinity waiting for a response in the Amazon library for AmazonS3Client.GetOject see the stacktrace below, the object is not being reused.

image

here is a sample of the code run in that thread:

using (var amazonS3Client = CreateS3Client())
{
    var s3PartDownloadRequest = new GetObjectRequest
    {
        BucketName = download.Bucket,
        Key = download.ObjectKey,
        ByteRange = new ByteRange(position, end)
    };

    using (var getObjectResponse = amazonS3Client.GetObject(s3PartDownloadRequest)) //hangs on this line
    {    
        getObjectResponse.WriteResponseStreamToFile(dest);
    }
}
@Stormsys
Copy link
Contributor Author

Stormsys commented Dec 8, 2014

P.S. we are currently running AWS 2.3.11 but the issue was present with 2.3.9 also.

@Stormsys
Copy link
Contributor Author

Stormsys commented Dec 8, 2014

http://support.microsoft.com/kb/980817 perhaps is a likely culprate? Only thing is im pretty sure were running .net 4.5

@Stormsys
Copy link
Contributor Author

Using the following code works, so i believe there's an issue with the sync version in the SDK:

using (var amazonS3Client = CreateS3Client())
{
    var s3PartDownloadRequest = new GetObjectRequest
    {
        BucketName = download.Bucket,
        Key = download.ObjectKey,
        ByteRange = new ByteRange(position, end)
    };

    using (var getObjectResponse = amazonS3Client.GetObjectAsync(s3PartDownloadRequest).GetResult())//async with my own sync method extension.
    {    
        getObjectResponse.WriteResponseStreamToFile(dest);
    }
}


//GetResult:
        public static T GetResult<T>(this Task<T> task)
        {
            task.ConfigureAwait(false);
            task.Wait();

            return task.Result;
        }

@Stormsys
Copy link
Contributor Author

Bug also observed with "UploadPart" calls(non async version). i belive there is a bug in your "Invoke" flow.

@gokarnm
Copy link
Contributor

gokarnm commented Dec 10, 2014

Thanks @Stormsys for reporting this issue and providing details! I'll look into this. I have a few questions about your application.

  1. Is your application a Windows or an ASP.NET application?
  2. Is the code which calls the S3 API multithreaded? If yes, how many concurrent threads are used?
  3. How frequently do you see this issue? Have you been able to replicate this issue on another machine?
  4. Have you noticed this issue with any other AWS SDK APIs apart from S3?

Because the hang is intermittent, it would be helpful to replicate this issue under different conditions.

  1. Can you switch to an http endpoint instead of https and check if you get the same issue?
    The following snippet shows how you can switch to http.

var s3config = new AmazonS3Config() { UseHttp = true }; var client = new AmazonS3Client(s3config);

@Stormsys
Copy link
Contributor Author

Hi,

Just to add, with upload part even with my sync wrapper the bug persists, even Async().wait() never detects a completion state sometimes, interestingly we have not seen this in GetObject since we switched over to the async version.

To awsner your questions:

  1. Its a x64 windows application, running as a windows service.
  2. the code is mutlithreaded and peaks at about 100~ threads however at the point where the bug is visible and hooking the deubgger in only 3 threads are active, and they are not interdependent as such (i'm certain its not a classic deadlock) it might be worth mentioning that we have 32 hardware threads on the server running the code.
  3. I've seen this issue at least 1-2 times per day, perhaps every 500-700 requests or so.
  4. I'm not currently using any other services as it stands.

and i can certainly try http, but realistically this probably would not be an satisfactory workaround.

@gokarnm
Copy link
Contributor

gokarnm commented Dec 10, 2014

Thanks @Stormsys , yes I understand, I suggested trying out http to isolate the behavior.
Have you seen this behavior on any other server machines?

@Stormsys
Copy link
Contributor Author

@gokarnm i will try the Http setting tomorrow, we do have another server we can test on but have not yet done so, can also do this if it helps.

@saguiitay
Copy link

Has anyone been able to resolve this issue? I'm facing it too...

@theofanis
Copy link

I also experience this sometimes, UploadPartAsync never returns, and actually, it doesn't even upload, since the StreamTransferProgress handler doesn't get anything.

I also provide a CancellationToken, and when the problem occurs, it even ignores the cancellation, so there's actually no way to get this task finished.

@kobi
Copy link

kobi commented Oct 29, 2015

In case anyone still has this issue, I've found two changes that removed the problem:

  • Before you initialize any of ASW SDK's class set ServicePointManager.DefaultConnectionLimit = 1000;.
    I set it to 1000, but any big number works. MSDN says the default number is Int32.MaxValue, but in fact it is just 2.
    The AWS SDK for .Net does use a value of 50, but that might not be enough (I also use SWF, which requires a large number of open connections).
  • Do not share instances of AmazonS3Client between threads. The SDK uses HttpWebRequest, which isn't thread safe. It isn't clear if some methods are thread safe, and I don't think the SDK guarantees this, so I try not to use the same instance in parallel.

Two more comments:

  • I got hanging connections when opening a stream using HttpWebRequest directly to a public S3 URL (public file/static content), without using the SDK at all. This is how I've found the two workarounds above.
  • I get more hanging connections at the office than at home.

@randall-peakey-com
Copy link

I am seeing something very similar, although we are using the PutObject method.
We can set the DefaultConnectionLimit very high....but as soon as that number of connections has been reached the application hangs.

Running "netstat -a -n | find /c "54.231." indicates the same number of connections
If we run a simple "netstat -a -n" shows that the connections in the "CLOSE_WAIT" state.

These connections remain in this state until the application is terminated (this has caused problems when using the library in a website......it must be restarted to clear).

My guess is that the underlying httpwebrequest connection is not being closed properly.

We have tried (without success) setting...
System.Net.ServicePointManager.DefaultConnectionLimit = 1000;
System.Net.ServicePointManager.SetTcpKeepAlive(false, 1, 1);
System.Net.ServicePointManager.MaxServicePointIdleTime = 10 * 1000;

@djluck
Copy link
Contributor

djluck commented Mar 21, 2016

I think the key might be to dispose of the GetObjectResponse as quickly as possible. In my program, I'm concurrently downloading the contents of an entire bucket (with 35 concurrent worker tasks). I noticed that I started seeing object requests hang indefinitely if I didn't immediately read the contents of GetObjectResponse.ResponseStream into memory and dispose of the stream.
Fiddling with the DefaultConnectionLimit didn't seem to offer any improvement, only the quick disposal of the stream made any difference for me.

@MikesGlitch
Copy link

I'm using the Quartz Scheduler for .Net and I was receiving the problem intermittently during a Job Execution. I have implemented Kobi's and Stormsys's posts and they both improved matters - but didn't fully fix the problem.

The only thing that HAS managed to fix the problem has been to make a new instance of my S3Client in my Job Execution class whenever it executes rather than resolving it using dependency injection - I think it has something to do with thread safety. After implementing this, along with Kobi's and Stormsys's posts I haven't experienced the problem any more.

@thoean
Copy link

thoean commented May 26, 2016

I have a related problem, using the asp.net core SDK version 3.2.3-beta. My problem is related to executing GetObjectAsync calls in parallel, but reading through the thread, it might be highly related. I've posted the problem at http://stackoverflow.com/questions/37471477/download-files-from-s3-in-parallel-aws-net-sdk before I found this thread.

Does the AmazonS3Client have any synchronization or shared state?

@lewislabs
Copy link

I've been experiencing this, and on investigating the issue I think the line to blame is https://github.com/aws/aws-sdk-net/blob/aws-sdk-net-v2/AWSSDK_DotNet35/Amazon.Runtime/Pipeline/HttpHandler/HttpHandler.cs#L104.
If the GetResponse method throws a WebException internally, then the response stream will never be closed. That's consistent with seeing connections hanging in the close_wait state.

@sstevenkang
Copy link
Contributor

The new 3.3.1 version of Core contains the PR #449 which addresses this problem. Please let us know if the problem persists. Thanks!

AnthonySteele pushed a commit to AnthonySteele/JustSaying that referenced this issue Nov 10, 2016
This is the latest release version and contains a fix to aws/aws-sdk-net#152
Which is likely the case of our "hangs when reading from queues" issue
We have a workaround, but the internal fix is IMHO better
AnthonySteele pushed a commit to AnthonySteele/JustSaying that referenced this issue Nov 10, 2016
This is the latest release version and contains a fix to aws/aws-sdk-net#152
Which is likely the case of our "hangs when reading from queues" issue
We have a workaround, but the internal fix is IMHO better.

The question is, do we still need the "linkedCancellationToken" measure that fixes it?
If we view it as a fix to a very specific bug that we are sure is fixed then no.
If we view it as a general "belt and braces" reliability measure then yes.
@sstevenkang
Copy link
Contributor

Closing due to inactivity. If any of you guys encounter this issue again, feel free to reopen it. Thanks!

@rsrini83
Copy link

rsrini83 commented Feb 4, 2017

I'm not sure how to reopen this issue. So adding my problem here.

We have a server side application in .Net developed using Nancyfx framework. Running as selfhost. This application receives multipart request with multiple files(around 100). All these files are supposed to be upload to S3 bucket. Using Parallels to upload files to S3 bucket. Right now creating s3 object for every task. This is causing too many HTTP connections and after a while system is become slow or s3 latency increases. We have optimized at TCP level, to reduce the TcpWaitTimeDelay to 30 seconds.

Can anyone help how to resolve this issue ?
How can we reduce AmazonS3Client HTTP connection pool ?

Using Windows 2012
AWS SDK version : 3.3.7

Let me know if any further information require.

Thanks in advance.

@PavelSafronov
Copy link

Separate issue was opened for this question, no need to re-open - #546

@onyxmaster
Copy link

Hangs on AWSSDK.Core 3.3.12 + AWSSDK.S3 3.3.5.12.

mscorlib.dll!System.Threading.WaitHandle.WaitOne(int millisecondsTimeout, bool exitContext)
System.dll!System.Net.LazyAsyncResult.WaitForCompletion(bool snap)
System.dll!System.Net.HttpWebRequest.GetResponse()
AWSSDK.Core.dll!Amazon.Runtime.Internal.HttpRequest.GetResponse()
AWSSDK.Core.dll!Amazon.Runtime.Internal.HttpHandler<System.IO.Stream>.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.Internal.RedirectHandler.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.Internal.Unmarshaller.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.S3.dll!Amazon.S3.Internal.AmazonS3ResponseHandler.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.Internal.ErrorHandler.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.Internal.CallbackHandler.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.Internal.RetryHandler.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.Internal.CallbackHandler.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.Internal.CallbackHandler.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.S3.dll!Amazon.S3.Internal.AmazonS3ExceptionHandler.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.Internal.MetricsHandler.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.Internal.RuntimePipeline.InvokeSync(Amazon.Runtime.IExecutionContext executionContext)
AWSSDK.Core.dll!Amazon.Runtime.AmazonServiceClient.Invoke<Amazon.S3.Model.PutObjectRequest, Amazon.S3.Model.PutObjectResponse>(Amazon.S3.Model.PutObjectRequest request, Amazon.Runtime.Internal.Transform.IMarshaller<Amazon.Runtime.Internal.IRequest, Amazon.Runtime.AmazonWebServiceRequest> marshaller, Amazon.Runtime.Internal.Transform.ResponseUnmarshaller unmarshaller)
AWSSDK.S3.dll!Amazon.S3.AmazonS3Client.PutObject(Amazon.S3.Model.PutObjectRequest request)

@randall-peakey-com
Copy link

randall-peakey-com commented Apr 27, 2017

From our experience concurreny doesn't really matter, if you don't dispose, the connection will hang around until the application terminates. We are simply disposing of the object as soon as possible and all the problems have disappeared.

@craigbrett17
Copy link

@randall-peakey-com: Interesting. Okay, this will be my next attempt at a fix. Right now we return the whole GetObjectResponse and handle it elsewhere in the code, so I might just have to rewrite it to use only the stream and copy it out to a MemoryStream and return that. Thanks for the info.

@craigbrett17
Copy link

Alternatively, just rewrite everything that was using the GetObjectResponse to just be inside a using statement and it's actually done the trick, even without the DefaultConnectionLimit change. Thanks @randall-peakey-com!

@ghost
Copy link

ghost commented Aug 18, 2017

We have been hitting this issue sporadically in our application for over a year - always while hitting the S3 api concurrently on multiple threads. We just hit again tonight and I got a stack trace of the hung thread. See below. We have hit this on different sdk calls: GetObject, GetObjectMetadata, and PutObject. Our situation was greatly improved a while ago by auditing every call to the sdk to make sure we weren't leaking requests or s3 clients.

Here's the stack trace I captured tonight. At the time we detected this hang, we had 16 threads concurrently hitting s3. This on on version 3.3.0 of the sdk on .Net Framework 4.7, Windows Server 2016, running on a machine inside EC2. Does anything jump out here? I haven't tried moving all our S3 access to http (as ooposed to https) as I see suggested above. Is that still a recommended workaround?


StackTrace: at System.Net.UnsafeNclNativeMethods.OSSOCK.recv(IntPtr socketHandle, Byte* pinnedBuffer, Int32 len, SocketFlags socketFlags)
at System.Net.UnsafeNclNativeMethods.OSSOCK.recv(IntPtr socketHandle, Byte* pinnedBuffer, Int32 len, SocketFlags socketFlags)
at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, SocketError& errorCode)
at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.FixedSizeReader.ReadPacket(Byte[] buffer, Int32 offset, Int32 count)
at System.Net.Security._SslStream.StartFrameHeader(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security._SslStream.StartReading(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security._SslStream.ProcessRead(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.TlsStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.PooledStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.Connection.SyncRead(HttpWebRequest request, Boolean userRetrievedStream, Boolean probeRead)
at System.Net.ConnectStream.ProcessWriteCallDone(ConnectionReturnResult returnResult)
at System.Net.ConnectStream.CallDone(ConnectionReturnResult returnResult)
at System.Net.ConnectStream.CloseInternal(Boolean internalCall, Boolean aborting)
at System.Net.ConnectStream.System.Net.ICloseEx.CloseEx(CloseExState closeState)
at System.Net.HttpWebRequest.EndWriteHeaders_Part2()
at System.Net.HttpWebRequest.EndWriteHeaders(Boolean async)
at System.Net.HttpWebRequest.WriteHeadersCallback(WebExceptionStatus errorStatus, ConnectStream stream, Boolean async)
at System.Net.ConnectStream.WriteHeaders(Boolean async)
at System.Net.HttpWebRequest.EndSubmitRequest()
at System.Net.Connection.SubmitRequest(HttpWebRequest request, Boolean forcedsubmit)
at System.Net.ServicePoint.SubmitRequest(HttpWebRequest request, String connName)
at System.Net.HttpWebRequest.SubmitRequest(ServicePoint servicePoint)
at System.Net.HttpWebRequest.GetResponse()
at Amazon.Runtime.Internal.HttpRequest.GetResponse()
at Amazon.Runtime.Internal.HttpHandler1.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.Internal.RedirectHandler.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.Internal.Unmarshaller.InvokeSync(IExecutionContext executionContext)
at Amazon.S3.Internal.AmazonS3ResponseHandler.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.Internal.ErrorHandler.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.Internal.CallbackHandler.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.Internal.RetryHandler.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.Internal.CallbackHandler.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.Internal.CallbackHandler.InvokeSync(IExecutionContext executionContext)
at Amazon.S3.Internal.AmazonS3ExceptionHandler.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.Internal.MetricsHandler.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.Internal.RuntimePipeline.InvokeSync(IExecutionContext executionContext)
at Amazon.Runtime.AmazonServiceClient.Invoke[TRequest,TResponse](TRequest request, IMarshaller`2 marshaller, ResponseUnmarshaller unmarshaller)
at Amazon.S3.AmazonS3Client.GetObjectMetadata(GetObjectMetadataRequest request)

@Plasma
Copy link

Plasma commented Apr 11, 2018

This may be related to https://github.com/dotnet/corefx/issues/21796 (.NET FX deadlock on sockets cleanup and due to ServicePoint.set_ConnectionLimit internal locks)

I've started getting deadlocks due to many S3 SDK GetObject/Metadata/Exists calls and eventually something deadlocks exactly like the above issue.

@Plasma
Copy link

Plasma commented Apr 25, 2018

I've posted a workaround that is currently working for us, I believe: https://github.com/dotnet/corefx/issues/21796#issuecomment-381515493

@diehlaws diehlaws added guidance Question that needs advice or information. needs-discussion and removed Discussion labels Jan 3, 2019
@cullenjohnson
Copy link

cullenjohnson commented Jan 25, 2019

@djluck commented on Mar 21, 2016:

I think the key might be to dispose of the GetObjectResponse as quickly as possible. In my program, I'm concurrently downloading the contents of an entire bucket (with 35 concurrent worker tasks). I noticed that I started seeing object requests hang indefinitely if I didn't immediately read the contents of GetObjectResponse.ResponseStream into memory and dispose of the stream.

Fiddling with the DefaultConnectionLimit didn't seem to offer any improvement, only the quick disposal of the stream made any difference for me.

This was exactly my problem. We were mistakenly calling Dispose on the resultant Task<GetObjectResponse> (returned from GetObjectAsync) instead of the .Result of the task.

Task<GetObjectResponse> s3FileResponse = S3Client.GetObjectAsync(S3BucketName, s3FilePath);

// ...

try
{
    // ...
}
finally
{
    s3FileResponse.Result?.Dispose();
    // Changed from the following incorrect line:
    //s3FileResponse?.Dispose();
}

@ghost
Copy link

ghost commented Apr 15, 2019

It's April and 2019 and, years later, we are STILL experiencing random and sporadic hangs when accessing S3 via the .Net AWSSDK with a high degree of parallelism.

We've tried disabling SSL. We've done multiple audits for leaked IDisposables. We've messed with ServicePointManager.DefaultConnectionLimit. The hangs still persist.

Are others still experiencing this issue as well?

@Plasma
Copy link

Plasma commented Apr 15, 2019

Hey @tomlor,

We just worked this issue by first:

https://github.com/dotnet/corefx/issues/21796#issuecomment-381515493

And for downloading the blob data, we instead just generate a signed URL via the s3 client, and use HttpClient to download it instead of the s3 SDK. No issues now for 12 months.

@dyardyGIT
Copy link

I'm not sure how to reopen this issue. So adding my problem here.

We have a server side application in .Net developed using Nancyfx framework. Running as selfhost. This application receives multipart request with multiple files(around 100). All these files are supposed to be upload to S3 bucket. Using Parallels to upload files to S3 bucket. Right now creating s3 object for every task. This is causing too many HTTP connections and after a while system is become slow or s3 latency increases. We have optimized at TCP level, to reduce the TcpWaitTimeDelay to 30 seconds.

Can anyone help how to resolve this issue ?
How can we reduce AmazonS3Client HTTP connection pool ?

Using Windows 2012
AWS SDK version : 3.3.7

Let me know if any further information require.

Thanks in advance.

I have been fighting this issue for 6 months and still not good resolution. Help

@clevrdavid
Copy link

I think there is still an issue here.

I'm strangely getting this issue in a ASP .NET Core 3.1 Web API project, but not getting it in a ASP .NET Core 3.1 MVC project. The code calling the GetObjectAsync task is exactly the same in both projects and using the same credentials and bucket.

Been tearing my hair out all day, when this should be simple.

@Plasma
Copy link

Plasma commented Feb 16, 2020

@dyardyGIT @clevrdavid Depending on the stack trace you get when things get locked up, dotnet/runtime#22592 (comment) may fix this for you like it did for us.

As for too many tasks, perhaps wrap your upload code path in a Polly Bulkhead code block which will help throttle the parallelism: https://github.com/App-vNext/Polly/wiki/Bulkhead

@dyardyGIT
Copy link

I am shocked that there is not legitimate answer to this issue. I understand the 'randomness' makes it difficult but we have tried many things on both .net framework and .net core. This issue really was surfaced to a much greater extent when we made our move from Windows 2008 to Window 2012. On 2012 we have any to address http connection/socket issues.

The comment about setting ServicePointManager.DefaultConnectionLimit=50 we have tried with no success.

@randall-peakey-com
Copy link

@dyardyGIT Disposing has certainly eliminated this problem for many. #152 (comment)
Have you given this a try?

@dyardyGIT
Copy link

oading the blob data, we instead just generate a signed URL via the s3 client, and use HttpClient to download it instead of the s3 SDK. No issues now for 12 months.

Can you share how you generate an url using the sdk? thanks!

@dyardyGIT
Copy link

Yes, it looks like this now, and still having the issue.
using (GetObjectResponse response = _amazonClient.GetObject(request))
{
using (Stream responseStream = response.ResponseStream)
{
amazonFile = new AmazonFile();
amazonFile.FileBytes = ReadStream(response.ResponseStream);
amazonFile.Size = amazonFile.FileBytes.LongLength;
}
}

@Plasma
Copy link

Plasma commented Feb 18, 2020

Looks like our problem was mostly upload related, where it would hang, so we made our upload code just get the signed URL to upload to and PUT the data via regular HTTP calls:

		/// <summary>
		/// Upload to the specified key the provided stream of data
		/// </summary>
		async Task UploadUsingWebRequestAsync (string key, Stream stream) {
		    var client = CreateClient ();

		    // Calculate URL to upload to
		    var signedRequest = new GetPreSignedUrlRequest {
		        BucketName = BucketName,
		            Key = key,
		            Verb = HttpVerb.PUT,
		            Expires = DateTime.UtcNow.AddHours (1)
		    };

		    // Generate Url
		    var uploadUrl = client.GetPreSignedURL (signedRequest);

		    // Perform Upload
		    // Create content
		    var streamContent = new StreamContent (stream);

		    // Create Retry Policy
		    var retryPolicy = Policy<HttpResponseMessage>
		        .Handle<HttpRequestException> ()
		        .WaitAndRetryAsync (3, x => TimeSpan.FromSeconds(x));

		    // Put File
		    using (var response = await retryPolicy.ExecuteAsync (() => SharedClient.PutAsync(uploadUrl, streamContent))) {
		        // Verify
		        if (!response.IsSuccessStatusCode)
		            throw new ArgumentException ($"Upload of blob failed ({key}): {await response.Content.ReadAsStringAsync()}");
		    }
		}

As for downloading, we do less of that, but you can imagine the flow is similar (client.GetPreSignedUrl), then just do a HTTP GET on that Url.

@randall-peakey-com
Copy link

Yes, it looks like this now, and still having the issue.
using (GetObjectResponse response = _amazonClient.GetObject(request))
{
using (Stream responseStream = response.ResponseStream)
{
amazonFile = new AmazonFile();
amazonFile.FileBytes = ReadStream(response.ResponseStream);
amazonFile.Size = amazonFile.FileBytes.LongLength;
}
}

I see you are disposing of the Stream and the GetObjectResponse, but not the AmazonS3Client.

@pavisalavisa
Copy link

I'm a bit late to the party but I'd like to give my two cents on this issue.

We're running .NET 4.7 on windows server 2016. We have multiple worker instances (some of which are running Quartz) and multiple instances running ASP.NET Web API. API instances never displayed any problems regarding S3 requests hanging, however, worker instances did.

At first, I thought that we might have some kind of deadlock because logging in place wasn't verbose enough to pinpoint the exact location where the application stopped. It was clear that these hangs were happening only when the system was under a considerable load.

Dumping the stack trace pointed me in the wrong direction. It showed that my service blocked on the following snippet executing Exists:

public bool ObjectExists(string bucketName, string objectKey)
        {
            var s3FileInfo = new S3FileInfo(_s3Client, bucketName, objectKey);
            return s3FileInfo.Exists;
        }

I've tried everything I could find related to this issue including updating to .NET 4.8, using single instance S3 client, using different S3 clients for every unit of work, changing the existing implementation to use different SDK methods, etc.

Tuning the ServicePointManager.DefaultConnectionLimit = N certainly did have an effect on how soon the service ground to a halt. Using netstat I noticed that there were N connections to S3 with CLOSE_WAIT status. According to TCP specification, this state indicates that the server (S3) received and acknowledged signal for closing the connection but the client (your application) has not yet closed the socket.

This information steered me away from the problem source and into the esoteric search for the bug in the framework (hence the update to a newer version of the framework). While there might be cases where the framework bug caused your implementation to misbehave, that wasn't my case.

Enter the following comment:

@dyardyGIT Disposing has certainly eliminated this problem for many. #152 (comment)
Have you given this a try?

I inspected the facade implementation that we used to wrap S3 related actions and discovered that one of the s3Client.GetObjectAsync(getObjectRequest) objects was not disposed of. That method is called in the normal service flow but it's kind of buried deeper in the service layer. This caused the number of open connections to grow and eventually lead service to a grinding halt.
The interesting part is that because of the way the service was implemented, it would always stop on the same method (ObjectExists shown above) where there was no need for disposal per se.

Disposing of the object properly meant that no connections were left hanging and that the service could handle bigger loads without stopping.

I still don't understand why the SDK lets this happen in the first place. I would've been happier with an exception telling me that the number of connections has been exceeded and that new connections cannot be established.

@brinkdinges
Copy link

I think I'm running into this issue as well. I'm trying to upload a 30kB text file from a .NET 4.7.2 desktop app with AWSSDK.S3 3.3.111.33. I run as a plugin in another app, which might be single-threaded.

This line from the official docs never uploads the file and hangs until I force close the application. This happens every single time. I also see lingering CLOSE_WAIT items in netstat.
PutObjectResponse response = await client.PutObjectAsync(putRequest);

I can only upload the file and close the request when I wrap the PutObjectAsync in a using statement, even though the only disposable object is the Task that it creates.

public static void UploadFile()
{
   // pseudocode
   using (credentials)
   using (client)
   create putRequest
   PutObjectAsync(client, putRequest).Wait();
}

public static async Task PutObjectAsync(AmazonS3Client client, PutObjectRequest putRequest)
{
  using (var task = client.PutObjectAsync(putRequest))
  {
    var success = task.Result.HttpStatusCode == HttpStatusCode.OK;
    if (!success) throw new CannotUploadFile();
  }
}

But this is no longer async since there is no await in the PutObjectAsync method. So the UI becomes unresponsive. Any ideas on how to work around this?

A second, related issue is that this behavior gets even worse when there is no internet connection. My working sample above times out at about 100 seconds. That's a long time to wait. I have found no setting that had any effect on this.

@Plasma
Copy link

Plasma commented Aug 2, 2020

@brinkdinges my workaround I'd suggest is to bypass the SDK for uploading, see my comment here #152 (comment)

The workaround is to have the SDK generate the HTTPS Upload URL (that includes the signing key), then use HttpClient to PUT the data directly (pretty much what the SDK does, anyway).

@Plasma
Copy link

Plasma commented Aug 2, 2020

@brinkdinges your using statement is also wrong, you should await the task to ensure you are async:

public static async Task PutObjectAsync(AmazonS3Client client, PutObjectRequest putRequest)
{
  // Do not dispose of task, dispose of the async result (and await the task instead of accessing .Result property directly)
  // I am not at a computer, perhaps PutObjectAsync does not implement IDisposable, in which cause using not required.
  using (var putResponse = await client.PutObjectAsync(putRequest))
  {
    var success = putResponse.HttpStatusCode == HttpStatusCode.OK;
    if (!success) throw new CannotUploadFile();
  }
}

@brinkdinges
Copy link

brinkdinges commented Aug 3, 2020

@Plasma Thank you. I had already tried to use your implementation, but I couldn't find what the SharedClient and Policy where. I found the policies in the SDK, but none that used generics. Could you clarify those two?

PutObjectAsync indeed doesn't implement IDisposable. But without the using, the request never ends or gives a result. That is the reason I'm trying all these workarounds. I just tried GetObjectAsync and the same happens, it also never returns a result.

Right now I have wrapped the whole method that creates credentials, the client and the request in a single task and I await that. This works for now.

@Plasma
Copy link

Plasma commented Aug 3, 2020

Ah, SharedClient is just an instance of HttpClient, and Policy is Polly.Net Retry Policy -- this part is kinda optional and you can skip it.

But, your original question above, you had pasted code with a known anti-pattern of using async, where a deadlock can definitely occur, because you are not await'ing a task, but instead trying to access the .Result property directly. Ignoring my workaround, what if you change that whole method to this?

public static async Task PutObjectAsync(AmazonS3Client client, PutObjectRequest putRequest)
{
  var result = await client.PutObjectAsync(putRequest);
  var success = result.HttpStatusCode == HttpStatusCode.OK;
  if (!success) throw new CannotUploadFile();
}

@brinkdinges
Copy link

I am very sure that's what I started with. It didn't work then, it does now 😄 Thanks for pushing me to do it the correct way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests