SDK v10: IOException: Connection reset by peer #363

rocketraman · 2018-08-31T09:34:50Z

Which service(blob, file, queue, table) does this issue concern?

Blob

Which version of the SDK was used?

10.0.4-rc

What problem was encountered?

Upon upgrade from 10.0.1-preview to 10.0.4-rc, I occasionally get the following exception: java.io.IOException: Connection reset by peer, with the stack:

2018-08-31 05:27:14,442 WARN  --- [    pool-3-thread-31] com.red.mic.cms.CmsRestApi                        : Request resulted in unexpected exception. Request=[] [[EXCEPTION: java.io.IOException: Connection reset by peer
	at sun.nio.ch.FileDispatcherImpl.writev0(Native Method) ~[?:1.8.0_171]
	at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) ~[?:1.8.0_171]
	at sun.nio.ch.IOUtil.write(IOUtil.java:148) ~[?:1.8.0_171]
	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:504) ~[?:1.8.0_171]
	at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:420) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:360) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:901) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1396) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.handler.ssl.SslHandler.forceFlush(SslHandler.java:1776) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.handler.ssl.SslHandler.wrapAndFlush(SslHandler.java:775) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:752) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.DefaultChannelPipeline.flush(DefaultChannelPipeline.java:1013) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannel.flush(AbstractChannel.java:248) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at com.microsoft.rest.v2.http.NettyClient$AcquisitionListener$RequestSubscriber.lambda$onNext$0(NettyClient.java:379) ~[client-runtime-2.0.0-beta4.jar:?]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
]]

This error was encountered during an upload, and it doesn't look like retry worked either.

I don't recall ever seeing anything similar with 10.0.1-Preview and will likely downgrade to that version until this is resolved.

Have you found a mitigation/solution?

No

The text was updated successfully, but these errors were encountered:

rickle-msft · 2018-08-31T17:14:30Z

Hi, @rocketraman. Thank you for opening this issue. We have noticed this as well. What happened here was a change to the retry policy to try to handle more specific network-related exceptions rather than IOExceptions, which also includes lots of non-network related issues. We've since decided that it is better to simply look for IOException and will be changing it back in the next release, which should resolve this issue because those errors will be automatically retried.

rocketraman · 2018-08-31T17:24:43Z

@rickle-msft Thanks, that explains it. Any IOException emanating from the networking layer should be networking related, so I think your decision to revert the retry behavior makes sense.

rocketraman · 2018-08-31T17:29:59Z

@rickle-msft Just realized this is the same problem I previously reported (Azure/autorest-clientruntime-for-java#467) -- I just didn't realize it before because it was being worked around automatically by the retry.

rickle-msft · 2018-08-31T17:38:25Z

Ah. Good catch. So the reason we get these exceptions at all is because we pool connections and keep them alive across multiple operations, and the service will eventually time them out and close them. When we retry, we just establish a new connection and the request works fine. Reverting the retry policy should mitigate this for the user, but we'll continue looking into the root cause

rocketraman · 2018-09-12T05:48:35Z

Reverting the retry policy should mitigate this for the user, but we'll continue looking into the root cause

Was the retry policy reverted in 10.1.0? I didn't see it in the changelog.

rocketraman · 2018-09-12T13:26:56Z

Was the retry policy reverted in 10.1.0? I didn't see it in the changelog.

Never mind my last, I do see the change in RequestRetryFactory to catch and retry IOException. Great.

rickle-msft · 2018-09-12T17:17:48Z

Yes we did. Sorry about the gap in the changelog. I'll close this issue now since it has been addressed in the latest release. Please feel free to reopen it if you have further concerns.

rocketraman · 2018-09-27T16:01:26Z

@rickle-msft I just encountered this (or a very very similar) issue again with 10.1.0:

Here is the stack:

java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.writev0(Native Method) ~[?:1.8.0_181]
        at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) ~[?:1.8.0_181]
        at sun.nio.ch.IOUtil.write(IOUtil.java:148) ~[?:1.8.0_181]
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:504) ~[?:1.8.0_181]
        at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:420) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:360) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:901) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1396) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.handler.ssl.SslHandler.forceFlush(SslHandler.java:1776) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.handler.ssl.SslHandler.wrapAndFlush(SslHandler.java:775) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:752) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.flush(CombinedChannelDuplexHandler.java:533) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.CombinedChannelDuplexHandler.flush(CombinedChannelDuplexHandler.java:358) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.DefaultChannelPipeline.flush(DefaultChannelPipeline.java:1013) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.AbstractChannel.flush(AbstractChannel.java:248) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at com.microsoft.rest.v2.http.NettyClient$AcquisitionListener$RequestSubscriber.lambda$onNext$0(NettyClient.java:331) ~[client-runtime-2.0.0.jar:?]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]

Retrying multiple times did not solve the problem -- the app continue to fail with the same Exception until it was restarted, at which point things started working again.

rickle-msft · 2018-09-27T17:28:07Z

Hey, @rocketraman. I've reopened the dialogue with the team that owns the NettyClient. Hopefully we can get to the bottom of either why the socket is being closed or why we're still trying to write to it. Thanks for pointing this out again and sorry you continue to experience difficulty here.

rickle-msft · 2018-09-27T19:31:46Z

@rocketraman Quick question. Are you seeing this less frequently since we returned to retrying on IOException? Or the same?

rocketraman · 2018-09-27T20:42:14Z

@rocketraman Quick question. Are you seeing this less frequently since we returned to retrying on IOException? Or the same?

Far less frequently. Its not the same issue for sure. Previously, this would happen if the CMS were idle for a while, and then a user-driven retry (without an app restart) would fix the problem.

This was the first time I saw the situation that happened today, and user-driven retries were not effective in solving the problem. The only solution to the problem was restarting the app.

rickle-msft · 2018-09-27T20:58:09Z

Was this after the app had been running for a long period of time? Or was there some new workload it was trying to process that might hit a corner case?

It sounds like you hit this error, then retries maxed out and passed the error back up to your application. You say you had to restart the app, so were all (or any number) of requests that should have been independent hitting this same issue once you hit it once?

rocketraman · 2018-09-27T21:53:23Z

Was this after the app had been running for a long period of time? Or was there some new workload it was trying to process that might hit a corner case?

No, it had been running for a couple of days, though its still in dev/test so there wasn't a lot of traffic. It wasn't a corner case -- the same action (uploading a particular blob) worked after a restart.

It sounds like you hit this error, then retries maxed out and passed the error back up to your application. You say you had to restart the app, so were all (or any number) of requests that should have been independent hitting this same issue once you hit it once?

I believe that all requests were failing once the issue had been hit, yes.

rocketraman · 2018-09-28T05:14:31Z

Unfortunately I had to get the service working right away but if this happens again I'll do some more debugging at the networking level.

rickle-msft · 2018-09-28T17:29:40Z

That would be really helpful. Thank you. I'm also going to try setting up an application that just does some uploads and downloads and try running it for a couple days and see if I can gather any more information.

rocketraman · 2018-10-11T03:38:47Z

@rickle-msft I just encountered this again, with the same stack I posted in #363 (comment).

netstat in the container in which the process is running indicates that the process has lost its connection with Azure Storage:

# nslookup mystorage.blob.core.windows.net 10.2.128.10
Server:    10.2.128.10
Address 1: 10.2.128.10 kube-dns.kube-system.svc.cluster.local

Name:      mystorage.blob.core.windows.net
Address 1: 40.86.232.206

# netstat -anp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 :::8096                 :::*                    LISTEN      1/java
tcp        0      0 :::43313                :::*                    LISTEN      1/java
tcp        0      0 :::9010                 :::*                    LISTEN      1/java
tcp        0      0 :::44151                :::*                    LISTEN      1/java
tcp        0      0 ::ffff:10.2.5.124:40370 ::ffff:10.2.153.167:8529 ESTABLISHED 1/java
tcp        0      0 ::ffff:10.2.5.124:53046 ::ffff:10.2.0.64:9092   ESTABLISHED 1/java
tcp        0      0 ::ffff:10.2.5.124:8096  ::ffff:10.2.0.147:60552 ESTABLISHED 1/java
tcp        0      0 ::ffff:10.2.5.124:8096  ::ffff:10.2.0.147:60418 TIME_WAIT   -
tcp        0      0 ::ffff:10.2.5.124:48012 ::ffff:10.2.0.64:9092   ESTABLISHED 1/java
tcp        0      0 ::ffff:10.2.5.124:35414 ::ffff:10.2.4.80:9092   ESTABLISHED 1/java
tcp        0      0 ::ffff:10.2.5.124:53072 ::ffff:10.2.0.64:9092   ESTABLISHED 1/java
tcp        0      0 ::ffff:10.2.5.124:42870 ::ffff:10.2.2.35:9092   ESTABLISHED 1/java
tcp        0      0 ::ffff:10.2.5.124:58588 ::ffff:10.2.4.80:9092   ESTABLISHED 1/java
tcp        0      0 ::ffff:10.2.5.124:36304 ::ffff:10.2.2.35:9092   ESTABLISHED 1/java
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node PID/Program name    Path
unix  2      [ ]         STREAM     CONNECTED     250279377 1/java              
unix  2      [ ]         STREAM     CONNECTED     250278709 1/java

The only thread that appears to be related to the SDK is this one:

"SharedChannelPool-worker" #12 daemon prio=5 os_prio=0 tid=0x00007f40fcfad800 nid=0x1c in Object.wait() [0x00007f40c238b000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:502)
        at com.microsoft.rest.v2.http.SharedChannelPool.lambda$new$3(SharedChannelPool.java:101)
        - locked <0x00000000f0611f30> (a java.util.concurrent.ConcurrentLinkedDeque)
        at com.microsoft.rest.v2.http.SharedChannelPool$$Lambda$26/80499834.run(Unknown Source)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

I've used tcpdump to check if there is any network traffic between my process and 40.86.232.206, and there is none. It looks like the process is not even attempting to establish a connection to the storage SDK.

I also note that the java.io.IOException: Connection reset by peer happens at exactly the time in which the client process to my process times out on its send attempt. However, the exception stack does appear to be for the storage SDK rather than the client network process, as it contains this line:

        at com.microsoft.rest.v2.http.NettyClient$AcquisitionListener$RequestSubscriber.lambda$onNext$0(NettyClient.java:331) ~[client-runtime-2.0.0.jar:?]

Lastly, I've compare the stack dump with a working process, and there is no difference -- the SDK is only mentioned in the one Thread with the same stack as shown above. I've also done the same tcpdump for the working process, and I do see packets, starting from the connection establishment and through to the data transmission, so it does appear that for some reason in the process with the weird state, the SDK is not even starting the connection.

I've left the process that has the SDK in the weird state running, so if there is some more debugging you want me to do on the process, let me know.

rickle-msft · 2018-10-11T18:51:27Z

Thank you so much for following up with this. Just so I understand the timeout situation correctly, you have client -> storage sdk -> storage service. Client is timing out and is trying to cancel some outstanding operation in the sdk, and that is the point when everything seems to hang?

rocketraman · 2018-10-11T20:37:59Z

@rickle-msft

Just so I understand the timeout situation correctly, you have client -> storage sdk -> storage service.

Yes

Client is timing out and is trying to cancel some outstanding operation in the sdk, and that is the point when everything seems to hang?

I suspect that is a bit backward: the sdk seems to already be hung, the client is timing out because it is hung, and the exception is reported when the client cancels its connection, which I guess causes the open channel / rx flowable for the sdk to be cancelled, which seems to prompt the exception previously reported. So the issue is not so much the exception, but why the SDK seems to be hung in the first place. I'm not seeing any odd messages or exceptions in our logs before this "hang" though.

rickle-msft · 2018-10-12T00:04:44Z

I think we discussed before that this "Connection reset by peer" thing comes from a server side timeout, which closes the socket, then we try to write into the half closed socket, retry and grab another connection from the pool. Usually this fixes it, but perhaps we're running out of connections? So after some large number of retries, there are just no more connections available, and the client sits waiting for another connection that will never come? I'm not terrible familiar with the connection pooling logic, so there could be a bug that is slowly draining connections?

rocketraman · 2018-10-12T12:45:59Z

@rickle-msft That seems reasonable. If it is waiting for a connection, might there be a Thread blocking on a queue read or something somewhere? I still have the hung instance running so can continue to debug with that if we think of something concrete to look at.

rickle-msft · 2018-10-15T21:00:41Z

@rocketraman. Update for you. The runtime team spent a while doing some investigation, and they found some suspect logic in the connection acquisition here. They are thinking that this logic rarely results in using existing channels, so the connections tend to sit idle for a long time and then close and eventually give this IOException. The runtime team is working on a fix and some testing that we should be able to try out soon.

yeroc · 2018-10-15T21:40:55Z

That's good to hear. We also ran into similar hangs and eventually worked around it by creating our own HttpClientFactory instance rather than using the static one defined in DefaultHttpClientHolder and then closed it after each request as a work around.

rickle-msft · 2018-10-15T22:01:31Z

@yeroc. Thanks for sharing that work around. I'm hopeful this fix will be successful. I'll post here when there's a version with the fix that you can try out if you're interested.

rocketraman · 2018-10-15T23:20:59Z

@rickle-msft Awesome, thanks. Looking forward to it.

@yeroc Can you share a bit more about your workaround? I see that PipelineOptions takes an HttpClient, not an HttpClientFactory. How did you inject your own HttpClientFactory? And how do you manage the lifecycle of HttpClient instances in a concurrent request environment -- do you end up serializing all requests to it so that you can close and re-open between requests?

I'm starting to think it might be easiest just to use the underlying storage REST API directly with my own code using a better tested async client -- I'm only doing PUT and GET on blobs, so it can't be that hard (famous last words) :-)

yeroc · 2018-10-15T23:35:37Z

@rocketraman Yea, we started thinking the same thing (maybe it would be easier to write our own REST library) but we were on a very tight timeline so looked harder for a workaround.

Anyway, more detail on the workaround... For ease-of-use we created a wrapper object which implements AutoCloseable. The wrapper object creates a HttpClientFactory and stores it as an instance variable:

  private final HttpClientFactory clientFactory = new NettyClient.Factory();

and we use it to create a HttpClient instance when setting up our Pipeline:

    ...
    HttpClient client = clientFactory.create(null);
    return HttpPipeline.build(client, factories.toArray(new RequestPolicyFactory[factories.size()]));

the close() function implemented by our wrapper object simply closes the factory which closes all the pooled sockets:

  @Override
  public void close() throws IOException
  {
    clientFactory.close();
  }

We're creating a new wrapper for every request so this is a sledgehammer approach to fixing the issue. We were seeing random failures after only 5 requests so the safest approach for us was to create a new one for every high-level download/upload request. If you're doing many requests this may not work for you. In our case we're manipulating larger files relatively infrequently so initializing a new client every time wasn't much of a performance hit.

rickle-msft · 2018-10-15T23:47:01Z

@yeroc. You said you're seeing random failures after only 5 requests. Are these all the IOExceptions? That behavior is a bit odd to me. On the latest versions, we've been seeing these IOExceptions much less frequently, and those ones are almost always resolved by a retry. I think @rocketraman has only been seeing this issue become unrecoverable after running his application for a while if I'm not mistaken. I'm curious why you're still seeing it so frequently. What version of the library are you using and what is your workflow? Have you observed retries being unsuccessful (retries are enabled by default)? And has this workaround completely mitigated the issue?

yeroc · 2018-10-15T23:56:47Z

@rickle-msft These weren't 5 requests back-to-back but rather 5-6 requests spread out over a period of time (30 minutes or so). We were never able to create a test that reproduced the issue reliably. These ended in hangs. In most cases we observed no retry being attempted, things just hung with things blocking. We're using version 10.1.0. To date the workaround completely mitigated the issue. As per above, we completely tear down the connection pool after each upload/download so the window for the connections or client to get itself into an unrecoverable state is very small.

rocketraman · 2019-02-20T08:12:43Z

@anuchandy Sorry for the delay, I was on vacation last week. I've deployed a new version of our service with the correct version of Netty, and without the native epoll.

I can immediately note that we are no longer seeing the Error emitted on channel x. Message: syscall:read(..) failed: Connection reset by peer and Active Channels: n, Leaked Channels: m messages that we were seeing before, at least in brief initial testing.

Will do some testing over the next week or two to see if we run into any similar issues as before, but I'm cautiously optimistic here.

anuchandy · 2019-02-20T14:52:12Z

@rocketraman thank you for testing this. Sure, waiting to see the result of long run.

lagerspetz · 2019-02-20T18:28:30Z

I wasn't sure I was hitting the same issue, as I have no other Netty dependencies, just the one introduced by azure storage, which is why I was reporting a new issue #438 . I am on blob storage version 10.5.0 Dependency graph does show a superseded version though, is this normal?:

+--- com.microsoft.azure:azure-storage-blob:10.5.0
|    +--- com.microsoft.rest.v2:client-runtime:2.0.2
|    |    +--- io.netty:netty-handler:4.1.28.Final -> 4.1.31.Final
|    |    |    +--- io.netty:netty-buffer:4.1.31.Final
|    |    |    |    \--- io.netty:netty-common:4.1.31.Final
|    |    |    +--- io.netty:netty-transport:4.1.31.Final
|    |    |    |    +--- io.netty:netty-buffer:4.1.31.Final (*)
|    |    |    |    \--- io.netty:netty-resolver:4.1.31.Final
|    |    |    |         \--- io.netty:netty-common:4.1.31.Final
|    |    |    \--- io.netty:netty-codec:4.1.31.Final
|    |    |         \--- io.netty:netty-transport:4.1.31.Final (*)
|    |    +--- io.netty:netty-handler-proxy:4.1.28.Final -> 4.1.31.Final
|    |    |    +--- io.netty:netty-transport:4.1.31.Final (*)
|    |    |    +--- io.netty:netty-codec-socks:4.1.31.Final
|    |    |    |    \--- io.netty:netty-codec:4.1.31.Final (*)
|    |    |    \--- io.netty:netty-codec-http:4.1.31.Final
|    |    |         \--- io.netty:netty-codec:4.1.31.Final (*)

(Full dependency graph attached below)
depfile.txt

My issue is similar: I am uploading blobs to save json-formatted text data into Azure Blob Storage. When I run the application overnight, receiving messages about once every 10 minutes, the application stops being able to upload or download data from Azure storage. I also see many "channels leaked" and "Connection reset by peer" messages from the Azure SDK or libraries it uses. I do not have any other version of netty in my app except for the one pulled by compile group: 'com.microsoft.azure', name: 'azure-storage-blob', version: '10.5.0'. I use the following code for uploads:

	@Override
	public int store(String name, String jsonData) {
		BlockBlobURL blob = containerURL.createBlockBlobURL(name);

		byte[] array = jsonData.getBytes(AuthUtils.utf8);
		ByteBuffer buf = ByteBuffer.wrap(array);
		Flowable<ByteBuffer> data = Flowable.fromArray(buf);
		long length = array.length;
		Single<BlockBlobUploadResponse> resp = blob.upload(data, length);
		BlockBlobUploadResponse result = resp.blockingGet();
		return result.statusCode();
	}

The errors I have got are like the following:

2019-02-19 02:42:57.239  WARN 9154 --- [ntLoopGroup-1-2] com.microsoft.rest.v2.http.NettyClient   : Error emitted o
n channel 50f0698f. Message: Connection reset by peer
2019-02-19 02:42:57.239  INFO 9154 --- [ntLoopGroup-1-2] c.m.rest.v2.http.SharedChannelPool       : ---- com.micros
oft.rest.v2.http.SharedChannelPool@440fdd18: size 64, keep alive (sec) 300 ----
2019-02-19 02:42:57.239  INFO 9154 --- [ntLoopGroup-1-2] c.m.rest.v2.http.SharedChannelPool       : Channel State
 For     Age     URL
2019-02-19 02:42:57.239  INFO 9154 --- [ntLoopGroup-1-2] c.m.rest.v2.http.SharedChannelPool       : 50f0698f
 AVAIL   184s    184s    https://vesratstorage.blob.core.windows.net:443
2019-02-19 02:42:57.239  INFO 9154 --- [ntLoopGroup-1-2] c.m.rest.v2.http.SharedChannelPool       : Active channels
: 54 Leaked channels: 53

The application does stall in the sense that no new azure blob reads or writes get through anymore at that point. This happens at around 100 "leaked channels" as reported by the shared channel pool print. I do not see other errors. Here is the stall from this morning:

x-ms-date: Wed, 20 Feb 2019 10:04:46 GMT                                                                                                                                                                                              [62/1670]
host: vesratstorage.blob.core.windows.net                                                                                                                                                                                                      
Content-Type: application/octet-stream                                                                                                                                                                                                         
connection: keep-alive                                                                                                                                                                                                                         
x-ms-client-request-id: 28f4ef70-1c08-439f-9f88-6ef04b314324                                                                                                                                                                                   
x-ms-blob-type: BlockBlob                                                                                                                                                                                                                      
User-Agent:  Azure-Storage/10.5.0 (JavaJRE 1.8.0_191; Linux 4.15.0-1036-azure)                                                                                                                                                                 
                                                                                                                                                                                                                                               
2019-02-20 10:07:54.243  WARN 938 --- [ntLoopGroup-1-2] com.microsoft.rest.v2.http.NettyClient   : Error emitted on channel 42790359. Message: Connection reset by peer                                                                        
2019-02-20 10:07:54.243  INFO 938 --- [ntLoopGroup-1-2] c.m.rest.v2.http.SharedChannelPool       : ---- com.microsoft.rest.v2.http.SharedChannelPool@e363209: size 64, keep alive (sec) 300 ----                                               
2019-02-20 10:07:54.243  INFO 938 --- [ntLoopGroup-1-2] c.m.rest.v2.http.SharedChannelPool       : Channel      State   For     Age     URL                                                                                                    
2019-02-20 10:07:54.243  INFO 938 --- [ntLoopGroup-1-2] c.m.rest.v2.http.SharedChannelPool       : 42790359     AVAIL   185s    185s    https://vesratstorage.blob.core.windows.net:443                                                        
2019-02-20 10:07:54.243  INFO 938 --- [ntLoopGroup-1-2] c.m.rest.v2.http.SharedChannelPool       : Active channels: 134 Leaked channels: 133

After this tries just hang:

2019-02-20 10:10:39.012  INFO 938 --- [nio-8443-exec-9] Azure Storage Java SDK                   : 'https://vesratstorage.blob.core.windows.net/vesratblobs/A81758FFFE0357F0%2F1550593490059.json'==> OUTGOING REQUEST (Try number='1')        
                                                                                                                                                                                                                                               
2019-02-20 10:10:39.019  INFO 938 --- [nio-8443-exec-4] Azure Storage Java SDK                   : 'https://vesratstorage.blob.core.windows.net/vesratblobs/A81758FFFE0357F0%2F1550593190324.json'==> OUTGOING REQUEST (Try number='1')        
                                                                                                                                                                                                                                               
2019-02-20 10:10:39.031  INFO 938 --- [nio-8443-exec-3] Azure Storage Java SDK                   : 'https://vesratstorage.blob.core.windows.net/vesratblobs/A81758FFFE0357F0%2F1550593790120.json'==> OUTGOING REQUEST (Try number='1')        
                                                                                                                                                                                                                                               
2019-02-20 10:10:39.036  INFO 938 --- [nio-8443-exec-1] Azure Storage Java SDK                   : 'https://vesratstorage.blob.core.windows.net/vesratblobs/A81758FFFE0357F0%2F1550594389997.json'==> OUTGOING REQUEST (Try number='1')        
                                                                                                                                                                                                                                               
2019-02-20 10:10:39.037  INFO 938 --- [io-8443-exec-10] Azure Storage Java SDK                   : 'https://vesratstorage.blob.core.windows.net/vesratblobs/A81758FFFE0357F0%2F1550594090027.json'==> OUTGOING REQUEST (Try number='1')        
                                                                                                                                                                                                                                               
2019-02-20 10:10:39.127  INFO 938 --- [nio-8443-exec-6] Azure Storage Java SDK                   : 'https://vesratstorage.blob.core.windows.net/vesratblobs/A81758FFFE0357F0%2F1550594690011.json'==> OUTGOING REQUEST (Try number='1')

anuchandy · 2019-02-20T20:47:19Z

@rocketraman need a help from you. could you describe how did you disable native-transport for linux? (pom file change? how it looks like) it will be very helpful for anyone having same issue [@lagerspetz seem hitting similar issue]

rocketraman · 2019-02-20T21:00:34Z

@anuchandy As I understand it, the native transport is enabled only if a runtime dependency on the appropriate artifact is included e.g.:

runtimeOnly 'io.netty:netty-transport-native-epoll:4.1.28.Final:linux-x86_64

Since the Storage SDK doesn't include this in its POM, it won't be included for the user either unless they add it themselves, or another dependency they have adds it for them (such as was my case above).

For @lagerspetz , its odd that his Netty version was superseded by Gradle -- that shouldn't happen unless the dependency was explicitly specified, or unless Gradle is applying a version conflict resolution algorithm, which shouldn't be the case if netty isn't being pulled in by any other dependency. A Gradle build scan (--scan) may show more details / reasons as to why the dependency was superseded.

I don't believe @lagerspetz 's dependencies include any native libs though, based on his deps report, which is an interesting data point if so -- that means the issue is not with the native libs, but rather with newer versions of Netty (or in how the storage SDK uses them, of course).

lagerspetz · 2019-02-20T21:21:54Z

--scan did not reveal anything, it just says that version of netty was selected by rule.
I added all the mentioned netty packages to my build.gradle, and now the dependencies graph has no other version than the 4.1.28.Final. I am testing whether the app hanging issue occurs now with the app.
For reference, this is my entire dependencies section in build.gradle:

dependencies {
  compile("io.netty:netty-handler:4.1.28.Final")
  compile("io.netty:netty-handler-proxy:4.1.28.Final")
  compile("io.netty:netty-buffer:4.1.28.Final")
  compile("io.netty:netty-transport:4.1.28.Final")
  compile("io.netty:netty-resolver:4.1.28.Final")
  compile("io.netty:netty-codec:4.1.28.Final")
  compile("io.netty:netty-codec-http:4.1.28.Final")
  compile("io.netty:netty-codec-socks:4.1.28.Final")
  compile("io.netty:netty-common:4.1.28.Final")
  // https://mvnrepository.com/artifact/com.microsoft.azure/azure-storage-blob
  compile group: 'com.microsoft.azure', name: 'azure-storage-blob', version: '10.5.0'
  compile("org.springframework.boot:spring-boot-starter-web")
  compile("org.springframework.boot:spring-boot-starter-thymeleaf")
  compile("org.webjars:bootstrap:3.3.7")
  compile("org.webjars:jquery:3.3.1")
  compile("nz.net.ultraq.thymeleaf:thymeleaf-layout-dialect")
  //compile("org.springframework.security:spring-security-config")
  //compile("org.springframework.security:spring-security-web")
  //compile("org.thymeleaf.extras:thymeleaf-extras-springsecurity4")
  compile("com.fasterxml.jackson.core:jackson-databind")
  compile("org.yaml:snakeyaml")
  testCompile("org.springframework.boot:spring-boot-starter-test")
}

rocketraman · 2019-02-20T21:29:45Z

@lagerspetz It looks like you are using Spring Boot -- it may very well be the cause of your "Selected by rule". I believe Spring Boot does all sorts of things to try and set the versions of things automagically. See for example: https://github.com/lkishalmi/gradle-gatling-plugin#spring-boot-and-netty-version.

lagerspetz · 2019-02-20T21:39:31Z

Thanks @rocketraman I found this as well:
spring-projects/spring-boot#15932
And setting ext['netty.version'] = '4.1.28.Final' at the top level of my build.gradle also seems to pull the right version. If the problem really is just a specific version of netty + azure blob storage v10.5.0, there should be no more hangs tomorrow morning.

For reference, the dependencies section that produces the same deps as the one in my above comment:

ext['netty.version'] = '4.1.28.Final'

dependencies {
  // https://mvnrepository.com/artifact/com.microsoft.azure/azure-storage-blob
  compile group: 'com.microsoft.azure', name: 'azure-storage-blob', version: '10.5.0'
  compile("org.springframework.boot:spring-boot-starter-web")
  compile("org.springframework.boot:spring-boot-starter-thymeleaf")
  compile("org.webjars:bootstrap:3.3.7")
  compile("org.webjars:jquery:3.3.1")
  compile("nz.net.ultraq.thymeleaf:thymeleaf-layout-dialect")
  //compile("org.springframework.security:spring-security-config")
  //compile("org.springframework.security:spring-security-web")
  //compile("org.thymeleaf.extras:thymeleaf-extras-springsecurity4")
  compile("com.fasterxml.jackson.core:jackson-databind")
  compile("org.yaml:snakeyaml")
  testCompile("org.springframework.boot:spring-boot-starter-test")
}

lagerspetz · 2019-02-21T07:36:38Z

Even with this version, I am still getting the error. There's also "Unexpected failure attempting to make request." that shows up after the 100 leaked channels:

2019-02-21 07:34:31.953  INFO 26843 --- [nio-8443-exec-5] i.v.c.storage.AzureCloudStorageService   : Storing file: ABCD/1550734471953.json data: {"Time":1546164058000,"DevEUI":"ABCD","payload_hex":"abcdef","receivedTimestamp":1550734471953
}
2019-02-21 07:34:31.956  INFO 26843 --- [nio-8443-exec-5] Azure Storage Java SDK                   : 'https://vesratstorage.blob.core.windows.net/vesratblobs/ABCD%2F1550734471953.json'==> OUTGOING REQUEST (Try number='1')

2019-02-21 07:34:33.958  WARN 26843 --- [ntLoopGroup-1-1] com.microsoft.rest.v2.http.NettyClient   : Error emitted on channel 20619673. Message: Connection reset by peer
2019-02-21 07:34:33.958  INFO 26843 --- [ntLoopGroup-1-1] c.m.rest.v2.http.SharedChannelPool       : ---- com.microsoft.rest.v2.http.SharedChannelPool@62b0aa59: size 64, keep alive (sec) 300 ----
2019-02-21 07:34:33.958  INFO 26843 --- [ntLoopGroup-1-1] c.m.rest.v2.http.SharedChannelPool       : Channel    State   For     Age     URL
2019-02-21 07:34:33.958  INFO 26843 --- [ntLoopGroup-1-1] c.m.rest.v2.http.SharedChannelPool       : 20619673   LEASE   0s      248s    https://vesratstorage.blob.core.windows.net:443
2019-02-21 07:34:33.958  INFO 26843 --- [ntLoopGroup-1-1] c.m.rest.v2.http.SharedChannelPool       : Active channels: 101 Leaked channels: 100
2019-02-21 07:34:33.959 ERROR 26843 --- [ntLoopGroup-1-1] Azure Storage Java SDK                   : Unexpected failure attempting to make request.
Error message:'Connection reset by peer'

2019-02-21 07:34:33.959 ERROR 26843 --- [ntLoopGroup-1-1] Azure Storage Java SDK                   : Unexpected failure attempting to make request.
Error message:'Connection reset by peer'

2019-02-21 07:34:33.959  INFO 26843 --- [ntLoopGroup-1-1] Azure Storage Java SDK                   : 'https://vesratstorage.blob.core.windows.net/vesratblobs/ABCD%2F1550734471953.json'==> OUTGOING REQUEST (Try number='2')

2019-02-21 07:34:33.959  WARN 26843 --- [ntLoopGroup-1-1] com.microsoft.rest.v2.http.NettyClient   : Error emitted on channel 20619673. Message: Connection reset by peer

rocketraman · 2019-02-21T14:51:52Z

@anuchandy Like @lagerspetz I can also confirm I still have the issue, with no native epoll and the correct version of netty. Not sure this means anything, but it also "hangs" for me way before reaching 100 leaked channels -- in fact the hanging behaviour seems to be somewhat random.

anuchandy · 2019-02-21T15:37:48Z

thanks @rocketraman & @lagerspetz for validating.

@lagerspetz could you share a bit more about nature of your application, you shared the following code

	@Override
	public int store(String name, String jsonData) {
		BlockBlobURL blob = containerURL.createBlockBlobURL(name);

		byte[] array = jsonData.getBytes(AuthUtils.utf8);
		ByteBuffer buf = ByteBuffer.wrap(array);
		Flowable<ByteBuffer> data = Flowable.fromArray(buf);
		long length = array.length;
		Single<BlockBlobUploadResponse> resp = blob.upload(data, length);
		BlockBlobUploadResponse result = resp.blockingGet();
		return result.statusCode();
	}

Is blob.upload the only storage SDK method your application is using?
Where the application is hosted? (console, web, micro-service)
What OS?
Approximately after how many calls to blob.upload, you started seeing this error?
Are these calls happening serially or concurrently?
Is there anyway you can help us so that we can repro this our end?

lagerspetz · 2019-02-21T16:44:59Z

EDIT: I am investigating possible client issues that might cause the stored message to contain other than expected content (JSON Untyped vs JSON typed). The below is still an accurate description of the problem.

I use upload and download. The app receives json documents over HTTPS and stores them to azure. It also allows listing the objects in a given storage container and prefix, and downloading those blobs. Only the upload part seems to be failing. Only the uploads are happening overnight when this occurs, but both upload and download get blocked after it.
Hosted on azure.

Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.1 LTS
Release:	18.04
Codename:	bionic

I am using Java 8 (openjdk-8-jre-headless)
4. Almost every upload call results in the error emitted on channel message. After 100 or so upload requests (and leaked channels) the application hangs.
5. Currently only one device is sending data, so these calls are happening serially. Download calls, which do not suffer from this, happen cocurrently, as multiple data items will be displayed by a web page that does all of the fetching with ajax to minimize delays. However, the Spring Boot based server allows uploads to also happen concurrently in the future.
6. With the above code, I am storing roughly this long JSON content: Storing file: G81758FFFE0357F0/1550767386361.json data: {"Time":1546163633000,"DevEUI":"G81758FFFE0357F0","FPort":"5","FCntUp":"44984","ADRbit":"1","MType":"2","FCntDn":"428","payload_hex":"DEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEF","qq":"DEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADB","zz":"DEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADB","ww":"DEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADB","hh":"DEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADB","h2":"DEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADBEEFDEADB","receivedTimestamp":1550767386361}
so if there are some size limits that I should observe, would be good to know.

lagerspetz · 2019-02-22T17:18:52Z

I am trying with Java 10 now, and using different client side format, but I still get this issue.

rocketraman · 2019-02-22T17:23:40Z

I'm on Java 11 now, btw. That doesn't help either.

marcioos · 2019-02-22T20:04:48Z

@lagerspetz @rocketraman Have you guys tried bumping Netty to the latest patch version manually? I was experiencing the same issue and doing that seem to have helped. My build.gradle looks like this:

    implementation('com.microsoft.azure:azure-storage-blob:10.5.0') {
        exclude group : 'io.netty'
    }
    implementation 'io.netty:netty-handler:4.1.33.Final'
    implementation 'io.netty:netty-handler-proxy:4.1.33.Final'
    implementation 'io.netty:netty-buffer:4.1.33.Final'
    implementation 'io.netty:netty-codec-http:4.1.33.Final'

It might be a temporary measure while they don't fix the SDK.

lagerspetz · 2019-02-22T20:48:09Z

Thanks @marcioos I am trying that next. For me it was enough to do:

ext['netty.version'] = '4.1.33.Final'

UPDATE: This does not work for me. The app still hangs.

marcioos · 2019-03-10T09:10:23Z

After pushing v10 to our production environment, the issues found by @rocketraman started manifesting quite consistently. I ended up rolling back to v8.

rickle-msft · 2019-03-11T16:21:55Z

Thank you guys, for the updates. We are currently spending a considerable amount of time investigating this issue and working towards a resolution. We will let you know here when there is progress.

mzarkowski · 2019-03-22T15:05:15Z

We are also affected by this - at this moment this api version is useless for production loads. This should be clearly stated in the Readme and docs.

seguler · 2019-04-01T17:53:37Z

@mzarkowski - we have validated a fix for this internally. We'll reply to this thread with more information today.

rickle-msft · 2019-04-03T22:03:03Z

Quick update. We are prepared to release the version which contains this fix as soon as some service features light up (to support other features added in this release). That deployment is in its final stages.

Thank you all for your patience and your contributions to this effort. We look forward to publishing these fixes and unblocking everyone here. As always, if you encounter any other problems, we are happy to continue to work with you

rickle-msft · 2019-04-08T19:16:40Z

We have released v11.0.0 which depends on v2.1.0 of the runtime, containing several fixes related to connection issues. Please consider upgrading and giving this a try. Thank you all again for you participation in this issue and for yourp patience in its resolution.

I will close this issue now as we believe that we have fixed it, but please feel free to reopen it if you continue to experience problems in this area.

Spinfusor · 2019-04-26T17:10:01Z

I'm still regularly getting "Connection reset by peer" with v11; there's been no difference for me between 10.X and 11 (other than no longer getting ConcurrentModificationException errors).

lagerspetz · 2019-04-26T17:20:24Z

I am getting Connection reset by peer regularly also, but I have so far got only 1 leaked connection in over two weeks, and the library doesn't seem to hang. So it would appear that the connection resets are OK for me, since the library retries the upload, and they do not affect functionality for me.

rickle-msft · 2019-04-26T18:44:00Z

@lagerspetz I am happy to hear that you are seeing better results now.

@Spinfusor Could you please provide the output of mvn dependency:tree so we can first validate that all the dependencies that you are using are as they should be to pull in the fix?

rocketraman · 2019-04-27T02:10:42Z

I'll add a vote for positive results with v11. I haven't noticed any library hangs any more, and very few leaked connections.

Spinfusor · 2019-04-27T22:09:27Z

@rickle-msft

[INFO] +- com.microsoft.azure:azure-storage-blob:jar:11.0.0:compile
[INFO] |  +- com.microsoft.rest.v2:client-runtime:jar:2.1.0:compile
[INFO] |  |  +- io.netty:netty-handler:jar:4.1.34.Final:compile
[INFO] |  |  |  +- io.netty:netty-common:jar:4.1.34.Final:compile
[INFO] |  |  |  +- io.netty:netty-transport:jar:4.1.34.Final:compile
[INFO] |  |  |  |  \- io.netty:netty-resolver:jar:4.1.34.Final:compile
[INFO] |  |  |  \- io.netty:netty-codec:jar:4.1.34.Final:compile
[INFO] |  |  +- io.netty:netty-handler-proxy:jar:4.1.34.Final:compile
[INFO] |  |  |  \- io.netty:netty-codec-socks:jar:4.1.34.Final:compile
[INFO] |  |  +- io.netty:netty-buffer:jar:4.1.34.Final:compile
[INFO] |  |  +- io.netty:netty-codec-http:jar:4.1.34.Final:compile
[INFO] |  |  +- com.fasterxml.jackson.dataformat:jackson-dataformat-xml:jar:2.9.8:compile
[INFO] |  |  |  +- org.codehaus.woodstox:stax2-api:jar:3.1.4:compile
[INFO] |  |  |  \- com.fasterxml.woodstox:woodstox-core:jar:5.0.3:compile
[INFO] |  |  \- io.reactivex.rxjava2:rxjava:jar:2.2.8:compile
[INFO] |  \- org.slf4j:slf4j-api:jar:1.7.26:compile

rickle-msft · 2019-04-29T18:29:45Z

@Spinfusor Thank you for sending that. Do you have some logs that we can look through that capture the failure? And can you describe the behavior of your application and when it hits this issue?

rickle-msft closed this as completed Sep 12, 2018

rickle-msft mentioned this issue Sep 27, 2018

Runtime may try sending data into half-closed connections. Azure/autorest-clientruntime-for-java#467

Open

rickle-msft reopened this Oct 12, 2018

rickle-msft mentioned this issue Feb 20, 2019

Connection Reset by peer while uploading blob with BlobkBlobUrl.upload() #438

Closed

rickle-msft closed this as completed Apr 8, 2019

SDK v10: IOException: Connection reset by peer #363

SDK v10: IOException: Connection reset by peer #363

Comments

rocketraman commented Aug 31, 2018 • edited Loading

Which service(blob, file, queue, table) does this issue concern?

Which version of the SDK was used?

What problem was encountered?

Have you found a mitigation/solution?

rickle-msft commented Aug 31, 2018

rocketraman commented Aug 31, 2018

rocketraman commented Aug 31, 2018 • edited Loading

rickle-msft commented Aug 31, 2018

rocketraman commented Sep 12, 2018

rocketraman commented Sep 12, 2018

rickle-msft commented Sep 12, 2018

rocketraman commented Sep 27, 2018

rickle-msft commented Sep 27, 2018

rickle-msft commented Sep 27, 2018

rocketraman commented Sep 27, 2018

rickle-msft commented Sep 27, 2018

rocketraman commented Sep 27, 2018

rocketraman commented Sep 28, 2018

rickle-msft commented Sep 28, 2018

rocketraman commented Oct 11, 2018 • edited Loading

rickle-msft commented Oct 11, 2018

rocketraman commented Oct 11, 2018

rickle-msft commented Oct 12, 2018

rocketraman commented Oct 12, 2018

rickle-msft commented Oct 15, 2018

yeroc commented Oct 15, 2018

rickle-msft commented Oct 15, 2018

rocketraman commented Oct 15, 2018

yeroc commented Oct 15, 2018

rickle-msft commented Oct 15, 2018

yeroc commented Oct 15, 2018

rocketraman commented Feb 20, 2019 • edited Loading

anuchandy commented Feb 20, 2019

lagerspetz commented Feb 20, 2019 • edited Loading

anuchandy commented Feb 20, 2019

rocketraman commented Feb 20, 2019 • edited Loading

lagerspetz commented Feb 20, 2019 • edited Loading

rocketraman commented Feb 20, 2019 • edited Loading

lagerspetz commented Feb 20, 2019 • edited Loading

lagerspetz commented Feb 21, 2019

rocketraman commented Feb 21, 2019

anuchandy commented Feb 21, 2019 • edited Loading

lagerspetz commented Feb 21, 2019 • edited Loading

lagerspetz commented Feb 22, 2019

rocketraman commented Feb 22, 2019

marcioos commented Feb 22, 2019 • edited Loading

lagerspetz commented Feb 22, 2019 • edited Loading

marcioos commented Mar 10, 2019 • edited Loading

rickle-msft commented Mar 11, 2019

mzarkowski commented Mar 22, 2019 • edited Loading

seguler commented Apr 1, 2019

rickle-msft commented Apr 3, 2019

rickle-msft commented Apr 8, 2019

Spinfusor commented Apr 26, 2019

lagerspetz commented Apr 26, 2019 • edited Loading

rickle-msft commented Apr 26, 2019

rocketraman commented Apr 27, 2019

Spinfusor commented Apr 27, 2019

rickle-msft commented Apr 29, 2019

rocketraman commented Aug 31, 2018 •

edited

Loading

rocketraman commented Aug 31, 2018 •

edited

Loading

rocketraman commented Oct 11, 2018 •

edited

Loading

rocketraman commented Feb 20, 2019 •

edited

Loading

lagerspetz commented Feb 20, 2019 •

edited

Loading

rocketraman commented Feb 20, 2019 •

edited

Loading

lagerspetz commented Feb 20, 2019 •

edited

Loading

rocketraman commented Feb 20, 2019 •

edited

Loading

lagerspetz commented Feb 20, 2019 •

edited

Loading

anuchandy commented Feb 21, 2019 •

edited

Loading

lagerspetz commented Feb 21, 2019 •

edited

Loading

marcioos commented Feb 22, 2019 •

edited

Loading

lagerspetz commented Feb 22, 2019 •

edited

Loading

marcioos commented Mar 10, 2019 •

edited

Loading

mzarkowski commented Mar 22, 2019 •

edited

Loading

lagerspetz commented Apr 26, 2019 •

edited

Loading