Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection Closed Prematurely #413

Closed
bcelenk opened this issue Aug 14, 2018 · 74 comments
Closed

Connection Closed Prematurely #413

bcelenk opened this issue Aug 14, 2018 · 74 comments
Labels
type/bug A general bug

Comments

@bcelenk
Copy link

bcelenk commented Aug 14, 2018

Hi,

I've come across with the Connection closed prematurely problem, reproducing is trivial. Below is the stack trace.

java.io.IOException: Connection closed prematurely
	at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:269) ~[reactor-netty-0.7.8.RELEASE.jar:0.7.8.RELEASE]
	at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:113) ~[reactor-netty-0.7.8.RELEASE.jar:0.7.8.RELEASE]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) ~[netty-codec-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282) ~[netty-codec-http-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_171]

I'm using raw cloud-gateway and webflux which are both produced by Spring Initializr/no modifications, just added a uuid to represent each user's session.

Load is generated using Gatling, 2000rps. Below is the gatling's scala file.


import io.gatling.core.Predef._
import io.gatling.http.Predef._
import io.gatling.http.request.builder.HttpRequestBuilder.toActionBuilder
import scala.concurrent.duration._
import java.util.UUID

class GatewayLoadTest extends Simulation {
  object UuidFeeder {
    val feeder = Iterator.continually(Map("uuid" -> java.util.UUID.randomUUID.toString()))
  }

  
  val theHttpProtocolBuilder = http
        .baseURL("http://localhost:8080")

  val theScenarioBuilder = scenario("sample")
  .feed(UuidFeeder.feeder)
  .forever(){
  	 exec(http("gateway")
      .get("/premature/${uuid}")
      .header("gatling-user-id", "${uuid}")
  	  .check(
          
                    status.find.in(200)))
  	}
  
    setUp(
        theScenarioBuilder.inject(constantUsersPerSec(100) during (120 seconds))
    )
    .throttle(
  			reachRps(2000) in (1 seconds),
  			holdFor(120 seconds)
		)
      .protocols(theHttpProtocolBuilder)


}

After 11k requests, exception occurred. This number varies between tests.

Gateway/reactor-netty's context when exception is occurred:
gateway-premature-exception

Origin is unaware about exception
origin

Wireshark HTTP packages, user's first two request were delegated to origin, 3rd one wasn't forwarded.
premature-wireshark

Last http package's details

Frame 275732: 210 bytes on wire (1680 bits), 210 bytes captured (1680 bits) on interface 0
    Interface id: 0 (lo0)
        Interface name: lo0
    Encapsulation type: NULL/Loopback (15)
    Arrival Time: Aug  2, 2018 16:43:41.578082000 +03
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1533217421.578082000 seconds
    [Time delta from previous captured frame: 0.000009000 seconds]
    [Time delta from previous displayed frame: 0.345598000 seconds]
    [Time since reference or first frame: 233.942944000 seconds]
    Frame Number: 275732
    Frame Length: 210 bytes (1680 bits)
    Capture Length: 210 bytes (1680 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: null:ip:tcp:http]
    [Coloring Rule Name: HTTP]
    [Coloring Rule String: http || tcp.port == 80 || http2]
Null/Loopback
    Family: IP (2)
Internet Protocol Version 4, Src: 127.0.0.1, Dst: 127.0.0.1
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
        0000 00.. = Differentiated Services Codepoint: Default (0)
        .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    Total Length: 206
    Identification: 0x0000 (0)
    Flags: 0x02 (Don't Fragment)
        0... .... = Reserved bit: Not set
        .1.. .... = Don't fragment: Set
        ..0. .... = More fragments: Not set
    Fragment offset: 0
    Time to live: 64
    Protocol: TCP (6)
    Header checksum: 0x0000 [validation disabled]
    [Header checksum status: Unverified]
    Source: 127.0.0.1
    Destination: 127.0.0.1
    [Source GeoIP: Unknown]
    [Destination GeoIP: Unknown]
Transmission Control Protocol, Src Port: 57374, Dst Port: 8080, Seq: 309, Ack: 299, Len: 154
    Source Port: 57374
    Destination Port: 8080
    [Stream index: 6157]
    [TCP Segment Len: 154]
    Sequence number: 309    (relative sequence number)
    [Next sequence number: 463    (relative sequence number)]
    Acknowledgment number: 299    (relative ack number)
    1000 .... = Header Length: 32 bytes (8)
    Flags: 0x018 (PSH, ACK)
        000. .... .... = Reserved: Not set
        ...0 .... .... = Nonce: Not set
        .... 0... .... = Congestion Window Reduced (CWR): Not set
        .... .0.. .... = ECN-Echo: Not set
        .... ..0. .... = Urgent: Not set
        .... ...1 .... = Acknowledgment: Set
        .... .... 1... = Push: Set
        .... .... .0.. = Reset: Not set
        .... .... ..0. = Syn: Not set
        .... .... ...0 = Fin: Not set
        [TCP Flags: ·······AP···]
    Window size value: 12750
    [Calculated window size: 408000]
    [Window size scaling factor: 32]
    Checksum: 0xfec2 [unverified]
    [Checksum Status: Unverified]
    Urgent pointer: 0
    Options: (12 bytes), No-Operation (NOP), No-Operation (NOP), Timestamps
        TCP Option - No-Operation (NOP)
            Kind: No-Operation (1)
        TCP Option - No-Operation (NOP)
            Kind: No-Operation (1)
        TCP Option - Timestamps: TSval 458370685, TSecr 458370364
            Kind: Time Stamp Option (8)
            Length: 10
            Timestamp value: 458370685
            Timestamp echo reply: 458370364
    [SEQ/ACK analysis]
        [iRTT: 0.000137000 seconds]
        [Bytes in flight: 154]
        [Bytes sent since last PSH flag: 154]
    TCP payload (154 bytes)
Hypertext Transfer Protocol
    GET /premature/43d91f31-5fc7-4c65-81eb-84d9865c2ff1 HTTP/1.1\r\n
        [Expert Info (Chat/Sequence): GET /premature/43d91f31-5fc7-4c65-81eb-84d9865c2ff1 HTTP/1.1\r\n]
            [GET /premature/43d91f31-5fc7-4c65-81eb-84d9865c2ff1 HTTP/1.1\r\n]
            [Severity level: Chat]
            [Group: Sequence]
        Request Method: GET
        Request URI: /premature/43d91f31-5fc7-4c65-81eb-84d9865c2ff1
        Request Version: HTTP/1.1
    gatling-user-id: 43d91f31-5fc7-4c65-81eb-84d9865c2ff1\r\n
    Host: localhost:8080\r\n
    Accept: */*\r\n
    \r\n
    [Full request URI: http://localhost:8080/premature/43d91f31-5fc7-4c65-81eb-84d9865c2ff1]
    [HTTP request 3/3]
    [Prev request in frame: 270045]

Tcp stream of the last http package(produces using right-click>follow TCP stream):
wireshark-as-tcp-stream

All apps are run on my personal macbook, same exception also occurs on EC2/Linux per gateway/origin with genuine clients.

Macbook specs:
macOS High Sierra 10.13.6 (17G65)
Processor 2.2 GHz Intel Core i7
Memory 16 GB 1600 MHz DDR3
Tried with 3 different jdks:
jdk1.8.0_131
jdk1.8.0_171
jdk1.8.0_181

This issue is directly addressed to reactor-netty.

@smaldini smaldini added the type/bug A general bug label Aug 14, 2018
@smaldini smaldini modified the milestones: 0.8.0.M3, 0.7.x Maintenance Backlog Aug 14, 2018
@violetagg
Copy link
Member

@bcelenk Can you try different settings for SO_BACKLOG? The default is 1000, so try to increase/decrease this and tell us the result.

Customise it with the code below

@Component
public class MyNettyReactiveWebServerCustomizer
                implements WebServerFactoryCustomizer<NettyReactiveWebServerFactory> {
	@Override
	public void customize(NettyReactiveWebServerFactory factory) {
		factory.addServerCustomizers(
                        builder -> builder.option(ChannelOption.SO_BACKLOG, 2000));
	}
}

@violetagg violetagg added the for/user-attention This issue needs user attention (feedback, rework, etc...) label Aug 21, 2018
@violetagg
Copy link
Member

@bcelenk were you able to play with the TCP backlog configuration. I was thinking whether this is something like this one here spring-cloud/spring-cloud-gateway#228 (comment)
I tried to reproduce it on my local machine but the results are a bit different than yours

================================================================================
---- Global Information --------------------------------------------------------
> request count                                     238668 (OK=235894 KO=2774  )
> min response time                                      0 (OK=0      KO=0     )
> max response time                                   1023 (OK=1023   KO=0     )
> mean response time                                     8 (OK=8      KO=0     )
> std deviation                                         23 (OK=23     KO=0     )
> response time 50th percentile                          4 (OK=4      KO=0     )
> response time 75th percentile                          5 (OK=5      KO=0     )
> response time 95th percentile                         14 (OK=14     KO=0     )
> response time 99th percentile                        106 (OK=107    KO=0     )
> mean requests/sec                                1956.295 (OK=1933.557 KO=22.738)
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                        235884 ( 99%)
> 800 ms < t < 1200 ms                                  10 (  0%)
> t > 1200 ms                                            0 (  0%)
> failed                                              2774 (  1%)
---- Errors --------------------------------------------------------------------
> j.n.ConnectException: Failed to open a socket.                   2624 (94.59%)
> j.n.ConnectException: connection timed out: localhost/127.0.0.    150 ( 5.41%)
1:8080
================================================================================

@bcelenk
Copy link
Author

bcelenk commented Aug 21, 2018

Hi @violetagg

Sorry about the delay, I'm on vacation. While investigating the related issues, I had seen that particular issue and changed origin to use Tomcat and set the maxKeepAliveRequests to -1 in the past, which resulted in success until saturation of file descriptors, just like in your case. This is the expected behaviour since nothing goes wrong until it hits the limit. I've increased the backlog of the origin and give it a shot (by the way there are lots of warning logs Unknown channel option 'SO_BACKLOG' for channel ... probably WebServerFactoryCustomizer doesn't consider the channel type, whether it's child or parent), couldn't see the prematurely error, to be honest I think both modifying backlog and tomcat options are deceptions (I'll share another reproducer at the end of the message which is still causing exceptions with modified backlog value).

There are various reasons to make me think like that.

First of all, we are using the cloud gateway/reactor-netty to face partial load, which are way less than it's capabilities. 85k per hour precisely, and even with low traffic at this rate, we're intermittently observing Connection closed prematurely. and SSLEngine closed already.(spring-cloud/spring-cloud-gateway#480) exceptions (sometimes SSLEngine exception occurs for consecutive requests ~150 or more).

Second reason is that we're using our cloud provider's load balancer before our origin, which is proprietary software and couldn't fine tune these kind of backlog/connection settings. At the end of the day, a person might want to send requests to external systems using pooled TcpClient, where these kind of solutions are not applicable.

What I'm pretty sure is that something is wrong with pooled TcpClient. At first, I thought that something is wrong with the elastic pool. However, could reproduce are same exceptions with fixed pool with high connection limit(which converges to elastic pool). So I come up with a reproducer to pinpoint the problem.

Basically, this reproducer creates many channels in the pool, then restarts the origin. Which makes all of these channels in the pool invalid/useless. Then trigger another requests wave to the gateway, results with the mentioned exceptions.

First of all, let's add a delay to the origin's handler method, hence TcpClient's elastic pool will create a new channel for each request to deal with the load since all existing channels are acquired for respective second. Example handler method:

    @RequestMapping("/premature/{userId}")
    @ResponseStatus(HttpStatus.OK)
    public Flux<Long> premature() {
        return Flux.just(1L).delayElements(Duration.ofSeconds(1L));
    }

Using the lsof, monitor the count of the open connections between gateway<->origin.
7777 is the origin's port in this case.
repeat 10000 lsof -i tcp:7777 | wc -l

Then create channels in the pool using the below command with wrk2
wrk2 -H "gatling-request-id: test" -t8 -c1000 -d10s -R1 http://localhost:9009/premature/test

There shouldn't be any exceptions until now. And there are plenty of connections as the result of the lsof
pool-has-channels

Then restart the origin, which will close the connections by sending FIN packet.

pool-empty

Because of the nature of the TCP, reactor-netty client's pool still has these invalidate channels. However, this invalid channels will be vanished thanks to the connection pool's health checker while trying to acquire the channel, hence it should not be a problem. But exception occurs. If I enable SSL, it'd have thrown the mentioned SSL exception. So something's wrong here.

2018-08-21 16:17:47.396 ERROR 77211 --- [ctor-http-nio-8] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

java.io.IOException: Connection closed prematurely
	at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:269) ~[reactor-netty-0.7.8.RELEASE.jar:0.7.8.RELEASE]
	at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:113) ~[reactor-netty-0.7.8.RELEASE.jar:0.7.8.RELEASE]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) ~[netty-codec-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282) ~[netty-codec-http-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]

@violetagg violetagg removed the for/user-attention This issue needs user attention (feedback, rework, etc...) label Aug 22, 2018
@violetagg violetagg modified the milestones: 0.7.x Maintenance Backlog, 0.7.9.RELEASE Aug 22, 2018
@violetagg
Copy link
Member

@bcelenk Your last example was extremely helpful. Thanks for that.
The issue is that when we drain the last element from the queue that holds the request messages (sent by the client)/response messages (sent by the server) we should not schedule the flush operation but invoke it immediately.

the corresponding logs

2018-08-22 17:06:03.978 DEBUG 39805 --- [ctor-http-nio-5] r.i.n.channel.ChannelOperationsHandler   : [id: 0xe05650db, L:/127.0.0.1:52103 - R:localhost/127.0.0.1:8060] Writing object DefaultHttpRequest(decodeResult: success, version: HTTP/1.1)
GET /premature/test HTTP/1.1
gatling-request-id: test
Forwarded: proto=http;host="localhost:8080";for="0:0:0:0:0:0:0:1:52020"
X-Forwarded-For: 0:0:0:0:0:0:0:1
X-Forwarded-Proto: http
X-Forwarded-Port: 8080
X-Forwarded-Host: localhost:8080
host: localhost:8060
...
2018-08-22 17:06:03.979 DEBUG 39805 --- [ctor-http-nio-5] r.i.n.resources.DefaultPoolResources     : [id: 0xe05650db, L:/127.0.0.1:52103 ! R:localhost/127.0.0.1:8060] Channel released, now 137 active connections
...
2018-08-22 17:06:03.983 DEBUG 39805 --- [ctor-http-nio-5] r.ipc.netty.channel.ChannelOperations    : [id: 0xe05650db, L:/127.0.0.1:52103 ! R:localhost/127.0.0.1:8060] An outbound error could not be processed

java.io.IOException: Broken pipe
    at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_141]
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_141]
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_141]
    at sun.nio.ch.IOUtil.write(IOUtil.java:51) ~[na:1.8.0_141]
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) ~[na:1.8.0_141]
    at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:403) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:360) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:901) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1396) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.handler.logging.LoggingHandler.flush(LoggingHandler.java:265) [netty-handler-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.flush(CombinedChannelDuplexHandler.java:533) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.CombinedChannelDuplexHandler.flush(CombinedChannelDuplexHandler.java:358) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749) [netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at reactor.ipc.netty.channel.ChannelOperationsHandler.lambda$scheduleFlush$0(ChannelOperationsHandler.java:335) [reactor-netty-0.7.9.BUILD-SNAPSHOT.jar:0.7.9.BUILD-SNAPSHOT]
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) ~[netty-transport-4.1.27.Final.jar:4.1.27.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.27.Final.jar:4.1.27.Final]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_141]

Do you think you can grab #419 and test the change?

@bcelenk
Copy link
Author

bcelenk commented Aug 22, 2018

@violetagg Thank you for spotting that. Unfortunately, I'm still experiencing the same exception while trying the last example. I've cherry-picked #419 onto 0.7.x branch(also tried with 0.7.8 RELEASE as the base) and applied it to both gateway/origin. If in your case the error is not reproducible, my humble suggestion would be to enable SSL while connecting to the origin. In that case, both Connection closed prematurely. and SSLEngine closed already. occur and error ratio is ~%100 in my case. I have a hunch that both exceptions are related. Below is an example log excerpt.

2018-08-22 23:41:39.793 ERROR 35017 --- [ctor-http-nio-4] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

java.io.IOException: Connection closed prematurely
	at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:271) ~[reactor-netty-0.7.9.BUILD-SNAPSHOT.jar:0.7.9.BUILD-SNAPSHOT]
	at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:115) ~[reactor-netty-0.7.9.BUILD-SNAPSHOT.jar:0.7.9.BUILD-SNAPSHOT]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) ~[netty-codec-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282) ~[netty-codec-http-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) ~[netty-codec-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) ~[netty-codec-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1028) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]

2018-08-22 23:41:39.913 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

io.netty.channel.AbstractChannel$AnnotatedSocketException: Connection reset by peer: localhost/127.0.0.1:7777
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_131]
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_131]
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.net.SocketException: Connection reset by peer
	... 10 common frames omitted

2018-08-22 23:41:39.986 ERROR 35017 --- [ctor-http-nio-6] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

io.netty.channel.AbstractChannel$AnnotatedSocketException: Connection reset by peer: localhost/127.0.0.1:7777
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_131]
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_131]
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.net.SocketException: Connection reset by peer
	... 10 common frames omitted

2018-08-22 23:41:40.068 ERROR 35017 --- [ctor-http-nio-5] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.068 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.069 ERROR 35017 --- [ctor-http-nio-7] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.073 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.074 ERROR 35017 --- [ctor-http-nio-1] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.085 ERROR 35017 --- [ctor-http-nio-7] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.092 ERROR 35017 --- [ctor-http-nio-7] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.095 ERROR 35017 --- [ctor-http-nio-6] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

io.netty.channel.AbstractChannel$AnnotatedSocketException: Connection reset by peer: localhost/127.0.0.1:7777
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_131]
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_131]
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.net.SocketException: Connection reset by peer
	... 10 common frames omitted

2018-08-22 23:41:40.096 ERROR 35017 --- [ctor-http-nio-1] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:40.225 ERROR 35017 --- [ctor-http-nio-8] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

io.netty.channel.AbstractChannel$AnnotatedSocketException: Connection reset by peer: localhost/127.0.0.1:7777
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_131]
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_131]
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460) ~[netty-transport-4.1.28.Final.jar:4.1.28.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) ~[netty-common-4.1.28.Final.jar:4.1.28.Final]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_131]
Caused by: java.net.SocketException: Connection reset by peer
	... 10 common frames omitted

2018-08-22 23:41:41.050 ERROR 35017 --- [ctor-http-nio-2] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.083 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.089 ERROR 35017 --- [ctor-http-nio-2] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.138 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.138 ERROR 35017 --- [ctor-http-nio-1] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.146 ERROR 35017 --- [ctor-http-nio-3] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

2018-08-22 23:41:41.146 ERROR 35017 --- [ctor-http-nio-4] .a.w.r.e.DefaultErrorWebExceptionHandler : Failed to handle request [GET http://localhost:9009/premature/test]

javax.net.ssl.SSLException: SSLEngine closed already
	at io.netty.handler.ssl.SslHandler.wrap(...)(Unknown Source) ~[netty-handler-4.1.28.Final.jar:4.1.28.Final]

@dave-fl
Copy link

dave-fl commented Aug 23, 2018

Hi,

I've seen some mentions of problems related to pooling - Does this fall into the same category?

@violetagg
Copy link
Member

@dave-fl The problem that I described above so related to the pooling yes.

@dave-fl
Copy link

dave-fl commented Aug 24, 2018

@violetagg Will the use of netty-tcnative help here?

@mulderbaba
Copy link

@dave-fl It won't. I think the reason behind here would be the pool management itself. I tried @bcelenk 's sample as well and if the origin restarts itself, the gateway doesn't do a proper health check on the previously opened socket connections while trying the re-use them during the handshaking with the new origin. @violetagg do you have any comments on this one? if you can shed some light on this class-wise I can also take a look at it deeply.

@dave-fl
Copy link

dave-fl commented Aug 28, 2018

@mulderbaba

At any point do the connections move out of the invalid state and become useable again eg can this be worked around through retry?

I’ve also read that a fixed pool can address this (not sure why).

@violetagg
Copy link
Member

Hey,

I added additional debugging to the 0.7.9.BUILD-SNAPSHOT.
Please execute the following:

  • enable the logger logging.level.reactor.ipc.netty.resources.DefaultPoolResources=DEBUG on the gateway application
  • run the scenario
  • stop the origin
  • look at the log file and check the last record of the type Channel closed, now x active connections and y inactive connections. This should be the last line of the log file. I want to check whether on your machine there are channels in the pool or not.

Thanks a lot,
Violeta

@mulderbaba
Copy link

mulderbaba commented Aug 29, 2018

Hey @violetagg,

  • Origin impl is running on a box and gateway impl is running on another box. load-tool wrk is running on the same box with origin impl.
  • I enabled the logging as stated.
  • I implemented 2 origins, one with reactor-netty based [legend:X] and one with tomcat based [legend:Y]. both origins get the request and delays it by 1 second.
  • I updated my gateway implementation to use 0.7.9.BUILD-SNAPSHOT.

For both origin implementations I experienced SSLEngine closed already exceptions and the count of inactive connections weren't 0. I'm attaching log output of my gateway for both executions; X, Y.
[X]: gateway_output_01.txt
[Y]: gateway_output_02.txt

Wireshark captures are from the origin machine.
[X]: origin_wireshark_01.pcapng.zip

@violetagg
Copy link
Member

@mulderbaba There was one use case where we missed to decrement the inactive counter (this is when the connection is created but cannot be acquired because of some exceptions in my case it was SocketException: Connection reset by peer). I fixed that so can you try with the new snapshot again.
Can you enable one additional log logging.level.reactor.ipc.netty.http.client.HttpClientOperations=DEBUG thus when SSLEngine closed already happens we will know the port number and then we can find the records in the tcpdump.
On my end I see always the server initiate the connection close (when SSLEngine closed already happens), when the origin is stopped all the connections are closed. When I start the origin again I can see that new connections are created.

2018-08-31 08:17:07.176 DEBUG 8785 --- [ctor-http-nio-3] r.i.n.resources.DefaultPoolResources     : [id: 0x3f7cffa5, L:/127.0.0.1:59578 ! R:localhost/127.0.0.1:8060] Channel closed, now 0 active connections and 0 inactive connections
2018-08-31 08:17:40.264 DEBUG 8785 --- [ctor-http-nio-1] r.i.n.resources.DefaultPoolResources     : [id: 0xe87e160d] Created new pooled channel, now 0 active connections and 5 inactive connections
2018-08-31 08:17:40.264 DEBUG 8785 --- [ctor-http-nio-5] r.i.n.resources.DefaultPoolResources     : [id: 0x275ebcb8] Created new pooled channel, now 0 active connections and 4 inactive connections
2018-08-31 08:17:40.264 DEBUG 8785 --- [ctor-http-nio-7] r.i.n.resources.DefaultPoolResources     : [id: 0x236088be] Created new pooled channel, now 0 active connections and 4 inactive connections
2018-08-31 08:17:40.264 DEBUG 8785 --- [ctor-http-nio-8] r.i.n.resources.DefaultPoolResources     : [id: 0x3d488f83] Created new pooled channel, now 0 active connections and 4 inactive connections
2018-08-31 08:17:40.264 DEBUG 8785 --- [ctor-http-nio-6] r.i.n.resources.DefaultPoolResources     : [id: 0x8ec97034] Created new pooled channel, now 0 active connections and 4 inactive connections

@mulderbaba
Copy link

@violetagg I made 2 runs with reactor-netty based gateway and origin. Using the latest snapshot of 0.7.x.

First run contains all logs with debug mode. Here are the logs of gateway, logs of origin and tcpdump of origin:
gateway_log_01.log
gateway_log_02.log
origin_log.log
origin_tcpdump.pcapng.zip

Second run only contains HttpClientOperations=DEBUG set. Here are the logs of gateway and tcpdump of origin:
gateway_simplified.log
wireshark-short-last.pcapng.zip

@JCASTANO
Copy link

JCASTANO commented Sep 7, 2018

@violetagg Good afternoon, some possible date of solution for this problem? We are starting a project with spring webflux and we are a little concerned about the problem because we have a large volume of users, greetings and many thanks

@sharathsrinivasa
Copy link

@sharathsrinivasa I'm not able to reproduce it using the provided scenario. Can you share you environment details - OS, JAVA version etc.

OS - macOS Sierra
Java
➜ ~ java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
➜ ~

@violetagg
Copy link
Member

Please try 0.7.13.BUILD-SNAPSHOT/0.8.4.BUILD-SNAPSHOT.
Snapshots repository https://repo.spring.io/snapshot
A new fix is available for the scenario client(POST) -> server(POST) (#536)

@revfactory
Copy link

Our team is also seeing the following error:

BubblingException: reactor.netty.http.client.HttpClientOperations $ PrematureCloseException: Connection prematurely closed DURING response
     at reactor.core.Exceptions.bubble (Exceptions.java:154)
     at reactor.core.publisher.Operators.onErrorDropped (Operators.java:512)
     at reactor.netty.channel.FluxReceive.onInboundError (FluxReceive.java:349)
     at reactor.netty.channel.ChannelOperations.onInboundError (ChannelOperations.java:396)
     at reactor.netty.http.client.HttpClientOperations.onInboundClose (HttpClientOperations.java:254)
...
(27 additional frames (s) were not displayed)

PrematureCloseException: Connection prematurely closed DURING response

I'm not sure if this is the same issue
We tried updating to 0.8.3.RELEASE and it was not fixed
Is the issue resolved in the 0.8.4.BUILD-SNAPSHOT version (maybe spring boot 2.1.2)

@violetagg
Copy link
Member

@revfactory Please have in mind that this exception might not indicate a problem in Reactor Netty but that your server closes the connection on purpose (e.g. spring-cloud/spring-cloud-gateway#473 (comment)) so it is crucial for us to know your scenario.
Please also try the 0.8.4.BUILD-SNAPSHOT.
Thanks in advance,
Violeta

@lukaszpy
Copy link

lukaszpy commented Dec 4, 2018

@violetagg - maybe my IOException is related to my docker swarm envirnoment - where I'm using IPVS load balancing - could You let me know if it could be related https://translate.google.pl/translate?sl=auto&tl=pl&u=http%3A%2F%2Fwww.anymk.com%2Fp%2Ff05294c0a456 ?

I'm curnetly using 0.8.4-SNAPSHOT and have the same issue

@lukaszpy
Copy link

lukaszpy commented Dec 6, 2018

@violetagg looks like my problems were resolved by changing:

sudo sysctl -w net.ipv4.tcp_keepalive_time=600
sudo sysctl -w net.ipv4.tcp_keepalive_intvl=50
sudo sysctl -w net.ipv4.tcp_keepalive_probes=10

@violetagg
Copy link
Member

violetagg commented Dec 10, 2018

All, I committed another fix related to Connection prematurely closed DURING response (#551)
If you can test 0.8.4.BUILD-SNAPSHOT it will be very helpful.

@sharathsrinivasa
Copy link

@violetagg For me after tweaking the ConnectionProvider from elastic to fixed, it fixed bunch of connection errors. Will post back if I see more issues from my side.

@violetagg
Copy link
Member

Closing this one for now. If there are other use case than these fixed till now we can reopen it.

@xiaodanguoguo
Copy link

@violetagg
Hey, I also meet this exceptions, same request to the gateway one success and one failure cycle, the second request debug_log as follows:


[01-14 14:31:37.089] DEBUG DefaultClientConnection.java:278- >> GET /eureka/apps/delta HTTP/1.1
[01-14 14:31:37.089] DEBUG DefaultClientConnection.java:281- >> Accept: application/json
[01-14 14:31:37.089] DEBUG DefaultClientConnection.java:281- >> DiscoveryIdentity-Name: DefaultClient
[01-14 14:31:37.089] DEBUG DefaultClientConnection.java:281- >> DiscoveryIdentity-Version: 1.4
[01-14 14:31:37.089] DEBUG DefaultClientConnection.java:281- >> DiscoveryIdentity-Id: 10.1.41.145
[01-14 14:31:37.089] DEBUG DefaultClientConnection.java:281- >> Accept-Encoding: gzip
[01-14 14:31:37.089] DEBUG DefaultClientConnection.java:281- >> Host: 127.0.0.1:10008
[01-14 14:31:37.090] DEBUG DefaultClientConnection.java:281- >> Connection: Keep-Alive
[01-14 14:31:37.090] DEBUG DefaultClientConnection.java:281- >> User-Agent: Java-EurekaClient/v1.9.3
[01-14 14:31:37.102] DEBUG Wire.java:73-  << "HTTP/1.1 200 [\r][\n]"
[01-14 14:31:37.103] DEBUG Wire.java:73-  << "Content-Encoding: gzip[\r][\n]"
[01-14 14:31:37.103] DEBUG Wire.java:73-  << "Content-Type: application/json[\r][\n]"
[01-14 14:31:37.103] DEBUG Wire.java:73-  << "Content-Length: 634[\r][\n]"
[01-14 14:31:37.103] DEBUG Wire.java:73-  << "Date: Mon, 14 Jan 2019 06:31:37 GMT[\r][\n]"
[01-14 14:31:37.103] DEBUG Wire.java:73-  << "[\r][\n]"
[01-14 14:31:37.104] DEBUG DefaultClientConnection.java:259- Receiving response: HTTP/1.1 200 
[01-14 14:31:37.104] DEBUG DefaultClientConnection.java:262- << HTTP/1.1 200 
[01-14 14:31:37.104] DEBUG DefaultClientConnection.java:265- << Content-Encoding: gzip
[01-14 14:31:37.104] DEBUG DefaultClientConnection.java:265- << Content-Type: application/json
[01-14 14:31:37.104] DEBUG DefaultClientConnection.java:265- << Content-Length: 634
[01-14 14:31:37.104] DEBUG DefaultClientConnection.java:265- << Date: Mon, 14 Jan 2019 06:31:37 GMT
[01-14 14:31:37.105] DEBUG DefaultRequestDirector.java:509- Connection can be kept alive indefinitely
[01-14 14:31:37.105] DEBUG Wire.java:73-  << "[0x1f][0x8b][0x8][0x0][0x0][0x0][0x0][0x0][0x0][0x0][0xed]U][0x8f][0x9a]@[0x14][0xfd]/[0x93]}D[0x14][0x1]q}Z+[0xa6]1M[0xd5][0xf8][0xb1][0xa6]i6f[0x16][0xae]2)[0xc]d[0x18][0xb4][0xc6][0xf0][0xdf]{[0x7][0xc4][0xd5]][0x8d][0xb6][0xd9][0xc7]>[0xe9][0xdc]{[0xee][0xd7][0xb9][0xf7][0x84]=[0xa1]I[0x12]2[0x8f]J[0x16][0xf3][0x94]t[0xf6]d[0x3]"U[0xff][0x97]K[0x1f]BII[0x87][0x18]DS([0xb4][0x4]4[\r][0xbc][0xd8][0x7]4[0xce][0xc7][0xcb][0xe6][0xb2]tT[0xe1][0xa4][0xf3]sO8[0x8d][0x94]{<[0x19][0xb9][0xf3][0xde]l0[0x1a][0xd6][0xa6][0xfd][0xc9][0xf3][0xa0][0xd7]G,[0xe3][0xa9][0xa4][0xdc][0x83][0x2]X=[0x6][0xbe]*[0xd1][0xd0][\r][0xdd]2t[0xc3][0xb2];[0xcd][0xa6][0xd9]Bp[0x10][0xa7]rX&;[0xf1][0x96][0x15][0xaf][0x16]H[0xba][0xbe]/>D`![0x99][0xa5]E[0xd3][0xf8][0x8a]qB[0xc1]|[0x1f][0xf8][0xf4]h[0x1f]~[0x1b][0x8e][0x16]Ct&[0xb1][0x90][0x8a][0x85][0x7]R[0xf4][0xa1][0x91]'[0xe0][0xf4]5[0x4][0xd5][0xa4][0x14][0x19][0x90][0x1c][0xd3][0x81][0x97][0x9][0x18][0xbf]!-[0xcb]<[0x3][0xae]h[0x98][0x16]H/[0xce][0xb8][0x14];5[0xa2][0xa1][0x11][0x9f]J[0xda][0x3].A[0xc][0xf8]*V[0xb1]O^HS[0xd5][0x80][0x17]G:[0x7][0xb9][\n]"
[01-14 14:31:37.105] DEBUG Wire.java:73-  << "[0xd9]o[0x1d][0xe7]c[0xe8][0xd7][0x7][0x15]A[0xf8]xpaE[0xb3]P[0xba][0xe7])[0xb4][0x8a][0xee][0xef][0xbb][0xd1][0x96][0xab][0x8a]![0xd0][0x14][0xaa][0xf4][0x2]8li8P[0xf8][0x8d][0xfa][0x9d][0x82][0x87][0xd5][0xcc][0x6][0xf6][0x92][0x89]bg[0x95][0xcd][0xb0]5[0x84][0xaf]Y*K[0xfb][0x8c]E[0x80][0xe5][0xa3]D[0xb9],[0xc7][0xb2][0x1c][0xd3]j:&[0xa2][0xb0]c9)[0x13]_[0x0][0xb5][0x1d][0xbb][0x81][0xac][0xc1][0x86]y[0xef][0xd2]4[0x14]q[0x2][0xed]0O[0xae]do[0xe1][0x0][0x11]H[0xaa][0x88]R[0xfd]G[0x94][0xd3]5D8[0xaf]^[0xee][0x85][0x14][0xb7][0x91][0xab][0xe3][0x88]`[0x8c][0xbe][0xb9][0x8][0xd1][0x1a]H[0x99]t[0xea][0xf5][0xf7]GT?[0xae][0xfe]6[0x94]z2[0xa3]2[0x16]uV[0xf2][0x1a][0x0][\r]e[0xd0][0xb][0xc0][0xfb]ug`[0x19][0x81][0xa1][0x9b][0xf2][\n]"
[01-14 14:31:37.105] DEBUG Wire.java:87-  << "[0xa1]Xl"b?+[0xa8][0xa8][0x1d][0xa6]'[0xd5][0x1]=[0xdf][0xc4][0xb1][0xb4][0x17][0xc7][0xc2]g[0x1c]W[0xc2][0xd7].K=u[0xbb][0xbb])[0x2]@[0x1c][0xef][0xac]\[0xc9]<A[0xd6][0xc0]?a[0x96][0x9c]R[0xeb][0x1c]`.[0x13]rw[0x19]d7L[0xa5][0xae]ro[0xbb]D[0x9d]U[0xd7]u[0xfb].[0xc9]_r[0xed]([0xec]E[0xff]Kw<[0x9e][0xd6][0xbe]vg[0xfd]E[0xf7][0xc7][0xdd][0xaa]n?[0xb6][0x1f]o[0xab][0xfa]B[0xf6][0xcf][0x94][0xb4]j[0xe2][0xbf][0xa4][0xdf]Dg;M[0xd3][0xb9]%[0xe9][0x3][0xe8][0xef]%][0x4][0xde][0x94]tq[0x18][0xf7]HZ[0x1][0xef][0x94]t[0x1][0xfd][0x17]I[0x9f][0x7]^[0x96][0xf4][0x16]^[0xd5][0xd7][0xb0][0xb6]F[0xbd]m[0xe9][0xee][0xb2][0x9e]?[0x82]>I[0xcc][0x5][0xa9][0xb7][0xc4]l[0xb7][0xda][0xcd][0xf6]U1[0xbf][0xe4][0xf9][0x1f][0xe2]2[0xfb][0xb6][0xf7][0x7][0x0][0x0]"
[01-14 14:31:37.107] DEBUG ThreadSafeClientConnManager.java:286- Released connection is reusable.
[01-14 14:31:37.107] DEBUG ConnPoolByRoute.java:431- Releasing connection [{}->http://127.0.0.1:10008][null]
[01-14 14:31:37.107] DEBUG ConnPoolByRoute.java:457- Pooling connection [{}->http://127.0.0.1:10008][null]; keep alive indefinitely
[01-14 14:31:37.107] DEBUG ConnPoolByRoute.java:678- Notifying no-one, there are no waiting threads
[01-14 14:31:37.107] DEBUG DeserializerStringCache.java:122- clearing global-level cache with size 1
[01-14 14:31:37.108] DEBUG DeserializerStringCache.java:126- clearing app-level serialization cache with size 15
[01-14 14:31:37.108] DEBUG AbstractJerseyEurekaHttpClient.java:206- Jersey HTTP GET http://127.0.0.1:10008/eureka//apps/delta?; statusCode=200
[01-14 14:31:37.108] DEBUG DiscoveryClient.java:1095- Got delta update with apps hashcode UP_2_
[01-14 14:31:37.108] DEBUG DiscoveryClient.java:1208- Added instance 10.1.41.145:2236 to the existing apps in region null
[01-14 14:31:37.108] DEBUG DiscoveryClient.java:1208- Added instance 10.1.41.145:8989 to the existing apps in region null
[01-14 14:31:37.108] DEBUG DiscoveryClient.java:1229- The total number of instances fetched by the delta processor : 2
[01-14 14:31:37.108] DEBUG DiscoveryClient.java:1126- The total number of all instances in the client now is 2
[01-14 14:31:37.108] DEBUG AbstractBeanFactory.java:254- Returning cached instance of singleton bean 'cachedCompositeRouteLocator'
[01-14 14:31:37.109] DEBUG DiscoveryClient.java:1491- Completed cache refresh task for discovery. All Apps hash code is Local region apps hashcode: UP_2_, is fetching remote regions? false 
[01-14 14:31:42.923] DEBUG Slf4JLogger.java:71- [id: 0x76628456, L:/10.1.41.145:52071 - R:/10.1.41.145:2236] READ COMPLETE
[01-14 14:31:42.923] DEBUG Loggers.java:218- [id: 0x76628456, L:/10.1.41.145:52071 ! R:/10.1.41.145:2236] Channel closed, now 1 active connections and -1 inactive connections
[01-14 14:31:42.924] DEBUG Loggers.java:218- [id: 0x76628456, L:/10.1.41.145:52071 ! R:/10.1.41.145:2236] Channel released, now 0 active connections and 0 inactive connections
[01-14 14:31:42.924] DEBUG Slf4JLogger.java:71- [id: 0x76628456, L:/10.1.41.145:52071 ! R:/10.1.41.145:2236] INACTIVE
[01-14 14:31:42.935] DEBUG RouterFunctions.java:406- Predicate "org.springframework.web.reactive.function.server.RequestPredicates$$Lambda$844/1897822856@24ab7298" matches against "POST /gateway/test"
[01-14 14:31:42.943] ERROR DefaultErrorWebExceptionHandler.java:218- Failed to handle request [POST http://localhost:8989/gateway/test]
java.io.IOException: Connection closed prematurely
	at reactor.ipc.netty.http.client.HttpClientOperations.onInboundClose(HttpClientOperations.java:271)
	at reactor.ipc.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:115)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
	at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:282)
	at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.handler.logging.LoggingHandler.channelInactive(LoggingHandler.java:167)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947)
	at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:446)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
	at java.lang.Thread.run(Thread.java:745)
[01-14 14:31:42.957] DEBUG Loggers.java:218- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] Writing object DefaultHttpResponse(decodeResult: success, version: HTTP/1.1)
HTTP/1.1 500 Internal Server Error
Content-Type: application/json;charset=UTF-8
Content-Length: 154
[01-14 14:31:42.958] DEBUG Slf4JLogger.java:71- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] WRITE: 105B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 48 54 54 50 2f 31 2e 31 20 35 30 30 20 49 6e 74 |HTTP/1.1 500 Int|
|00000010| 65 72 6e 61 6c 20 53 65 72 76 65 72 20 45 72 72 |ernal Server Err|
|00000020| 6f 72 0d 0a 43 6f 6e 74 65 6e 74 2d 54 79 70 65 |or..Content-Type|
|00000030| 3a 20 61 70 70 6c 69 63 61 74 69 6f 6e 2f 6a 73 |: application/js|
|00000040| 6f 6e 3b 63 68 61 72 73 65 74 3d 55 54 46 2d 38 |on;charset=UTF-8|
|00000050| 0d 0a 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 |..Content-Length|
|00000060| 3a 20 31 35 34 0d 0a 0d 0a                      |: 154....       |
+--------+-------------------------------------------------+----------------+
[01-14 14:31:42.958] DEBUG Slf4JLogger.java:71- [id: 0x76628456, L:/10.1.41.145:52071 ! R:/10.1.41.145:2236] UNREGISTERED
[01-14 14:31:42.958] DEBUG Slf4JLogger.java:71- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] FLUSH
[01-14 14:31:42.958] DEBUG Loggers.java:218- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] Writing object 
[01-14 14:31:42.959] DEBUG Slf4JLogger.java:71- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] WRITE: 154B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 7b 22 74 69 6d 65 73 74 61 6d 70 22 3a 22 32 30 |{"timestamp":"20|
|00000010| 31 39 2d 30 31 2d 31 34 54 30 36 3a 33 31 3a 34 |19-01-14T06:31:4|
|00000020| 32 2e 39 33 36 2b 30 30 30 30 22 2c 22 70 61 74 |2.936+0000","pat|
|00000030| 68 22 3a 22 2f 67 61 74 65 77 61 79 2f 74 65 73 |h":"/gateway/tes|
|00000040| 74 22 2c 22 73 74 61 74 75 73 22 3a 35 30 30 2c |t","status":500,|
|00000050| 22 65 72 72 6f 72 22 3a 22 49 6e 74 65 72 6e 61 |"error":"Interna|
|00000060| 6c 20 53 65 72 76 65 72 20 45 72 72 6f 72 22 2c |l Server Error",|
|00000070| 22 6d 65 73 73 61 67 65 22 3a 22 43 6f 6e 6e 65 |"message":"Conne|
|00000080| 63 74 69 6f 6e 20 63 6c 6f 73 65 64 20 70 72 65 |ction closed pre|
|00000090| 6d 61 74 75 72 65 6c 79 22 7d                   |maturely"}      |
+--------+-------------------------------------------------+----------------+
[01-14 14:31:42.959] DEBUG Slf4JLogger.java:71- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] FLUSH
[01-14 14:31:42.959] DEBUG ReactorHttpHandlerAdapter.java:77- Handling completed with success
[01-14 14:31:42.960] DEBUG Loggers.java:213- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] Last HTTP response frame
[01-14 14:31:42.960] DEBUG Loggers.java:218- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] Writing object EmptyLastHttpContent
[01-14 14:31:42.960] DEBUG Slf4JLogger.java:71- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] WRITE: 0B
[01-14 14:31:42.960] DEBUG Loggers.java:218- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] Decreasing pending responses, now 0
[01-14 14:31:42.960] DEBUG Slf4JLogger.java:71- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] FLUSH
[01-14 14:31:42.961] DEBUG Loggers.java:213- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] Discarding inbound content
[01-14 14:31:42.961] DEBUG Slf4JLogger.java:71- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] USER_EVENT: [Handler Terminated]
[01-14 14:31:42.961] DEBUG Loggers.java:218- [id: 0x0c34790d, L:/0:0:0:0:0:0:0:1:8989 - R:/0:0:0:0:0:0:0:1:52058] Disposing context reactor.ipc.netty.channel.ServerContextHandler@413c1998
[01-14 14:31:46.463] DEBUG ThreadSafeClientConnManager.java:332- Closing connections idle longer than 30000 SECONDS
[01-14 14:31:46.464] DEBUG ConnPoolByRoute.java:720- Closing connections idle longer than 30000 SECONDS
[01-14 14:31:46.883] DEBUG ThreadSafeClientConnManager.java:240- Get connection: {}->http://127.0.0.1:10008, timeout = 5000
[01-14 14:31:46.883] DEBUG ConnPoolByRoute.java:347- [{}->http://127.0.0.1:10008] total kept alive: 1, total issued: 0, total allocated: 1 out of 200
[01-14 14:31:46.883] DEBUG ConnPoolByRoute.java:496- Getting free connection [{}->http://127.0.0.1:10008][null]
[01-14 14:31:46.883] DEBUG DefaultRequestDirector.java:431- Stale connection check
[01-14 14:31:46.885] DEBUG RequestAddCookies.java:123- CookieSpec selected: default
[01-14 14:31:46.886] DEBUG RequestAuthCache.java:77- Auth cache not set in the context
[01-14 14:31:46.886] DEBUG RequestTargetAuthentication.java:80- Target auth state: UNCHALLENGED
[01-14 14:31:46.886] DEBUG RequestProxyAuthentication.java:89- Proxy auth state: UNCHALLENGED
[01-14 14:31:46.886] DEBUG DefaultRequestDirector.java:682- Attempt 1 to execute request
[01-14 14:31:46.887] DEBUG DefaultClientConnection.java:274- Sending request: PUT /eureka/apps/WEBAPPS-GATEWAY/10.1.41.145:8989?status=UP&lastDirtyTimestamp=1547447356828 HTTP/1.1
[01-14 14:31:46.887] DEBUG Wire.java:73-  >> "PUT /eureka/apps/WEBAPPS-GATEWAY/10.1.41.145:8989?status=UP&lastDirtyTimestamp=1547447356828 HTTP/1.1[\r][\n]"
[01-14 14:31:46.887] DEBUG Wire.java:73-  >> "DiscoveryIdentity-Name: DefaultClient[\r][\n]"
[01-14 14:31:46.887] DEBUG Wire.java:73-  >> "DiscoveryIdentity-Version: 1.4[\r][\n]"
[01-14 14:31:46.887] DEBUG Wire.java:73-  >> "DiscoveryIdentity-Id: 10.1.41.145[\r][\n]"
[01-14 14:31:46.887] DEBUG Wire.java:73-  >> "Accept-Encoding: gzip[\r][\n]"
[01-14 14:31:46.887] DEBUG Wire.java:73-  >> "Content-Length: 0[\r][\n]"
[01-14 14:31:46.888] DEBUG Wire.java:73-  >> "Host: 127.0.0.1:10008[\r][\n]"
[01-14 14:31:46.888] DEBUG Wire.java:73-  >> "Connection: Keep-Alive[\r][\n]"
[01-14 14:31:46.888] DEBUG Wire.java:73-  >> "User-Agent: Java-EurekaClient/v1.9.3[\r][\n]"
[01-14 14:31:46.888] DEBUG Wire.java:73-  >> "[\r][\n]"

I tried the new versions 0.7.12.RELEASE and run my program on Red Hat 4.8.5-16 server but result was same exceptions

My environment list as follows:

MacBoot Pro
JDK 1.8.0_121
MacOS Mojave 10.14
Spring-Cloud Finchley.SR2
Reactor-netty 0.7.12.RELEASE、0.7.10.RELEASE

@violetagg
Copy link
Member

@xiaodanguoguo Try with 0.7.14.RELEASE, there are fixes for some use cases that might cause this exception to be thrown.

@xiaodanguoguo
Copy link

xiaodanguoguo commented Jan 15, 2019

@violetagg
I just tried this version, the exception also exist. It's my pom config:

             <dependency>
			<groupId>org.springframework.cloud</groupId>
			<artifactId>spring-cloud-starter-gateway</artifactId>
			<exclusions>
				<exclusion>
					<groupId>io.projectreactor.ipc</groupId>
		  			<artifactId>reactor-netty</artifactId>
				</exclusion>
			</exclusions>
		</dependency>
		<dependency>
			<groupId>io.projectreactor.ipc</groupId>
			<artifactId>reactor-netty</artifactId>
			<version>0.7.14.RELEASE</version>
		</dependency>

and my gateway filter code:

public class MRSFilter implements GatewayFilter, Ordered {
        private static final String PARAM_ERR_MSG = "入参错误";
	private static final String INTERFACDE_CODE_NOT_EXISTS = "接口号不存在";
	private static final String COLUMN_NOT_EXISTS = "属性不存在";
	private static final String SYS_ERR = "系统异常";
	private static final String DEFAULT_PRE_UTL = "http://localhost:8989/";
	private static final String PRE_SERVICE_ID = "lb://";
	private static final String SPLIT = "|";
	private static final String TARGET_OBJ_TYPE_ERROR = "不支持的对象类型";
	private static final String SENDER_IS_NULL = "系统号为空";
	@Override
	public int getOrder() {
		return 10001;
	}

	@Override
	public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
		Map<String, Object> attributes = exchange.getAttributes();
		ServerHttpRequest request = exchange.getRequest();
		String bodyStr = resolveBodyFromRequest(request);
		NodeInfo node = JSON.parseObject(bodyStr, NodeInfo.class);
		node.getMessageHeader().getUUID();

		ServerHttpResponse response = exchange.getResponse();
		// 校验后转换入参对象
		Object requestBodyObj = getRequestBodyObj(node);
		if (requestBodyObj instanceof String) {
			IXBUSResponse resp = new IXBUSResponse(ServiceResponse.FAIL_CODE, (String) requestBodyObj);
			DataBuffer responseBodyDataBuffer = stringBuffer(JSON.toJSONString(resp));
			return response.writeWith(Mono.just(responseBodyDataBuffer));
		}
		
		DataBuffer bodyDataBuffer = stringBuffer(JSON.toJSONString(requestBodyObj));
		Flux<DataBuffer> bodyFlux = Flux.just(bodyDataBuffer);
		
		String sysCode = node.getMessageHeader().getSender();
		String interfacdCode = node.getMessageHeader().getSvcName();
		InterfaceInfo interfaceInfo = InterfaceInfoHelper
				.getInterfaceInfoByMessageTypeId(sysCode, interfacdCode);

		MRSServerHttpRequestDecorator mrsServerHttpRequestDecorator = new MRSServerHttpRequestDecorator(request, URI.create(DEFAULT_PRE_UTL + interfaceInfo.getUri())) {
			@Override
			public Flux<DataBuffer> getBody() {
				return bodyFlux;
			}
		};
		
		request = mrsServerHttpRequestDecorator;
		URI finalURI = URI.create(PRE_SERVICE_ID + interfaceInfo.getServiceId());
		attributes.put(ServerWebExchangeUtils.GATEWAY_REQUEST_URL_ATTR, finalURI);

		return chain.filter(exchange.mutate().request(request).build());
	}

	private Object getRequestBodyObj(NodeInfo node) {
		if (node == null) 
			return PARAM_ERR_MSG;
		
		MsgHeaderDto messageHeader = node.getMessageHeader();
		if (messageHeader == null) 
			return PARAM_ERR_MSG;
		else {
			String sysCode = messageHeader.getSender(); 
			if (StringHelper.isBlank(sysCode))
				return SENDER_IS_NULL;
			String uuid = messageHeader.getUUID();
			//各个系统生成ID可能重复
			String messageId = StringHelper.concat(sysCode, SPLIT, uuid);
			// 如果只有一个,就包一个对象,如果有两个,就配置外层对象,根据熟悉来转json对象
			List<TablesDto> tables = node.getTables();
			// 接口号
			String interfaceId = messageHeader.getSvcName();
			InterfaceInfo interfaceInfo = InterfaceInfoHelper
					.getInterfaceInfoByMessageTypeId(sysCode, interfaceId);
			if (interfaceInfo == null) 
				return INTERFACDE_CODE_NOT_EXISTS;
			
			String[] classNames = interfaceInfo.getClassNames();
			Object result;
			try {
				// 长度是一,说明是单表对象
				if (classNames.length == 1) {
					List<Object> objs = new ArrayList<>();
					Class<?> forName = Class.forName(classNames[0]);
					List<Map<String, Object>> rows = tables.get(0).getRows();
					for (Map<String, Object> map : rows) {
						Object obj = JSON.parseObject(JSON.toJSONString(map), forName);
					}
					result = objs;
				} else {
					// 第一个是外层对象class,里面分别是内层对象集合的class
					String outerClassName = classNames[0];
					Class<?> forName = Class.forName(outerClassName);
					Object outerObj = forName.newInstance();
					
					for (TablesDto tablesDto : tables) {
						String fieldName = tablesDto.getName().toLowerCase();
						String className = interfaceInfo.getClassNameByFieldName(fieldName);
						Class<?> fieldObj = Class.forName(className);
						// 外层对象的属性名
						Field declaredField = forName.getDeclaredField(fieldName);
						if (declaredField == null)
							throw new BusinessException(COLUMN_NOT_EXISTS);

						List<Object> objList = new ArrayList<>();
						tablesDto.getRows()
								.forEach(row -> objList.add(JSON.parseObject(JSON.toJSONString(row), fieldObj)));
						PropertyDescriptor property = new PropertyDescriptor(declaredField.getName(), forName);
						Method writeMethod = property.getWriteMethod();

						writeMethod.invoke(outerObj, objList);
					}
					result = outerObj;
				}
				return result;
			} catch (Exception e) {
				e.printStackTrace();
				return SYS_ERR;
			}
		}
	}

	/**
	 * 从Flux<DataBuffer>中获取字符串的方法
	 * 
	 * @return 请求体
	 */
	private String resolveBodyFromRequest(ServerHttpRequest serverHttpRequest) {
		// 获取请求体
		 Flux<DataBuffer> body = serverHttpRequest.getBody();
		
		StringBuilder sb = new StringBuilder();

		body.subscribe(buffer -> {
			int readableByteCount = buffer.readableByteCount();
			byte[] bytes = new byte[readableByteCount];
			buffer.read(bytes);
			DataBufferUtils.release(buffer);
			String bodyString = new String(bytes, StandardCharsets.UTF_8);
			sb.append(bodyString);
		});
		return sb.toString();
	}

	/**
	 * 重新封装buffer
	 * 
	 * @param value
	 * @return
	 */
	private DataBuffer stringBuffer(String value) {
		byte[] bytes = value.getBytes(StandardCharsets.UTF_8);

		NettyDataBufferFactory nettyDataBufferFactory = new NettyDataBufferFactory(ByteBufAllocator.DEFAULT);
		DataBuffer buffer = nettyDataBufferFactory.allocateBuffer(bytes.length);
		buffer.write(bytes);
		return buffer;
	}

}
the override ServerHttpRequestDecorator for change the request url:

public class MRSServerHttpRequestDecorator extends ServerHttpRequestDecorator {

	private URI URI;
	
	public MRSServerHttpRequestDecorator(ServerHttpRequest delegate, URI uri) {
		super(delegate);
		this.URI = uri;
	}

	@Override
	public URI getURI() {
		return URI;
	}
}

Main class:

public class GatewayApplication {
         private static final String SERVICE = "/gateway/test";
         private static final String URI = "lb://aaa";
	@Bean
	public Queue helloQueue() {
		return new Queue(queueName);
	}

	public static void main(String[] args) {
		System.setProperty("server.TYPE", "gateway");
		SpringApplication.run(GatewayApplication.class, args);
	}

	@Bean
	public MRSFilter testFilter() {
		return new MRSFilter();
	}

    @Autowired
    private MRSFilter filter;

    private static final String SERVICE = "/gateway/test";
    private static final String URI = "lb://aaa";

    @Bean
    public RouteLocator myRoutes(RouteLocatorBuilder builder) {
        RouteLocatorBuilder.Builder routes = builder.routes();
        RouteLocatorBuilder.Builder serviceProvider = routes
                .route("route_get",
                        r -> r
                                .method(HttpMethod.GET)
                                .and()
                                .path(SERVICE)
                                .filters(f -> {
                                    f.filter(filter);
                                    return f;
                                })
                                .uri(URI))
                .route("route_post",
                        r -> r
                                .header(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
                                .and()
                                .method(HttpMethod.POST)
                                .and()
                                .readBody(Object.class, readBody -> {
                                    return true;
                                })
                                .and()
                                .path(SERVICE)
                                .filters(f -> {
                                    f.filter(filter);
                                    return f;
                                })
                                .uri(URI));
        RouteLocator routeLocator = serviceProvider.build();
        return routeLocator;
    }

ServerHttpRequestDecorator class

public class MRSServerHttpRequestDecorator extends ServerHttpRequestDecorator {

	private URI URI;
	
	public MRSServerHttpRequestDecorator(ServerHttpRequest delegate, URI uri) {
		super(delegate);
		this.URI = uri;
	}

	@Override
	public URI getURI() {
		return URI;
	}
}

I want to change the request url from the business interface number in the request body,
the gateway will always be success and failure again and again.
I send the first request, the will be success and continuously send the second request, the will be failure. But I wait a moment, that the second request will be success!
I don't understand what caused the error.
The error looks like I changed the content in the request body. Is this operation not support in the gateway? If I only changed the request url, then the problem can not reproduced.

@violetagg
Copy link
Member

@xiaodanguoguo What exactly are you trying to achieve with the code?

	/**
	 * 从Flux<DataBuffer>中获取字符串的方法
	 * 
	 * @return 请求体
	 */
	private String resolveBodyFromRequest(ServerHttpRequest serverHttpRequest) {
		// 获取请求体
		 Flux<DataBuffer> body = serverHttpRequest.getBody();
		
		StringBuilder sb = new StringBuilder();

		body.subscribe(buffer -> {
			int readableByteCount = buffer.readableByteCount();
			byte[] bytes = new byte[readableByteCount];
			buffer.read(bytes);
			DataBufferUtils.release(buffer);
			String bodyString = new String(bytes, StandardCharsets.UTF_8);
			sb.append(bodyString);
		});
		return sb.toString();
	}

@xiaodanguoguo
Copy link

@violetagg
I want to get the request body and check the security parameter, the get the business parameter to route to my business service, change the body, but every time after I changed the request body, the result always be success and failure again and again.

@xiaodanguoguo
Copy link

xiaodanguoguo commented Jan 17, 2019

@violetagg
I changed a method to modify the request body, now it's right

@Bean
	public RouteLocator myRoutes(RouteLocatorBuilder builder) {
		RouteLocatorBuilder.Builder routes = builder.routes();
		RouteLocatorBuilder.Builder serviceProvider = routes
				// 不支持get请求
				.route("route_post", r -> r.header(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE).and()
						.method(HttpMethod.POST).and().readBody(Object.class, readBody -> {
							return true;
						}).and().path(SERVICE).filters(f -> {
							f.modifyRequestBody(String.class, String.class, ((exchange, s) -> {
								// TODO 消息发送
								// sendMessage.send(s);
								// `s` is request body
								NodeInfo node = JSON.parseObject(s, NodeInfo.class);
								Object requestBody = getRequestBodyObj(node);
								Map<String, Object> attributes = exchange.getAttributes();
								if (requestBody instanceof String)
									attributes.put(ERROR_INFO, requestBody);
								else {
									String sysCode = node.getMessageHeader().getSender();
									String interfacdCode = node.getMessageHeader().getSvcName();
									InterfaceInfo interfaceInfo = InterfaceInfoHelper
											.getInterfaceInfoByMessageTypeId(sysCode, interfacdCode);
									attributes.put(MRSFilter.CACHE_REQUEST_BODY_OBJECT, interfaceInfo);
								}

								return Mono.just(JSON.toJSONString(requestBody));
							}));
							f.filter(filter);
							return f;
						}).uri(URI));
		RouteLocator routeLocator = serviceProvider.build();
		return routeLocator;
	}

@violetagg
Copy link
Member

@xiaodanguoguo Try to put somewhere this code as a project, it is hard to guess all the needed dependencies. Thanks.

@tareksha
Copy link
Contributor

tareksha commented Feb 3, 2019

Hi all,
I'm experiencing this issue with 0.7.x too, even 0.7.14.
Should point out that these failures are near-consistent when we use the same client from different threads - is the client thread-safe ?

@violetagg
Copy link
Member

@tareqhs Hi, Please create a new issue with a reproducible example OR scenario description.
Thanks, Violeta

@tony-clarke-amdocs
Copy link

@violetagg I am also seeing the same issue. See: spring-cloud/spring-cloud-gateway#473
As for reproducible steps. We are seeing it from Spring Cloud Gateway. Just send many requests through spring cloud gateway and wait anywhere from 1 hour to 12 hours.
See stack trace here: spring-cloud/spring-cloud-gateway#473 (comment)
We using 0.7.14. Is there any fixes in 0.8.x that are not yet in 0.7.14? When we disable the netty client pool we are not able to reproduce the problem.

@violetagg
Copy link
Member

@tony-clarke-amdocs Are you able to test with 0.7.15.BUILD-SNAPSHOT?

@tony-clarke-amdocs
Copy link

Will try and get to it soon. Thanks!

@tony-clarke-amdocs
Copy link

@violetagg So far this looks better. I have no been able to reproduce the problem. So whatever you did it looks better so far :)

@violetagg
Copy link
Member

@tony-clarke-amdocs I backported several fixes from 0.8.x to 0.7.x

@iblogc
Copy link

iblogc commented Feb 20, 2021

@violetagg looks like my problems were resolved by changing:

sudo sysctl -w net.ipv4.tcp_keepalive_time=600
sudo sysctl -w net.ipv4.tcp_keepalive_intvl=50
sudo sysctl -w net.ipv4.tcp_keepalive_probes=10

�Hi, is this set in the container or on the host?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug A general bug
Projects
None yet
Development

No branches or pull requests