Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webflux RestController unexpected 400 response with reused connection and incomplete expect-100 activity #2910

Closed
pivotal-Josh-Gainey opened this issue Sep 26, 2023 · 7 comments · Fixed by #2919
Assignees
Labels
type/bug A general bug
Milestone

Comments

@pivotal-Josh-Gainey
Copy link

We discovered a difference in behavior while using reactor-netty as a server compared to other servers such as tomcat or go http in regards to a specific scenario. The scenario is:

  • client makes POST to server
  • The request contains an Expect: 100-continue header and a content-length > 0
  • The server responds with a status code before reading the body or closing the connection
  • The next request on the same connection is immediately seen as malformed and response code 400 is returned to client.

The expect-100 header in an http request allows the server to validate headers before asking the client to proceed sending the request body. Once the server validates the headers it gives a 100 continue response back to client and the client continues sending the body.

What we observe is a client is reusing a keep-alive connection to the server. First request is a POST request and the response is unauthorized and given a 401 response. Since the 401 was returned to client and not a 100 continue then, the POST body is never sent from client to server. Due to the content-length specified in the POST request, the next bytes on the wire the server is looking for is the POST body - however its not the POST body but a completely new request and results in an immediate 400.

To summarize it seems this issue is very specific to connection reuse and the expect-100 header in a request that contains payload and the server returns a non 100/200 status code prior to reading the rest of the body or closing the connection.

Per rfc https://www.rfc-editor.org/rfc/rfc9110#name-expect
"A server that responds with a final status code before reading the
entire message body SHOULD indicate in that response whether it
intends to close the connection or continue reading and discarding
the request message (see Section 6.6 of [RFC7230])."

Expected Behavior

The server should close the connection on a request that results in a non 200 status code that had expectations (such as expect 100)

Actual Behavior

The server returns a status code prior to sending a 100 response to the client and due to the content-length - the next bytes the server wants to read from the wire is the POST body payload but its a new request and results in immediate 400.

Steps to Reproduce

Repro steps are as follows:

1 - The code snippits/data relevant to app
2 - The commands to use over netcat while running the app on localhost:8080

In the following Controller, if you uncomment either of those lines - the app works as expected and no 400s will be observed.

#Controller

public final class Controller {
    @RequestMapping("/**")
    public Mono<Void> service( ServerHttpRequest incoming, ServerHttpResponse response) {
        System.out.println("expect100test Received request");

        Mono<Void> x= Mono.empty();
        if(incoming.getHeaders().get(HttpHeaders.AUTHORIZATION) == null) {
            //  response.getHeaders().add(HttpHeaders.CONNECTION, "close");
            //  Flux<DataBuffer> myBody = incoming.getBody();
            response.setStatusCode(HttpStatus.UNAUTHORIZED);
            System.out.println("expect100test sending 401");}
        else {
            response.setStatusCode(HttpStatus.OK);
            x= response.writeWith(incoming.getBody());
            System.out.println("expect100test  sending 200");
        }
        System.out.println("expect100test request complete");
        return x;
    }
}

#SecurityConfig

@EnableWebFluxSecurity
class SecurityConfig {
    @Bean
    public SecurityWebFilterChain springSecurityFilterChain(final ServerHttpSecurity http) {
        return http.authorizeExchange()
                .pathMatchers(HttpMethod.OPTIONS).denyAll()
                .pathMatchers(HttpMethod.TRACE).denyAll()
                .anyExchange().permitAll()
                .and()
                .csrf().disable()
                .build();
    }
}

#pom

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.7.17-SNAPSHOT</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.example</groupId>
    <artifactId>expect100test</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>expect100test</name>
    <description>expect100test</description>
    <properties>
        <java.version>1.8</java.version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-webflux</artifactId>
            <scope>compile</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-security</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-actuator</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>io.projectreactor</groupId>
            <artifactId>reactor-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>
</project>

Steps to repro on localhost netcat:

1 - Run the app
2 - Start netcat to localhost at the port the app is running on nc localhost 8080
3 - Copy/Paste in the request without the POST body

POST / HTTP/1.1
Host: localhost
User-Agent: curl/7.81.0
Accept: */*
Expect: 100-continue
Content-Length: 10
Content-Type: application/x-www-form-urlencoded
connection: keep-alive

4 - Press enter twice to send the request
5 - Copy/Paste in the second request to send on that same connection

GET / HTTP/1.1
Host: localhost
User-Agent: curl/7.81.0
Accept: */*
Content-Type: application/x-www-form-urlencoded
Authorization: Yes
connection: keep-alive

6 - Its an immediate 400 for second request and connection is closed.

For testing purposes - here is how to send the POST body payload in the netcat request Copy/Paste:

POST / HTTP/1.1
Host: localhost
User-Agent: curl/7.81.0
Accept: */*
Expect: 100-continue
Content-Length: 10
Content-Type: application/x-www-form-urlencoded
connection: keep-alive

[000, 000]

Possible Solution

This is where we are seeking input. We are unsure if this issue is considered a bug in netty, reactor-netty, Spring or if this is not a bug at all and its expected that the implementing application should handle edge cases like this programatically vs the underlying abstractions.

If implementing in the server it appears the following is a common approach:
Close the connection on any non-200 status codes that have expectations such as the expect-100 header present but no POST body was read.

Your Environment

This has been observed in a variety of environments such as cloud foundry and localhost.

@pivotal-Josh-Gainey pivotal-Josh-Gainey added status/need-triage A new issue that still need to be evaluated as a whole type/bug A general bug labels Sep 26, 2023
@violetagg violetagg removed the status/need-triage A new issue that still need to be evaluated as a whole label Sep 27, 2023
@violetagg violetagg self-assigned this Sep 27, 2023
@violetagg violetagg added this to the 1.0.37 milestone Sep 27, 2023
@violetagg
Copy link
Member

I can reproduce it

@hmble2
Copy link

hmble2 commented Sep 27, 2023

It's best to fix it within abstraction layer (just like Tomcat does) , treat it is as RFC non-compliance issue. It's not good idea to expose these complexities to developers who are using netty server. Developers could end up spending countless hours researching such complicated issues. Secondly, the connection must be closed anyway for any non-final status codes per RFC:

A server that responds with a final status code before reading the entire request content SHOULD indicate whether it intends to close the connection (e.g., see Section 9.6 of [HTTP/1.1]) or continue reading the request content.

So if the application-code replies with final status code (401, 500 or anything other than 100 Continue) without reading body then the connection must be closed anyway, there is no use-case where developer would want to keep connection open (as that would be violation of RFC). So I feel this complexity should be hidden from developers.

@violetagg
Copy link
Member

violetagg commented Oct 3, 2023

@pivotal-Josh-Gainey @hmble2 The fix is available in 1.0.37-SNAPSHOT and 1.1.12-SNAPSHOT. If you can test it, it would be great!

@pivotal-Josh-Gainey Thanks for the detailed description!

@pivotal-Josh-Gainey
Copy link
Author

Thank you so much for the quick help @violetagg!

Tested 1.1.12-SNAPSHOT on localhost and it shows connection is closing now!

$ nc localhost 8080
POST / HTTP/1.1
Host: localhost
User-Agent: curl/7.81.0
Accept: */*
Expect: 100-continue
Content-Length: 10
Content-Type: application/x-www-form-urlencoded
connection: keep-alive

HTTP/1.1 401 Unauthorized
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: 0
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1 ; mode=block
Referrer-Policy: no-referrer
content-length: 0
connection: close

I also tested this on an app pushed to cloud foundry and the unexpected 400s are no longer observed.

@violetagg
Copy link
Member

@pivotal-Josh-Gainey Thanks

@hmble2
Copy link

hmble2 commented Oct 4, 2023

@violetagg @pivotal-Josh-Gainey Are we closing connection on every non-200 response ? I think we don't want to close the connection when response code is 100 Continue, response code 100 means server wants to accept request-body over same connection, and so closing won't be appropriate. Response code 100 can come either from application code (behind this same reactor-netty server) when it decides to handle Expect header explicitly, OR otherwise it can come from backend server (such as golang based servers) when application based of rector-netty is just acting as proxy/reverse-proxy to backend server (and backend server handles Expect header by sending 100 Continue). Please validate.

@violetagg
Copy link
Member

violetagg commented Oct 5, 2023

@hmble2 Good point. Let me explain how Reactor Netty works:

Reactor Netty gives 2 types of API to the application:

  • send* which can be used to send the final response. This is where the fix was applied. So a final response cannot be 100 continue.
  • receive* which can be used to receive the incoming data. There is no need the application to understand how exactly the incoming data will be received. It is Reactor Netty responsibility to do the necessary things in order to received it, including when a non final response needs to be sent.

Now to your use cases:

Response code 100 can come either from application code (behind this same reactor-netty server) when it decides to handle Expect header explicitly

A request with Expect header is received.

  • The application can decide that it is not interested in the incoming data and wants to send the final response using send* API -> the status code cannot be 100 continue.
  • The application can decide that it is interested in the incoming data and uses receive* API. Reactor Netty responsibility is to send the non final 100 continue response.

OR otherwise it can come from backend server (such as golang based servers) when application based of rector-netty is just acting as proxy/reverse-proxy to backend server (and backend server handles Expect header by sending 100 Continue)

client -> gateway -> backend

When a backend sends back 100 continue, the gateway behaves exactly the same as in the first use case - it can use receive* API for consuming the incoming data. Reactor Netty responsibility is to send the non final 100 continue response.

In summary there is NO Reactor Netty API that can be used by the application to send non final response.

I hope this clarifies the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug A general bug
Projects
None yet
3 participants