Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream request bodies when proxying to outpack server. #107

Closed
wants to merge 2 commits into from

Conversation

plietar
Copy link
Member

@plietar plietar commented Aug 22, 2024

If a client tries to upload a file to Packit using the outpack API, the Packit server would try to buffer the entire request in memory before sending it out to the outpack server. This wasn't obvious from the code but an implementation detail of the HTTP libraries being used.

This caused issues when clients upload very large files - too large to fit in memory. The Packit server would run out of heap space and cause an exception.

Thankfully the issue is simple to fix, and the HTTP library only needs to be configured with setBufferRequestBody(false). In fact, according to the docs, the latest version of Spring has these flags set by default. We don't, however, use this version of Spring yet.

Steps to reproduce:

  1. Create a 1GB data file:

    dd if=/dev/random of=data bs=1M count=1024
    
  2. Start the Packit server with a 128MB maximum heap size:

    java -Xmx128m -jar app/build/libs/app.jar
    
  3. Calculate the hash of the file and upload it to the server:

    hash=$(sha256sum data | awk '{ print $1 }')
    curl -X POST -T data \
      -H "Authorization: Bearer $TOKEN" \
      http://127.0.0.1:8080/outpack/file/sha256:$hash
    

Without this change, step 3 results in a OutOfMemoryError exception.

If a client tries to upload a file to Packit using the outpack API, the
Packit server would try to buffer the entire request in memory before
sending it out to the outpack server. This wasn't obvious from the code
but an implementation detail of the HTTP libraries being used.

This caused issues when clients upload very large files - too large to
fit in memory. The Packit server would run out of heap space and cause
an exception.

Thankfully the issue is simple to fix, and the HTTP library only needs
to be configured with `setBufferRequestBody(false)`. In fact,
[according to the docs][setBufferRequestBody], the latest version of
Spring has these flags set by default. We don't, however, use this
version of Spring yet.

Steps to reproduce:
1. Create a 1GB `data` file:

    ```
    dd if=/dev/random of=data bs=1M count=1024
    ```

2. Start the Packit server with a 128MB maximum heap size:

    ```
    java -Xmx128m -jar app/build/libs/app.jar
    ```

3. Calculate the hash of the file and upload it to the server:

    ```
    hash=$(sha256sum data | awk '{ print $1 }')
    curl -X POST -T data \
      -H "Authorization: Bearer $TOKEN" \
      http://127.0.0.1:8080/outpack/file/sha256:$hash
    ```

Without this change, step 3 results in a `OutOfMemoryError` exception.

[setBufferRequestBody]: https://docs.spring.io/spring-framework/docs/6.1.0/javadoc-api/org/springframework/http/client/SimpleClientHttpRequestFactory.html#setBufferRequestBody(boolean)
Copy link

codecov bot commented Aug 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.00%. Comparing base (4d636e7) to head (b85a58a).

Additional details and impacted files
@@                    Coverage Diff                     @@
##           mrc-5677-refactor-clients     #107   +/-   ##
==========================================================
  Coverage                      96.00%   96.00%           
==========================================================
  Files                            106      106           
  Lines                            975      975           
  Branches                         252      252           
==========================================================
  Hits                             936      936           
  Misses                            38       38           
  Partials                           1        1           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@plietar
Copy link
Member Author

plietar commented Aug 22, 2024

I spent way too long trying to write integration tests for this but got no where.

I had code that created a 1GB InputStream full of zero bytes, that could be sent to the API. Naturally, the http client used by the integration was also buffering the request, and blew the heap. To fix that I tried to set setBufferRequestBody(false) on the test client as well, just like this PR does for the server's client, however the test client already has setOutputStreaming(false) .

With setOutputStreaming(false), the "the setFixedLengthStreamingMode and setChunkedStreamingMode methods of the underlying connection will never be called", which means the request ends up being buffered one layer further down the stack (I don't know why it can't call those methods, but here we are).

I tried removing setOutputStreaming(false) from the test client, but then any test that is supposed to yield a 401 error produces an HttpRetryException instead. Turns out the underlying URLConnection does weird retry stuff but it can't if streaming is enabled.

The docs for TestRestClient suggest using "Apache Http Client 4.3.2" instead of URLConnection as the request factory, so I did that. It fixes the issues with the test client and now errors are correctly reported instead of raising HttpRetryException, but just having this in the classpath changes the behaviour of the server, and now the IOUtils.copy(request.inputStream, serverRequest.body) line from the GenericClient doesn't work because there's no getBody implementation

This is the point where I gave up.

@absternator absternator deleted the branch mrc-5677-refactor-clients August 22, 2024 15:57
@plietar
Copy link
Member Author

plietar commented Aug 22, 2024

Re-opened as #108. I don't understand what GItHub did here and it won't let me re-open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants