Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add timeouts to avoid goroutine leaks in net/http #14995

Merged
merged 1 commit into from May 30, 2022

Conversation

harshavardhana
Copy link
Member

Description

fix: add timeouts to avoid goroutine leaks in net/http

Motivation and Context

Following code can reproduce an unending go-routine buildup,
while keeping connections established due to lack of client
not closing the connections.

https://gist.github.com/harshavardhana/2d00e6f909054d2d2524c71485ad02e1

Without this PR all MinIO deployments can be put into
denial of service attacks, causing the entire service to be
unavailable.

We bring in two timeouts at this stage to control such
go-routine build-ups, new change

  • IdleTimeout (to kill off idle connections)
  • ReadHeaderTimeout (to kill off connections that are too slow)

This new change also brings two hidden options to make any
additional relevant changes if desired in some setups.

How to test this PR?

Test with the gist shared with the latest master and observe
the connection build-up locally, this can be reproduced with
single MinIO running.

This eventually causes all incoming API calls to slowdown
or not respond - eventually requiring MinIO service to be
restarted.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Optimization (provides speedup with no functional changes)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • Fixes a regression (If yes, please add commit-id or PR # here)
  • Documentation updated
  • Unit tests added/updated

Copy link
Contributor

@klauspost klauspost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see it being applied. PTAL.

Limits looks good. It is virtually impossible to protect against this type of attacks at this level. It just takes a trivial bump in attack rate to get around this.

But whatever we do, there are still slow-lorris attacks which can eat up connections. Virtually impossible to guard against, though it will require authentication.

internal/http/server.go Show resolved Hide resolved
internal/http/server.go Show resolved Hide resolved
@harshavardhana
Copy link
Member Author

harshavardhana commented May 29, 2022

I don't see it being applied. PTAL.

Limits looks good. It is virtually impossible to protect against this type of attacks at this level. It just takes a trivial bump in attack rate to get around this.

But whatever we do, there are still slow-lorris attacks which can eat up connections. Virtually impossible to guard against, though it will require authentication.

It does the total connection build up becomes lower automatically, they grow up to some value and then taper off.

However yes there is just no way in reality to have this to some extent. That's why it's configurable as a hidden value.

@harshavardhana
Copy link
Member Author

@klauspost correct I meant to keep these as reasonable protection, of course you can basically make enough fast connections before server considers them idle - timeouts a bloody game :-)

Following code can reproduce an unending go-routine buildup,
while keeping connections established due to lack of client
not closing the connections.

https://gist.github.com/harshavardhana/2d00e6f909054d2d2524c71485ad02e1

Without this PR all MinIO deployments can be put into
denial of service attacks, causing entire service to be
unavailable.

We bring in two timeouts at this stage to control such
go-routine build ups, new change

- IdleTimeout (to kill off idle connections)
- ReadHeaderTimeout (to kill off connections that are too slow)

This new change also brings two hidden options to make any
additional relevant changes if desired in some setups.
@minio-trusted
Copy link
Contributor

Mint Automation

Test Result
mint-fs.sh ✔️
mint-gateway-s3.sh ✔️
mint-erasure.sh ✔️
mint-dist-erasure.sh ✔️
mint-compress-encrypt-dist-erasure.sh ✔️
mint-pools.sh ✔️
mint-large-bucket.sh more...

14995-f5ea4dc/mint-large-bucket.sh.log:

Running with
SERVER_ENDPOINT:      minio-k8s4:32313
ACCESS_KEY:           minio
SECRET_KEY:           ***REDACTED***
ENABLE_HTTPS:         0
SERVER_REGION:        us-east-1
MINT_DATA_DIR:        /mint/data
MINT_MODE:            full
ENABLE_VIRTUAL_STYLE: 0
RUN_ON_FAIL:          0

To get logs, run 'docker cp :/mint/log /tmp/mint-logs'

(1/15) Running aws-sdk-go tests ... done in 1 seconds
(2/15) Running aws-sdk-java tests ... done in 1 seconds
(3/15) Running aws-sdk-php tests ... done in 43 seconds
(4/15) Running aws-sdk-ruby tests ... done in 7 seconds
(5/15) Running awscli tests ... FAILED in 1 minutes and 18 seconds
{
  "name": "awscli",
  "duration": 32156,
  "function": "aws --endpoint-url http://minio-k8s4:32313 s3 sync --no-progress /mint/data s3://awscli-mint-test-bucket-6035/\n",
  "status": "FAIL",
  "error": "upload: ../../../data/datafile-0-b to s3://awscli-mint-test-bucket-6035/datafile-0-bupload: ../../../data/datafile-1-kB to s3://awscli-mint-test-bucket-6035/datafile-1-kBupload: ../../../data/datafile-10-kB to s3://awscli-mint-test-bucket-6035/datafile-10-kBupload: ../../../data/datafile-100-kB to s3://awscli-mint-test-bucket-6035/datafile-100-kBupload: ../../../data/datafile-1-b to s3://awscli-mint-test-bucket-6035/datafile-1-bupload: ../../../data/datafile-33-kB to s3://awscli-mint-test-bucket-6035/datafile-33-kBupload: ../../../data/datafile-1.03-MB to s3://awscli-mint-test-bucket-6035/datafile-1.03-MBupload: ../../../data/datafile-1-MB to s3://awscli-mint-test-bucket-6035/datafile-1-MBupload: ../../../data/datafile-5243880-b to s3://awscli-mint-test-bucket-6035/datafile-5243880-bupload: ../../../data/datafile-6-MB to s3://awscli-mint-test-bucket-6035/datafile-6-MBupload: ../../../data/datafile-10-MB to s3://awscli-mint-test-bucket-6035/datafile-10-MBupload: ../../../data/datafile-11-MB to s3://awscli-mint-test-bucket-6035/datafile-11-MBupload: ../../../data/datafile-5-MB to s3://awscli-mint-test-bucket-6035/datafile-5-MBupload: ../../../data/datafile-65-MB to s3://awscli-mint-test-bucket-6035/datafile-65-MBupload failed: ../../../data/datafile-129-MB to s3://awscli-mint-test-bucket-6035/datafile-129-MB 'ETag'"
}
(5/15) Running healthcheck tests ... done in 0 seconds
(6/15) Running mc tests ... done in 29 seconds
(7/15) Running minio-dotnet tests ... done in 6 minutes and 14 seconds
(8/15) Running minio-go tests ... done in 1 minutes and 14 seconds
(9/15) Running minio-java tests ... done in 33 seconds
(10/15) Running minio-js tests ... done in 1 minutes and 6 seconds
(11/15) Running minio-py tests ... done in 1 minutes and 22 seconds
(12/15) Running s3cmd tests ... done in 26 seconds
(13/15) Running s3select tests ... done in 4 seconds
(14/15) Running versioning tests ... done in 3 minutes and 5 seconds

Executed 14 out of 15 tests successfully.

Deleting image on docker hub
Deleting image locally

Copy link
Member

@vadmeste vadmeste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants