Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: reduce crawler memory usage by orders of magnitude #11556

Merged
merged 1 commit into from
Feb 17, 2021

Conversation

harshavardhana
Copy link
Member

@harshavardhana harshavardhana commented Feb 17, 2021

Description

fix: reduce crawler memory usage by orders of magnitude

Motivation and Context

currently, the crawler waits for an entire readdir call to
return until it processes usage, lifecycle, replication
and healing - instead we should pass the applicator all
the way down to avoid building any special stack for all
the contents in a single directory.

This allows for

  • no need to remember the entire list of entries per directory
    before applying the required functions
  • no need to wait for the entire readdir() call to finish before
    applying the required functions

How to test this PR?

All functionality should remain the same

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Optimization (provides speedup with no functional changes)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • Fixes a regression (If yes, please add commit-id or PR # here)
  • Documentation updated
  • Unit tests added/updated

Copy link
Contributor

@klauspost klauspost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@vadmeste vadmeste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

cmd/os-readdir_unix.go Outdated Show resolved Hide resolved
Copy link
Contributor

@poornas poornas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@vadmeste vadmeste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

currently crawler waits for an entire readdir call to
return until it processes usage, lifecycle, replication
and healing - instead we should pass the applicator all
the way down to avoid building any special stack for all
the contents in a single directory.

This allows for

- no need to remember the entire list of entries per directory
  before applying the required functions
- no need to wait for entire readdir() call to finish before
  applying the required functions
@minio-trusted
Copy link
Contributor

Mint Automation

Test Result
mint-large-bucket.sh ✔️
mint-fs.sh ✔️
mint-gateway-s3.sh ✔️
mint-erasure.sh ✔️
mint-dist-erasure.sh ✔️
mint-zoned.sh ✔️
mint-gateway-nas.sh ✔️
mint-compress-encrypt-dist-erasure.sh more...

11556-6181864/mint-compress-encrypt-dist-erasure.sh.log:

Running with
SERVER_ENDPOINT:      minio-dev8.minio.io:30761
ACCESS_KEY:           minio
SECRET_KEY:           ***REDACTED***
ENABLE_HTTPS:         0
SERVER_REGION:        us-east-1
MINT_DATA_DIR:        /mint/data
MINT_MODE:            full
ENABLE_VIRTUAL_STYLE: 0

To get logs, run 'docker cp a12dd028c33b:/mint/log /tmp/mint-logs'

(1/15) Running aws-sdk-go tests ... done in 2 seconds
(2/15) Running aws-sdk-java tests ... done in 1 seconds
(3/15) Running aws-sdk-php tests ... done in 43 seconds
(4/15) Running aws-sdk-ruby tests ... done in 4 seconds
(5/15) Running awscli tests ... FAILED in 33 seconds
{
  "name": "awscli",
  "duration": 2770,
  "function": "aws --endpoint-url http://minio-dev8.minio.io:30761 s3api copy-object --bucket awscli-mint-test-bucket-12058 --key datafile-1-kB-copy --copy-source awscli-mint-test-bucket-12058/datafile-1-kB\n",
  "status": "FAIL",
  "error": "Hash mismatch expected 084e1383b70fb0c51acc680fef370023, got ac57de7156d7fc25ac1a65f81fa3989b"
}
(5/15) Running healthcheck tests ... done in 0 seconds
(6/15) Running mc tests ... done in 51 seconds
(7/15) Running minio-dotnet tests ... done in 43 seconds
(8/15) Running minio-go tests ... FAILED in 2 minutes and 36 seconds
{
  "args": {
    "destination": {
      "Bucket": "minio-go-test-ltgyha5lttev90eu",
      "Object": "dstObject",
      "Encryption": {},
      "UserMetadata": null,
      "ReplaceMetadata": false,
      "UserTags": null,
      "ReplaceTags": false,
      "LegalHold": "",
      "Mode": "",
      "RetainUntilDate": "0001-01-01T00:00:00Z",
      "Size": 0,
      "Progress": null
    },
    "source": {
      "Bucket": "minio-go-test-ltgyha5lttev90eu",
      "Object": "srcObject",
      "VersionID": "",
      "MatchETag": "",
      "NoMatchETag": "",
      "MatchModifiedSince": "0001-01-01T00:00:00Z",
      "MatchUnmodifiedSince": "0001-01-01T00:00:00Z",
      "MatchRange": false,
      "Start": 0,
      "End": 0,
      "Encryption": null
    }
  },
  "duration": 4858,
  "error": "We encountered an internal error, please try again.: cause(s2: corrupt input)",
  "function": "CopyObject(destination, source)",
  "message": "GetObject failed",
  "name": "minio-go: testUnencryptedToSSES3CopyObject",
  "status": "FAIL"
}
(8/15) Running minio-java tests ... FAILED in 1 minutes and 41 seconds
{
  "name": "minio-java",
  "function": "copyObject()",
  "args": "[match etag]",
  "duration": 855,
  "status": "FAIL",
  "error": "error occurred\nErrorResponse(code = PreconditionFailed, message = At least one of the pre-conditions you specified did not hold, bucketName = minio-java-test-2tj40pu, objectName = minio-java-test-llkp1g-copy, resource = /minio-java-test-2tj40pu/minio-java-test-llkp1g-copy, requestId = 1664A5DF0D9ECD14, hostId = 661fb1a5-8657-4955-8fcc-d1066fa99c93)\nrequest={method=PUT, url=http://minio-dev8.minio.io:30761/minio-java-test-2tj40pu/minio-java-test-llkp1g-copy, headers=x-amz-copy-source-if-match: 71cff0a060f852067e443ad1e24ae26c-1\nx-amz-copy-source: /minio-java-test-1n4nbtb/minio-java-test-llkp1g\nHost: minio-dev8.minio.io:30761\nAccept-Encoding: identity\nUser-Agent: MinIO (Linux; amd64) minio-java/8.0.3\nContent-MD5: 1B2M2Y8AsgTpgAmY7PhCfg==\nx-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855\nx-amz-date: 20210217T212223Z\nAuthorization: AWS4-HMAC-SHA256 Credential=*REDACTED*/20210217/us-east-1/s3/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-copy-source;x-amz-copy-source-if-match;x-amz-date, Signature=*REDACTED*\n}\nresponse={code=412, headers=Accept-Ranges: bytes\nContent-Length: 416\nContent-Security-Policy: block-all-mixed-content\nContent-Type: application/xml\nETag: \"71cff0a060f852067e443ad1e24ae26c\"\nLast-Modified: Wed, 17 Feb 2021 21:22:23 GMT\nServer: MinIO\nVary: Origin\nX-Amz-Request-Id: 1664A5DF0D9ECD14\nX-Xss-Protection: 1; mode=block\nDate: Wed, 17 Feb 2021 21:22:23 GMT\n}\n >>> [io.minio.MinioClient.execute(MinioClient.java:775), io.minio.MinioClient.execute(MinioClient.java:563), io.minio.MinioClient.executePut(MinioClient.java:904), io.minio.MinioClient.copyObject(MinioClient.java:1232), FunctionalTest.testCopyObjectMatchETag(FunctionalTest.java:1850), FunctionalTest.copyObject(FunctionalTest.java:2016), FunctionalTest.runObjectTests(FunctionalTest.java:3757), FunctionalTest.runTests(FunctionalTest.java:3783), FunctionalTest.main(FunctionalTest.java:3927)]"
}
(8/15) Running minio-js tests ... done in 1 minutes and 1 seconds
(9/15) Running minio-py tests ... done in 3 minutes and 7 seconds
(10/15) Running s3cmd tests ... FAILED in 5 seconds
{
  "name": "s3cmd",
  "duration": "3146",
  "function": "test_put_object_multipart",
  "status": "FAIL",
  "error": "WARNING: MD5 Sums don't match!\nWARNING: Retrying upload of /mint/data/datafile-65-MB\nWARNING: MD5 Sums don't match!\nWARNING: Retrying upload of /mint/data/datafile-65-MB\nWARNING: MD5 Sums don't match!\nWARNING: Retrying upload of /mint/data/datafile-65-MB\nWARNING: MD5 Sums don't match!\nWARNING: Retrying upload of /mint/data/datafile-65-MB\nWARNING: MD5 Sums don't match!\nWARNING: Retrying upload of /mint/data/datafile-65-MB\nWARNING: MD5 Sums don't match!\nWARNING: Too many failures. Giving up on '/mint/data/datafile-65-MB'\nERROR: \nUpload of '/mint/data/datafile-65-MB' part 1 failed. Use\n  /usr/local/bin/s3cmd abortmp s3://s3cmd-test-bucket-26774/s3cmd-test-object-29206 071ed344-5008-4ce2-a5ff-6361b39be4a9\nto abort the upload, or\n  /usr/local/bin/s3cmd --upload-id 071ed344-5008-4ce2-a5ff-6361b39be4a9 put ...\nto continue the upload.\nERROR: Upload of '/mint/data/datafile-65-MB' failed too many times (Last reason: )"
}
(10/15) Running s3select tests ... done in 9 seconds
(11/15) Running security tests ... done in 0 seconds

Executed 11 out of 15 tests successfully.

Deleting image on docker hub
Deleting image locally

@harshavardhana harshavardhana merged commit 289e1d8 into minio:master Feb 17, 2021
@harshavardhana harshavardhana deleted the reduce-memory branch February 17, 2021 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants