Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: discarding results do not attempt in-memory metacache writer #11163

Merged
merged 1 commit into from Dec 24, 2020

Conversation

harshavardhana
Copy link
Member

Description

fix: discarding results do not attempt in-memory metacache writer

Motivation and Context

Optimizations include

  • do not write the metacache block if the size of the
    block is '0' and it is the first block - where listing
    is attempted for a transient list. this helps to
    avoid creating lots of empty metacache entries for
    minioMetaBucket

  • avoid the entire initialization sequence of cacheCh
    , metacacheBlockWriter if we are simply going to skip
    them when discardResults is set to true.

  • No need to hold write locks while writing metacache
    blocks - each block is unique, per bucket, per prefix
    and also is written by a single node.

How to test this PR?

Needs a large setup with lots of files in different prefixes, tested this
on a packet setup with major gains obtained with per-prefix listing.

This PR to address some performance problems observed with
spark workloads under very high concurrency.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • Fixes a regression (If yes, please add commit-id or PR # here)
  • Documentation needed
  • Unit tests needed

@harshavardhana
Copy link
Member Author

This PR to address some performance problems observed with
spark workloads under very high concurrency.

Of course, all of this was tested under LIST_QUORUM=strict consistency

Copy link
Contributor

@kannappanr kannappanr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@vadmeste vadmeste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM & tested

Optimizations include

- do not write the metacache block if the size of the
  block is '0' and it is the first block - where listing
  is attempted for a transient prefix, this helps to
  avoid creating lots of empty metacache entries for
  `minioMetaBucket`

- avoid the entire initialization sequence of cacheCh
  , metacacheBlockWriter if we are simply going to skip
  them when discardResults is set to true.

- No need to hold write locks while writing metacache
  blocks - each block is unique, per bucket, per prefix
  and also is written by a single node.
@minio-trusted
Copy link
Contributor

Mint Automation

Test Result
mint-large-bucket.sh ✔️
mint-fs.sh ✔️
mint-gateway-s3.sh ✔️
mint-dist-erasure.sh ✔️
mint-zoned.sh ✔️
mint-gateway-nas.sh ✔️
mint-gateway-azure.sh more...
mint-erasure.sh more...

11163-ef79de5/mint-gateway-azure.sh.log:

Running with
SERVER_ENDPOINT:      minio-c2.minio.io:30689
ACCESS_KEY:           minioazure
SECRET_KEY:           ***REDACTED***
ENABLE_HTTPS:         0
SERVER_REGION:        us-east-1
MINT_DATA_DIR:        /mint/data
MINT_MODE:            full
ENABLE_VIRTUAL_STYLE: 0

To get logs, run 'docker cp a7ac3596699c:/mint/log /tmp/mint-logs'

(1/15) Running aws-sdk-go tests ... done in 9 seconds
(2/15) Running aws-sdk-java tests ... done in 2 seconds
(3/15) Running aws-sdk-php tests ... done in 1 minutes and 9 seconds
(4/15) Running aws-sdk-ruby tests ... done in 19 seconds
(5/15) Running awscli tests ... done in 2 minutes and 55 seconds
(6/15) Running healthcheck tests ... done in 0 seconds
(7/15) Running mc tests ... done in 4 minutes and 10 seconds
(8/15) Running minio-dotnet tests ... done in 1 minutes and 48 seconds
(9/15) Running minio-go tests ... done in 6 minutes and 30 seconds
(10/15) Running minio-java tests ... FAILED in 9 minutes and 11 seconds
{
  "name": "minio-java",
  "function": "putObject()",
  "args": "[user metadata]",
  "duration": 173,
  "status": "FAIL",
  "error": "error occurred\nErrorResponse(code = AuthenticationFailed, message = -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, github.com/Azure/azure-storage-blob-go@v0.10.0/azblob/zc_storage_error.go:42\n===== RESPONSE ERROR (ServiceCode=AuthenticationFailed) =====\nDescription=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.\nRequestId:5886a6c6-c01e-0047-5c41-da3285000000\nTime:2020-12-24T22:08:23.0103138Z, Details: \n   AuthenticationErrorDetail: The MAC signature found in the HTTP request 'mBeSn/Opvvga09ZvEBwtBdKYLFbVfx2xBiEnvsgVPxk=' is not the same as any computed signature. Server used following string to sign: 'PUT\n\n\n128\n\napplication/xml\n\n\n\n\n\n\nx-ms-blob-cache-control:\nx-ms-blob-content-disposition:\nx-ms-blob-content-encoding:\nx-ms-blob-content-language:\nx-ms-blob-content-type:application/octet-stream\nx-ms-client-request-id:b877fd74-f8e2-45f6-42c1-91dbcf1b355c\nx-ms-date:Thu, 24 Dec 2020 22:08:22 GMT\nx-ms-meta-my_header1:a   b   c\nx-ms-meta-my_header2:\"a   b   c\"\nx-ms-meta-my_project:Project One\nx-ms-meta-my_unicode_tag:商å“�\nx-ms-version:2019-02-02\n/minioazure/minio-java-test-cti9ak/minio-java-test-2tjieaf\ncomp:blocklist\ntimeout:1501'.\n   Code: AuthenticationFailed\n   PUT https://minioazure.blob.core.windows.net/minio-java-test-cti9ak/minio-java-test-2tjieaf?comp=blocklist&timeout=1501\n   Authorization: REDACTED\n   Content-Length: [128]\n   Content-Type: [application/xml]\n   User-Agent: [APN/1.0 MinIO/1.0 MinIO/2020-12-24T21:35:51Z]\n   X-Ms-Blob-Cache-Control: []\n   X-Ms-Blob-Content-Disposition: []\n   X-Ms-Blob-Content-Encoding: []\n   X-Ms-Blob-Content-Language: []\n   X-Ms-Blob-Content-Type: [application/octet-stream]\n   X-Ms-Client-Request-Id: [b877fd74-f8e2-45f6-42c1-91dbcf1b355c]\n   X-Ms-Date: [Thu, 24 Dec 2020 22:08:22 GMT]\n   X-Ms-Meta-My_header1: [a   b   c]\n   X-Ms-Meta-My_header2: [\"a   b   c\"]\n   X-Ms-Meta-My_project: [Project One]\n   X-Ms-Meta-My_unicode_tag: [商品]\n   X-Ms-Version: [2019-02-02]\n   --------------------------------------------------------------------------------\n   RESPONSE Status: 403 Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.\n   Content-Length: [1091]\n   Content-Type: [application/xml]\n   Date: [Thu, 24 Dec 2020 22:08:22 GMT]\n   Server: [Microsoft-HTTPAPI/2.0]\n   X-Ms-Error-Code: [AuthenticationFailed]\n   X-Ms-Request-Id: [5886a6c6-c01e-0047-5c41-da3285000000]\n\n\n, bucketName = minio-java-test-cti9ak, objectName = minio-java-test-2tjieaf, resource = /minio-java-test-cti9ak/minio-java-test-2tjieaf, requestId = 1653C67634909073, hostId = afa6210a-d75f-476f-ba70-b01b24ded00d)\nrequest={method=PUT, url=http://minio-c2.minio.io:30689/minio-java-test-cti9ak/minio-java-test-2tjieaf, headers=x-amz-meta-My-Unicode-Tag: 商品\nx-amz-meta-My-Project: Project One\nx-amz-meta-My-header1: a   b   c\nx-amz-meta-My-Header2: \"a   b   c\"\nContent-Type: application/octet-stream\nHost: minio-c2.minio.io:30689\nAccept-Encoding: identity\nUser-Agent: MinIO (Linux; amd64) minio-java/8.0.3\nContent-MD5: A9oFTxee7YVcJ9fWsgQeKg==\nx-amz-content-sha256: 1ff7959f86334ddc5c188a5083268f600146328b2b6c5185e75bf7d9387d6b74\nx-amz-date: 20201224T220822Z\nAuthorization: AWS4-HMAC-SHA256 Credential=*REDACTED*/20201224/us-east-1/s3/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-date;x-amz-meta-my-header1;x-amz-meta-my-header2;x-amz-meta-my-project;x-amz-meta-my-unicode-tag, Signature=*REDACTED*\n}\nresponse={code=403, headers=Accept-Ranges: bytes\nContent-Length: 3082\nContent-Security-Policy: block-all-mixed-content\nContent-Type: application/xml\nServer: MinIO\nVary: Origin\nX-Amz-Request-Id: 1653C67634909073\nX-Xss-Protection: 1; mode=block\nDate: Thu, 24 Dec 2020 22:08:23 GMT\n}\n >>> [io.minio.MinioClient.execute(MinioClient.java:775), io.minio.MinioClient.putObject(MinioClient.java:4547), io.minio.MinioClient.putObject(MinioClient.java:2713), io.minio.MinioClient.putObject(MinioClient.java:2830), FunctionalTest.testPutObject(FunctionalTest.java:763), FunctionalTest.putObject(FunctionalTest.java:890), FunctionalTest.runObjectTests(FunctionalTest.java:3751), FunctionalTest.runTests(FunctionalTest.java:3783), FunctionalTest.main(FunctionalTest.java:3927)]"
}
(10/15) Running minio-js tests ... FAILED in 46 seconds
{
  "name": "minio-js",
  "function": "\"after all\" hook in \"functional tests\"",
  "duration": 83,
  "status": "FAIL",
  "error": "S3Error: The bucket you tried to delete is not empty at Object.parseError (node_modules/minio/dist/main/xml-parsers.js:79:11) at /mint/run/core/minio-js/node_modules/minio/dist/main/transformers.js:156:22 at DestroyableTransform._flush (node_modules/minio/dist/main/transformers.js:80:10) at DestroyableTransform.prefinish (node_modules/readable-stream/lib/_stream_transform.js:129:10) at prefinish (node_modules/readable-stream/lib/_stream_writable.js:611:14) at finishMaybe (node_modules/readable-stream/lib/_stream_writable.js:620:5) at endWritable (node_modules/readable-stream/lib/_stream_writable.js:643:3) at DestroyableTransform.Writable.end (node_modules/readable-stream/lib/_stream_writable.js:571:22) at IncomingMessage.onend (internal/streams/readable.js:684:10) at endReadableNT (internal/streams/readable.js:1327:12) at processTicksAndRejections (internal/process/task_queues.js:80:21)"
}
(10/15) Running minio-py tests ... done in 18 minutes and 50 seconds
(11/15) Running s3cmd tests ... done in 2 minutes and 19 seconds
(12/15) Running s3select tests ... FAILED in 25 seconds
{
  "name": "s3select:test_csv_output_quote_char",
  "function": "select_object_content(bucket_name, object_name, request)",
  "args": {
    "bucket_name": "s3select-test-c35bc1d3-a52b-4c82-9116-632d79f76065"
  },
  "duration": 8305,
  "message": "Test test_9 unexpectedly failed with: InternalError: file not found",
  "error": "Traceback (most recent call last):\n  File \"/mint/run/core/s3select/csv.py\", line 42, in test_sql_api\n    for d in data.stream(10*1024):\n  File \"/usr/local/lib/python3.6/dist-packages/minio/select.py\", line 444, in stream\n    if self._read() <= 0:\n  File \"/usr/local/lib/python3.6/dist-packages/minio/select.py\", line 406, in _read\n    headers.get(\":error-code\"), headers.get(\":error-message\"),\nminio.error.MinioException: InternalError: file not found\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"./tests.py\", line 57, in main\n    test_csv_output_custom_quote_char(client, log_output)\n  File \"/mint/run/core/s3select/csv.py\", line 175, in test_csv_output_custom_quote_char\n    input_data, sql_opts, expected_output)\n  File \"/mint/run/core/s3select/csv.py\", line 48, in test_sql_api\n    'Test {} unexpectedly failed with: {}'.format(test_name, select_err))\nValueError: Test test_9 unexpectedly failed with: InternalError: file not found\n",
  "status": "FAIL"
}
(12/15) Running security tests ... done in 0 seconds

Executed 12 out of 15 tests successfully.

11163-ef79de5/mint-erasure.sh.log:

Running with
SERVER_ENDPOINT:      minio-c2.minio.io:31985
ACCESS_KEY:           minio
SECRET_KEY:           ***REDACTED***
ENABLE_HTTPS:         0
SERVER_REGION:        us-east-1
MINT_DATA_DIR:        /mint/data
MINT_MODE:            full
ENABLE_VIRTUAL_STYLE: 0

To get logs, run 'docker cp f00a6b21aaf3:/mint/log /tmp/mint-logs'

(1/15) Running aws-sdk-go tests ... done in 1 seconds
(2/15) Running aws-sdk-java tests ... done in 1 seconds
(3/15) Running aws-sdk-php tests ... done in 42 seconds
(4/15) Running aws-sdk-ruby tests ... done in 3 seconds
(5/15) Running awscli tests ... done in 2 minutes and 0 seconds
(6/15) Running healthcheck tests ... done in 1 seconds
(7/15) Running mc tests ... done in 50 seconds
(8/15) Running minio-dotnet tests ... done in 35 seconds
(9/15) Running minio-go tests ... FAILED in 1 minutes and 29 seconds
{
  "args": {
    "bucketName": "minio-go-test-euzd5hsmamu9rg4v",
    "objectName": "x4uxhvp4u6qqe0rq3qz3tfsrwrtu6m"
  },
  "duration": 6792,
  "error": "read tcp 172.17.0.2:59824->72.28.97.58:31985: read: connection reset by peer",
  "function": "GetObject(bucketName, objectName)",
  "message": "CopyN failed",
  "name": "minio-go: testSSES3EncryptedGetObjectReadSeekFunctional",
  "status": "FAIL"
}
(9/15) Running minio-java tests ... done in 1 minutes and 5 seconds
(10/15) Running minio-js tests ... done in 48 seconds
(11/15) Running minio-py tests ... done in 2 minutes and 32 seconds
(12/15) Running s3cmd tests ... done in 19 seconds
(13/15) Running s3select tests ... done in 5 seconds
(14/15) Running security tests ... done in 0 seconds

Executed 14 out of 15 tests successfully.

Deleting image on docker hub
Deleting image locally

@harshavardhana harshavardhana merged commit 027e174 into minio:master Dec 24, 2020
@harshavardhana harshavardhana deleted the discard-results branch December 24, 2020 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants