fix: discarding results do not attempt in-memory metacache writer #11163

harshavardhana · 2020-12-24T07:48:30Z

Description

fix: discarding results do not attempt in-memory metacache writer

Motivation and Context

Optimizations include

do not write the metacache block if the size of the
block is '0' and it is the first block - where listing
is attempted for a transient list. this helps to
avoid creating lots of empty metacache entries for
minioMetaBucket
avoid the entire initialization sequence of cacheCh
, metacacheBlockWriter if we are simply going to skip
them when discardResults is set to true.
No need to hold write locks while writing metacache
blocks - each block is unique, per bucket, per prefix
and also is written by a single node.

How to test this PR?

Needs a large setup with lots of files in different prefixes, tested this
on a packet setup with major gains obtained with per-prefix listing.

This PR to address some performance problems observed with
spark workloads under very high concurrency.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

Fixes a regression (If yes, please add commit-id or PR # here)
Documentation needed
Unit tests needed

harshavardhana · 2020-12-24T07:59:38Z

This PR to address some performance problems observed with
spark workloads under very high concurrency.

Of course, all of this was tested under LIST_QUORUM=strict consistency

kannappanr

LGTM

vadmeste

LGTM & tested

Optimizations include - do not write the metacache block if the size of the block is '0' and it is the first block - where listing is attempted for a transient prefix, this helps to avoid creating lots of empty metacache entries for `minioMetaBucket` - avoid the entire initialization sequence of cacheCh , metacacheBlockWriter if we are simply going to skip them when discardResults is set to true. - No need to hold write locks while writing metacache blocks - each block is unique, per bucket, per prefix and also is written by a single node.

minio-trusted · 2020-12-24T22:31:19Z

Mint Automation

Test	Result
mint-large-bucket.sh	✔️
mint-fs.sh	✔️
mint-gateway-s3.sh	✔️
mint-dist-erasure.sh	✔️
mint-zoned.sh	✔️
mint-gateway-nas.sh	✔️
mint-gateway-azure.sh	❌ more...
mint-erasure.sh	❌ more...

11163-ef79de5/mint-gateway-azure.sh.log:

Running with
SERVER_ENDPOINT:      minio-c2.minio.io:30689
ACCESS_KEY:           minioazure
SECRET_KEY:           ***REDACTED***
ENABLE_HTTPS:         0
SERVER_REGION:        us-east-1
MINT_DATA_DIR:        /mint/data
MINT_MODE:            full
ENABLE_VIRTUAL_STYLE: 0

To get logs, run 'docker cp a7ac3596699c:/mint/log /tmp/mint-logs'

(1/15) Running aws-sdk-go tests ... done in 9 seconds
(2/15) Running aws-sdk-java tests ... done in 2 seconds
(3/15) Running aws-sdk-php tests ... done in 1 minutes and 9 seconds
(4/15) Running aws-sdk-ruby tests ... done in 19 seconds
(5/15) Running awscli tests ... done in 2 minutes and 55 seconds
(6/15) Running healthcheck tests ... done in 0 seconds
(7/15) Running mc tests ... done in 4 minutes and 10 seconds
(8/15) Running minio-dotnet tests ... done in 1 minutes and 48 seconds
(9/15) Running minio-go tests ... done in 6 minutes and 30 seconds
(10/15) Running minio-java tests ... FAILED in 9 minutes and 11 seconds
{
  "name": "minio-java",
  "function": "putObject()",
  "args": "[user metadata]",
  "duration": 173,
  "status": "FAIL",
  "error": "error occurred\nErrorResponse(code = AuthenticationFailed, message = -> github.com/Azure/azure-storage-blob-go/azblob.newStorageError, github.com/Azure/azure-storage-blob-go@v0.10.0/azblob/zc_storage_error.go:42\n===== RESPONSE ERROR (ServiceCode=AuthenticationFailed) =====\nDescription=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.\nRequestId:5886a6c6-c01e-0047-5c41-da3285000000\nTime:2020-12-24T22:08:23.0103138Z, Details: \n   AuthenticationErrorDetail: The MAC signature found in the HTTP request 'mBeSn/Opvvga09ZvEBwtBdKYLFbVfx2xBiEnvsgVPxk=' is not the same as any computed signature. Server used following string to sign: 'PUT\n\n\n128\n\napplication/xml\n\n\n\n\n\n\nx-ms-blob-cache-control:\nx-ms-blob-content-disposition:\nx-ms-blob-content-encoding:\nx-ms-blob-content-language:\nx-ms-blob-content-type:application/octet-stream\nx-ms-client-request-id:b877fd74-f8e2-45f6-42c1-91dbcf1b355c\nx-ms-date:Thu, 24 Dec 2020 22:08:22 GMT\nx-ms-meta-my_header1:a   b   c\nx-ms-meta-my_header2:\"a   b   c\"\nx-ms-meta-my_project:Project One\nx-ms-meta-my_unicode_tag:å•†å“�\nx-ms-version:2019-02-02\n/minioazure/minio-java-test-cti9ak/minio-java-test-2tjieaf\ncomp:blocklist\ntimeout:1501'.\n   Code: AuthenticationFailed\n   PUT https://minioazure.blob.core.windows.net/minio-java-test-cti9ak/minio-java-test-2tjieaf?comp=blocklist&timeout=1501\n   Authorization: REDACTED\n   Content-Length: [128]\n   Content-Type: [application/xml]\n   User-Agent: [APN/1.0 MinIO/1.0 MinIO/2020-12-24T21:35:51Z]\n   X-Ms-Blob-Cache-Control: []\n   X-Ms-Blob-Content-Disposition: []\n   X-Ms-Blob-Content-Encoding: []\n   X-Ms-Blob-Content-Language: []\n   X-Ms-Blob-Content-Type: [application/octet-stream]\n   X-Ms-Client-Request-Id: [b877fd74-f8e2-45f6-42c1-91dbcf1b355c]\n   X-Ms-Date: [Thu, 24 Dec 2020 22:08:22 GMT]\n   X-Ms-Meta-My_header1: [a   b   c]\n   X-Ms-Meta-My_header2: [\"a   b   c\"]\n   X-Ms-Meta-My_project: [Project One]\n   X-Ms-Meta-My_unicode_tag: [商品]\n   X-Ms-Version: [2019-02-02]\n   --------------------------------------------------------------------------------\n   RESPONSE Status: 403 Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.\n   Content-Length: [1091]\n   Content-Type: [application/xml]\n   Date: [Thu, 24 Dec 2020 22:08:22 GMT]\n   Server: [Microsoft-HTTPAPI/2.0]\n   X-Ms-Error-Code: [AuthenticationFailed]\n   X-Ms-Request-Id: [5886a6c6-c01e-0047-5c41-da3285000000]\n\n\n, bucketName = minio-java-test-cti9ak, objectName = minio-java-test-2tjieaf, resource = /minio-java-test-cti9ak/minio-java-test-2tjieaf, requestId = 1653C67634909073, hostId = afa6210a-d75f-476f-ba70-b01b24ded00d)\nrequest={method=PUT, url=http://minio-c2.minio.io:30689/minio-java-test-cti9ak/minio-java-test-2tjieaf, headers=x-amz-meta-My-Unicode-Tag: 商品\nx-amz-meta-My-Project: Project One\nx-amz-meta-My-header1: a   b   c\nx-amz-meta-My-Header2: \"a   b   c\"\nContent-Type: application/octet-stream\nHost: minio-c2.minio.io:30689\nAccept-Encoding: identity\nUser-Agent: MinIO (Linux; amd64) minio-java/8.0.3\nContent-MD5: A9oFTxee7YVcJ9fWsgQeKg==\nx-amz-content-sha256: 1ff7959f86334ddc5c188a5083268f600146328b2b6c5185e75bf7d9387d6b74\nx-amz-date: 20201224T220822Z\nAuthorization: AWS4-HMAC-SHA256 Credential=*REDACTED*/20201224/us-east-1/s3/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-date;x-amz-meta-my-header1;x-amz-meta-my-header2;x-amz-meta-my-project;x-amz-meta-my-unicode-tag, Signature=*REDACTED*\n}\nresponse={code=403, headers=Accept-Ranges: bytes\nContent-Length: 3082\nContent-Security-Policy: block-all-mixed-content\nContent-Type: application/xml\nServer: MinIO\nVary: Origin\nX-Amz-Request-Id: 1653C67634909073\nX-Xss-Protection: 1; mode=block\nDate: Thu, 24 Dec 2020 22:08:23 GMT\n}\n >>> [io.minio.MinioClient.execute(MinioClient.java:775), io.minio.MinioClient.putObject(MinioClient.java:4547), io.minio.MinioClient.putObject(MinioClient.java:2713), io.minio.MinioClient.putObject(MinioClient.java:2830), FunctionalTest.testPutObject(FunctionalTest.java:763), FunctionalTest.putObject(FunctionalTest.java:890), FunctionalTest.runObjectTests(FunctionalTest.java:3751), FunctionalTest.runTests(FunctionalTest.java:3783), FunctionalTest.main(FunctionalTest.java:3927)]"
}
(10/15) Running minio-js tests ... FAILED in 46 seconds
{
  "name": "minio-js",
  "function": "\"after all\" hook in \"functional tests\"",
  "duration": 83,
  "status": "FAIL",
  "error": "S3Error: The bucket you tried to delete is not empty at Object.parseError (node_modules/minio/dist/main/xml-parsers.js:79:11) at /mint/run/core/minio-js/node_modules/minio/dist/main/transformers.js:156:22 at DestroyableTransform._flush (node_modules/minio/dist/main/transformers.js:80:10) at DestroyableTransform.prefinish (node_modules/readable-stream/lib/_stream_transform.js:129:10) at prefinish (node_modules/readable-stream/lib/_stream_writable.js:611:14) at finishMaybe (node_modules/readable-stream/lib/_stream_writable.js:620:5) at endWritable (node_modules/readable-stream/lib/_stream_writable.js:643:3) at DestroyableTransform.Writable.end (node_modules/readable-stream/lib/_stream_writable.js:571:22) at IncomingMessage.onend (internal/streams/readable.js:684:10) at endReadableNT (internal/streams/readable.js:1327:12) at processTicksAndRejections (internal/process/task_queues.js:80:21)"
}
(10/15) Running minio-py tests ... done in 18 minutes and 50 seconds
(11/15) Running s3cmd tests ... done in 2 minutes and 19 seconds
(12/15) Running s3select tests ... FAILED in 25 seconds
{
  "name": "s3select:test_csv_output_quote_char",
  "function": "select_object_content(bucket_name, object_name, request)",
  "args": {
    "bucket_name": "s3select-test-c35bc1d3-a52b-4c82-9116-632d79f76065"
  },
  "duration": 8305,
  "message": "Test test_9 unexpectedly failed with: InternalError: file not found",
  "error": "Traceback (most recent call last):\n  File \"/mint/run/core/s3select/csv.py\", line 42, in test_sql_api\n    for d in data.stream(10*1024):\n  File \"/usr/local/lib/python3.6/dist-packages/minio/select.py\", line 444, in stream\n    if self._read() <= 0:\n  File \"/usr/local/lib/python3.6/dist-packages/minio/select.py\", line 406, in _read\n    headers.get(\":error-code\"), headers.get(\":error-message\"),\nminio.error.MinioException: InternalError: file not found\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"./tests.py\", line 57, in main\n    test_csv_output_custom_quote_char(client, log_output)\n  File \"/mint/run/core/s3select/csv.py\", line 175, in test_csv_output_custom_quote_char\n    input_data, sql_opts, expected_output)\n  File \"/mint/run/core/s3select/csv.py\", line 48, in test_sql_api\n    'Test {} unexpectedly failed with: {}'.format(test_name, select_err))\nValueError: Test test_9 unexpectedly failed with: InternalError: file not found\n",
  "status": "FAIL"
}
(12/15) Running security tests ... done in 0 seconds

Executed 12 out of 15 tests successfully.

11163-ef79de5/mint-erasure.sh.log:

Running with
SERVER_ENDPOINT:      minio-c2.minio.io:31985
ACCESS_KEY:           minio
SECRET_KEY:           ***REDACTED***
ENABLE_HTTPS:         0
SERVER_REGION:        us-east-1
MINT_DATA_DIR:        /mint/data
MINT_MODE:            full
ENABLE_VIRTUAL_STYLE: 0

To get logs, run 'docker cp f00a6b21aaf3:/mint/log /tmp/mint-logs'

(1/15) Running aws-sdk-go tests ... done in 1 seconds
(2/15) Running aws-sdk-java tests ... done in 1 seconds
(3/15) Running aws-sdk-php tests ... done in 42 seconds
(4/15) Running aws-sdk-ruby tests ... done in 3 seconds
(5/15) Running awscli tests ... done in 2 minutes and 0 seconds
(6/15) Running healthcheck tests ... done in 1 seconds
(7/15) Running mc tests ... done in 50 seconds
(8/15) Running minio-dotnet tests ... done in 35 seconds
(9/15) Running minio-go tests ... FAILED in 1 minutes and 29 seconds
{
  "args": {
    "bucketName": "minio-go-test-euzd5hsmamu9rg4v",
    "objectName": "x4uxhvp4u6qqe0rq3qz3tfsrwrtu6m"
  },
  "duration": 6792,
  "error": "read tcp 172.17.0.2:59824->72.28.97.58:31985: read: connection reset by peer",
  "function": "GetObject(bucketName, objectName)",
  "message": "CopyN failed",
  "name": "minio-go: testSSES3EncryptedGetObjectReadSeekFunctional",
  "status": "FAIL"
}
(9/15) Running minio-java tests ... done in 1 minutes and 5 seconds
(10/15) Running minio-js tests ... done in 48 seconds
(11/15) Running minio-py tests ... done in 2 minutes and 32 seconds
(12/15) Running s3cmd tests ... done in 19 seconds
(13/15) Running s3select tests ... done in 5 seconds
(14/15) Running security tests ... done in 0 seconds

Executed 14 out of 15 tests successfully.

Deleting image on docker hub
Deleting image locally

harshavardhana requested review from vadmeste and kannappanr December 24, 2020 07:48

harshavardhana added the priority: high label Dec 24, 2020

harshavardhana requested a review from klauspost December 24, 2020 07:48

harshavardhana force-pushed the discard-results branch from 4326294 to 448a138 Compare December 24, 2020 08:02

kannappanr approved these changes Dec 24, 2020

View reviewed changes

vadmeste approved these changes Dec 24, 2020

View reviewed changes

harshavardhana force-pushed the discard-results branch from 448a138 to ef79de5 Compare December 24, 2020 21:35

harshavardhana merged commit 027e174 into minio:master Dec 24, 2020

harshavardhana deleted the discard-results branch December 24, 2020 23:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: discarding results do not attempt in-memory metacache writer #11163

fix: discarding results do not attempt in-memory metacache writer #11163

harshavardhana commented Dec 24, 2020

harshavardhana commented Dec 24, 2020

kannappanr left a comment

vadmeste left a comment

minio-trusted commented Dec 24, 2020

fix: discarding results do not attempt in-memory metacache writer #11163

fix: discarding results do not attempt in-memory metacache writer #11163

Conversation

harshavardhana commented Dec 24, 2020

Description

Motivation and Context

How to test this PR?

Types of changes

Checklist:

harshavardhana commented Dec 24, 2020

kannappanr left a comment

Choose a reason for hiding this comment

vadmeste left a comment

Choose a reason for hiding this comment

minio-trusted commented Dec 24, 2020

Mint Automation

11163-ef79de5/mint-gateway-azure.sh.log:

11163-ef79de5/mint-erasure.sh.log: