Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: headobject will not save meta into cache when not found #19929

Conversation

jiuker
Copy link
Contributor

@jiuker jiuker commented Jun 13, 2024

fix: headobject will not save meta into cache when not found

Community Contribution License

All community contributions in this pull request are licensed to the project maintainers
under the terms of the Apache 2 license.
By creating this pull request I represent that I have the right to license the
contributions to the project maintainers under the Apache 2 license.

Description

fix: headobject will not save meta into cache when not found

mc cp 4mobject minio9000/mytest/object
mc stat minio9000/mytest/object   <----------here save metadata 4mb
mc cp 3mobject minio9000/mytest/object    <------------here can't save
mc cp minio9000/mytest/object object  <------- read as 4mb, but it should be 3mb
mc: <ERROR> Failed to copy `http://:9000/mytest/object`. Input reader closed pre-maturely. Expected `aaaaaaa` bytes, but only received `bbbbbb` bytes.

Motivation and Context

How to test this PR?

Types of changes

Checklist:

  • Fixes a regression (If yes, please add commit-id or PR # here)
  • Unit tests added/updated
  • Internal documentation updated
  • Create a documentation update request here

fix: headobject will not save meta into cache when not found
@harshavardhana
Copy link
Member

The main motivation here is that metadata on the cache is not useful, and it was a bit excessive in the current code.

The caching will populate when both metadata and data are available such as GET and PUT.

Copy link
Contributor

@shtripat shtripat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ramondeklein
Copy link
Contributor

I have recreated my cluster using KMS and caching enabled (also on the bucket), but I can't reproduce the issue anymore. It should be the same configuration as yesterday, so I'm not sure what has changed. Because I cannot reproduce the issue with an older Minio release, I cannot confirm that this release will fix it.

@jiuker
Copy link
Contributor Author

jiuker commented Jun 13, 2024

I have recreated my cluster using KMS and caching enabled (also on the bucket), but I can't reproduce the issue anymore. It should be the same configuration as yesterday, so I'm not sure what has changed. Because I cannot reproduce the issue with an older Minio release, I cannot confirm that this release will fix it.

I have posted my steps. You can follow them @ramondeklein

@klauspost
Copy link
Contributor

klauspost commented Jun 13, 2024

@harshavardhana Shouldn't cache just be completely removed from HeadObject?

It seems counterproductive to fetch up to a 1MB data (or whatever the limit is) just to serve Metadata header.

@harshavardhana
Copy link
Member

@harshavardhana Shouldn't cache just be completely removed from HeadObject?

HEAD is cached to save on reading xl.meta for metadata such as if-match, if-modified calls. However what was wrong was HEAD updating the cache without data.

It seems counterproductive to fetch up to a 1MB data (or whatever the limit is) just to serve Metadata header.

This needs fix at mincache not at MinIO.

@klauspost
Copy link
Contributor

klauspost commented Jun 13, 2024

@harshavardhana So HEAD is saved in a separate cache from GET requests?

Just wondering if it returns data as well, or is everything split on the server.

@ramondeklein
Copy link
Contributor

I have been able to reproduce the issue with this PR, so it doesn't seem to be fixed. When I restart the pods using this image it all seems to work fine:

mc: <DEBUG> GET /test/?location= HTTP/1.1
Host: tenant-rdk-hl.tenant-rdk:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.70 mc/DEVELOPMENT.GOGET
Accept-Encoding: zstd,gzip
Authorization: AWS4-HMAC-SHA256 Credential=l5m7mQiwUi2GCSH3/20240613/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: UNSIGNED-PAYLOAD
X-Amz-Date: 20240613T114107Z

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 128
Accept-Ranges: bytes
Content-Type: application/xml
Date: Thu, 13 Jun 2024 11:41:07 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Id-2: d5a265f7265c149d31b79b2568737f9e0d97aab546838b9090d7403265887733
X-Amz-Request-Id: 17D88D86CB930FC3
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

mc: <DEBUG> TLS Certificate found:
mc: <DEBUG>  >> Expires: 2025-06-12 09:16:34 +0000 UTC
mc: <DEBUG> Response Time: 5.295233ms

mc: <DEBUG> HEAD /test/audi/Audi%20cabriolet%2001.jpg HTTP/1.1
Host: tenant-rdk-hl.tenant-rdk:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.70 mc/DEVELOPMENT.GOGET
Authorization: AWS4-HMAC-SHA256 Credential=l5m7mQiwUi2GCSH3/20240613/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20240613T114107Z

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 2971514
Accept-Ranges: bytes
Content-Type: image/jpeg
Date: Thu, 13 Jun 2024 11:41:07 GMT
Etag: "bf8b56ed8f68b800536335ad4c6fa7e8"
Last-Modified: Thu, 13 Jun 2024 11:37:10 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Id-2: d5a265f7265c149d31b79b2568737f9e0d97aab546838b9090d7403265887733
X-Amz-Request-Id: 17D88D86CBB15A37
X-Amz-Server-Side-Encryption: aws:kms
X-Amz-Server-Side-Encryption-Aws-Kms-Key-Id: arn:aws:kms:tenant-rdk-key
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

mc: <DEBUG> TLS Certificate found:
mc: <DEBUG>  >> Expires: 2025-06-12 09:16:34 +0000 UTC
mc: <DEBUG> Response Time: 3.414829ms

mc: <DEBUG> HEAD /test/audi/Audi%20cabriolet%2001.jpg HTTP/1.1
Host: tenant-rdk-hl.tenant-rdk:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.70 mc/DEVELOPMENT.GOGET
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=l5m7mQiwUi2GCSH3/20240613/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20240613T114107Z

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 2971514
Accept-Ranges: bytes
Content-Type: image/jpeg
Date: Thu, 13 Jun 2024 11:41:07 GMT
Etag: "bf8b56ed8f68b800536335ad4c6fa7e8"
Last-Modified: Thu, 13 Jun 2024 11:37:10 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Id-2: d5a265f7265c149d31b79b2568737f9e0d97aab546838b9090d7403265887733
X-Amz-Request-Id: 17D88D86CBF60060
X-Amz-Server-Side-Encryption: aws:kms
X-Amz-Server-Side-Encryption-Aws-Kms-Key-Id: arn:aws:kms:tenant-rdk-key
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

mc: <DEBUG> TLS Certificate found:
mc: <DEBUG>  >> Expires: 2025-06-12 09:16:34 +0000 UTC
mc: <DEBUG> Response Time: 2.936505ms

mc: <DEBUG> GET /test/audi/Audi%20cabriolet%2001.jpg HTTP/1.1
Host: tenant-rdk-hl.tenant-rdk:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.70 mc/DEVELOPMENT.GOGET
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=l5m7mQiwUi2GCSH3/20240613/us-east-1/s3/aws4_request, SignedHeaders=host;if-match;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
If-Match: "bf8b56ed8f68b800536335ad4c6fa7e8"
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20240613T114107Z

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 2971514
Accept-Ranges: bytes
Content-Type: image/jpeg
Date: Thu, 13 Jun 2024 11:41:07 GMT
Etag: "bf8b56ed8f68b800536335ad4c6fa7e8"
Last-Modified: Thu, 13 Jun 2024 11:37:10 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Id-2: d5a265f7265c149d31b79b2568737f9e0d97aab546838b9090d7403265887733
X-Amz-Request-Id: 17D88D86CC3F66D6
X-Amz-Server-Side-Encryption: aws:kms
X-Amz-Server-Side-Encryption-Aws-Kms-Key-Id: arn:aws:kms:tenant-rdk-key
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

mc: <DEBUG> TLS Certificate found:
mc: <DEBUG>  >> Expires: 2025-06-12 09:16:34 +0000 UTC
mc: <DEBUG> Response Time: 4.672123ms

When I copy the data to the test bucket again, then the copy operation succeeds. But fetching the data fails with the following logging:

mc: <DEBUG> GET /test/?location= HTTP/1.1
Host: tenant-rdk-hl.tenant-rdk:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.70 mc/DEVELOPMENT.GOGET
Accept-Encoding: zstd,gzip
Authorization: AWS4-HMAC-SHA256 Credential=l5m7mQiwUi2GCSH3/20240613/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: UNSIGNED-PAYLOAD
X-Amz-Date: 20240613T114115Z

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 128
Accept-Ranges: bytes
Content-Type: application/xml
Date: Thu, 13 Jun 2024 11:41:15 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Id-2: d5a265f7265c149d31b79b2568737f9e0d97aab546838b9090d7403265887733
X-Amz-Request-Id: 17D88D888E65F5DB
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

mc: <DEBUG> TLS Certificate found:
mc: <DEBUG>  >> Expires: 2025-06-12 09:16:34 +0000 UTC
mc: <DEBUG> Response Time: 5.764119ms

mc: <DEBUG> HEAD /test/audi/Audi%20cabriolet%2001.jpg HTTP/1.1
Host: tenant-rdk-hl.tenant-rdk:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.70 mc/DEVELOPMENT.GOGET
Authorization: AWS4-HMAC-SHA256 Credential=l5m7mQiwUi2GCSH3/20240613/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20240613T114115Z

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 2972986
Accept-Ranges: bytes
Content-Type: image/jpeg
Date: Thu, 13 Jun 2024 11:41:15 GMT
Etag: "cf286d3bb78192dc30574ae716b671a7"
Last-Modified: Thu, 13 Jun 2024 11:41:12 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Id-2: d5a265f7265c149d31b79b2568737f9e0d97aab546838b9090d7403265887733
X-Amz-Request-Id: 17D88D888E83DED2
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

mc: <DEBUG> TLS Certificate found:
mc: <DEBUG>  >> Expires: 2025-06-12 09:16:34 +0000 UTC
mc: <DEBUG> Response Time: 1.841403ms

mc: <DEBUG> HEAD /test/audi/Audi%20cabriolet%2001.jpg HTTP/1.1
Host: tenant-rdk-hl.tenant-rdk:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.70 mc/DEVELOPMENT.GOGET
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=l5m7mQiwUi2GCSH3/20240613/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20240613T114115Z

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 2972986
Accept-Ranges: bytes
Content-Type: image/jpeg
Date: Thu, 13 Jun 2024 11:41:15 GMT
Etag: "cf286d3bb78192dc30574ae716b671a7"
Last-Modified: Thu, 13 Jun 2024 11:41:12 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Id-2: d5a265f7265c149d31b79b2568737f9e0d97aab546838b9090d7403265887733
X-Amz-Request-Id: 17D88D888EAFA7B8
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

mc: <DEBUG> TLS Certificate found:
mc: <DEBUG>  >> Expires: 2025-06-12 09:16:34 +0000 UTC
mc: <DEBUG> Response Time: 1.460432ms

mc: <DEBUG> GET /test/audi/Audi%20cabriolet%2001.jpg HTTP/1.1
Host: tenant-rdk-hl.tenant-rdk:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.70 mc/DEVELOPMENT.GOGET
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=l5m7mQiwUi2GCSH3/20240613/us-east-1/s3/aws4_request, SignedHeaders=host;if-match;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
If-Match: "cf286d3bb78192dc30574ae716b671a7"
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20240613T114115Z

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 2971514
Accept-Ranges: bytes
Content-Type: image/jpeg
Date: Thu, 13 Jun 2024 11:41:15 GMT
Etag: "cf286d3bb78192dc30574ae716b671a7"
Last-Modified: Thu, 13 Jun 2024 11:41:12 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Id-2: d5a265f7265c149d31b79b2568737f9e0d97aab546838b9090d7403265887733
X-Amz-Request-Id: 17D88D888EE0966D
X-Amz-Server-Side-Encryption: aws:kms
X-Amz-Server-Side-Encryption-Aws-Kms-Key-Id: arn:aws:kms:tenant-rdk-key
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

mc: <DEBUG> TLS Certificate found:
mc: <DEBUG>  >> Expires: 2025-06-12 09:16:34 +0000 UTC
mc: <DEBUG> Response Time: 14.091496ms

mc: <ERROR> Failed to copy `https://tenant-rdk-hl.tenant-rdk:9000/test/audi/Audi cabriolet 01.jpg`. Input reader closed pre-maturely. Expected `2972986` bytes, but only received `2971514` bytes.
 (3) cp-main.go:478 cmd.doCopySession(..) Tags: [https://tenant-rdk-hl.tenant-rdk:9000/test/audi/Audi cabriolet 01.jpg]
 (2) common-methods.go:508 cmd.uploadSourceToTargetURL(..) Tags: [https://tenant-rdk-hl.tenant-rdk:9000/test/audi/Audi cabriolet 01.jpg]
 (1) common-methods.go:212 cmd.putTargetStream(..) Tags: [, /mnt/c/temp/Audi cabriolet 01.jpg]
 (0) client-fs.go:355 cmd.(*fsClient).put(..)
 Release-Tag:DEVELOPMENT.GOGET | Commit:DEVELOPMENT. | Host:RDK-PREC7560 | OS:linux | Arch:amd64 | Lang:go1.22.3 | Mem:5.7 MiB/18 MiB | Heap:5.7 MiB/11 MiB

After a restart of the pods, everything works again. I can copy the files multiple times to my local disk without any issues. Only when I copy the files from my local disk to the bucket again, then it starts to fail. I double-checked, but I build my Docker image using branch jiuker:fix-headobject-will-not-save-meta-into-cache-when-not-found and my /usr/bin/minio file is only a few minutes old, so I'm pretty confident that I'm using the proper image.

@ramondeklein
Copy link
Contributor

ramondeklein commented Jun 13, 2024

I think there might be a problem that mincache updates its local cache during the PutObject operation. To test if mincache is the issue, I did the following:

  1. Copy the files to the bucket and fetch one of the files. This failed with the issue that it couldn't read enough data.
  2. Shell into the mincache side-containers of both pods and issue kill 1 to kill the mincache.
  3. Wait until both pods spinned up a mincache side-container again.
  4. Fetch one of the files and this now works.

This needs fix at mincache not at MinIO.

@harshavardhana I guess my test indicates the same.

@ramondeklein
Copy link
Contributor

Steps to reproduce:

  1. Spin up a Minio container that uses server-side encryption.
  2. Create a bucket named test and enable SSE on the bucket.
  3. Ensure that mincache is used and make sure mincache also caches the test bucket.
  4. Copy a file to the test bucket using mc.
  5. Copy the file from the test bucket using mc.

After restarting mincache copying the file works fine, until it's written again.

@harshavardhana
Copy link
Member

This PR is an independent fix where it avoids discrepancy on HEAD, so not really fixing the original issue.

@harshavardhana harshavardhana merged commit 62e6dc9 into minio:master Jun 13, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

with mincache enabled HEAD request returns different Content-Length then matching GET on a KMS setup
5 participants