Skip to content
This repository has been archived by the owner on Aug 27, 2023. It is now read-only.

processes metedata continuous spaces #314

Merged
merged 1 commit into from
Sep 3, 2022

Conversation

QSummerY
Copy link
Contributor

As suggested with issue boto/botocore#2409
Upload may fail when there is a continuous space

Copy link
Owner

@stevearc stevearc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linked issue is talking about leading/trailing whitespace, not multiple space characters in a row. This seems like a value.strip() would be more appropriate?

@QSummerY
Copy link
Contributor Author

SignatureDoesNotMatch occurs when I try to upload. After local verification, the signature parsing will be affected if continuous Spaces appear. Non-ascii characters in metadata are removed from the normalize_metadata_value method, after removing the continuous space may come up.
This linked issue is similar to mine, and I also tried the situation with Spaces at the beginning and the end, but the upload is normal, only the continuous space will be affected

Copy link
Owner

@stevearc stevearc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, which object storage provider is giving you this signature error?

@@ -80,7 +80,7 @@ def normalize_metadata_value(value: Union[str, bytes]) -> str:
value = value.decode("utf-8")
if isinstance(value, str):
value = "".join(c for c in unicodedata.normalize("NFKD", value) if ord(c) < 128)
return value
return re.sub(r"\s+", " ", value)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will replace all whitespace characters with " ". Could we instead do

re.sub(r"  +", " ", value)

to only replace instances of two-or-more spaces with a single space?

Copy link
Contributor Author

@QSummerY QSummerY Aug 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the presence of other whitespace characters, the same problem would occur if \t existed, and the presence of \n and \r would succeed, but the characters after the whitespace were removed.
And observing that only X-AMZ-meta-summary affects the request, the values processed include the following.

 'x-amz-meta-hash-sha256': '71eccb33ac8b2584c86f36f8ebfbe72c0a98022b576f411ab05773343e4e2cac', 
 'x-amz-meta-hash-md5': 'f562366338d015df143c7afacabdee44', 
 'x-amz-meta-summary': ' Python3\tweb , WSGI, Web ', 
 'x-amz-meta-name': 'frontgate', 
 'x-amz-meta-version': '0.6.39'

So I think it's going to be implemented the same way as before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the COS of Tencent Cloud.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, last request:
This is going to replace all kinds of whitespace with spaces. It would be better if we could preserve the original whitespace if it's not causing issues. I believe this should do it

re.sub(r'(\s)\s+', r'\1', value)

Copy link
Contributor Author

@QSummerY QSummerY Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I have tried to only exist /t will also appear, and if you keep the original, it will cause the same problem. So I think it should look like this

return re.sub(r"\s+", " ", value)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow COS really doesn't like whitespace, huh? I guess this is fine. The worst that we expect is some mangling of the summary formatting.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 82.916% when pulling 5930658 on QSummerY:processes_continuous_spaces into 477ca21 on stevearc:master.

Copy link
Owner

@stevearc stevearc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last request before merging

@stevearc stevearc merged commit 6640512 into stevearc:master Sep 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants