-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Make SingleFile BundleID the expected length (12 bytes) #116656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @vitek-karas, @agocke |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR corrects the substring usage when generating the deterministic BundleID so that the ID is the intended fixed length rather than the remainder of the string.
- Changed the
Substring
call to use a start index of 0 and a specified length. - Ensures the generated BundleID is exactly
BundleIdLength
characters.
We would need to bump the major version, but otherwise it seems ok. How long is the bundle id right now? |
Also needs a unit test |
Haven't validated by looking at an actual bundle yet, but it should be 32 bytes long now (SHA-256 hash : 32 bytes => Base 64 Encoded : 44 bytes => Substring(12) : 32 bytes). Seems like a reasonable length dispite being maybe overkill for a unique identifier. It might make more sense to leave this as 32 bytes and update the code to make it look intentionally 32 bytes as long there's no good reason to change it back to 12 bytes. From what I gather from previous discussions, the 12 bytes was arbitrarily chosen to match GetRandomFileName(). |
Works for me. Do we actually use the bundle ID as a file name anywhere? Maybe during extraction? The problem with long file names is windows path lengths. |
I think either way (going back to 12 bytes per the original intent of making it deterministic or making the code/naming indicate what it is currently doing) is actually fine. The bundle ID is expected to be variable-length data (written/read as a length-prefixed string), so it shouldn't be breaking. |
Yeah, for extraction we do |
To clarify, since it's a variable length field this shouldn't require a version bump in the Manifest, right? |
Right - this should not need a version bump in the Manifest since it is already a variable length field. |
@@ -133,8 +133,9 @@ private string GenerateDeterministicId() | |||
byte[] manifestHash = bundleHash.Hash; | |||
bundleHash.Dispose(); | |||
bundleHash = null; | |||
|
|||
return Convert.ToBase64String(manifestHash).Substring(BundleIdLength).Replace('/', '_'); | |||
string id = Convert.ToBase64String(manifestHash).Substring(0, BundleIdLength).Replace('/', '_'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More safer escaping:
using using System.Buffers.Text;
...
#if NET
string id = Base64Url.EncodeToString(manifestHash).Substring(0, BundleIdLength);
#else
string id = Convert.ToBase64String(manifestHash).Substring(0, BundleIdLength).Replace('/', '_');
#endif
If without #if / #else
directives is desired, <PackageReference Include="System.Buffers" Version="$(SystemBuffersVersion)" Condition="'$(TargetFrameworkIdentifier)' != '.NETCoreApp'" />
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these guaranteed to be identical? I'd be worried about using a different API in different TFMs when the id needs to be deterministic.
Also, what do you mean by safer?
This was written this way because Roslyn does this for getting a unique pipe name and hasn't had issues with it: #52930 (comment).
Current implementation:
https://github.com/dotnet/roslyn/blob/c8b5f306d86bc04c59a413ad17b6152663a1e744/src/Compilers/Shared/BuildServerConnection.cs#L579
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what do you mean by safer?
https://developer.mozilla.org/en-US/docs/Glossary/Base64#url_and_filename_safe_base64, the term "URL-safe Base64 encoding" is common.
Are these guaranteed to be identical?
The PackageReference approach will make sure it's identical for downlevel TFM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not identical but it would be more correct usage anyway.
Base64Url.EncodeToString
differs in three ways:
- Replaces
/
=>_
(same as in previous code) - Removes padding with
=
at the end (irrelevant in this case because we trim at 12 characters so the padding is never part of the output) - Replaces
+
=>-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like Base64Url is not in the maintainence packages, but I can add a separate implementation for Framework.
#116656 (9ec2190) was merged accidentally before realizing that the length was unchanged and remained at 32 bytes. This actually changes the length to 12. Also uses Base64Url.EncodeToString on .NET and a custom implementation on .NET Framework rather than Convert.ToBase64().Replace(). Adds tests for the custom implementation.
It looks like #52930 meant to have the BundleID be 12 characters, but used
Substring(12)
, which created a substring starting at offset 12 instead. This may end up being more of breaking change, though.