Skip to content

Make SingleFile BundleID the expected length (12 bytes) #116656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 24, 2025

Conversation

jtschuster
Copy link
Member

@jtschuster jtschuster commented Jun 13, 2025

It looks like #52930 meant to have the BundleID be 12 characters, but used Substring(12), which created a substring starting at offset 12 instead. This may end up being more of breaking change, though.

@github-actions github-actions bot added the area-HostModel Microsoft.NET.HostModel issues label Jun 13, 2025
Copy link
Contributor

Tagging subscribers to this area: @vitek-karas, @agocke
See info in area-owners.md if you want to be subscribed.

@jtschuster jtschuster marked this pull request as ready for review June 14, 2025 01:18
@Copilot Copilot AI review requested due to automatic review settings June 14, 2025 01:18
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR corrects the substring usage when generating the deterministic BundleID so that the ID is the intended fixed length rather than the remainder of the string.

  • Changed the Substring call to use a start index of 0 and a specified length.
  • Ensures the generated BundleID is exactly BundleIdLength characters.

@agocke
Copy link
Member

agocke commented Jun 14, 2025

We would need to bump the major version, but otherwise it seems ok. How long is the bundle id right now?

@agocke
Copy link
Member

agocke commented Jun 14, 2025

Also needs a unit test

@jtschuster
Copy link
Member Author

We would need to bump the major version, but otherwise it seems ok. How long is the bundle id right now?

Haven't validated by looking at an actual bundle yet, but it should be 32 bytes long now (SHA-256 hash : 32 bytes => Base 64 Encoded : 44 bytes => Substring(12) : 32 bytes). Seems like a reasonable length dispite being maybe overkill for a unique identifier. It might make more sense to leave this as 32 bytes and update the code to make it look intentionally 32 bytes as long there's no good reason to change it back to 12 bytes. From what I gather from previous discussions, the 12 bytes was arbitrarily chosen to match GetRandomFileName().

@agocke
Copy link
Member

agocke commented Jun 16, 2025

Works for me. Do we actually use the bundle ID as a file name anywhere? Maybe during extraction? The problem with long file names is windows path lengths.

@elinor-fung
Copy link
Member

I think either way (going back to 12 bytes per the original intent of making it deterministic or making the code/naming indicate what it is currently doing) is actually fine. The bundle ID is expected to be variable-length data (written/read as a length-prefixed string), so it shouldn't be breaking.

@elinor-fung
Copy link
Member

Do we actually use the bundle ID as a file name anywhere? Maybe during extraction?

Yeah, for extraction we do <extraction_root>/<app_name>/<bundle_id>

@jtschuster
Copy link
Member Author

I think either way (going back to 12 bytes per the original intent of making it deterministic or making the code/naming indicate what it is currently doing) is actually fine. The bundle ID is expected to be variable-length data (written/read as a length-prefixed string), so it shouldn't be breaking.

To clarify, since it's a variable length field this shouldn't require a version bump in the Manifest, right?

@elinor-fung
Copy link
Member

I think either way (going back to 12 bytes per the original intent of making it deterministic or making the code/naming indicate what it is currently doing) is actually fine. The bundle ID is expected to be variable-length data (written/read as a length-prefixed string), so it shouldn't be breaking.

To clarify, since it's a variable length field this shouldn't require a version bump in the Manifest, right?

Right - this should not need a version bump in the Manifest since it is already a variable length field.

@jtschuster jtschuster merged commit 9ec2190 into dotnet:main Jun 24, 2025
74 of 76 checks passed
@@ -133,8 +133,9 @@ private string GenerateDeterministicId()
byte[] manifestHash = bundleHash.Hash;
bundleHash.Dispose();
bundleHash = null;

return Convert.ToBase64String(manifestHash).Substring(BundleIdLength).Replace('/', '_');
string id = Convert.ToBase64String(manifestHash).Substring(0, BundleIdLength).Replace('/', '_');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More safer escaping:

using using System.Buffers.Text;
...
#if NET
            string id = Base64Url.EncodeToString(manifestHash).Substring(0, BundleIdLength);
#else
            string id = Convert.ToBase64String(manifestHash).Substring(0, BundleIdLength).Replace('/', '_');
#endif

If without #if / #else directives is desired, <PackageReference Include="System.Buffers" Version="$(SystemBuffersVersion)" Condition="'$(TargetFrameworkIdentifier)' != '.NETCoreApp'" />

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these guaranteed to be identical? I'd be worried about using a different API in different TFMs when the id needs to be deterministic.

Also, what do you mean by safer?

This was written this way because Roslyn does this for getting a unique pipe name and hasn't had issues with it: #52930 (comment).
Current implementation:
https://github.com/dotnet/roslyn/blob/c8b5f306d86bc04c59a413ad17b6152663a1e744/src/Compilers/Shared/BuildServerConnection.cs#L579

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what do you mean by safer?

https://developer.mozilla.org/en-US/docs/Glossary/Base64#url_and_filename_safe_base64, the term "URL-safe Base64 encoding" is common.

Are these guaranteed to be identical?

The PackageReference approach will make sure it's identical for downlevel TFM.

Copy link
Member

@filipnavara filipnavara Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not identical but it would be more correct usage anyway.

Base64Url.EncodeToString differs in three ways:

  • Replaces / => _ (same as in previous code)
  • Removes padding with = at the end (irrelevant in this case because we trim at 12 characters so the padding is never part of the output)
  • Replaces + => -

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like Base64Url is not in the maintainence packages, but I can add a separate implementation for Framework.

jtschuster added a commit that referenced this pull request Jun 26, 2025
#116656 (9ec2190) was merged accidentally before realizing that the length was unchanged and remained at 32 bytes. This actually changes the length to 12.

Also uses Base64Url.EncodeToString on .NET and a custom implementation on .NET Framework rather than Convert.ToBase64().Replace(). Adds tests for the custom implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-HostModel Microsoft.NET.HostModel issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants