Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shorten UTD marker file #9387

Merged
merged 8 commits into from
Nov 30, 2023
Merged

Conversation

JanKrivanek
Copy link
Member

@JanKrivanek JanKrivanek commented Nov 2, 2023

Fixes #9346, #7131

Context

With long project paths CopyUpToDate mareker file can get long and break build in VS as long file support is not on.
This is a proposal of short/mid term remedy by shortening the marker file path a bit. However it applies only to non-CPS cases and it applies to CopyUpToDate marker file only.
Considering those facts it's questionable if this should be part of the MSBuild repo.
Alternatively we can decide to fix it more widely by renaming the MSBuildCopyMarkerName to MSBuildProjectMarkerName and use it in various other scenarios where marker/cache files are generated in a similar way - though that would have much higher breaking possibility.

@JanKrivanek JanKrivanek marked this pull request as ready for review November 3, 2023 08:44
@YuliiaKovalova
Copy link
Member

Do you have an experimental insertion for this fix?
It would be nice to see how if affects VS in advance.

@JanKrivanek
Copy link
Member Author

JanKrivanek commented Nov 8, 2023

Do you have an experimental insertion for this fix? It would be nice to see how if affects VS in advance.

https://devdiv.visualstudio.com/DevDiv/_git/VS/pullrequest/509283

I'm looking into the checks failures.
Did you have anything else specific in your mind to watch for?

UPDATE: Just a single test is failing for insertion which is by design (as built packages are missing in the public feed)

@JanKrivanek
Copy link
Member Author

Closing in order to rekick the confused git pull

@JanKrivanek JanKrivanek reopened this Nov 8, 2023
@JanKrivanek
Copy link
Member Author

CI blocked by ADO issue. IcM [internal]: https://portal.microsofticm.com/imp/v3/incidents/details/439851463/home

@JanKrivanek
Copy link
Member Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

src/Build/Utilities/FowlerNollVo1aHash.cs Outdated Show resolved Hide resolved
src/Tasks/Microsoft.Common.CurrentVersion.targets Outdated Show resolved Hide resolved
src/Build/Utilities/FowlerNollVo1aHash.cs Outdated Show resolved Hide resolved
src/Tasks/Microsoft.Common.CurrentVersion.targets Outdated Show resolved Hide resolved
src/StringTools/FowlerNollVo1aHash.cs Outdated Show resolved Hide resolved
src/Tasks/Microsoft.Common.CurrentVersion.targets Outdated Show resolved Hide resolved
Copy link
Contributor

@f-alizada f-alizada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation :)

@JanKrivanek JanKrivanek merged commit c4aec6d into dotnet:main Nov 30, 2023
8 checks passed
@JanKrivanek JanKrivanek deleted the proto/shorten-utd-marker branch November 30, 2023 09:18
Comment on lines +111 to +115
ReadOnlySpan<byte> span = MemoryMarshal.Cast<char, byte>(text.AsSpan());
foreach (byte b in span)
{
hash = unchecked((hash ^ b) * fnvPrimeA64Bit);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does MSBuild support big-endian platforms? It looks like this would not be stable between big-endian and little-endian.

Copy link
Member Author

@JanKrivanek JanKrivanek Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KalleOlaviNiemitalo - Thank you for bringing this up!
We're not declaring what can be the expectations of stableness of the hashing between different systems (we actually do not declare anything ... yet :-)) - but it'd certainly be nice to have it more universal.

@uweigand - would you be willing to chime in and share your thoughts on the following?

Basically - since the MemoryMarshal.Cast<char, byte> changes how we fetch data from memory to processor - it'll behave differently between Endinan-ness representations. But how anout the version which manually splits each char to bytes:

for (int i = 0; i < text.Length; i++)
{
    char ch = text[i];
    byte b = (byte)ch;
    hash ^= b;
    hash *= fnvPrimeA64Bit;

    b = (byte)(ch >> 8);
    hash ^= b;
    hash *= fnvPrimeA64Bit;
}

since all this happens in processor after the data are fetched from memory - it should fetch same results regardless of endinan-ness - is that right?

FYI @ladipro (and thanks for all the offline consultation)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that would work the same in either endianness.

You could also check BitConverter.IsLittleEndian and use MemoryMarshal.Cast on little-endian platforms but fall back to the slower NET35-compatible implementation on big-endian platforms.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…although I have not measured whether the NET35-compatible implementation actually is slower. In both implementations, each operation on hash depends on the previous one, so it doesn't look like there is much that the CPU could do in parallel. And the strings probably are pretty short anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me the code above implements three methods that are (potentially) different. There's the NET35 method in ComputeHash64 (which operates on single bytes at a time, splitting the 2-byte char in an endian-independent way), the !NET35 method in ComputeHash64 (which also operates on single bytes at a time, but splits the 2-byte char in an endian-dependent way), and then there's ComputeHash64Fast (which operates on 2-byte chars at a time, which appears to give a different result from either of the two other methods).

In general, I agree that it would be preferable to use a method that is stable across native endianness. Unless the NET35 method is really slower (and I also suspect it actually isn't), and performance matters at this place, it might be best to just use that method unconditionally?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re ComputeHash64Fast, this comment seems to be false:

/// The hashing algorithm considers only the first 8 bits of each character.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the detailed input! I'll remove the Spans version.

The Spans version is indeed not superior (it has slight to no advantage for large strings, but performs worse for a very short strings). It's nice that it doesn't need unchecked code and is more brief - but given the fact that it can perform worse and that the other version anyways need to be maintained - it's an easy choice.

I'll as well fix the description for the ComputeHash64Fast

JanKrivanek added a commit to JanKrivanek/msbuild that referenced this pull request Dec 12, 2023
rainersigwald pushed a commit that referenced this pull request Dec 12, 2023
…entVersion.targets (#9520)

#9387 introduced improved hashing for [`[MSBuild]::StableStringHash`](https://learn.microsoft.com/en-us/visualstudio/msbuild/property-functions?view=vs-2022#msbuild-stablestringhash), that however broke internal functionality relying on the hash to be stable between versions (despite documented otherwise).

### Changes Made
 * Reverted all the hashing changes
 * Kept the UTD marker change (fixing #9346)

### Note
Longer term fix: #9519
JanKrivanek added a commit that referenced this pull request Feb 8, 2024
JanKrivanek added a commit that referenced this pull request Feb 16, 2024
* Revert "Revert "Shorten UTD marker file (#9387)" except Microsoft.Common.Curr…"

This reverts commit 5af9301.

* Make FNV hash compatible across endianness

* Add StableStringHash intrinsic function overloads
JanKrivanek added a commit to JanKrivanek/msbuild that referenced this pull request Mar 13, 2024
JanKrivanek added a commit that referenced this pull request Mar 18, 2024
* Revert "Revert "Shorten UTD marker file (#9387)" except Microsoft.Common.Curr…"

This reverts commit 5af9301.

* Make FNV hash compatible across endianness

* Add StableStringHash intrinsic function overloads

* Put StringTools functions references behind changewave

* Prevent StableStringHash inlining

* Move the changewave description to proper section
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CopyUpToDateMarker path too long
6 participants