Skip to content

ProbabilisticMap and CompExactlyDependsOn #89979

Closed
@MihaZupan

Description

@MihaZupan

The ProbabilisticMap (bloom filter) implementation we use in CoreLib to speed up IndexOfAny(char[]) operations has a vectorized path (Sse41.IsSupported || AdvSimd.Arm64.IsSupported).
The process is split into two parts: 1) building the map, and 2) searching through text using said map.

The actual data structure we end up with depends on whether vectorization is supported. That is, we must make the same decision re: vectorization support when building the map and when we're searching.
If they don't agree (e.g. SetCharBit assumes these intrinsics aren't supported, but IsCharBitSet does) we will report wrong results.

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static void SetCharBit(ref uint charMap, byte value)
{
if (Sse41.IsSupported || AdvSimd.Arm64.IsSupported)
{
Unsafe.Add(ref Unsafe.As<uint, byte>(ref charMap), value & VectorizedIndexMask) |= (byte)(1u << (value >> VectorizedIndexShift));
}
else
{
Unsafe.Add(ref charMap, value & PortableIndexMask) |= 1u << (value >> PortableIndexShift);
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static bool IsCharBitSet(ref uint charMap, byte value) => Sse41.IsSupported || AdvSimd.Arm64.IsSupported
? (Unsafe.Add(ref Unsafe.As<uint, byte>(ref charMap), value & VectorizedIndexMask) & (1u << (value >> VectorizedIndexShift))) != 0
: (Unsafe.Add(ref charMap, value & PortableIndexMask) & (1u << (value >> PortableIndexShift))) != 0;

This situation is called out in BOTR docs: https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/vectors-and-intrinsics.md#code-review-and-analyzer-rules-for-code-written-in-systemprivatecorelibdll

Within a single function that uses platform intrinsics, unless marked with the CompExactlyDependsOn attribute it must behave identically regardless of whether IsSupported returns true or not. This allows the R2R compiler to compile with a lower set of intrinsics support, and yet expect that the behavior of the function will remain unchanged in the presence of tiered compilation.

The issue is that if we do apply CompExactlyDependsOn to these methods, we'll run into a viral attribute explosion.
The code analyzer in corelib will tell us that ProbabilisticMap.Contains should be marked, then all its callers like string.MakeSeparatorListAny, then string.SplitInternal, then all string.Split overloads, then ...

What should we do in such situations?

Just mark the two root methods (SetCharBit, IsCharBitSet) as AggressiveOptimization?

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-crossgen2-coreclrin-prThere is an active PR which will close this issue when it is merged

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions