ProbabilisticMap and CompExactlyDependsOn

The [`ProbabilisticMap`](https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/SearchValues/ProbabilisticMap.cs) (bloom filter) implementation we use in CoreLib to speed up `IndexOfAny(char[])` operations has a vectorized path (`Sse41.IsSupported || AdvSimd.Arm64.IsSupported`).
The process is split into two parts: 1) building the map, and 2) searching through text using said map.

The actual data structure we end up with depends on whether vectorization is supported. That is, we must make the same decision re: vectorization support when building the map and when we're searching.
If they don't agree (e.g. `SetCharBit` assumes these intrinsics aren't supported, but `IsCharBitSet` does) we will report wrong results.

https://github.com/dotnet/runtime/blob/643087b28c42f1ea0fbb645ba9f7cbcb2f334060/src/libraries/System.Private.CoreLib/src/System/SearchValues/ProbabilisticMap.cs#L78-L94


This situation is called out in BOTR docs: https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/vectors-and-intrinsics.md#code-review-and-analyzer-rules-for-code-written-in-systemprivatecorelibdll

> Within a single function that uses platform intrinsics, unless marked with the `CompExactlyDependsOn` attribute it must behave identically regardless of whether IsSupported returns true or not. This allows the R2R compiler to compile with a lower set of intrinsics support, and yet expect that the behavior of the function will remain unchanged in the presence of tiered compilation.

The issue is that if we do apply `CompExactlyDependsOn` to these methods, we'll run into a viral attribute explosion.
The code analyzer in corelib will tell us that [`ProbabilisticMap.Contains`](https://github.com/dotnet/runtime/blob/643087b28c42f1ea0fbb645ba9f7cbcb2f334060/src/libraries/System.Private.CoreLib/src/System/SearchValues/ProbabilisticMap.cs#L97) should be marked, then all its callers like `string.MakeSeparatorListAny`, then `string.SplitInternal`, then all `string.Split` overloads, then ...

What should we do in such situations?

Just mark the two root methods (`SetCharBit`, `IsCharBitSet`) as `AggressiveOptimization`?

	[MethodImpl(MethodImplOptions.AggressiveInlining)]
	private static void SetCharBit(ref uint charMap, byte value)
	{
	if (Sse41.IsSupported \|\| AdvSimd.Arm64.IsSupported)
	{
	Unsafe.Add(ref Unsafe.As<uint, byte>(ref charMap), value & VectorizedIndexMask) \|= (byte)(1u << (value >> VectorizedIndexShift));
	}
	else
	{
	Unsafe.Add(ref charMap, value & PortableIndexMask) \|= 1u << (value >> PortableIndexShift);
	}
	}

	[MethodImpl(MethodImplOptions.AggressiveInlining)]
	private static bool IsCharBitSet(ref uint charMap, byte value) => Sse41.IsSupported \|\| AdvSimd.Arm64.IsSupported
	? (Unsafe.Add(ref Unsafe.As<uint, byte>(ref charMap), value & VectorizedIndexMask) & (1u << (value >> VectorizedIndexShift))) != 0
	: (Unsafe.Add(ref charMap, value & PortableIndexMask) & (1u << (value >> PortableIndexShift))) != 0;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ProbabilisticMap and CompExactlyDependsOn #89979

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ProbabilisticMap and CompExactlyDependsOn #89979

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions