-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frozen collection follow-ups #77891
Comments
Tagging subscribers to this area: @dotnet/area-system-collections Issue Details#77799 adds new FrozenDictionary and FrozenSet types. These are in decent shape but not yet ready to ship. Several follow-ups are required:
public static FrozenSet<T> ToFrozenSet<T>(this IEnumerable<T> source, IEqualityComparer<T>? comparer = null,
+ double constructionOptimization = default // pick a better name
); When
|
cc: @geeknoid |
Perfect hashing ends up being not a great choice. Perfect hashing usually involves a more complex hash function, and you'll end up with lower perf than what we have here. Their main benefit would potentially be in smaller data by not having giant sparse lookup tables. Another area worth looking into is to further optimize string-based key (the overwhelming favorite kind of key). There are a number of comparers in the current implementation to try and fine-tune how comparison works, but more are possible. For example, comparers that know about 2 character sequences explicitly. The motivation here is to reduce the cost of hashing incoming strings to a minimum. Perhaps a broader problem that just for the frozen collections, but there is very poor interop between spans and collections. In particular, for string-based dictionaries, it would be terrific to be able to do a lookup using a ReadOnlySpan as a key in addition to a string. This would avoid needing to materialize strings just to do a dictionary lookup. A perfect use for this is a string interning table for incoming REST/gRPC calls. You could go sraight from the incoming buffer, and do a lookup in the dictionary to get the final string object. |
Yes, this has long been desired, and we currently lack a great answer. This is tracked by #27229. |
Frozen collection state serialization is another interesting possibility. The idea is to make it possible to serialize the state of a frozen collection to a portable format. A source generator can then generate such serialized blobs to be sucked in by a frozen collection at startup, providing cheap initiatilziation for well-knwon data. Additionally, said serialized data could actually be passed around in config blobs. So that even state that is config-drivem can have the benefit of low startup time. You serialize the state offline, and send it through the config system. This could in fact provide opportunities for even more aggressive optimizations since startup time would not be an issue. |
I intended the "Source generation" part of what I wrote above to cover this. But we'd also need to be really careful here. It'd be fairly easy to accidentally create a vulnerability. |
Does this exclude using perfect hash functions at all? |
That statement was unrelated to perfect hashing. |
I mean GetHash using seeding today but perfect hash no. So question can we use it or no. |
It could be an enum like CompressionLevel. It is more user friendly in public API. |
Technically could we? Sure. Should we? That would depend on a variety of factors, not the least of which would be demonstrating it was a perf win. And Martin already commented on his experience with their perf in #77891 (comment). |
I couldn't agree more. A simple example, if we have 256 strings all with different length - perfect hash function can be just the length of the string (and byte lookup table). Is it possible to create something faster? Complex perfect functions are rather aimed at creating very compact data buffers. If we are not going to store them in the .Net Runtime, we don't need to strive to make them extremely small and hence we can use simpler and faster functions. |
The implementation already does that. |
Would be very helpful if the dictionary had such indexer public ref readonly KeyValuePair<TKey, TValue> this[int entryIndex]
// for the set
public ref readonly T this[int entryIndex] It would allow to store references to entries as integers and later quickly retrieve them. The idea is to be able to get indexes of the entries and use them to retrieve keys and values. |
What's the scenario for that such that the indexer returning a ref to the value is insufficient? |
You mean this indexer? I have a couple of scenarios.
|
@omariom Does this make sense in the context of a frozen dictionary? Seems like you're using a dictionary to build up unique name/value pairs, and then when you're done you ditch the name part and access your values by index. Do you have scenarios that routinely require having both key-based and index-based access to the same values on an on-going basis? |
@iSazonov For those cases where the input is known statically, a source generator that has a large suite of tricks to generate custom code seems like a great idea. This is not so much focused on the need for perfect hashing, it could be anything that makes the code fast. In the scenarios where I've used frozen collections so far, the set of items in the collections have always come from config or some other dynamic source, so I haven't had the need/desire for a build-time generator. But for parsers which have a well-known set of tokens, there could be value in a highly optimizing source generator. It all depends on the actual gains you could achieve relative to what you already get with the hard-coded frozen collection implementations. |
@stephentoub I have not experimented with purely length-based "hashing", which I think is a fascinating idea. If there is sufficient variance in the length of the keys in the frozen collection, it should be possible to just use the string length as the hash code, completely eliminating the cost of hashing. It'd be trivial to write a new comparer implementation that just uses string length. I bet that could be a considerable win in many cases. |
I spent some time on it, which is where https://github.com/dotnet/runtime/blob/main/src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/LengthBucketsFrozenDictionary.cs came from. There's of course more opportunity to experiment. |
Ah, you're one step ahead. I'm surprised this required a new dictionary type, rather than just a string comparer implementation. |
It didn't require it, but the comparer itself added overhead. It might still be useful to have the comparer with the frozen hashtable as a fallback if it doesn't meet the criteria for this impl. |
Here's another idea: If an integer set is created that contains a constrained range of values, it would be more efficient to just have an array of bits to hold the set's state. The idea is to leverage this for sets of enums. FrozenSet can then be implemented using simple array lookups rather than hashing. I just saw an enum-based set in some production code... |
I suggested providing optimized read-only collections for SortedSet and SortedDictionary. I made a separate topic for discussing the API and other issues related to the implementation of these collections, so as not to interfere with all discussions on different topics in this one topic. |
The Adaptive Radix Tree could be suitable for (ordinal) string keys, especially to implement a sorted set. A trade-off is retaining the actual string keys vs. keeping only their compacted bytes in the tree. |
Come to think of it, the adaptive part is unneccessary for frozen collections. Any radix tree would do. |
Is the scenario of frozen keys, mutable values supported by the new collections? Eg., in this scenario (on Kestrel hot path) a dictionary lookup is basically hand generated (these days maybe Roslyn would generate similar code from a switch statement) and the value may change. https://github.com/dotnet/aspnetcore/blob/41627fe50a7a71e395f2fdfccce9f8cbff10a49a/src/Shared/HttpSys/RequestProcessing/RequestHeaders.Generated.cs#L968 Edit: never mind, it uses fields for the values. But I'm interested in the answer anyway. |
You can't mutate what value the dictionary/set stores for a key. If that value is a reference type, though, or if it contains a reference type, you can of course mutate the referenced instance. In other words, the immutability here is shallow. |
Yeah in this case it's a struct. OK thanks. |
@danmoseley I have definitely seen someone use GetValueRefOrNullRef() and mutate a struct value in a frozen dictionary somewhere… |
That returns a ref readonly. Using a ref readonly to erroneously mutate the collection requires unsafe code. With unsafe code you can violate almost any kind of constraint, including immutability. |
Ahhh, so that was the trick. Indeed, they used a method from the |
There is absolutely no requirement for Perfect Hashing to be complex. Take a look at https://gist.github.com/mjsabby/d7baf64d524176ee35f69ffa2c278179 and it out performs current .NET 7 dictionary. Implements the Hash Displace algorithm. I would implement Frozen Sets and Dictionaries with perfect hashing. [MethodImpl(MethodImplOptions.AggressiveOptimization)]
public static int Lookup(ReadOnlySpan<byte> key, ReadOnlySpan<int> values, ReadOnlySpan<int> seeds)
{
ulong hash;
unsafe
{
fixed (byte* ptr = &MemoryMarshal.GetReference(key))
{
hash = Hash64(ptr, key.Length);
}
}
var size = (ulong)values.Length;
var index = hash & (size - 1);
var seed = seeds[(int)index];
if (seed < 0)
{
return values[0 - seed - 1];
}
var x = (ulong)seed + hash;
x ^= x >> 12;
x ^= x << 25;
x ^= x >> 27;
index = (x * 0x2545F4914F6CDD1D) & (size - 1);
return values[(int)index];
} |
@geeknoid
A source generator for static sets could be fantastic. It can make top-of-the-line lookup perf easily accessible. Being able to customize the underlying hash function and key comparer opens the door for super-efficient implementations that make use of the static set's properties and/or assumptions. About minimal: I've yet to do thorough benchmarking of the PHF area (planning to), but my hunch is having a minimal PHF results in worse performance.
@danmoseley Edit: Looked at the code. Like Stephen said for the Frozen collections, a source-generated PHF-based solution would also allow for shallow immutability. If the value can be an index in an array - that's probably best. This the sort of use case I'd expect could see great benefit from a source-generated PHF set/dictionary. Come to think of it - how does Roslyn do keyword matching? |
@YoniFeng I’m not familiar with that code. Of course any contributions or analysis you may have are welcome. |
Is this on the radar for .NET 8? |
No. It requires exposing the details of the abstraction, and we want some bake time on that before we do. It's something we'll consider for .NET 9. |
@stephentoub are there any work items tracked in this issue that we're planning to deliver in .NET 8? If not we should update the milestone. |
Everything for 8 is done |
Another way to deliver on the premise of improving lookup time at a potential one-time cost of construction would be some kind of mechanism which tracks usage patterns. This could be provided either at construction time (e.g. an array of key lookups) or by enabling some kind of sampling to be run to track lookup keys. The frozen collection would then have a method to re-optimize to speed up lookups based on the usage pattern. Here are some metrics which would be great to know ahead of time for stringly keyed FrozenDictionaries in order to create more optimal and targeted lookups:
I'm sure that some of these ideas aren't suitable or are slow to compute, but equally I'm sure there are others I haven't thought of which could be perfect optimization. I guess the problem with this suggestion is that I'm not entirely sure how |
After I saw that routing in ASP.NET Core uses reflection emitted Trie, I also spent some time attempting to implement an alternative to stringly-keyed ASCII |
Is it being considered for .NET 9? |
Unlikely at this point. |
Any chance of considering adding something like:
Perhaps via analogous methods such as those for |
What functionality are you hoping for here? Specifically just that the exposed Items and Key/Value arrays are sorted and that enumerating happens in that order as well? |
@stephentoub exactly. I have a ConcurrentDictionary that I accumulate into and then need to read from ordered by key, as quickly as possibile (using this for some UDP traffic re-ordering, the dictionary is a Right now I'm using |
Thanks. In that case, I don't think this warrants or necessitates new types. We could just have the creation of a FrozenDictionary/Set sort the arrays at construction, either using a type's built-in notion of ordering (via IComparable) which could be done without any API additions, or via an extra overload that accepts a Comparer to use for that purpose. |
That would definitely work, thanks for the feedback @stephentoub . |
#77799 adds new FrozenDictionary and FrozenSet types. These are in decent shape but not yet ready to ship. Several follow-ups are required:
The initial use cases for these types was for long-lived processes, where overall it's a better use of resources to spend more time at construction time to optimize the data structures and algorithms used for later data access. However, the current construction time is one or more orders of magnitude more than for the standard dictionary/set types. We need to a) invest in driving down those construction costs, and b) consider making the default performance of ToFrozenDictionary/Set as close that of Dictionary/Set as possible via an additional optional parameter to these methods, e.g.
public static FrozenSet<T> ToFrozenSet<T>(this IEnumerable<T> source, IEqualityComparer<T>? comparer = null, + double constructionOptimization = default // pick a better name );
When
0.0
, this would effectively default to creating a FrozenSet/Dictionary that just wrapped a HashSet/Dictionary, providing the same FrozenSet/Dictionary semantics but primarily just as a wrapper for a HashSet/Dictionary, such that the only additional costs were cloning of the data (assuming the Items/Keys/Values properties remain as they are today, they could be lazily-created on first need). On a scale from 0.0 to 1.0, the larger the double value, the more optimization would be performed at construction time, enabling the caller to choose how much time is spent at construction vs how fast subsequent data access can be made, with 1.0 meaning construction would take the longest in exchange for fastest data access. Having the default be fast construction would make this the least surprising for casual use and would avoid falling into pits of failure in the average case; consumers willing to spend more time at construction because they know their processes will be long-lived can then opt-in to longer startup times.The current design has Items/Keys/Values properties that return
ImmutableArray<>
. This is to enable access to the data as aReadOnlySpan<>
, indexibility, via refs, and with very fast enumeration. This does, however, place a requirement on implementations that they either store the keys/values/items separately and in contiguous memory or that implementations maintain a second copy of the data that does so. If we'd be willing to give up on theReadOnlySpan<>
and indexing capabilities, these types should instead be changed to someFrozenEnumerable<>
-like type that would allow different implementations to vary how the data was stored.There's currently an optimized implementation for empty collections, but not for other small sizes, e.g. <= 8 elements. We should investigate adding simple implementations that don't do any hashing or other complicated storage for small sizes. This includes looking at various code bases / sources of information to determine the most common sizes to be optimized for, e.g. is it worth having a dedicated implementation for collections of just 1 or 2 elements.
Would a perfect hashing implementation yield any benefits? Would tries be useful in some situations for strings? Should we consider doing reflection emit code gen in some circumstances?
The factory nature here means that every implementation needs to be kept by a trimmer regardless of whether it's actually used (though a trimmer should, even if not today, be able to get rid of implementations unique to Ts where it can see those Ts are never used). We thus need to be thoughtful about how many custom implementations there are, and potentially have one or more feature switches to exclude one or more that add only incremental value, potentially opting them out by default.
If/when Roslyn source generators support some kind of mechanism that would allow a source generator to look at constant data in the C# file and augment ToFrozenSet/Dictionary call sites in some manner, it would be beneficial if the optimizations could be performed at compile-time, feeding something into an optional parameter to ToFrozenSet/Dictionary to enable them to be fast, with all the work supplied at build time. This would likely also require more work on the API to make the existing types extensible, which is something we've actively avoided today.
The text was updated successfully, but these errors were encountered: