Frozen collection follow-ups #77891

stephentoub · 2022-11-04T12:53:21Z

#77799 adds new FrozenDictionary and FrozenSet types. These are in decent shape but not yet ready to ship. Several follow-ups are required:

Frozen collection construction performance #87964
The initial use cases for these types was for long-lived processes, where overall it's a better use of resources to spend more time at construction time to optimize the data structures and algorithms used for later data access. However, the current construction time is one or more orders of magnitude more than for the standard dictionary/set types. We need to a) invest in driving down those construction costs, and b) consider making the default performance of ToFrozenDictionary/Set as close that of Dictionary/Set as possible via an additional optional parameter to these methods, e.g.

public static FrozenSet<T> ToFrozenSet<T>(this IEnumerable<T> source, IEqualityComparer<T>? comparer = null,
+    double constructionOptimization = default // pick a better name
);

When 0.0, this would effectively default to creating a FrozenSet/Dictionary that just wrapped a HashSet/Dictionary, providing the same FrozenSet/Dictionary semantics but primarily just as a wrapper for a HashSet/Dictionary, such that the only additional costs were cloning of the data (assuming the Items/Keys/Values properties remain as they are today, they could be lazily-created on first need). On a scale from 0.0 to 1.0, the larger the double value, the more optimization would be performed at construction time, enabling the caller to choose how much time is spent at construction vs how fast subsequent data access can be made, with 1.0 meaning construction would take the longest in exchange for fastest data access. Having the default be fast construction would make this the least surprising for casual use and would avoid falling into pits of failure in the average case; consumers willing to spend more time at construction because they know their processes will be long-lived can then opt-in to longer startup times.

Frozen Collection Follow-up: Items/Keys/Values properties #86719 blocking-release
The current design has Items/Keys/Values properties that return ImmutableArray<>. This is to enable access to the data as a ReadOnlySpan<>, indexibility, via refs, and with very fast enumeration. This does, however, place a requirement on implementations that they either store the keys/values/items separately and in contiguous memory or that implementations maintain a second copy of the data that does so. If we'd be willing to give up on the ReadOnlySpan<> and indexing capabilities, these types should instead be changed to some FrozenEnumerable<>-like type that would allow different implementations to vary how the data was stored.
Small collections
There's currently an optimized implementation for empty collections, but not for other small sizes, e.g. <= 8 elements. We should investigate adding simple implementations that don't do any hashing or other complicated storage for small sizes. This includes looking at various code bases / sources of information to determine the most common sizes to be optimized for, e.g. is it worth having a dedicated implementation for collections of just 1 or 2 elements.
Other data structures and algorithms
Would a perfect hashing implementation yield any benefits? Would tries be useful in some situations for strings? Should we consider doing reflection emit code gen in some circumstances?
Trimming friendliness
The factory nature here means that every implementation needs to be kept by a trimmer regardless of whether it's actually used (though a trimmer should, even if not today, be able to get rid of implementations unique to Ts where it can see those Ts are never used). We thus need to be thoughtful about how many custom implementations there are, and potentially have one or more feature switches to exclude one or more that add only incremental value, potentially opting them out by default.
Source generation
If/when Roslyn source generators support some kind of mechanism that would allow a source generator to look at constant data in the C# file and augment ToFrozenSet/Dictionary call sites in some manner, it would be beneficial if the optimizations could be performed at compile-time, feeding something into an optional parameter to ToFrozenSet/Dictionary to enable them to be fast, with all the work supplied at build time. This would likely also require more work on the API to make the existing types extensible, which is something we've actively avoided today.

The text was updated successfully, but these errors were encountered:

ghost · 2022-11-04T12:53:27Z

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

#77799 adds new FrozenDictionary and FrozenSet types. These are in decent shape but not yet ready to ship. Several follow-ups are required:

Construction performance. The initial use cases for these types was for long-lived processes, where overall it's a better use of resources to spend more time at construction time to optimize the data structures and algorithms used for later data access. However, the current construction time is one or more orders of magnitude more than for the standard dictionary/set types. We need to a) invest in driving down those construction costs, and b) consider making the default performance of ToFrozenDictionary/Set as close that of Dictionary/Set as possible via an additional optional parameter to these methods, e.g.

public static FrozenSet<T> ToFrozenSet<T>(this IEnumerable<T> source, IEqualityComparer<T>? comparer = null,
+    double constructionOptimization = default // pick a better name
);

When 0.0, this would effectively default to creating a FrozenSet/Dictionary that just wrapped a HashSet/Dictionary, providing the same FrozenSet/Dictionary semantics but primarily just as a wrapper for a HashSet/Dictionary, such that the only additional costs were cloning of the data (assuming the Items/Keys/Values properties remain as they are today, they could be lazily-created on first need). On a scale from 0.0 to 1.0, the larger the double value, the more optimization would be performed at construction time, enabling the caller to choose how much time is spent at construction vs how fast subsequent data access can be made, with 1.0 meaning construction would take the longest in exchange for fastest data access. Having the default be fast construction would make this the least surprising for casual use and would avoid falling into pits of failure in the average case; consumers willing to spend more time at construction because they know their processes will be long-lived can then opt-in to longer startup times.

Items/Keys/Values properties. The current design has Items/Keys/Values properties that return ImmutableArray<>. This is to enable access to the data as a ReadOnlySpan<>, indexibility, via refs, and with very fast enumeration. This does, however, place a requirement on implementations that they either store the keys/values/items separately and in contiguous memory or that implementations maintain a second copy of the data that does so. If we'd be willing to give up on the ReadOnlySpan<> and indexing capabilities, these types should instead be changed to some FrozenEnumerable<>-like type that would allow different implementations to vary how the data was stored.
Small collections. There's currently an optimized implementation for empty collections, but not for other small sizes, e.g. <= 8 elements. We should investigate adding simple implementations that don't do any hashing or other complicated storage for small sizes. This includes looking at various code bases / sources of information to determine the most common sizes to be optimized for, e.g. is it worth having a dedicated implementation for collections of just 1 or 2 elements.
Other data structures and algorithms. Would a perfect hashing implementation yield any benefits? Would tries be useful in some situations for strings? Should we consider doing reflection emit code gen in some circumstances?
Source generation. If/when Roslyn source generators support some kind of mechanism that would allow a source generator to look at constant data in the C# file, it would be beneficial if the optimizations could be performed at compile-time, feeding something into an optional parameter to ToFrozenSet/Dictionary to enable them to be fast, with all the work supplied at build time. This would likely also require more work on the API to make the existing types extensible, which is something we've actively avoided today.

Author:	stephentoub
Assignees:	-
Labels:	`area-System.Collections`
Milestone:	8.0.0

stephentoub · 2022-11-04T12:59:54Z

cc: @geeknoid

geeknoid · 2022-11-04T14:58:22Z

Perfect hashing ends up being not a great choice. Perfect hashing usually involves a more complex hash function, and you'll end up with lower perf than what we have here. Their main benefit would potentially be in smaller data by not having giant sparse lookup tables.

Another area worth looking into is to further optimize string-based key (the overwhelming favorite kind of key). There are a number of comparers in the current implementation to try and fine-tune how comparison works, but more are possible. For example, comparers that know about 2 character sequences explicitly. The motivation here is to reduce the cost of hashing incoming strings to a minimum.

Perhaps a broader problem that just for the frozen collections, but there is very poor interop between spans and collections. In particular, for string-based dictionaries, it would be terrific to be able to do a lookup using a ReadOnlySpan as a key in addition to a string. This would avoid needing to materialize strings just to do a dictionary lookup. A perfect use for this is a string interning table for incoming REST/gRPC calls. You could go sraight from the incoming buffer, and do a lookup in the dictionary to get the final string object.

stephentoub · 2022-11-04T15:00:11Z

it would be terrific to be able to do a lookup using a ReadOnlySpan as a key in addition to a string

Yes, this has long been desired, and we currently lack a great answer. This is tracked by #27229.

geeknoid · 2022-11-04T15:03:07Z

Frozen collection state serialization is another interesting possibility.

The idea is to make it possible to serialize the state of a frozen collection to a portable format. A source generator can then generate such serialized blobs to be sucked in by a frozen collection at startup, providing cheap initiatilziation for well-knwon data.

Additionally, said serialized data could actually be passed around in config blobs. So that even state that is config-drivem can have the benefit of low startup time. You serialize the state offline, and send it through the config system. This could in fact provide opportunities for even more aggressive optimizations since startup time would not be an issue.

stephentoub · 2022-11-04T15:04:56Z

Frozen collection state serialization is another interesting possibility.

I intended the "Source generation" part of what I wrote above to cover this. But we'd also need to be really careful here. It'd be fairly easy to accidentally create a vulnerability.

iSazonov · 2022-11-04T17:51:58Z

Frozen collection state serialization is another interesting possibility.

I intended the "Source generation" part of what I wrote above to cover this. But we'd also need to be really careful here. It'd be fairly easy to accidentally create a vulnerability.

Does this exclude using perfect hash functions at all?

stephentoub · 2022-11-04T17:54:19Z

Frozen collection state serialization is another interesting possibility.

I intended the "Source generation" part of what I wrote above to cover this. But we'd also need to be really careful here. It'd be fairly easy to accidentally create a vulnerability.

Does this exclude using perfect hash functions at all?

That statement was unrelated to perfect hashing.

iSazonov · 2022-11-04T17:56:18Z

I mean GetHash using seeding today but perfect hash no. So question can we use it or no.

iSazonov · 2022-11-04T18:00:07Z

double constructionOptimization = default

It could be an enum like CompressionLevel. It is more user friendly in public API.
Or it could be a dynamic to follow tiered compilation conception - fast startup with follow optimization in background.

stephentoub · 2022-11-04T18:11:05Z

perfect hash no. So question can we use it or no.

Technically could we? Sure. Should we? That would depend on a variety of factors, not the least of which would be demonstrating it was a perf win. And Martin already commented on his experience with their perf in #77891 (comment).

iSazonov · 2022-11-04T18:32:42Z

That would depend on a variety of factors, not the least of which would be demonstrating it was a perf win. And Martin already commented on his experience with their perf in #77891 (comment).

I couldn't agree more. A simple example, if we have 256 strings all with different length - perfect hash function can be just the length of the string (and byte lookup table). Is it possible to create something faster?

Complex perfect functions are rather aimed at creating very compact data buffers. If we are not going to store them in the .Net Runtime, we don't need to strive to make them extremely small and hence we can use simpler and faster functions.
We could learn the gperf algorithms, if the license allows it. And create an SG in the first step if that is needed of course.

stephentoub · 2022-11-04T18:40:04Z

A simple example, if we have 256 strings all with different length

The implementation already does that.

omariom · 2022-11-04T23:22:53Z

Would be very helpful if the dictionary had such indexer

public ref readonly KeyValuePair<TKey, TValue> this[int entryIndex]

// for the set
public ref readonly T this[int entryIndex]

It would allow to store references to entries as integers and later quickly retrieve them.
If returning KeyValuePair imposes too much on the implementation, then may be separate GetValue and GetKey?

The idea is to be able to get indexes of the entries and use them to retrieve keys and values.

stephentoub · 2022-11-04T23:25:40Z

Would be very helpful if the dictionary had such indexer

What's the scenario for that such that the indexer returning a ref to the value is insufficient?

omariom · 2022-11-04T23:59:19Z

You mean this indexer?
public ref readonly TValue this[TKey key]

I have a couple of scenarios.

Reference data that can be looked up. For example, a list of countries or a list of currencies.
The referencing data structure would keep entries' indexes, instead of the keys ("UK" or "GBP").
It would also be more efficient in terms of memory and more GC friendly.

I implemented it for myself by just copying the existing HashSet and Dictionary and adding the required members.
Another scenario (imaginable in this case) is a network protocol, that at the start of a session sends the initial snapshot of
ref data, and later refers to the entries by their indexes instead of the values in the hashset or the keys of the dictionary.

geeknoid · 2022-11-05T17:27:42Z

@omariom Does this make sense in the context of a frozen dictionary? Seems like you're using a dictionary to build up unique name/value pairs, and then when you're done you ditch the name part and access your values by index. Do you have scenarios that routinely require having both key-based and index-based access to the same values on an on-going basis?

geeknoid · 2022-11-05T17:33:07Z

@iSazonov For those cases where the input is known statically, a source generator that has a large suite of tricks to generate custom code seems like a great idea. This is not so much focused on the need for perfect hashing, it could be anything that makes the code fast.

In the scenarios where I've used frozen collections so far, the set of items in the collections have always come from config or some other dynamic source, so I haven't had the need/desire for a build-time generator. But for parsers which have a well-known set of tokens, there could be value in a highly optimizing source generator. It all depends on the actual gains you could achieve relative to what you already get with the hard-coded frozen collection implementations.

geeknoid · 2022-11-05T17:35:58Z

@stephentoub I have not experimented with purely length-based "hashing", which I think is a fascinating idea.

If there is sufficient variance in the length of the keys in the frozen collection, it should be possible to just use the string length as the hash code, completely eliminating the cost of hashing. It'd be trivial to write a new comparer implementation that just uses string length. I bet that could be a considerable win in many cases.

stephentoub · 2022-11-05T17:38:36Z

I have not experimented with purely length-based "hashing", which I think is a fascinating idea.

I spent some time on it, which is where https://github.com/dotnet/runtime/blob/main/src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/LengthBucketsFrozenDictionary.cs came from. There's of course more opportunity to experiment.

geeknoid · 2022-11-05T17:43:43Z

I have not experimented with purely length-based "hashing", which I think is a fascinating idea.

I spent some time on it, which is where https://github.com/dotnet/runtime/blob/main/src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/LengthBucketsFrozenDictionary.cs came from. There's of course more opportunity to experiment.

Ah, you're one step ahead. I'm surprised this required a new dictionary type, rather than just a string comparer implementation.

stephentoub · 2022-11-05T17:45:23Z

It didn't require it, but the comparer itself added overhead. It might still be useful to have the comparer with the frozen hashtable as a fallback if it doesn't meet the criteria for this impl.

geeknoid · 2022-11-22T14:39:27Z

Here's another idea:

If an integer set is created that contains a constrained range of values, it would be more efficient to just have an array of bits to hold the set's state.

The idea is to leverage this for sets of enums. FrozenSet can then be implemented using simple array lookups rather than hashing.

I just saw an enum-based set in some production code...

AlexRadch · 2022-12-21T04:43:59Z

I suggested providing optimized read-only collections for SortedSet and SortedDictionary. I made a separate topic for discussing the API and other issues related to the implementation of these collections, so as not to interfere with all discussions on different topics in this one topic.

Timovzl · 2022-12-21T10:44:34Z

The Adaptive Radix Tree could be suitable for (ordinal) string keys, especially to implement a sorted set.

A trade-off is retaining the actual string keys vs. keeping only their compacted bytes in the tree.

Timovzl · 2022-12-21T11:12:02Z

The Adaptive Radix Tree

Come to think of it, the adaptive part is unneccessary for frozen collections. Any radix tree would do.

danmoseley · 2023-02-13T22:34:11Z

Is the scenario of frozen keys, mutable values supported by the new collections? Eg., in this scenario (on Kestrel hot path) a dictionary lookup is basically hand generated (these days maybe Roslyn would generate similar code from a switch statement) and the value may change. https://github.com/dotnet/aspnetcore/blob/41627fe50a7a71e395f2fdfccce9f8cbff10a49a/src/Shared/HttpSys/RequestProcessing/RequestHeaders.Generated.cs#L968

Edit: never mind, it uses fields for the values. But I'm interested in the answer anyway.

stephentoub · 2023-02-13T22:45:41Z

Is the scenario of frozen keys, mutable values supported by the new collections?

You can't mutate what value the dictionary/set stores for a key. If that value is a reference type, though, or if it contains a reference type, you can of course mutate the referenced instance. In other words, the immutability here is shallow.

danmoseley · 2023-02-13T22:55:49Z

Yeah in this case it's a struct. OK thanks.

Timovzl · 2023-02-14T08:17:22Z

@danmoseley I have definitely seen someone use GetValueRefOrNullRef() and mutate a struct value in a frozen dictionary somewhere…

stephentoub · 2023-02-14T10:33:49Z

I have definitely seen someone use GetValueRefOrNullRef() and mutate a struct value in a frozen dictionary somewhere…

That returns a ref readonly. Using a ref readonly to erroneously mutate the collection requires unsafe code. With unsafe code you can violate almost any kind of constraint, including immutability.

Timovzl · 2023-02-14T13:34:55Z

Using a ref readonly to erroneously mutate the collection requires unsafe code.

Ahhh, so that was the trick. Indeed, they used a method from the Unsafe class to do so, I now recall. So that was definitely not by design.

mjsabby · 2023-02-16T01:40:55Z

There is absolutely no requirement for Perfect Hashing to be complex. Take a look at https://gist.github.com/mjsabby/d7baf64d524176ee35f69ffa2c278179 and it out performs current .NET 7 dictionary.

Implements the Hash Displace algorithm.

I would implement Frozen Sets and Dictionaries with perfect hashing.

        [MethodImpl(MethodImplOptions.AggressiveOptimization)]
        public static int Lookup(ReadOnlySpan<byte> key, ReadOnlySpan<int> values, ReadOnlySpan<int> seeds)
        {
            ulong hash;

            unsafe
            {
                fixed (byte* ptr = &MemoryMarshal.GetReference(key))
                {
                    hash = Hash64(ptr, key.Length);
                }
            }

            var size = (ulong)values.Length;
            var index = hash & (size - 1);
            var seed = seeds[(int)index];

            if (seed < 0)
            {
                return values[0 - seed - 1];
            }

            var x = (ulong)seed + hash;
            x ^= x >> 12;
            x ^= x << 25;
            x ^= x >> 27;

            index = (x * 0x2545F4914F6CDD1D) & (size - 1);
            return values[(int)index];
        }

YoniFeng · 2023-03-12T17:03:40Z

Perfect hashing ends up being not a great choice. Perfect hashing usually involves a more complex hash function, and you'll end up with lower perf than what we have here. Their main benefit would potentially be in smaller data by not having giant sparse lookup tables.

@geeknoid
I haven't taken the time to look at the Frozen collection's code (would love TL;DR/gist, but I'll have to take a good look regardless), but a PHF(perfect hash function) definitely has value.
I'm surprised you seem to dismiss it for poor key lookup performance (did I infer wrong?), I'd think the only reason to avoid a PHF during runtime is the construction time can be very slow.

For those cases where the input is known statically, a source generator that has a large suite of tricks to generate custom code seems like a great idea. This is not so much focused on the need for perfect hashing, it could be anything that makes the code fast.

A source generator for static sets could be fantastic. It can make top-of-the-line lookup perf easily accessible.
It would essentially be like an integrated gperf. An example (similar to gperf in concept, not implementation) for an integrated gperf/"source generator" is rust-phf.
Its downsides are that it doesn't allow customizing the underlying hash function and comparer, is minimal, and Rust's default hasher is unnecessarily slow because it provides cryptographic guarantees that aren't needed for static sets.

Being able to customize the underlying hash function and key comparer opens the door for super-efficient implementations that make use of the static set's properties and/or assumptions.

About minimal: I've yet to do thorough benchmarking of the PHF area (planning to), but my hunch is having a minimal PHF results in worse performance.
It optimizes for space efficiency (binary size / table stored in data segment) over time efficiency (lookup speed), because a non-keyword hitting an empty slot can be rejected faster (in the case of strings - faster or no string comparison).
Granted, with a fast hash function (or even something ad-hoc) the distribution probably isn't great - increasing the table size probably tends to 0 for lookup performance improvement.

Is the scenario of frozen keys, mutable values supported by the new collections? Eg., in this scenario (on Kestrel hot path) a dictionary lookup is basically hand generated (these days maybe Roslyn would generate similar code from a switch statement) and the value may change. https://github.com/dotnet/aspnetcore/blob/41627fe50a7a71e395f2fdfccce9f8cbff10a49a/src/Shared/HttpSys/RequestProcessing/RequestHeaders.Generated.cs#L968

Edit: never mind, it uses fields for the values. But I'm interested in the answer anyway.

@danmoseley
Can you give a small elaboration on HTTP request keyword/verb matching is used in Kestrel? Why does the value need to be mutated? (Like above - I need to find the time to dig in to the code myself..).
I was looking for this the other day, because I wanted to see how Kestrel/ASP.NET match HTTP keywords and/or common request headers.

Edit: Looked at the code. Like Stephen said for the Frozen collections, a source-generated PHF-based solution would also allow for shallow immutability. If the value can be an index in an array - that's probably best.

This the sort of use case I'd expect could see great benefit from a source-generated PHF set/dictionary.
Assessing whether a string is a reserved keyword (e.g. in compilers) is exactly the use-case where a compile-time PHF-based structure can shine.

Come to think of it - how does Roslyn do keyword matching?

danmoseley · 2023-03-12T18:19:30Z

@YoniFeng I’m not familiar with that code. Of course any contributions or analysis you may have are welcome.

rampaa · 2023-06-28T15:46:57Z

Source generation
If/when Roslyn source generators support some kind of mechanism that would allow a source generator to look at constant data in the C# file and augment ToFrozenSet/Dictionary call sites in some manner, it would be beneficial if the optimizations could be performed at compile-time, feeding something into an optional parameter to ToFrozenSet/Dictionary to enable them to be fast, with all the work supplied at build time. This would likely also require more work on the API to make the existing types extensible, which is something we've actively avoided today.

Is this on the radar for .NET 8?

stephentoub · 2023-06-28T15:51:14Z

Is this on the radar for .NET 8?

No. It requires exposing the details of the abstraction, and we want some bake time on that before we do. It's something we'll consider for .NET 9.

eiriktsarpalis · 2023-07-19T15:46:36Z

@stephentoub are there any work items tracked in this issue that we're planning to deliver in .NET 8? If not we should update the milestone.

stephentoub · 2023-07-19T15:49:57Z

Everything for 8 is done

andrewjsaid · 2023-11-08T23:22:55Z

Another way to deliver on the premise of improving lookup time at a potential one-time cost of construction would be some kind of mechanism which tracks usage patterns. This could be provided either at construction time (e.g. an array of key lookups) or by enabling some kind of sampling to be run to track lookup keys.

The frozen collection would then have a method to re-optimize to speed up lookups based on the usage pattern.

Here are some metrics which would be great to know ahead of time for stringly keyed FrozenDictionaries in order to create more optimal and targeted lookups:

Are most lookups for the same key or small subset of keys? - e.g. a FrozenDictionary where the key is country code and most of the time it's the same country code being looked up. Maybe there could be two dictionaries - a faster one with fewer items and a regular one with the entire set.
Changing the number of buckets to ensure that frequently accessed keys are not in the same bucket.
Rotating the order of keys within buckets to sort by most frequently accessed
Not using max length checks if we know them to be ineffective with that particular data set.
Are we ignoring case unnecessarily most of the time? If so maybe there could be two dictionaries here too.

I'm sure that some of these ideas aren't suitable or are slow to compute, but equally I'm sure there are others I haven't thought of which could be perfect optimization.

I guess the problem with this suggestion is that I'm not entirely sure how FrozenDictionaries are used in the wild, so maybe it's better to have this conversation once some common scenarios exist.

andrewjsaid · 2023-11-08T23:40:25Z

Would tries be useful in some situations for strings? Should we consider doing reflection emit code gen in some circumstances?

After I saw that routing in ASP.NET Core uses reflection emitted Trie, I also spent some time attempting to implement an alternative to stringly-keyed ASCII FrozenDictionary in a similar way. After lots of attempts, I no longer think it's possible for Tree structures to be faster than Hash based data structures in general purpose use-cases. However the IL-Emit Trie did indeed come out faster than the FrozenDictionary in my small experiments.

https://github.com/andrewjsaid/trie

rampaa · 2024-04-13T16:01:59Z

It's something we'll consider for .NET 9.

Is it being considered for .NET 9?

stephentoub · 2024-04-13T16:09:49Z

It's something we'll consider for .NET 9.

Is it being considered for .NET 9?

Unlikely at this point.

sanghel-orbyta · 2024-06-07T14:04:49Z

Any chance of considering adding something like:

FrozenSortedSet<T>
FrozenSortedDictionary<T>
IEnumerable<T>.ToFrozenSortedSet<T>
IEnumerable<T>.ToFrozenSortedDictionary<T>
?

Perhaps via analogous methods such as those for .ToImmutableSortedDictionary/Set()

stephentoub · 2024-06-07T15:43:03Z

Any chance of considering adding something like:

FrozenSortedSet<T>

FrozenSortedDictionary<T>

IEnumerable<T>.ToFrozenSortedSet<T>

IEnumerable<T>.ToFrozenSortedDictionary<T>
?

Perhaps via analogous methods such as those for .ToImmutableSortedDictionary/Set()

What functionality are you hoping for here? Specifically just that the exposed Items and Key/Value arrays are sorted and that enumerating happens in that order as well?

sanghel-orbyta · 2024-06-07T15:49:43Z

@stephentoub exactly.

I have a ConcurrentDictionary that I accumulate into and then need to read from ordered by key, as quickly as possibile (using this for some UDP traffic re-ordering, the dictionary is a ConcurrentDictionary<long, (int packetOffset, int packetLength)>, that keeps track of offsets in a Memory<byte> buffer... linear, I know, but I'm not expert enough to know how to do this with a ring buffer with pooled memory alocations avoiding GC... any ideas on this?).

Right now I'm using .ToImmutableSortedDictionary(), but from what I've understood, a FrozenDictionary would be quicker to enumerate.

stephentoub · 2024-06-07T16:01:00Z

What functionality are you hoping for here? Specifically just that the exposed Items and Key/Value arrays are sorted and that enumerating happens in that order as well?

@stephentoub exactly.

Thanks. In that case, I don't think this warrants or necessitates new types. We could just have the creation of a FrozenDictionary/Set sort the arrays at construction, either using a type's built-in notion of ordering (via IComparable) which could be done without any API additions, or via an extra overload that accepts a Comparer to use for that purpose.

sanghel-orbyta · 2024-06-07T16:06:27Z

In that case, I don't think this warrants or necessitates new types. We could just have the creation of a FrozenDictionary/Set sort the arrays at construction

That would definitely work, thanks for the feedback @stephentoub .

stephentoub added the area-System.Collections label Nov 4, 2022

stephentoub added this to the 8.0.0 milestone Nov 4, 2022

layomia mentioned this issue Nov 22, 2022

System.Collections work planned for .NET 8 #77391

Closed

14 tasks

layomia mentioned this issue Dec 20, 2022

[API Proposal]: Provide optimized read-only collections for SortedSet and SortedDictionary #79781

Closed

stephentoub mentioned this issue May 16, 2023

Lots of tweaks to FrozenDictionary/Set analysis and hashing #86293

Merged

jeffhandley added the blocking-release label May 24, 2023

jeffhandley mentioned this issue May 24, 2023

Frozen Collection Follow-up: Items/Keys/Values properties #86719

Closed

adamsitnik mentioned this issue Jun 23, 2023

Frozen collection construction performance #87964

Closed

stephentoub mentioned this issue Jun 23, 2023

Remove optimizeForReading overloads from FrozenDictionary/Set #87988

Merged

stephentoub removed the blocking-release label Jul 19, 2023

stephentoub modified the milestones: 8.0.0, Future Jul 19, 2023

Frozen collection follow-ups #77891

Frozen collection follow-ups #77891

Comments

stephentoub commented Nov 4, 2022 • edited Loading

ghost commented Nov 4, 2022

stephentoub commented Nov 4, 2022

geeknoid commented Nov 4, 2022

stephentoub commented Nov 4, 2022

geeknoid commented Nov 4, 2022

stephentoub commented Nov 4, 2022

iSazonov commented Nov 4, 2022

stephentoub commented Nov 4, 2022

iSazonov commented Nov 4, 2022

iSazonov commented Nov 4, 2022

stephentoub commented Nov 4, 2022

iSazonov commented Nov 4, 2022

stephentoub commented Nov 4, 2022

omariom commented Nov 4, 2022 • edited Loading

stephentoub commented Nov 4, 2022

omariom commented Nov 4, 2022 • edited Loading

geeknoid commented Nov 5, 2022

geeknoid commented Nov 5, 2022

geeknoid commented Nov 5, 2022

stephentoub commented Nov 5, 2022 • edited Loading

geeknoid commented Nov 5, 2022

stephentoub commented Nov 5, 2022 • edited Loading

geeknoid commented Nov 22, 2022

AlexRadch commented Dec 21, 2022

Timovzl commented Dec 21, 2022

Timovzl commented Dec 21, 2022

danmoseley commented Feb 13, 2023 • edited Loading

stephentoub commented Feb 13, 2023

danmoseley commented Feb 13, 2023

Timovzl commented Feb 14, 2023

stephentoub commented Feb 14, 2023

Timovzl commented Feb 14, 2023

mjsabby commented Feb 16, 2023 • edited Loading

YoniFeng commented Mar 12, 2023 • edited Loading

danmoseley commented Mar 12, 2023

rampaa commented Jun 28, 2023

stephentoub commented Jun 28, 2023

eiriktsarpalis commented Jul 19, 2023 • edited Loading

stephentoub commented Jul 19, 2023

andrewjsaid commented Nov 8, 2023

andrewjsaid commented Nov 8, 2023

rampaa commented Apr 13, 2024

stephentoub commented Apr 13, 2024

sanghel-orbyta commented Jun 7, 2024

stephentoub commented Jun 7, 2024

sanghel-orbyta commented Jun 7, 2024 • edited Loading

stephentoub commented Jun 7, 2024 • edited Loading

sanghel-orbyta commented Jun 7, 2024

stephentoub commented Nov 4, 2022 •

edited

Loading

omariom commented Nov 4, 2022 •

edited

Loading

omariom commented Nov 4, 2022 •

edited

Loading

stephentoub commented Nov 5, 2022 •

edited

Loading

stephentoub commented Nov 5, 2022 •

edited

Loading

danmoseley commented Feb 13, 2023 •

edited

Loading

mjsabby commented Feb 16, 2023 •

edited

Loading

YoniFeng commented Mar 12, 2023 •

edited

Loading

eiriktsarpalis commented Jul 19, 2023 •

edited

Loading

sanghel-orbyta commented Jun 7, 2024 •

edited

Loading

stephentoub commented Jun 7, 2024 •

edited

Loading