New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a generic OrderedDictionary class #24826
Comments
📝 Edit: (And removing my related comments that followed) I was not aware of the semantics of the non-generic |
@sharwell |
Related to dotnet/corefx#26638 |
Would the implementation optimize for retrieval (presumably array + hashtable) or for space (presumably a tree/heap) ? Are there scenarios for both? What complexity do you hope for for lookup, Add, Remove, and ContainsKey (and ContainsValue if we have that) [snipped example table in favor of @TylerBrinkley 's below] |
Thanks for putting that chart together. The implementation will be nearly identical to Below is the chart filled out
|
💭 I almost prefer the indexer for this type be based on the "bias" implied by the type name. To me, |
If we only have one I'd agree that the |
FYI, if this gets approved I'd happily implement it. |
I'd also be interested in seeing generic variant of Thank you in advance for considering this. |
Hello, I too would like to see a generic version of this, and have it moved to the and so on.. So it would be nice to have this type implemented in the framework. |
I still think it would be beneficial to have Immutable Parent/Child Collection, but understand it is not specifically related to this thread. |
Just had need for this today. Any news if/when this will be implemented? |
Moved to corefxlab#2456 as part of the specialized collections initiative. |
Reopening in place of #29570. For context this has already been included in corefxlab via dotnet/corefxlab#2525 |
Bump |
Any news? |
We have no plans on adding such a type in our immediate roadmap. We will post an update on this thread as soon as anything changes. |
This has been around in various forms and other issues for years now. Is there really no priority in closing a glaring gap in Microsoft Collections? Especially given that it has been done time and time again, and with a general messiness in its placement - moved from a repo to another, without a stable nuget available. I dislike having to resort to third-party implementations for stuff like this. |
Also consider adding these methods void Sort()
void Sort(IComparer<TKey> Comparer)
void SortedInsert(TKey key, TValue value)
void SortedInsert(TKey key, TValue value, IComparer<TKey> Comparer) |
Came here to say I would like this too. |
Why Remove(key) and RemoveAt(index) are O(n) for OrderedDictionary? Implementation should keep the index as a (int, TValue) in the value of the key itself. |
Different solutions are possible, and remove is the hard part of ordered dictionaries.
Different decisions make different presumptions about how something's being used and/or what the user cares about. |
This is just a possible impl(not smart enough) of such a need, however without having detailed checking, sync for updates, sync of capacity(size of dict + list) and many more. And the concept can be optimized a lot to use less than 3 objects(self and 2 collections) Short objective: an ordered dictionary(dictionary + ordered elements used in foreach, based on insertion order)
Obs: I do not see a problem on iterations
And of course, this can be easily adjusted to implement IReadOnlyCollection<KeyValuePair<TKey,TValue>> |
|
From .NET documentation:
Also:
Lots of individually allocated objects. |
As I said, this is just a possible idea, do not take it 1:1, I am aware that a linked node is allocated for every entry. |
Interesting that only 73 issues have been tagged with the wishlist label (and just 36 if you also include api-suggestion). I'd have expected such a small number of features to get a bit more attention. |
I too am in need of this. I need to be able to serialize an OrderedDictionary. Currently we're using the non-generic version and it requires every value object to include the type information. If there was a generic OrderedDictionary<> the type information would only be serialized once with the dictionary object type. |
I am actually genuinely curious what it would take to move this forward. It's entirely unclear to me what the criteria might be, given these observations:
Evidently there is real-world demand here. It is not a niche ask. It seems like the only real blocker is getting this through API review? Why can that not happen? I would just like to get some insight into the decision-making process here, because all available indicators leave me very confused. |
You're right to point out that the primary bottleneck is getting the proposal lined up for API review. This does require some prep-work, including evaluating prototypes and ensuring that the API shape is on par with other designs that have already been shipped. Not all API proposals are created equal from a complexity standpoint however, and championing brand new collection types can be a long-running and expensive process. It's unlikely we could get around to such a proposal unless it registers high in our prioritization. |
Thanks for the reply! I totally understand and agree with most of what you bring up, save for this point:
This is the part I was really getting at. What would make it register high in your prioritization? As I mentioned, it's unclear to me how e.g. frozen collections could have registered higher going by the indicators I posted. (I don't say that to knock them, by the way - I use them myself! - they're just an example to illustrate the point.) If the answer is "we simply decided we really cared about maximizing read performance for collections in that release cycle" then, hey, fair enough. Like I said, I'm just looking for some insight into the process here. I think a lot of folks have the impression that prioritization of API proposals is to a large extent driven by the volume of demand. If that isn't the case, I think it would be good to just clarify what the different factors are, and maybe their relative importance. |
While our planning does take upvotes into consideration, it is not the only driving factor. In the interest of transparency, frozen collections were added because they were a first-party team requirement at the time. There are other factors as well: as you mention an implementation is already available via a NuGet package which plays a role as well. Not everything needs to be part of the BCL, or at least it doesn't urgently need to be part of the BCL. |
It goes without saying that our resources aren't infinite and our backlog is substantial. Oftentimes we might not invest on collections at all in a particular release cycle, simply because the team is pursuing different opportunities. |
To this particular point:
This is true of course, but with some caveats: Microsoft.Experimental.Collections is pre-release, so you will get NU5104 if you use package validation. There's also the fact that, with corefxlab archived, the package is deprecated and unmaintained, and even finding the source code requires a fair bit of digging. To be fair, nothing stops anyone from taking that code and publishing a new package. But I suspect part of the problem here is that, rightly or wrongly, .NET doesn't really have a culture of publishing small utility packages that are narrowly focused on one specific thing, as you'd see in e.g. Node.js and Rust. And on top of that, maintainers of libraries don't seem to like taking dependencies on such small utility packages. So, many who aren't using Microsoft.Experimental.Collections just end up copying an implementation into their project. I think several of the links I provided earlier at least partially substantiate this line of thinking. |
I wonder why this is. But that is off topic .. |
We should just do this. As has been noted, there a plethora of implementations floating around, including very close to home in System.IO.Packaging, EF Core, WCF, MAUI, and WPF, and then also as noted there a multitude of implementations in a myriad of other projects. We can do it once in the core libraries and avoid all that duplication, for something where we already have a non-generic implementation and just need a generic one. We can also start a more minimal surface area and add to it in the future if we're missing anything. Some notes on the original proposal:
I've updated the top proposal and marked it ready for review. |
When |
The ambiguity is there are then two overloads with the exact same arguments but that do two completely different things, e.g. this will successfully augment a histogram: public static void AddToHistogram(OrderedDictionary<string, int> counts, IEnumerable<string> source)
{
foreach (var item in source) counts[item] = counts.TryGetValue(item, out int count) ? count + 1 : 1;
} but this, with the exact same method body, will likely either blow up or produce meaningless results: public static void AddToHistogram(OrderedDictionary<int, int> counts, IEnumerable<int> source)
{
foreach (var item in source) counts[item] = counts.TryGetValue(item, out int count) ? count + 1 : 1;
} |
Thanks, yeah I agree it would likely cause issues for some users and using the |
EDITED on 4/10/2024 by @stephentoub to update proposal
Often times I've come across places when needing a
Dictionary
where the insertion order of the elements is important to me. Unfortunately, .NET does not currently have a genericOrderedDictionary
class. We've had a non-genericOrderedDictionary
class since .NET Framework 2.0 which oddly enough was when generics were added but no generic equivalent. This has forced many to roll their own solution, typically by using a combination of aList
andDictionary
field resulting in the worst of both worlds in terms of performance and resulting in larger memory usage, and even worse sometimes users instead rely on implementation details ofDictionary
for ordering which is quite dangerous.Proposed API
Perhaps one of the reasons there was no generic
OrderedDictionary
added initially was due to issues with having both a key and index indexer when the key is anint
. A call to the indexer would be ambiguous. Roslyn prefers the non-generic parameter so in this case the index indexer will be called.API Details
Insert
allowsindex
to be equal toCount
to insert the element at the end.SetAt(int index, TValue value)
requiresindex
to be less thanCount
butSetAt(int index, TKey key, TValue value)
allowsindex
to be equal toCount
similar toInsert
.Dictionary
for all operations exceptRemove
which will necessarily beO(n)
.Insert
andRemoveAt
which aren't members ofDictionary
will also beO(n)
.Open Questions
System.Collections.Generic
when it could easily beSystem.Collections.Specialized
where the non-generic version is located? I just felt this collection is far more useful to be relegated to that namespace.ICollection
,IList
, andIOrderedDictionary
be implemented?Updates
IEnumerable<KeyValuePair<TKey, TValue>>
.ContainsValue
method due to being needed for theValueCollection.Contains
method.The text was updated successfully, but these errors were encountered: