Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Howto implement HashSet.GetOrAdd(T) extension method in .NET 6 or .NET 7 using CollectionsMarshal #82875

Closed
bairog opened this issue Mar 2, 2023 · 5 comments

Comments

@bairog
Copy link

bairog commented Mar 2, 2023

Hello.
On dotnet runtime Github repo there was a proposal for adding HashSet.GetOrAdd(T) - similar to

public static T GetOrAdd<T>(this HashSet<T> hashSet, T equalValue)
{
    if (hashSet.TryGetValue(equalValue, out var actualValue))
        return actualValue;

    hashSet.Add(equalValue);
    return equalValue;
}

but without a duplicate hash lookup (performance impact). That issue was closed with the following resolution:

Given we're going down CollectionsMarshal route for #15059, we should probably consider a similar approach for HashSet

In issue #15059 the last comment has an example of CollectionsMarshal usage for a Dictionary:

public static TValue GetOrAdd<TKey, TValue>(this Dictionary<TKey, TValue> dictionary, TKey key, Func<TKey, TValue> valueFactory)
where TKey : notnull
{
    if (dictionary == null)
        throw new ArgumentNullException(nameof(dictionary));

    if (valueFactory == null)
        throw new ArgumentNullException(nameof(valueFactory));            

    ref TValue? value = ref CollectionsMarshal.GetValueRefOrAddDefault(dictionary, key, out bool exists);
    if (!exists)
        value = valueFactory(key);

    return value!;
}

I've tried to use the same approach to implement HashSet.GetOrAdd(T) extention but failed - I don't have any idea how to get key value for CollectionsMarshal.GetValueRefOrAddDefault() method.
Both issues are locked as resolved and limited to collaborators - so I cannot ask directly there.
Help me please. Thank you.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Mar 2, 2023
@ghost
Copy link

ghost commented Mar 2, 2023

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

Hello.
On dotnet runtime Github repo there was a proposal for adding HashSet.GetOrAdd(T) - similar to

public static T GetOrAdd<T>(this HashSet<T> hashSet, T equalValue)
{
    if (hashSet.TryGetValue(equalValue, out var actualValue))
        return actualValue;

    hashSet.Add(equalValue);
    return equalValue;
}

but without a duplicate hash lookup (performance impact). That issue was closed with the following resolution:

Given we're going down CollectionsMarshal route for #15059, we should probably consider a similar approach for HashSet

In issue #15059 the last comment has an example of CollectionsMarshal usage for a Dictionary:

public static TValue GetOrAdd<TKey, TValue>(this Dictionary<TKey, TValue> dictionary, TKey key, Func<TKey, TValue> valueFactory)
where TKey : notnull
{
    if (dictionary == null)
        throw new ArgumentNullException(nameof(dictionary));

    if (valueFactory == null)
        throw new ArgumentNullException(nameof(valueFactory));            

    ref TValue? value = ref CollectionsMarshal.GetValueRefOrAddDefault(dictionary, key, out bool exists);
    if (!exists)
        value = valueFactory(key);

    return value!;
}

I've tried to use the same approach to implement HashSet.GetOrAdd(T) extention but failed - I don't have any idea how to get key value for CollectionsMarshal.GetValueRefOrAddDefault() method.
Both issues are locked as resolved and limited to collaborators - so I cannot ask directly there.
Help me please. Thank you.

Author: bairog
Assignees: -
Labels:

area-System.Collections

Milestone: -

@bairog bairog changed the title HashSet.GetOrAdd(T) extention method Implement HashSet.GetOrAdd(T) extension method in .NET 6 or .NET 7 using CollectionsMarshal Mar 2, 2023
@bairog bairog changed the title Implement HashSet.GetOrAdd(T) extension method in .NET 6 or .NET 7 using CollectionsMarshal Howto implement HashSet.GetOrAdd(T) extension method in .NET 6 or .NET 7 using CollectionsMarshal Mar 2, 2023
@theodorzoulias
Copy link
Contributor

theodorzoulias commented Mar 2, 2023

Adding CollectionsMarshal support for the HashSet<T> doesn't make much sense. So maybe the original proposal about a HashSet<T>.GetOrAdd method should be reopened?

@eiriktsarpalis
Copy link
Member

So maybe the original proposal about a HashSet<T>.GetOrAdd method should be reopened?

How common is it that you need to have access to the precise instance stored by the HashSet? In most applications I can think of, if hashSet.Add(value) == false then for all intents and purposes I can just use value in lieu of whatever equal instance is stored in the set.

We generally try to avoid adding new instance methods to very common generic types like HashSet<T> as it disproportionately impacts the static footprint of an application. We might make an exception for high-impact methods but frankly that doesn't seem to be one.

@theodorzoulias
Copy link
Contributor

@eiriktsarpalis I think that it is pretty rare. It is similar to searching a Dictionary<K,V> for a key, and wanting to get not only the value but also the actual key that you are searching for. AFAIK you can't even do this with a dictionary, so why should you be able to do it with a hashset?

On the other hand the original proposal is closed with a comment that gives the impression that the CollectionsMarshal has an answer to this request, or that it could have an answer in the future. Which apparently is not the case. The GetOrAdd feature can only be added directly in the HashSet<T>, or not at all. Is it possible to edit that last comment, so that it reflects the real reason that the proposal was rejected? (i.e. to avoid impacting the static footprint)

@eiriktsarpalis
Copy link
Member

Added a comment in the original issue for posterity. To answer the original question in this issue, I would recommend just using Add if obtaining the original instance is not essential, alternatively you could try creating an extension method that performs double lookup.

@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Mar 2, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Apr 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants