Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a variety of issues in FrozenCollections #81194

Merged
merged 1 commit into from Jan 26, 2023

Conversation

stephentoub
Copy link
Member

@stephentoub stephentoub commented Jan 25, 2023

  • Add the new ToFrozenDictionary/Set methods that take optimizedForReading and make the existing overloads cheaper for construction. This makes construction time and allocation by default much closer to that of a normal dictionary or set while also enabling more of the implementation to be trimmed away by default.
  • Fixed the use of FrozenSet/Dictionary.Empty such that we only use the singleton when its comparer matches; otherwise, we could end up handing back an instance with the wrong comparer.
  • Avoided passing both a dictionary/set and a comparer down through constructors, as the dictionary/set contains the comparer and we don't want to accidentally end up in a situation where we pass through the wrong one.
  • Condensed some of the construction code to be a bit less verbose
  • Removed unused locals from ToFrozenDictionary (the min/max/range were no longer being used)
  • Remove unnecessary array allocations from PickIntegerDictionary
  • Delete the sparse range integer sets, which are rarely needed and are broken for TryGetValue
  • Fix CheckUniqueAndUnfoundElements to slice down to only the relevant size prior to clearing

Performance-wise, the change when not specifying optimizeForReading has the desired effect (make construction a lot faster in exchange for making read throughput somewhat slower). For example, with this data (which is the list of case statements from one of our switches in System.Formats.Asn1):

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Collections.Frozen;
using System.Collections.Generic;
using System.Linq;

[MemoryDiagnoser(false)]
public partial class Program
{
    static void Main(string[] args) => BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);

    private Dictionary<string, string> _source =
        // from https://github.com/dotnet/runtime/blob/a30de6d40f69ef612b514344a5ec83fffd10b957/src/libraries/System.Formats.Asn1/src/System/Formats/Asn1/WellKnownOids.cs#L317-L419
        new[]
        {
            "1.2.840.10040.4.1", "1.2.840.10040.4.3", "1.2.840.10045.2.1", "1.2.840.10045.1.1", "1.2.840.10045.1.2", "1.2.840.10045.3.1.7", "1.2.840.10045.4.1", "1.2.840.10045.4.3.2", "1.2.840.10045.4.3.3", "1.2.840.10045.4.3.4",
            "1.2.840.113549.1.1.1", "1.2.840.113549.1.1.5", "1.2.840.113549.1.1.7", "1.2.840.113549.1.1.8", "1.2.840.113549.1.1.9", "1.2.840.113549.1.1.10", "1.2.840.113549.1.1.11", "1.2.840.113549.1.1.12", "1.2.840.113549.1.1.13",
            "1.2.840.113549.1.5.3", "1.2.840.113549.1.5.10", "1.2.840.113549.1.5.11", "1.2.840.113549.1.5.12", "1.2.840.113549.1.5.13", "1.2.840.113549.1.7.1", "1.2.840.113549.1.7.2", "1.2.840.113549.1.7.3", "1.2.840.113549.1.7.6",
            "1.2.840.113549.1.9.1", "1.2.840.113549.1.9.3", "1.2.840.113549.1.9.4", "1.2.840.113549.1.9.5", "1.2.840.113549.1.9.6", "1.2.840.113549.1.9.7", "1.2.840.113549.1.9.14", "1.2.840.113549.1.9.15", "1.2.840.113549.1.9.16.1.4",
            "1.2.840.113549.1.9.16.2.12", "1.2.840.113549.1.9.16.2.14", "1.2.840.113549.1.9.16.2.47", "1.2.840.113549.1.9.20", "1.2.840.113549.1.9.21", "1.2.840.113549.1.9.22.1", "1.2.840.113549.1.12.1.3", "1.2.840.113549.1.12.1.5",
            "1.2.840.113549.1.12.1.6", "1.2.840.113549.1.12.10.1.1", "1.2.840.113549.1.12.10.1.2", "1.2.840.113549.1.12.10.1.3", "1.2.840.113549.1.12.10.1.5", "1.2.840.113549.1.12.10.1.6", "1.2.840.113549.2.5", "1.2.840.113549.2.7",
            "1.2.840.113549.2.9", "1.2.840.113549.2.10", "1.2.840.113549.2.11", "1.2.840.113549.3.2", "1.2.840.113549.3.7", "1.3.6.1.4.1.311.17.1", "1.3.6.1.4.1.311.17.3.20", "1.3.6.1.4.1.311.20.2.3", "1.3.6.1.4.1.311.88.2.1",
            "1.3.6.1.4.1.311.88.2.2", "1.3.6.1.5.5.7.3.1", "1.3.6.1.5.5.7.3.2", "1.3.6.1.5.5.7.3.3", "1.3.6.1.5.5.7.3.4", "1.3.6.1.5.5.7.3.8", "1.3.6.1.5.5.7.3.9", "1.3.6.1.5.5.7.6.2", "1.3.6.1.5.5.7.48.1", "1.3.6.1.5.5.7.48.1.2",
            "1.3.6.1.5.5.7.48.2", "1.3.14.3.2.26", "1.3.14.3.2.7", "1.3.132.0.34", "1.3.132.0.35", "2.5.4.3", "2.5.4.5", "2.5.4.6", "2.5.4.7", "2.5.4.8", "2.5.4.10", "2.5.4.11", "2.5.4.97", "2.5.29.14", "2.5.29.15", "2.5.29.17", "2.5.29.19",
            "2.5.29.20", "2.5.29.35", "2.16.840.1.101.3.4.1.2", "2.16.840.1.101.3.4.1.22", "2.16.840.1.101.3.4.1.42", "2.16.840.1.101.3.4.2.1", "2.16.840.1.101.3.4.2.2", "2.16.840.1.101.3.4.2.3", "2.23.140.1.2.1", "2.23.140.1.2.2",
        }.ToDictionary(s => s, s => s);
    private FrozenDictionary<string, string> _fd;

    [GlobalSetup]
    public void Setup() => _fd = _source.ToFrozenDictionary();

    [Benchmark]
    public FrozenDictionary<string, string> Create() => FrozenDictionary.ToFrozenDictionary(_source);

    [Benchmark]
    public int ContainsKey()
    {
        int count = 0;
        foreach (string key in _fd.Keys)
        {
            if (_fd.ContainsKey(key))
            {
                count++;
            }
        }
        return count;
    }
}

it results in this:

Method Toolchain Mean Error StdDev Ratio Allocated Alloc Ratio
Create \main\corerun.exe 111,525.4 ns 2,065.66 ns 2,295.98 ns 1.000 39960 B 1.00
Create \pr\corerun.exe 938.4 ns 7.10 ns 6.30 ns 0.008 3176 B 0.08
ContainsKey \main\corerun.exe 996.2 ns 6.69 ns 6.26 ns 1.00 - NA
ContainsKey \pr\corerun.exe 1,301.5 ns 11.09 ns 9.26 ns 1.31 - NA

This shows both that there is a measurable read improvement achieved by specifying optimizeForReading here (since main is implicitly that) as well as it's now much cheaper to create one of these by default, and it would take tens of thousands of lookups to break even overall.

I also augmented the tests to improve code coverage. Before:

image

And after:

image

Fixes #81088
Contributes to #77891

@ghost
Copy link

ghost commented Jan 25, 2023

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details
  • Add the new ToFrozenDictionary/Set methods that take optimizedForReading and make the existing overloads cheaper for construction. This makes construction time and allocation by default much closer to that of a normal dictionary or set while also enabling more of the implementation to be trimmed away by default.
  • Fixed the use of FrozenSet/Dictionary.Empty such that we only use the singleton when its comparer matches; otherwise, we could end up handing back an instance with the wrong comparer.
  • Avoided passing both a dictionary/set and a comparer down through constructors, as the dictionary/set contains the comparer and we don't want to accidentally end up in a situation where we pass through the wrong one.
  • Condensed some of the construction code to be a bit less verbose
  • Removed unused locals from ToFrozenDictionary (the min/max/range were no longer being used)
  • Remove unnecessary array allocations from PickIntegerDictionary
  • Delete the sparse range integer sets, which are rarely needed and are broken for TryGetValue
  • Fix CheckUniqueAndUnfoundElements to slice down to only the relevant size prior to clearing

Performance-wise, the change when not specifying optimizeForReading has the desired effect (make construction a lot faster in exchange for making read throughput somewhat slower). For example, with this data (which is the list of case statements from one of our switches in System.Formats.Asn1):

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Collections.Frozen;
using System.Collections.Generic;
using System.Linq;

[MemoryDiagnoser(false)]
public partial class Program
{
    static void Main(string[] args) => BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);

    private Dictionary<string, string> _source =
        // from https://github.com/dotnet/runtime/blob/a30de6d40f69ef612b514344a5ec83fffd10b957/src/libraries/System.Formats.Asn1/src/System/Formats/Asn1/WellKnownOids.cs#L317-L419
        new[]
        {
            "1.2.840.10040.4.1", "1.2.840.10040.4.3", "1.2.840.10045.2.1", "1.2.840.10045.1.1", "1.2.840.10045.1.2", "1.2.840.10045.3.1.7", "1.2.840.10045.4.1", "1.2.840.10045.4.3.2", "1.2.840.10045.4.3.3", "1.2.840.10045.4.3.4",
            "1.2.840.113549.1.1.1", "1.2.840.113549.1.1.5", "1.2.840.113549.1.1.7", "1.2.840.113549.1.1.8", "1.2.840.113549.1.1.9", "1.2.840.113549.1.1.10", "1.2.840.113549.1.1.11", "1.2.840.113549.1.1.12", "1.2.840.113549.1.1.13",
            "1.2.840.113549.1.5.3", "1.2.840.113549.1.5.10", "1.2.840.113549.1.5.11", "1.2.840.113549.1.5.12", "1.2.840.113549.1.5.13", "1.2.840.113549.1.7.1", "1.2.840.113549.1.7.2", "1.2.840.113549.1.7.3", "1.2.840.113549.1.7.6",
            "1.2.840.113549.1.9.1", "1.2.840.113549.1.9.3", "1.2.840.113549.1.9.4", "1.2.840.113549.1.9.5", "1.2.840.113549.1.9.6", "1.2.840.113549.1.9.7", "1.2.840.113549.1.9.14", "1.2.840.113549.1.9.15", "1.2.840.113549.1.9.16.1.4",
            "1.2.840.113549.1.9.16.2.12", "1.2.840.113549.1.9.16.2.14", "1.2.840.113549.1.9.16.2.47", "1.2.840.113549.1.9.20", "1.2.840.113549.1.9.21", "1.2.840.113549.1.9.22.1", "1.2.840.113549.1.12.1.3", "1.2.840.113549.1.12.1.5",
            "1.2.840.113549.1.12.1.6", "1.2.840.113549.1.12.10.1.1", "1.2.840.113549.1.12.10.1.2", "1.2.840.113549.1.12.10.1.3", "1.2.840.113549.1.12.10.1.5", "1.2.840.113549.1.12.10.1.6", "1.2.840.113549.2.5", "1.2.840.113549.2.7",
            "1.2.840.113549.2.9", "1.2.840.113549.2.10", "1.2.840.113549.2.11", "1.2.840.113549.3.2", "1.2.840.113549.3.7", "1.3.6.1.4.1.311.17.1", "1.3.6.1.4.1.311.17.3.20", "1.3.6.1.4.1.311.20.2.3", "1.3.6.1.4.1.311.88.2.1",
            "1.3.6.1.4.1.311.88.2.2", "1.3.6.1.5.5.7.3.1", "1.3.6.1.5.5.7.3.2", "1.3.6.1.5.5.7.3.3", "1.3.6.1.5.5.7.3.4", "1.3.6.1.5.5.7.3.8", "1.3.6.1.5.5.7.3.9", "1.3.6.1.5.5.7.6.2", "1.3.6.1.5.5.7.48.1", "1.3.6.1.5.5.7.48.1.2",
            "1.3.6.1.5.5.7.48.2", "1.3.14.3.2.26", "1.3.14.3.2.7", "1.3.132.0.34", "1.3.132.0.35", "2.5.4.3", "2.5.4.5", "2.5.4.6", "2.5.4.7", "2.5.4.8", "2.5.4.10", "2.5.4.11", "2.5.4.97", "2.5.29.14", "2.5.29.15", "2.5.29.17", "2.5.29.19",
            "2.5.29.20", "2.5.29.35", "2.16.840.1.101.3.4.1.2", "2.16.840.1.101.3.4.1.22", "2.16.840.1.101.3.4.1.42", "2.16.840.1.101.3.4.2.1", "2.16.840.1.101.3.4.2.2", "2.16.840.1.101.3.4.2.3", "2.23.140.1.2.1", "2.23.140.1.2.2",
        }.ToDictionary(s => s, s => s);
    private FrozenDictionary<string, string> _fd;

    [GlobalSetup]
    public void Setup() => _fd = _source.ToFrozenDictionary();

    [Benchmark]
    public FrozenDictionary<string, string> Create() => FrozenDictionary.ToFrozenDictionary(_source);

    [Benchmark]
    public int ContainsKey()
    {
        int count = 0;
        foreach (string key in _fd.Keys)
        {
            if (_fd.ContainsKey(key))
            {
                count++;
            }
        }
        return count;
    }
}

it results in this:

Method Toolchain Mean Error StdDev Ratio Allocated Alloc Ratio
Create \main\corerun.exe 111,525.4 ns 2,065.66 ns 2,295.98 ns 1.000 39960 B 1.00
Create \pr\corerun.exe 938.4 ns 7.10 ns 6.30 ns 0.008 3176 B 0.08
ContainsKey \main\corerun.exe 996.2 ns 6.69 ns 6.26 ns 1.00 - NA
ContainsKey \pr\corerun.exe 1,301.5 ns 11.09 ns 9.26 ns 1.31 - NA

This shows both that there is a measurable read improvement achieved by specifying optimizeForReading here (since main is implicitly that) as well as it's now much cheaper to create one of these by default, and it would take several hundred lookups to break even overall.

I also augmented the tests to improve code coverage. Before:

image

And after:

image

Fixes #81088
Contributes to #77891

Author: stephentoub
Assignees: -
Labels:

area-System.Collections

Milestone: -

- Add the new ToFrozenDictionary/Set methods that take optimizedForReading and make the existing overloads cheaper for construction.  This makes construction time and allocation by default much closer to that of a normal dictionary or set while also enabling more of the implementation to be trimmed away by default.
- Fixed the use of FrozenSet/Dictionary.Empty such that we only use the singleton when its comparer matches; otherwise, we could end up handing back an instance with the wrong comparer.
- Avoided passing both a dictionary/set and a comparer down through constructors, as the dictionary/set contains the comparer and we don't want to accidentally end up in a situation where we pass through the wrong one.
- Condensed some of the construction code to be a bit less verbose
- Removed unused locals from ToFrozenDictionary (the min/max/range were no longer being used)
- Remove unnecessary array allocations from PickIntegerDictionary
- Delete the sparse range integer sets, which are rarely needed and are broken for TryGetValue
- Fix CheckUniqueAndUnfoundElements to slice down to only the relevant size prior to clearing
@@ -9,6 +9,7 @@ namespace System.Collections.Frozen
public static partial class FrozenDictionary
{
public static System.Collections.Frozen.FrozenDictionary<TKey, TValue> ToFrozenDictionary<TKey, TValue>(this System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<TKey, TValue>> source, System.Collections.Generic.IEqualityComparer<TKey>? comparer = null) where TKey : notnull { throw null; }
public static System.Collections.Frozen.FrozenDictionary<TKey, TValue> ToFrozenDictionary<TKey, TValue>(this System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<TKey, TValue>> source, System.Collections.Generic.IEqualityComparer<TKey>? comparer, bool optimizeForReading) where TKey : notnull { throw null; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the comparer parameter be optional? Seems you need to always supply a comparer when opting for optimized reads, even if null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't be optional here or both overloads with optional parameters would be ambiguous when neither are supplied.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about having another overload:

public static FrozenDictionary<TKey, TValue> ToFrozenDictionary<TKey, TValue>(
    this IEnumerable<KeyValuePair<TKey, TValue>> source, 
    bool optimizeForReading);

Seems many people would want to call this without bothering to specify a comparer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about having another overload

I'm fine with that if you think it's valuable. It's a pure addition so we can merge this and follow-up on adding more overloads.

@stephentoub stephentoub merged commit 523d0f9 into dotnet:main Jan 26, 2023
@stephentoub stephentoub deleted the dialablefrozen branch January 26, 2023 19:23
@dotnet dotnet locked as resolved and limited conversation to collaborators Feb 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[API Proposal]: FrozenDictionary/Set.ToFrozen* overloads that take optimization level
2 participants