Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Improve the performance of ImmutableList.Contains #40540

Merged
merged 5 commits into from
Oct 19, 2019

Conversation

shortspider
Copy link
Contributor

Fixes #36407

IndexOf has a lot of overhead because it's looking for an index when
all Contains wants to do is find a matching element.
@dnfclas
Copy link

dnfclas commented Aug 23, 2019

CLA assistant check
All CLA requirements met.

Copy link

@AArnott AArnott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current iteration looks good to me. :shipit:

@stephentoub
Copy link
Member

stephentoub commented Sep 7, 2019

@AArnott, can you share an example where using the enumerator is measurably faster than using the indexer? I'm having trouble creating a collection large enough, but maybe it requires creating one in a special way?

@AArnott
Copy link

AArnott commented Sep 7, 2019

@stephentoub I don't recall doing the experiment myself. I thought you had tried it and found 1M to be the turning point. So I'm happy to go with a pure-indexer enumerator if you couldn't find a size where that was worse.

@shortspider
Copy link
Contributor Author

@AArnott @stephentoub
So I redid the enumerator to use the index and I don't see a real improvement. Here is without my changes:

Type Method Size Mean Error StdDev
IterateForEach{Int32} ImmutableList 512 12.678 us 0.3188 us 0.3544 us
IterateForEach{String} ImmutableList 512 35.445 us 1.4779 us 1.5177 us

And then with the indexer implementation:

Type Method Size Mean Error StdDev
IterateForEach{Int32} ImmutableList 512 17.388 us 0.0719 us 0.0561 us
IterateForEach{String} ImmutableList 512 29.857 us 3.7395 us 4.3064 us

Switching Contains back to use IndexOf (and the Enumerator directly) here are the results (without my changes then with indexer enumerator):

Type Method Size Mean Error StdDev
ContainsFalse{Int32} ImmutableList 512 8,273.21 us 743.876 us 856.648 us
ContainsFalse{String} ImmutableList 512 23,145.14 us 3,396.777 us 3,775.509 us
ContainsTrue{Int32} ImmutableList 512 3,997.04 us 327.578 us 377.239 us
ContainsTrue{String} ImmutableList 512 10,036.07 us 29.512 us 24.644 us
Type Method Size Mean Error StdDev
ContainsFalse{Int32} ImmutableList 512 10,064.98 us 62.073 us 55.026 us
ContainsFalse{String} ImmutableList 512 13,986.43 us 38.286 us 35.813 us
ContainsTrue{Int32} ImmutableList 512 4,665.12 us 421.096 us 468.047 us
ContainsTrue{String} ImmutableList 512 7,144.50 us 8.922 us 7.909 us

I don't honestly know what to make of this data, with a pure indexer implementation the enumerator for ints seems to take a performance hit while for strings it performs better.

None of this matters for the Contains case but it's interesting.

@maryamariyan
Copy link
Member

cc: @stephentoub @AArnott

as explained by @shortspider, based on the resulting benchmarks, the numbers seem to not show enough reason for merging. Should this be closed then?

@safern
Copy link
Member

safern commented Oct 11, 2019

Actually the numbers shown are by changing the Enumerator to use the indexer... @shortspider do you have numbers for your current change to speed up Contains as is?

Overall the change looks good to me, but do you have benchmark numbers for the current change with big and small collections?

@stephentoub stephentoub changed the title Immutable list vs list contains Improve the performance of ImmutableList.Contains Oct 19, 2019
@stephentoub
Copy link
Member

Thanks for the improvement, @shortspider. I tried this out, and it looks like a very nice improvement to throughput.

Method Toolchain Length Mean
Contains0 \new\corerun.exe 1 2.467 ns
Contains0 \old\CoreRun.exe 1 155.496 ns
ContainsLengthDiv2 \new\corerun.exe 1 2.589 ns
ContainsLengthDiv2 \old\CoreRun.exe 1 156.111 ns
ContainsLength \new\corerun.exe 1 5.038 ns
ContainsLength \old\CoreRun.exe 1 173.002 ns
Contains0 \new\corerun.exe 10 11.719 ns
Contains0 \old\CoreRun.exe 10 174.735 ns
ContainsLengthDiv2 \new\corerun.exe 10 2.571 ns
ContainsLengthDiv2 \old\CoreRun.exe 10 382.319 ns
ContainsLength \new\corerun.exe 10 39.087 ns
ContainsLength \old\CoreRun.exe 10 556.378 ns
Contains0 \new\corerun.exe 100 21.540 ns
Contains0 \old\CoreRun.exe 100 189.930 ns
ContainsLengthDiv2 \new\corerun.exe 100 2.608 ns
ContainsLengthDiv2 \old\CoreRun.exe 100 2,423.277 ns
ContainsLength \new\corerun.exe 100 399.896 ns
ContainsLength \old\CoreRun.exe 100 4,510.335 ns
Contains0 \new\corerun.exe 1000 29.990 ns
Contains0 \old\CoreRun.exe 1000 207.408 ns
ContainsLengthDiv2 \new\corerun.exe 1000 2.613 ns
ContainsLengthDiv2 \old\CoreRun.exe 1000 23,011.236 ns
ContainsLength \new\corerun.exe 1000 4,364.078 ns
ContainsLength \old\CoreRun.exe 1000 46,108.501 ns
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Diagnosers;
using BenchmarkDotNet.Running;
using System.Collections.Immutable;
using System.Linq;

[MemoryDiagnoser]
public class Program
{
    static void Main(string[] args) => BenchmarkSwitcher.FromAssemblies(new[] { typeof(Program).Assembly }).Run(args);

    private ImmutableList<int> _list;

    [Params(1, 10, 100, 1000 )]
    public int Length { get; set; }

    [GlobalSetup]
    public void Setup() => _list = ImmutableList<int>.Empty.AddRange(Enumerable.Range(0, Length));

    [Benchmark]
    public bool Contains0() => _list.Contains(0); // first element

    [Benchmark]
    public bool ContainsLengthDiv2() => _list.Contains(Length / 2); // root, or close to it

    [Benchmark]
    public bool ContainsLength() => _list.Contains(Length); // doesn't exist
}

@stephentoub stephentoub merged commit a866d96 into dotnet:master Oct 19, 2019
@adamsitnik adamsitnik added the tenet-performance Performance related issue label Oct 20, 2019
@adamsitnik adamsitnik added this to the 5.0 milestone Oct 20, 2019
@stephentoub
Copy link
Member

stephentoub commented Oct 21, 2019

I don't recall doing the experiment myself. I thought you had tried it and found 1M to be the turning point. So I'm happy to go with a pure-indexer enumerator if you couldn't find a size where that was worse.

@AArnott, I just tried it again. Not sure what I was looking at when I commented before, but I just rewrote the enumerator to just be in terms of indexing, and while it was 2-3x faster for small sizes, starting at somewhere around 5_000 nodes on my current machine it started to get slower. By 100_000 nodes it was close to ~25% slower. So, it does seem like we shouldn't go with a pure-indexer approach.

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Improve ImmutableList.IndexOf performance

* Implement a spicific Contains method

IndexOf has a lot of overhead because it's looking for an index when
all Contains wants to do is find a matching element.

* Fix tests

* Revert change to IndexOf


Commit migrated from dotnet/corefx@a866d96
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
9 participants