Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve List<T>.AddRange performance for enumerables #76043

Merged
merged 4 commits into from Sep 25, 2022

Conversation

stephentoub
Copy link
Member

AddRange is currently implemented as delegating to InsertRange, and InsertRange in turn has a more complicated inner loop as part of adding each item from a source enumerable into the list. By just copying InsertRange's source into AddRange, deleting all the irrelevant stuff, and changing the Insert call to Add, throughput improves measurably.

Method Toolchain Count Mean Ratio
AddRange_Enumerable \main\corerun.exe 2 30.77 ns 1.00
AddRange_Enumerable \pr\corerun.exe 2 26.35 ns 0.85
AddRange_Enumerable \main\corerun.exe 256 1,584.66 ns 1.00
AddRange_Enumerable \pr\corerun.exe 256 1,069.29 ns 0.69
AddRange_Enumerable \main\corerun.exe 1024 6,223.58 ns 1.00
AddRange_Enumerable \pr\corerun.exe 1024 4,175.18 ns 0.67
private List<int> _list = new List<int>();
private IEnumerable<int> _sourceEnumerable;

[Params(2, 256, 1024)]
public int Count { get; set; }

[GlobalSetup]
public void Setup()
{
    _sourceEnumerable = Enumerable.Range(0, Count);
}

[Benchmark]
public void AddRange_Enumerable()
{
    _list.Clear();
    _list.AddRange(_sourceEnumerable);
}

AddRange is currently implemented as delegating to InsertRange, and InsertRange in turn has a more complicated inner loop as part of adding each item from a source enumerable into the list.  By just copying InsertRange's source into AddRange, deleting all the irrelevant stuff, and changing the Insert call to Add, throughput improves measurably.
@ghost
Copy link

ghost commented Sep 22, 2022

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

AddRange is currently implemented as delegating to InsertRange, and InsertRange in turn has a more complicated inner loop as part of adding each item from a source enumerable into the list. By just copying InsertRange's source into AddRange, deleting all the irrelevant stuff, and changing the Insert call to Add, throughput improves measurably.

Method Toolchain Count Mean Ratio
AddRange_Enumerable \main\corerun.exe 2 30.77 ns 1.00
AddRange_Enumerable \pr\corerun.exe 2 26.35 ns 0.85
AddRange_Enumerable \main\corerun.exe 256 1,584.66 ns 1.00
AddRange_Enumerable \pr\corerun.exe 256 1,069.29 ns 0.69
AddRange_Enumerable \main\corerun.exe 1024 6,223.58 ns 1.00
AddRange_Enumerable \pr\corerun.exe 1024 4,175.18 ns 0.67
private List<int> _list = new List<int>();
private IEnumerable<int> _sourceEnumerable;

[Params(2, 256, 1024)]
public int Count { get; set; }

[GlobalSetup]
public void Setup()
{
    _sourceEnumerable = Enumerable.Range(0, Count);
}

[Benchmark]
public void AddRange_Enumerable()
{
    _list.Clear();
    _list.AddRange(_sourceEnumerable);
}
Author: stephentoub
Assignees: -
Labels:

area-System.Collections

Milestone: -

@tfenise
Copy link
Contributor

tfenise commented Sep 22, 2022

The benchmark code makes me wonder, why Enumerable.Range doesn't return some type implementing ICollection<int>, so that List<int>.InsertRange may call ICollection<int>.CopyTo.

@stephentoub
Copy link
Member Author

The benchmark code makes me wonder, why Enumerable.Range doesn't return some type implementing ICollection, so that List.InsertRange may call ICollection.CopyTo.

It could; the question would be what scenario does that really serve, other than messing up my benchmarks :) It'd be more interesting I think to look at whether any of the other internal enumerable implementations in LINQ could/should implement ICollection<T>.

@stephentoub
Copy link
Member Author

t'd be more interesting I think to look at whether any of the other internal enumerable implementations in LINQ could/should implement ICollection

I don't think it works. The most valuable ones would be on the iterators used for things like Select (it's really rare, for example, to see an Enumerable.Range used in production where the result is consumed directly; it's much more common to see a Select or the like off of it). The problem, however, is we'd need to run the selectors as part of ICollection<T>.Count, just as we do for Enumerable.Count(), and that means if someone did a fairly common pattern of using the Count to presize an output array and then CopyTo'ing into that array, it'd result in the selectors running twice.

@tfenise

This comment was marked as off-topic.

Grow(_size + count);
}

c.CopyTo(_items, _size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this could regress cases where we're appending the list to itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regress functionally? We have this test:

public void AddRange_AddSelfAsEnumerable_ThrowsExceptionWhenNotEmpty()

Is there a case that's missing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality looks good, I meant to say regress performance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so. I just tried with:

[Benchmark]
public List<int> AddToSelf()
{
    var list = new List<int>();
    list.Add(1);
    list.Add(2);
    list.AddRange(list);
    list.AddRange(list);
    list.AddRange(list);
    list.AddRange(list);
    list.AddRange(list);
    list.AddRange(list);
    return list;
}

and depending on the run there was either no difference or a small difference in favor in of the new version.

Is there a specific aspect of the change you think would regress that? I can try to address it if so. Though I'm not particularly concerned about the performance of that case, anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we cared about optimizing that particular case, hardcoding the operation in the style of InsertRange might be marginally faster compared to using ICollection<T>.CopyTo, at the expense of added branching for every other case. But like you said, it doesn't seem important enough to justify.

@stephentoub
Copy link
Member Author

I want to comment on #1530, but I can't because that conversation "has been locked as resolved and limited to collaborators", even though that issue has been reopened,

I've unlocked it. Thanks.

@stephentoub stephentoub merged commit c6dbf9c into dotnet:main Sep 25, 2022
@stephentoub stephentoub deleted the listaddrange branch September 25, 2022 21:00
@dotnet dotnet locked as resolved and limited conversation to collaborators Oct 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants