Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Linq performance improvement suggestions #14366

Closed
1 of 8 tasks
ikopylov opened this issue Mar 19, 2015 · 38 comments
Closed
1 of 8 tasks

System.Linq performance improvement suggestions #14366

ikopylov opened this issue Mar 19, 2015 · 38 comments
Assignees
Labels
area-System.Linq tenet-performance Performance related issue
Milestone

Comments

@ikopylov
Copy link
Contributor

With Linq-to-Objects it is quite common practice to perform a series of transformations, and then materialize sequence to a concrete collection type by calling ToArray(), ToList(), ToDictionary(). These operations would work much faster if they knew the number of elements in the sequence.

Currently System.Linq.Enumerable has special treatment for ICollection interface only.
I suppose that additional support for IReadOnlyCollection can improve performance in some cases, because through it we can figure out the total number of elements.

Another problem with System.Linq is that in many cases the information about the number of elements is lost. One of the most common example:

List<int> source = ...;
List<int> transformed = source.Select(o => o + 1).ToList();

Obviously Select does not change the number of elements in the sequence, but ToList method cannot take advantage of that. Information is lost.
In this particular scenario, it would be great for Select to return some SelectIterator instance, which implements IReadOnlyCollection, and thereby passes the number of elements to the subsequent methods.

Steps to measure performance gain:

  1. Find in System.Linq.Tests\Performance the test for the method you've changed;
  2. Uncomment [Fact] attribute above that method;
  3. Build test project by Visual Studio in Release mode;
  4. Go to the folder with tests binaries: bin\tests\Windows_NT.AnyCPU.Release\System.Linq.Tests\aspnetcore50\;
  5. Open command prompt (cmd.exe or PowerShell);
  6. Run command: CoreRun.exe xunit.console.netcore.exe System.Linq.Tests.dll -parallel none -trait "Perf=true";
  7. Wait for results that will be printed right in the command window;
  8. Don't forget to run tests with different collection sizes. This can be done by varying elementCount argument of Measure method.

Casting to IReadOnlyCollection<T> is a slow operation so it is not a good idea to check if this interface implemented. The performance drop will be most noticeable on the small collections.

Tasks:

  • Select: add iterators for List<T>, Array[T], ICollection<T>;
  • ToArray, ToDictionary: add special support for ICollection<T> (and, very carefully, for IReadOnlyCollection<T>) to get the initial capacity;
  • ToList (???): add special support for IReadOnlyCollection<T> to get the initial capacity (separated as it will affect System.Collections.Generic);
  • OrderBy(Descending)/ThanBy(Descending): implement special iterator for ICollection<T> to propagate Count;
  • Cast, Reverse: add iterators for ICollection<T> to propagate Count;
  • Range, Repeat: add an iterator that implements ICollection<T>;
  • Skip, Take: add an iterator that handle ICollection<T>;
  • Add performance tests for System.Linq
@Clockwork-Muse
Copy link
Contributor

The underlying problem seems to be because Select is combined with a Where operation (Where has it's own standalone operation, too, but not Select it seems). This was probably done for a common case of the two operations being combined, but it means that it can't preserve the original count in the first place...

You may want to mention this particular problem in issue #14352

@MarcinJuraszek
Copy link
Contributor

Did you mean something like this: https://github.com/MarcinJuraszek/corefx/commit/efd50b9f56be93047f3b7b2a6929637b6b567f7d

It's not really tested or anything, but that's the general idea how I see some improvements around LINQ queries that use Select but don't use Where.

It would be nice to have tests around System.Linq first before attempting to optimize some corner cases (and to make sure nothing breaks when doing the change).

@ikopylov
Copy link
Contributor Author

Did you mean something like this: MarcinJuraszek@efd50b9

Exactly. But the implementation of ICollection<T> is overkill here. That's why I suggested to add special support for IReadOnlyCollection<T>.
Also the same approach (using a special IEnumerable+IReadOnlyCollection implementation) can be applied to Cast, OrderBy, Reverse, Range and Repeat methods.

@MarcinJuraszek
Copy link
Contributor

Basing it on IReadOnlyCollection<T> would require a change in List<T> constructor, because right now it only tries to be smart for ICollection<T>. The same applies to Buffer<TElement>, which is internally used for ToArray operation.

System.Collections.Generic is not on github yet.

I do agree that making some improvements how Case, OrderBy and ThanBy work when number of elements in the source collection is known, as these are quite common cases - just sorting a list or array using LINQ is quite common. I'm not sure about Reverse, Range or Repeat - they might not be worth the effort.

@ikopylov
Copy link
Contributor Author

Yes, it would require a change in List<T> and Buffer<TElement>. Also definitely it would affect the Count and ToDictionary methods. But this modifications should make Linq-to-Objects faster for immutable collections that only implements IReadOnlyCollection<T> (not a full ICollection<T>).

For Buffer<TElement> the change is very simple. It should just preallocate items array of proper length.

System.Collections.Generic is not on github yet.

In issue dotnet/corefx#1187 was mentioned that it will be coming soon. So this is not a problem.

@MarcinJuraszek
Copy link
Contributor

I really like all the suggestions.

I think this issue could be split into couple smaller ones, e.g. one for Select improvements, one for OrderBy(Descending)/ThanBy(Descending), etc. This way it will be smaller of a change + can be split between multiple people. What do you think?

Also, whoever will end up working on any of them should also include unit tests around given method, to make sure it still works the way it's expected. Kind of TDD, where e.g. you write tests around existing Select implementation and later add improvements. Maybe even perf tests to see how big the perf gain is?

@rosieks
Copy link

rosieks commented Mar 20, 2015

Maybe good idea would be to just provide a few additional overloads that allow provide initial capacity:

  • ToList(int capacity)
  • ToArray(int capacity)
  • ToDictionary(..., int capacity)

@ikopylov
Copy link
Contributor Author

From my point of view, the number of changes would not be very large, so I suggest not to split this issue into several smaller ones. But the tests (which definitely should be done) potentially will be much larger than modifications by themselves. So it would be a good idea to create a separate issue for them. In fact, I found that one already exists for the entire System.Linq (#1143). So I don't know, if it is better to create a new issue or just wait for completion of that one.
@MarcinJuraszek, what do you think?

Maybe even perf tests to see how big the perf gain is?

Yes, it would be great. From my calculations the gain should be 2x for simple transformation by Select.

@ikopylov
Copy link
Contributor Author

@rosieks, good suggestion. But it would require a change in public API.

@MarcinJuraszek
Copy link
Contributor

@rosieks I think that library should be smart enough to use all possible information to make it more efficient, without forcing users to explicitly specify it. E.g. when input is ICollection<T> we already know how many elements there are, no need for capacity after all.

@ikopylov Getting UT for the entire System.Linq will take a while as there is quite a bit of code to test. I think it would be better to split that effort into different parts of the library, so that it's more obvious what's already done, what's left, who's working on what. That would also make it easier for multiple people to prepare changes for different methods. I still think the same applies to perf improvements.

Getting perf tests seems like a good work that could be totally separated from both UT and perf improvements.

@MarcinJuraszek
Copy link
Contributor

I've updated my fork to use IReadOnlyCollection<T> instead of ICollection<T> + added simple tests: https://github.com/dotnet/corefx/compare/master...MarcinJuraszek:linq-select?expand=1

@ikopylov
Copy link
Contributor Author

Ok. Let's split this issue into several smaller ones.
I propose the following sub-tasks:

  1. Select: add iterators for List<T>, Array[T], ICollection<T> and IReadOnlyCollection<T>;
  2. ToArray, ToDictionary: add special support for IReadOnlyCollection<T> to get the initial capacity;
  3. ToList: add special support for IReadOnlyCollection<T> to get the initial capacity (separated as it will affect System.Collections.Generic);
  4. OrderBy(Descending)/ThanBy(Descending): implement IReadOnlyCollection<T> to propagate Count;
  5. Cast, Reverse: add iterators for ICollection<T> and IReadOnlyCollection<T> to propagate Count;
  6. Range, Repeat: add an iterator that implements IReadOnlyCollection<T>.

@MarcinJuraszek, what do you think about that?

@ikopylov
Copy link
Contributor Author

I've updated my fork to use IReadOnlyCollection instead of ICollection + added simple tests

Awesome.

I have some thoughts on this task.

  1. Count should not be stored as a local field, but should be read from the original sequence each time. Your implementation is perfect, but it would be nice to add a test like this:
List<int> source = new List<int>() { 1, 2, 3 };
var seq = source.Select(o => o + 1);

var arr1 = seq.ToArray();
Assert.AreEqual(source.Count, arr1.Length);

source.Add(10);
var arr2 = seq.ToArray();
Assert.AreEqual(source.Count, arr2.Length);
  1. One advantage of ICollection<T> is the method CopyTo. With it, we can make a copy using only a single virtual call. With IReadOnlyCollection<T> to get a copy we should use IEnumerator<T>, which leads to a large number of virtual calls.
    Now I think it is better to implement ICollection<T> on SelectListIterator<T> and on SelectArrayIterator<T>. This will give a huge performance gain by reducing the number of virtual calls.

@MarcinJuraszek
Copy link
Contributor

I really like how you split the items into smaller pieces! I'd add 7. Add performance tests for System.Linq

  1. Good idea.
  2. CopyTo would still need to iterate over all elements because selector needs to be called on every element of source collection. Unless I'm missing something.

@ikopylov
Copy link
Contributor Author

I really like how you split the items into smaller pieces! I'd add 7. Add performance tests for System.Linq

Great.
I have added a checkbox list of tasks to this issue. Work on them can be done in separate PRs. Do I need to create a separate GitHub issue for every task mentioned here?

CopyTo would still need to iterate over all elements because selector needs to be called on every element of source collection. Unless I'm missing something.

Iteration should still be there, but it will be a fast for-cycle instead of heavy foreach-cycle. CopyTo is a part of a concrete iterator class, so it is aware of the specific collection type and can use a for-cycle.
You can check the difference in the following simple performance test:

// 350ms (1000000 elements x 1000 times)
static int ForArray(int[] array)
{
    int result = 0;
    for (int i = 0; i < array.Length; i++)
        result += array[i];
    return result;
}
// 740ms (1000000 elements x 1000 times) 
static int ForList(List<int> array)
{
    int result = 0;
    for (int i = 0; i < array.Count; i++)
        result += array[i];
    return result;
}
// 2205ms (1000000 elements x 1000 times)
static int ForEachList(List<int> array)
{
    int result = 0;
    foreach (int elem in array)
        result += elem;
    return result;
}
// 4918ms (1000000 elements x 1000 times)
static int ForEachEnumerable(IEnumerable<int> array)
{
    int result = 0;
    foreach (int elem in array)
        result += elem;
    return result;
}

@MarcinJuraszek
Copy link
Contributor

I think one issue with a list of tasks is OK.

How about starting with perf tests, so we can actually see if implementing improvements like ICollection<T> vs. IReadOnlyCollection<T> makes sense and how much gain it gives.

Also, would be nice to see someone from .NET team approving the plan and/or giving some comments about it.

@ikopylov
Copy link
Contributor Author

I have prepared the changes for ToArray, ToDictionary task: dotnet/corefx@master...ikopylov:linq_to_array_to_dictionary

How about starting with perf tests, so we can actually see if implementing improvements like ICollection vs. IReadOnlyCollection makes sense and how much gain it gives.

I agree. I'll try to add some perf-tests.

@ikopylov
Copy link
Contributor Author

I've completed performance tests: dotnet/corefx@master...ikopylov:linq_perf_tests
I plan to submit them as Pull-Request tomorrow.
@MarcinJuraszek, it would be great if you could look at them.

@MarcinJuraszek
Copy link
Contributor

Didn't look deeply, but looks good after taking a quick look.

I was thinking about making perf tests separate project instead of putting it in the UT project. I don't think it's worth running them as part of UT, I would rather have separate .exe I could run (outside of VS) that would print perf for all the operations.

@rosieks
Copy link

rosieks commented Mar 23, 2015

What about Skip and Take methods. I think that magic could be also introduced in this methods.

@ikopylov
Copy link
Contributor Author

I was thinking about making perf tests separate project instead of putting it in the UT project.

I checked how performance tests implemented in other projects and found that they all done in the same way.

I would rather have separate .exe I could run (outside of VS) that would print perf for all the operations.

That can easily be done for unit-tests by executing them with xunit.console.netcore.exe:

CoreRun.exe xunit.console.netcore.exe System.Linq.Tests.dll -parallel none -trait "Perf=true"

Actually, I haven't found a way to run tests inside VS, so I always run them using the stated command.

What about Skip and Take methods. I think that magic could be also introduced in this methods.

Right. I've added them to the task list.

@MarcinJuraszek
Copy link
Contributor

I checked how performance tests implemented in other projects and found that they all done in the same way.

Sounds good. I didn't actually look. What projects already have perf tests?

Actually, I haven't found a way to run tests inside VS, so I always run them using the stated command.

You can run tests in VS15 by setting test project as Startup Project and just running debug session.

For perf tests it's important to compile using Release flavor and run them outside of devenv. Can you add a command that can be used to do that into your Pull Request + update first post in this issue when it's accepted?

@ikopylov
Copy link
Contributor Author

What projects already have perf tests?

For example, System.Collections.Immutable. In fact, there's not so much projects with perf tests.

Can you add a command that can be used to do that into your Pull Request

Steps are quite simple:

  1. Uncomment [Fact] attribute above the test method that you need;
  2. Build test project by Visual Studio in Release mode;
  3. Go to the folder with tests binaries: bin\tests\Windows_NT.AnyCPU.Release\System.Linq.Tests\aspnetcore50\;
  4. Open command prompt (cmd.exe or PowerShell);
  5. Run command: CoreRun.exe xunit.console.netcore.exe System.Linq.Tests.dll -parallel none -trait "Perf=true";
  6. Wait for results that will be printed right in the command window.

@theoy
Copy link

theoy commented Mar 25, 2015

Actually - @VSadov, can you take a look at reviewing this? Thanks!

VSadov referenced this issue in dotnet/corefx Mar 25, 2015
Performance tests for System.Linq (issue #1182)
@terrajobst
Copy link
Member

I really like the overall direction. We should do an API review for this. I've labelled the issue accordingly.

@MarcinJuraszek
Copy link
Contributor

What are the next steps? Should we just wait to hear back from the API Review Board? When should we expect that to happen?

@theoy
Copy link

theoy commented Mar 26, 2015

@MarcinJuraszek - for the API reviews, I believe @terrajobst conducts API reviews in the open for this on a weekly cadence.

@terrajobst - do we have a more lightweight process since this doesn't change public surface area, but instead is suggesting for implementation optimizations for being aware of more knowledge-rich collections?

@terrajobst
Copy link
Member

If there are no APIs being added then it doesn't need to be API reviewed. Based on the task from the description I assumed that this will result in new APIs:

Select : add iterators for List<T>, Array[T], ICollection<T> and IReadOnlyCollection<T>

@MarcinJuraszek
Copy link
Contributor

No, there is no API change. It will add new internal iterator classes to make publicly available API faster for some cases.

@ikopylov
Copy link
Contributor Author

While improving ToArray and ToDictionary methods I encountered an unexpected drawback of type conversion. As it turned out, casting to the generic interface with covariant or contravariant type parameter is a very slow operation. According to my tests it is 20 times slower than the cast to the interface (generic or non-generic) without covariance and contravariance.

Unfortunately, IReadOnlyCollection<T> has covariant type parameter, which lead to significant performance drop in some performance tests. So I suggest to reject any special optimisations for IReadOnlyCollection<T>. Hence, it is also pointless to implement IReadOnlyCollection<T> on handcrafted iterators. It is much better to implement ICollection<T> (if possible).

The information that I have found by examining the source code of CoreCLR:

  1. Casting to Array is just a one flag check, so it is extremely fast;
  2. Casting to Class is a linear search in an array of base classes;
  3. Casting to Interface is a linear search in an array of implemented interfaces;
  4. Casting to Interface with covariant or contravariant generic type parameter is a linear search in an array of implemented interfaces with an additional check of the possibility of conversion between type parameters.

@ikopylov
Copy link
Contributor Author

@terrajobst, public API will not be changed. All changes are internal to System.Linq.
The only side effect is that IEnumerable<T> returned from Select could be casted to ICollection<T> (or IReadOnlyCollection<T>), while previously it was not possible.

@VSadov
Copy link
Member

VSadov commented Mar 27, 2015

@ikopylov - since the optimization mostly relies on Count, perhaps specialcasing nongeneric ICollection is sufficient?

I am not sure if we can rely on indexing (or copying) functionality when iterating something hat is not an array. There is one, fairly inconvenient from performance prospective contract, that many mutable collections implement - they do not allow mutations while iterating. On the other hand indexers are ok with mutations, so replacing one with another is observable. Not sure if this is a big enough change in behavior to be considered compat breaking.

@VSadov
Copy link
Member

VSadov commented Mar 27, 2015

@terrajobst - yes, I believe there is no public API changes. However there could be subtle changes in behavior that might need to be called out.

  • LINQ iterators will become convertible to more interfaces. While observable, I do not think implementing more interfaces is considered breaking.
  • What happens if implementation of Count is incorrect (like returns N-1), very slow, or simply throws NotSupported?
  • something else ...

There might still be enough questions for a small discussion even though there are no surface changes.

@ikopylov
Copy link
Contributor Author

since the optimization mostly relies on Count, perhaps specialcasing nongeneric ICollection is sufficient?

ICollection is very similar to its generic twin ICollection<T>, but latter already has a special support in several places in System.Linq. So I prefer to rely on ICollection<T>.

I am not sure if we can rely on indexing (or copying) functionality when iterating something hat is not an array. There is one, fairly inconvenient from performance prospective contract, that many mutable collections implement - they do not allow mutations while iterating.

I agree with that. In order not to break the behaviour we should use IEnumerator<T> to iterate over any collection (with the exception of Array).
But the copying (with CopyTo) does not break anything. It is used only when materialization required and it is an 'atomic' operation, which means that nothing can be changed during this operation (of course, in single-threaded scenarios).

What happens if implementation of Count is incorrect (like returns N-1), very slow, or simply throws NotSupported?

This is the problem of the library which contains such ill-implemented collections.

I've updated a task list slightly to reflect last changes with IReadOnlyCollection<T> support.

@MarcinJuraszek
Copy link
Contributor

I've just sent a Pull Request for Select improvements, without optimizing for IReadOnlyCollection<T> based on comments by @ikopylov.

@terrajobst
Copy link
Member

We reviewed this issue today. We don't believe it's ready yet. Please the notes for more details.

@terrajobst terrajobst assigned stephentoub and unassigned VSadov Sep 10, 2015
@terrajobst
Copy link
Member

We reviewed this item and concluded that it should be split.

@stephentoub, could you take a stab at this?

@stephentoub
Copy link
Member

We reviewed this issue today. We don't believe it's ready yet. Please the notes for more details.

This wasn't really a reviewable item, as there wasn't any concrete API being proposed, as discussed earlier in the thread ;) It's a set of proposed improvements to performance in existing implementation rather than new API.

Through this thread as well as a bunch of PRs that have happened in the interim, we did agree that we don't want to expose existing collection interfaces out of LINQ, e.g. the object returned by Select shouldn't implement ICollection<T>, but we are ok with and have introduced additional internal interfaces to help curry certain information through the built-in operators, e.g. the iterator implemented by Select implements IArrayProvider so that its ToArray implementation can be optimized to work on the original array with complete knowledge of its size, using its indexer rather than going through IEnumerable, etc.

At this point, given the original tasks called out on this thread, it looks like the only concrete remaining items would be around specializing methods like ToDictionary to have more knowledge of the original source. I suggest that if someone wants to implement such optimizations and do detailed measurements, such optimizations with associated measurements could be submitted as individual PRs, and an issue like this isn't needed to track them. So I'm going to leave this closed and not open new issues. Obviously folks should feel free to open subsequent issues / PRs for individual topics to be discussed / solutions implemented.

Thanks for the good discussion!

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 1.0.0-rtm milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Jan 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Linq tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

10 participants