Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Combine linq list optimisations #5777

Merged
merged 1 commit into from
Feb 15, 2016

Conversation

JonHanna
Copy link
Contributor

A few different optimisations to Linq added recently can be usefully combined with each other.

Combine IArrayProvider and IListProvider

Anything that can be one can be the other, so merge the two interfaces into IIListProvider.

Anything implementing these can also fast-path Count(), so do so.

Have IList<T> optimised result of Skip() partitionable.

Optimisation of Skip() for IList<T> sources from #4551 fits with optimisations of Skip() and Take() for other sources from #2401.

Combine the approaches, extending how the result of Skip() on a list handles subsequent operations

@stephentoub
Copy link
Member

cc: @VSadov

@VSadov
Copy link
Member

VSadov commented Jan 29, 2016

LGTM nice change!

@stephentoub
Copy link
Member

@JonHanna, I'm going to go through and review, but could rebase first to resolve the merge conflicts?

@JonHanna JonHanna force-pushed the combine_linq_list_optimisations branch from 61722b4 to e981b71 Compare January 30, 2016 02:46
@JonHanna
Copy link
Contributor Author

Nearly managed to have the simultaneous PRs out of each others way, but for one line.

@@ -854,7 +869,8 @@ public static IEnumerable<TSource> Take<TSource>(this IEnumerable<TSource> sourc
if (count <= 0) return new EmptyPartition<TSource>();
IPartition<TSource> partition = source as IPartition<TSource>;
if (partition != null) return partition.Take(count);
return TakeIterator<TSource>(source, count);
IList<TSource> sourceList = source as IList<TSource>;
return (sourceList != null) ? new SkipListIterator<TSource>(sourceList, 0, count - 1) : TakeIterator(source, count);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should SkipListIterator really be named SkipTakeListIterator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a better name. The current one shows its legacy more than its function.

@stephentoub
Copy link
Member

Seems like there are some non-ASCII characters in the commit title:

c:\Users\stoub\Source\repos\corefx>git rebase master
First, rewinding head to replay your work on top of it...
Applying: Combine I?ArrayProvider and IListProvider

Warning: Your console font probably doesn't support Unicode. If you experience strange characters in the output, consider switching to a TrueType font such as Consolas!

@stephentoub
Copy link
Member

Thanks for working on this. Seems like there are several non-trivial issues to address before this can be merged.

@JonHanna
Copy link
Contributor Author

Hmm. A Left-To-Right Mark got into the title. I don't know what keypress makes that happen!

@JonHanna JonHanna force-pushed the combine_linq_list_optimisations branch 3 times, most recently from dd9bc31 to b6b1e49 Compare January 30, 2016 22:49
@JonHanna
Copy link
Contributor Author

@stephentoub I've included the CountIsConstantTime property on IIListProvider, which allows code to decide whether it's Count can be used for an optimised collection-building path, but since nothing yet uses it, the drop in test coverage is due to that.

@stephentoub
Copy link
Member

I've included the CountIsConstantTime property on IIListProvider

Thanks, though would it not make more sense to use a TryGetCount method like I suggested and avoid the additional interface call? Are there cases where you'd want to know it's cheap and not actually get it?

@JonHanna
Copy link
Contributor Author

The problem with that approach is that there are two possibilities:

  1. We want Count to facilitate an optimisation that is only worth it if Count is cheap.
  2. We want Count. Obviously the cheaper the better.

All of the changes here improve the second case, which is the only case currently used.

The first would come up if we wanted to handle a subsequent operation on an IPartition with a class that would provide ToArray() or ToList(). I intend to do that for Select() (which already goes through a virtual call, so there wouldn't even be the cost of an extra type check), and no other method seems like a likely candidate.

Maybe what I should do is int GetCount(bool onlyIfCheap) and if true is passed and the operation wouldn't be sufficiently cheap, return -1.

@JonHanna JonHanna force-pushed the combine_linq_list_optimisations branch from b6b1e49 to 4e80453 Compare January 31, 2016 00:08
@JonHanna
Copy link
Contributor Author

I've done that. It does seem to give the best of both worlds.

@JonHanna
Copy link
Contributor Author

Test Innerloop OSX Debug Build and Test
Test Innerloop OSX Release Build and Test
Test Innerloop Ubuntu Debug Build and Test
Test Innerloop Ubuntu Release Build and Test

/// <param name="onlyIfCheap">If true then the count should only be calculated if doing
/// so is quick (sure or likely to be constant time), otherwise -1 should be returned.</param>
/// <returns>The number of elements.</returns>
int GetCount(bool onlyIfCheap);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it look like this is always called with "false"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, yes. Through the comment trail, first I had this as a Count property and @stephentoub pointed out that someone might use it as a false economy in an optimisation that would be worse when that property wasn't cheap. I said that while it wasn't necessary yet, I'd plans for that (I'm holding back on further changes that build on this as the changes will be massive if I put them all in one PR) and brought a property that reported on whether Count was cheap or not. Stephen pointed out that this would mean two interface calls would be used, and agreeing with him I changed to this. As of right now though, it's still only called with false.

@JonHanna
Copy link
Contributor Author

Test Innerloop CentOS7.1 Release Build and Test

@JonHanna
Copy link
Contributor Author

Test Innerloop CentOS7.1 Release Build and Test
Test Innerloop OpenSUSE13.2 Debug Build and Test

@JonHanna
Copy link
Contributor Author

Innerloop CentOS7.1 Release Build and Test
(Clean-up failure)

@stephentoub
Copy link
Member

@dotnet-bot test this please

1 similar comment
@stephentoub
Copy link
Member

@dotnet-bot test this please

@JonHanna
Copy link
Contributor Author

Test Innerloop CentOS7.1 Release Build and Test

@@ -15,6 +15,8 @@ public static int Count<TSource>(this IEnumerable<TSource> source)
if (source == null) throw Error.ArgumentNull("source");
ICollection<TSource> collectionoft = source as ICollection<TSource>;
if (collectionoft != null) return collectionoft.Count;
IIListProvider<TSource> listProv = source as IIListProvider<TSource>;
if (listProv != null) return listProv.GetCount(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: would you mind using a named argument at the call site so that it's clear what the Boolean means? (Same for all the occurrences of this).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Of the three places this was called that weren't passing through that value, there was this and two cases where ListPartition called it on itself. I've made those two into a call to a private Count property, to make it clearer that the flag isn't relevant to that use.

That leaves this as the sole actual use of it, but I have plans for it 😄

@JonHanna JonHanna force-pushed the combine_linq_list_optimisations branch from 6871a86 to cd1d2b4 Compare February 12, 2016 16:46
{
IIListProvider<TElement> listProv = source as IIListProvider<TElement>;
if (listProv != null) return listProv.GetCount(onlyIfCheap);
return !onlyIfCheap || source is ICollection || source is ICollection<TElement>? source.Count() : -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: space before the ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also swap the order of checking ICollection<TElement> and ICollection. If Count() takes that as the sensible order to check them, I should either do the same or else demonstrate why they should both be changed.

@JonHanna JonHanna force-pushed the combine_linq_list_optimisations branch 2 times, most recently from 6475a2b to 896553a Compare February 12, 2016 17:18
@@ -15,7 +15,8 @@ public static IEnumerable<TSource> Take<TSource>(this IEnumerable<TSource> sourc
if (count <= 0) return new EmptyPartition<TSource>();
IPartition<TSource> partition = source as IPartition<TSource>;
if (partition != null) return partition.Take(count);
return TakeIterator(source, count);
IList<TSource> sourceList = source as IList<TSource>;
return sourceList != null ? new ListPartition<TSource>(sourceList, 0, count - 1) : TakeIterator(source, count);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: FWIW, in cases like this where there's already a bunch of cases like:

IInterface1 i1 = source as IInterface1;
if (i1 != null) return i1.Whatever();
IInterface1 i2 = source as IInterface2;
if (i2 != null) return i2.Whatever();
...

etc., I'd find the last one easier to understand as:

IInterfaceN iN = source as IInterfaceN;
if (iN != null) return iN.Whatever();
return Whatever();

rather than:

IInterfaceN iN = source as IInterfaceN;
return iN != null ? iN.Whatever() : Whatever();

Normally I like ?:, but in this case it breaks from the pattern used above it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Skip() ended up with a ?: from the way its changes happened, and this was analogous and so ended up with it too, but I'll change both of them to the more consistent pattern.

Anything that can serve as one can serve as the other, and also provide
a faster path for Count(). Merge the two interfaces and add a Count
property.

Have IList optimised result of Skip() partitionable.

Optimisation of Skip() for IList sources from dotnet#4551 fits with
optimisations of Skip() and Take() for other sources from dotnet#2401.

Combine the approaches, extending how the result of Skip() on a
list handles subsequent operations.
@JonHanna JonHanna force-pushed the combine_linq_list_optimisations branch from 896553a to a087c2d Compare February 12, 2016 17:30
@JonHanna
Copy link
Contributor Author

Test Innerloop Windows_NT Debug Build and Test please, because I'm not optimistic enough to ask about OpenSUSE.

@stephentoub
Copy link
Member

LGTM

@stephentoub
Copy link
Member

@VSadov, look good to you?

@VSadov
Copy link
Member

VSadov commented Feb 15, 2016

LGTM

VSadov added a commit that referenced this pull request Feb 15, 2016
@VSadov VSadov merged commit 6a0c20d into dotnet:master Feb 15, 2016
@JonHanna JonHanna deleted the combine_linq_list_optimisations branch February 15, 2016 22:41
@karelz karelz modified the milestone: 1.0.0-rtm Dec 3, 2016
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…_optimisations

Combine linq list optimisations

Commit migrated from dotnet/corefx@6a0c20d
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants