Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ToArray() and ToList() with length #70

Closed
wants to merge 7 commits into from
Closed

ToArray() and ToList() with length #70

wants to merge 7 commits into from

Conversation

konrad-gora
Copy link

Simple overload for ToArray() and ToList() with lenght parameter. With lenght we can avoid multiple creations of array/list when rewriting IEnumerable to these collections.

Copy link
Owner

@viceroypenguin viceroypenguin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I like the proposal. I have reservations about the semantics of .ToArray(int length) as stated in the comments. Some code improvements and .ToList(int length) will be fine. Thanks for the PR!

README.md Outdated Show resolved Hide resolved
Source/SuperLinq/ToList.cs Show resolved Hide resolved
Source/SuperLinq/ToList.cs Outdated Show resolved Hide resolved
public static partial class SuperEnumerable
{
/// <summary>
/// Creates an array from <see cref="IEnumerable{T}"/>.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think .ToArray(int length) is too ambiguous and deserves further discussion:

  • Does the consumer expect an array of length length always? (my initial expectation)
  • Does the consumer expect an array of the length of the sequence, and the length is an advisory for allocation performance? (your expectation)

I see both sides, and I'm not sure which would be the natural expectation. @Arithmomaniac do you have any thoughts on this?

Copy link
Contributor

@Arithmomaniac Arithmomaniac Oct 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the List constructor it's an advisory (new List<T>(3)), and in array initialization (new T[3]) it's a definition - maybe we should match that semantics here as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suggest throwing an exception in ToArray if length doesn't match to data in source?
I think we should have ToArray and ToList consistent with each other.
I preferred a safe solution (like in List<>).
Maybe we should consider naming this parameter to make it more clear. I'm thinking about initialCapacity (similar to List ctor).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks- that explains my expectation. @konrad-gora please update to a fixed array size with no resizing. Throw an exception using ThrowHelper if the array is not large enough for source.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If source is too short, then remaining items are left as default. If source is too long, then throw an exception.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to use TryGetCollectionCount and/or (in dotnet 6) TryGetNonEnumeratedCount when doing that array size check

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but I couldn't agree. ToList() and ToArray() should be consistent with each other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree if they were consistent in dotnet. But in a library like this, you're trying to extend the language idiomatically. (Unlike, for example, https://github.com/louthy/language-ext).

Source/SuperLinq/ToArray.cs Show resolved Hide resolved
Tests/SuperLinq.Test/ToArrayTests.cs Show resolved Hide resolved
Tests/SuperLinq.Test/ToListTests.cs Show resolved Hide resolved
var result = enumerable.ToList(Length);

Assert.Equal(enumerable, result);
Assert.True(result.Capacity > Length);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert is not necessary, as if the capacity is less than length, then the Assert.Equal() will fail due to invalid size.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test shows that we don't trim the List. I was also thinking about adding an optional bool (shouldBeTrimmed = false) to make it possible to trim the List to the expected length.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trimming is not a behavior used by the corefx .ToList(). If the consumer wants to trim after the fact, they can do so, but there is no benefit for us to do so (automatically or based on parameter).

@Arithmomaniac
Copy link
Contributor

The dotnet source code is already optimized for many, if not most, cases where the length can be precomputed. This is nice as a shortcut for the constructors, but I wouldn't add it for supposed optimization benefits.

Guard.IsGreaterThanOrEqualTo(length, 0);

var resultList = new List<T>(length);
resultList.AddRange(source);
Copy link
Contributor

@Arithmomaniac Arithmomaniac Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is as or less efficient than using ToList (which has lots of optimizations for taking into account input length and array sources, some based on the internal inferface IIListProvider) in virtually all scenarios. The only exception I can think of would be if you somehow knew in advance the count of an enumerable that does not implement ICollection or IIListProvider, which is extremely rare.

If you still think this is worthwhile to add, a better implementation would be

var resultList = new List<T>(source);
resultList.EnsureCapacity(int capacity);
return resultList;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this applies to ToArray as well, especially since there are internal-only methods that allow copying collections into arrays.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about cases when I want to convert to list results of Where? Or some custom generator methods that use yield return?

@viceroypenguin
Copy link
Owner

Closing in favor of #72. Making the consumer provide the reference avoids this discussion entirely, and provides further opportunities for improvement (ArrayPool.Rent for example).

@konrad-gora if you would like to take a try at implementing #72, by all means. Otherwise, I'll put something up next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants