-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decompose TagFirstLast
implementation for speed
#643
Conversation
8d50b73
to
e61eaf0
Compare
Depends on #928 that introduce a test for TagFirstLast. |
Codecov Report
@@ Coverage Diff @@
## master #643 +/- ##
==========================================
- Coverage 92.41% 92.41% -0.01%
==========================================
Files 112 112
Lines 3426 3439 +13
Branches 1017 1021 +4
==========================================
+ Hits 3166 3178 +12
Misses 199 199
- Partials 61 62 +1
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
600b63b
to
02932e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subject to correction of test in #928, approved
45aef84
to
0717f2e
Compare
TagFirstLast
implementation for speed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we publish the performance numbers using BenchmarkDotNet? I'm afraid that a screenshot doesn't say much about the configuration, runtime and other factors could have influenced the run.
Also, please bear in mind that TagFirstLast
that reuses CountDown
also benefits from its optimisations for lists and collections so the implementation in this PR could introduce a performance regression in those cases. The benchmarks should therefore include sequences, lists & collections as sources.
I'd argue (without proof, but reasonable belief) that collection evaluation would not improve performance significantly, if at all, for this operator. Knowing the size of the collection improves performance in one of two cases: when it allows us to avoid iterating the collection at all, or when it allows us to avoid buffering ( In this case, we need to iterate the full list anyway (since, we are returning an enumeration that contains every value from the original), and we are only buffering a single value, and that only for a single loop each. As such, I would find it surprising (but not impossible) to find a performance improvement using |
I will try to find time for this, sorry for the screenshot, I can't believe I was able to do such a thing 3 years ago 🙄
Actually MoreLINQ/MoreLinq/TagFirstLast.cs Lines 57 to 64 in f4806f5
Since But we can add some over the implementation proposed in this PR. |
Cool.
That's a good point.
Let's benchmark and we can always add those later since the optimisations (if any) weren't getting leveraged due to chaining with |
I quickly set up some benchmarks (available here).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing the benchmark numbers and code.
Since this operator is now implemented entirely on its own, it's worth updating the tests to use TestingSequence
(to ensure disposal and single-pass iteration of the source sequence) like so:
diff --git a/MoreLinq.Test/TagFirstLastTest.cs b/MoreLinq.Test/TagFirstLastTest.cs
index b4bf9cb..acab948 100644
--- a/MoreLinq.Test/TagFirstLastTest.cs
+++ b/MoreLinq.Test/TagFirstLastTest.cs
@@ -26,9 +26,10 @@ public class TagFirstLastTest
public void TagFirstLastDoesOneLookAhead()
{
var source = MoreEnumerable.From(() => 123, () => 456, BreakingFunc.Of<int>());
- source.TagFirstLast((item, isFirst, isLast) => new { Item = item, IsFirst = isFirst, IsLast = isLast })
- .Take(1)
- .Consume();
+ using var result = source.TagFirstLast((item, isFirst, isLast) => new { Item = item, IsFirst = isFirst, IsLast = isLast })
+ .AsTestingSequence();
+ result.Take(1).Consume();
+
}
[Test]
@@ -41,24 +42,27 @@ public void TagFirstLastIsLazy()
public void TagFirstLastWithSourceSequenceOfZero()
{
var source = Enumerable.Empty<int>();
- var sut = source.TagFirstLast(BreakingFunc.Of<int, bool, bool, int>());
- Assert.That(sut, Is.Empty);
+ using var result = source.TagFirstLast(BreakingFunc.Of<int, bool, bool, int>())
+ .AsTestingSequence();
+ Assert.That(result, Is.Empty);
}
[Test]
public void TagFirstLastWithSourceSequenceOfOne()
{
var source = new[] { 123 };
- source.TagFirstLast((item, isFirst, isLast) => new { Item = item, IsFirst = isFirst, IsLast = isLast })
- .AssertSequenceEqual(new { Item = 123, IsFirst = true, IsLast = true });
+ using var result = source.TagFirstLast((item, isFirst, isLast) => new { Item = item, IsFirst = isFirst, IsLast = isLast })
+ .AsTestingSequence();
+ result.AssertSequenceEqual(new { Item = 123, IsFirst = true, IsLast = true });
}
[Test]
public void TagFirstLastWithSourceSequenceOfTwo()
{
var source = new[] { 123, 456 };
- source.TagFirstLast((item, isFirst, isLast) => new { Item = item, IsFirst = isFirst, IsLast = isLast })
- .AssertSequenceEqual(new { Item = 123, IsFirst = true, IsLast = false },
+ using var result = source.TagFirstLast((item, isFirst, isLast) => new { Item = item, IsFirst = isFirst, IsLast = isLast })
+ .AsTestingSequence();
+ result.AssertSequenceEqual(new { Item = 123, IsFirst = true, IsLast = false },
new { Item = 456, IsFirst = false, IsLast = true });
}
@@ -66,8 +70,9 @@ public void TagFirstLastWithSourceSequenceOfTwo()
public void TagFirstLastWithSourceSequenceOfThree()
{
var source = new[] { 123, 456, 789 };
- source.TagFirstLast((item, isFirst, isLast) => new { Item = item, IsFirst = isFirst, IsLast = isLast })
- .AssertSequenceEqual(new { Item = 123, IsFirst = true, IsLast = false },
+ using var result = source.TagFirstLast((item, isFirst, isLast) => new { Item = item, IsFirst = isFirst, IsLast = isLast })
+ .AsTestingSequence();
+ result.AssertSequenceEqual(new { Item = 123, IsFirst = true, IsLast = false },
new { Item = 456, IsFirst = false, IsLast = false },
new { Item = 789, IsFirst = false, IsLast = true });
}
With this, we'll be good to merge!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atifaziz I think you recommended the wrong change. .AsTestingSequence()
should be done on the source
, and then .TagFirstLast()
should be done on the TestingSequence
returned by .AsTestingSequence()
. Tests do not actually test that TagFirstLast()
disposes properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atifaziz I think you recommended the wrong change.
.AsTestingSequence()
should be done on thesource
, and then.TagFirstLast()
should be done on theTestingSequence
returned by.AsTestingSequence()
. Tests do not actually test thatTagFirstLast()
disposes properly.
Right, that's what happens when you do a review/recommendation while in a rush to catch the shops before they close. I've always suffered from mistakes when I go fast, but then when I go slow, the project suffers. Anyway, fortunately, there's more than one pair of eyes on this. @viceroypenguin Thanks for spotting this.
@Orace Sorry for misleading and would appreciate if you could make the changes to chain AsTestingSequence()
on the source sequence to the operator rather than the one that's its result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This implementation avoid the creation of a
KeyValuePair
for each item of the sequence.It improves evaluation of a 0 returning resultSelector by a factor of 3