-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reimplement splits #349
Reimplement splits #349
Conversation
Using `Data.List.inits` to produce the initial segments saves quite a bit of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using AutoBench, I've compared the time perf of the two implementations and the result looks good.
fast_splits
is the proposed one, splits
is the current one.
―― Test summary ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
Programs fast_splits, splits
Data Random, size range [0,5..100]
Normalisation nf
QuickCheck ✔
GHC flags n/a
―― Analysis ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
fast_splits
Size 0 5 10 15 20 25 30 35
40 45 50 55 60 65 70 75
80 85 90 95 100
Time (μs) 0.031 0.616 2.533 4.202 5.301 7.294 15.82 15.95
17.88 22.98 26.58 33.20 35.82 44.96 57.78 111.0
118.4 124.9 202.4 272.8 271.3
Std dev (μs) 16.85
Average variance introduced by outliers: 98% (severely inflated)
Fits y = 3.49e-16 + 1.80e-14x + 8.75e-13x² + 3.35e-11x³
+ 2.56e-12x⁴
y = -7.62e-12 - 3.42e-10x - 1.16e-8x² + 3.96e-10x³
y = -3.87e-5 + 2.01e-10x + 2.82e-8x²
splits
Size 0 5 10 15 20 25 30 35
40 45 50 55 60 65 70 75
80 85 90 95 100
Time (μs) 0.016 0.847 2.865 5.787 7.992 11.59 15.40 20.79
33.53 41.96 40.39 57.93 70.99 77.91 145.4 213.9
222.8 152.8 426.5 278.9 296.1
Std dev (μs) 34.35
Average variance introduced by outliers: 98% (severely inflated)
Fits y = -2.90e-5 + 2.85e-10x + 3.65e-8x²
y = -5.65e-5 + 5.21e-7xlog₂(x)
y = 6.63e-12 + 3.08e-10x + 1.05e-8x² + 2.51e-10x³
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
Unless @HuwCampbell or @jacobstanley has a different opinion, we can merge this after addressing the style nit comment above.
I'm not into that style guide. Do what you like when you merge.
…On Fri, Jan 3, 2020, 9:34 AM Nikos Baxevanis ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In hedgehog/src/Hedgehog/Internal/Tree.hs
<#349 (comment)>
:
> @@ -320,12 +321,13 @@ mapMaybeT p m =
-- ]
--
splits :: [a] -> [([a], a, [a])]
-splits = \case
- [] ->
- []
- x : xs ->
- ([], x, xs) :
- fmap (\(as, b, cs) -> (x : as, b, cs)) (splits xs)
+-- This implementation is significantly more efficient than the others I've
+-- tried when the lists are long, thanks to the carefully optimized
+-- implementation of Data.List.inits.
+splits = \xs -> go (List.inits xs) xs
+ where
+ go (front : fronts) (x : xs) = (front, x, xs) : go fronts xs
+ go _ _ = []
(If not, we can do that instead.)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#349?email_source=notifications&email_token=AAOOF7PYXMPW6VRHLWWC3ALQ35EH7A5CNFSM4KBDHYAKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCQTRO7Y#discussion_r362829257>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7I6FLWWLEA5QUI2UUDQ35EH7ANCNFSM4KBDHYAA>
.
|
…into treeowl-splits
@moodmosaic I don't think your benchmark has enough data to really support those regression curves. There's too much noise, and presumably heavy cache effects on the low end. It all looks good for my change, but I think you'll see more pronounced/reliable effects with larger list sizes. Moreover, a cubic fit (as shown in the graph) doesn't seem to have any obvious theoretical support; I'd expect the reality to be approximately piecewise quadratic. |
Yep, agreed, FYI it's tested with sizes up to 100, and the cubit as well as the NF were just the defaults. I just wanted to have something to refer to, if needed. Better from nothing, I suppose. |
Using
Data.List.inits
to produce the initial segments savesquite a bit of time.