Make Binary.lines O(n) instead of O(n^2) #209

Merged
merged 2 commits into from Apr 21, 2015

Conversation

Projects
None yet
2 participants
@ndmitchell
Contributor

ndmitchell commented Apr 21, 2015

Given the benchmark:

import Control.Monad.Trans.Resource
import Data.Conduit.Binary as C
import Data.Conduit as C
import Data.Conduit.List as C
import Data.Void
import qualified Data.ByteString.Char8 as BS
import Control.Monad
import Control.Monad.IO.Class
import System.Time.Extra
import Numeric.Extra

main = do
    let s = BS.singleton 'x'
    forM_ [0..1000] $ \i -> do
        (t, res) <- duration $ runConduit $ replicateM_ (i*10000) (yield s) =$= C.lines =$= C.map BS.length =$= C.fold (+) 0
        print (showDuration $ t / intToDouble i, res)

In other words, generate a lot of short bytestrings, call lines (which basically concats it, since there are no newlines), count the total number of lines and print the time per 10K characters, I see:

  • Before this patch: at 10K it takes 0.01s, at 300K it takes 0.13s per 10K elements.
  • After this patch: it takes 0.00s per 10K elements at all numbers.

The cause of the poor performance before was an O(n^2) term. Someone did the "append as a function trick" - which kinda works for lists, but doesn't work at all for bytestrings. I did the simpler (and in my experience, more performant) list that you reverse construction, which actually works as you then call S.concat once.

snoyberg added a commit that referenced this pull request Apr 21, 2015

Merge pull request #209 from ndmitchell/master
Make Binary.lines O(n) instead of O(n^2)

@snoyberg snoyberg merged commit 91e8ed8 into snoyberg:master Apr 21, 2015

@snoyberg

This comment has been minimized.

Show comment
Hide comment
@snoyberg

snoyberg Apr 21, 2015

Owner

Thanks. I haven't run it myself, but I'll trust your benchmark numbers. This is the first I've heard of a O(n^2) generated by using difference lists, is this a well known phenomenon?

Owner

snoyberg commented Apr 21, 2015

Thanks. I haven't run it myself, but I'll trust your benchmark numbers. This is the first I've heard of a O(n^2) generated by using difference lists, is this a well known phenomenon?

@ndmitchell

This comment has been minimized.

Show comment
Hide comment
@ndmitchell

ndmitchell Apr 21, 2015

Contributor

Difference lists tend to suck performance wise due to a high constant factor (thunks do not have zero cost). In this case the difference list wasn't the issue, but the lots of appends. This patch transforms:

(a `S.append` b) `S.append` c

Into:

S.concat [a,b,c]

Which does one allocation for the entire size, instead of n allocations of increasingly long prefixes of the list (which in the limit take n), making n^2. It doesn't matter whether you run the appends left or right associated, they're always n^2 as S.append a b takes O(a+b). In contrast, for lists/String, a ++ b takes O(a) so if you associate the right way you get O(n).

Contributor

ndmitchell commented Apr 21, 2015

Difference lists tend to suck performance wise due to a high constant factor (thunks do not have zero cost). In this case the difference list wasn't the issue, but the lots of appends. This patch transforms:

(a `S.append` b) `S.append` c

Into:

S.concat [a,b,c]

Which does one allocation for the entire size, instead of n allocations of increasingly long prefixes of the list (which in the limit take n), making n^2. It doesn't matter whether you run the appends left or right associated, they're always n^2 as S.append a b takes O(a+b). In contrast, for lists/String, a ++ b takes O(a) so if you associate the right way you get O(n).

@snoyberg

This comment has been minimized.

Show comment
Hide comment
@snoyberg

snoyberg Apr 21, 2015

Owner

Oh, of course, I just misread what you wrote originally. Thanks for the clarification.

Owner

snoyberg commented Apr 21, 2015

Oh, of course, I just misread what you wrote originally. Thanks for the clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment