findIndexOrEnd is faster in Data.ByteString than in Data.ByteString.Lazy for no reason. #334

noughtmare · 2020-12-18T15:06:16Z

The findIndexOrEnd function defined in Data.ByteString.Lazy is slower, I think because it uses two parameters in its loop:

bytestring/Data/ByteString/Lazy.hs

Lines 1411 to 1423 in 34f972c

    
           -- | 'findIndexOrEnd' is a variant of findIndex, that returns the length 
        
           -- of the string if no element is found, rather than Nothing. 
        
           findIndexOrEnd :: (Word8 -> Bool) -> P.ByteString -> Int 
        
           findIndexOrEnd k (S.BS x l) = 
        
               S.accursedUnutterablePerformIO $ 
        
                 withForeignPtr x $ \f -> go f 0 
        
             where 
        
               go !ptr !n | n >= l    = return l 
        
                          | otherwise = do w <- peek ptr 
        
                                           if k w 
        
                                             then return n 
        
                                             else go (ptr `plusPtr` 1) (n+1) 
        
           {-# INLINE findIndexOrEnd #-}

The strict findIndexOrEnd is here:

bytestring/Data/ByteString.hs

Lines 2026 to 2039 in 34f972c

    
           -- | 'findIndexOrEnd' is a variant of findIndex, that returns the length 
        
           -- of the string if no element is found, rather than Nothing. 
        
           findIndexOrEnd :: (Word8 -> Bool) -> ByteString -> Int 
        
           findIndexOrEnd k (BS x l) = 
        
               accursedUnutterablePerformIO $ withForeignPtr x g 
        
             where 
        
               g ptr = go 0 
        
                 where 
        
                   go !n | n >= l    = return l 
        
                         | otherwise = do w <- peek $ ptr `plusPtr` n 
        
                                          if k w 
        
                                            then return n 
        
                                            else go (n+1) 
        
           {-# INLINE findIndexOrEnd #-}

This difference is significant, I get about a 1.5x speedup with my benchmark. I still need to clean it up and I don't know if you are interested in seeing it. If you want I can link it in a gist or something. (EDIT: take these results with a grain of salt, I don't think my benchmark is completely correct)

EDIT: I have done some more benchmarks: the performance difference between the two versions is about 1.4x, but marking dropWhile inlineable makes it 3x faster than that for my benchmark. Here is the benchmark: https://gist.github.com/noughtmare/f2478b9ea7a466d33b3f0185dc51f0dd

I think it would be best if this function could be deduplicated, preferably by exporting it in Data.ByteString.Internal.

Additionally, the dropWhile function in Data.ByteString.Lazy is not marked INLINE and not even INLINABLE. I think this also causes a big performance difference in my code. Should I open a separate issue for this?

The text was updated successfully, but these errors were encountered:

sjakobi · 2020-12-18T19:38:00Z

Nice catch!

Indeed I see no reason why D.B.Lazy shouldn't simply use the findIndexOrEnd defined in Data.ByteString. A PR to fix this would be welcome.

Additionally, the dropWhile function in Data.ByteString.Lazy is not marked INLINE and not even INLINABLE. I think this also causes a big performance difference in my code. Should I open a separate issue for this?

Thanks for pointing this out! There are quite a few definitions in D.B.Lazy that suspiciously lack INLINE or INLINABLE pragmas. It would be good to check whether we're leaving any performance on the table there. Recording this in a separate issue sounds like a good idea! 👍

noughtmare · 2020-12-19T11:57:49Z

I'm actually not quite sure why findIndexOrEnd is not just exported in Data.ByteString. Maybe it has to do with its use of accursedUnutterablePerformIO?

sjakobi · 2020-12-19T12:14:50Z

I'm actually not quite sure why findIndexOrEnd is not just exported in Data.ByteString.

I believe it's not exported because of its somewhat crude semantics where the return value has special meaning when equal to the bytestring's length.

I think we can export findIndexOrEnd from D.B.Internal, so we can use it from D.B.Lazy.

noughtmare changed the title ~~Strict findIndexOrEnd is faster than lazy findIndexOrEnd for no reason.~~ findIndexOrEnd is faster in Data.ByteString than in Data.ByteString.Lazy findIndexOrEnd for no reason. Dec 18, 2020

noughtmare changed the title ~~findIndexOrEnd is faster in Data.ByteString than in Data.ByteString.Lazy findIndexOrEnd for no reason.~~ findIndexOrEnd is faster in Data.ByteString than in Data.ByteString.Lazy for no reason. Dec 18, 2020

sjakobi added performance pr-welcome labels Dec 18, 2020

noughtmare mentioned this issue Dec 19, 2020

Deduplicate findIndexOrEnd by exporting it from Data.ByteString.Internal #337

Merged

Bodigrim closed this as completed in #337 Jan 11, 2021

Bodigrim added this to the 0.11.1.0 milestone Jan 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

findIndexOrEnd is faster in Data.ByteString than in Data.ByteString.Lazy for no reason. #334

findIndexOrEnd is faster in Data.ByteString than in Data.ByteString.Lazy for no reason. #334

noughtmare commented Dec 18, 2020 •

edited

Loading

sjakobi commented Dec 18, 2020

noughtmare commented Dec 19, 2020

sjakobi commented Dec 19, 2020

findIndexOrEnd is faster in Data.ByteString than in Data.ByteString.Lazy for no reason. #334

findIndexOrEnd is faster in Data.ByteString than in Data.ByteString.Lazy for no reason. #334

Comments

noughtmare commented Dec 18, 2020 • edited Loading

sjakobi commented Dec 18, 2020

noughtmare commented Dec 19, 2020

sjakobi commented Dec 19, 2020

noughtmare commented Dec 18, 2020 •

edited

Loading