How to make Rewrite Rules Fire #4

After trying a simple test, I noticed some strange performance results from stylistic changes to the code.

For example,

{-# LANGUAGE FlexibleContexts #-}

import Control.DeepSeq
import Data.Int
import qualified Data.Vector.Unboxed as U

{-# INLINE f #-}
f :: U.Vector Int64 -> U.Vector Int64 -> U.Vector Int64
f = U.zipWith (+) -- version 1
--f x = U.zipWith (+) x -- version 2
--f x = (U.zipWith (+) x) . id -- version 3
--f x y = U.zipWith (+) x y -- version 4

main = do
  let iters = 100
      dim = 221184
      y = U.replicate dim 0 :: U.Vector Int64
  let ans = iterate (f y) y !! iters
  putStr $ (show $ U.foldl1' (+) ans)

Versions 1 and 2 run in 1.6 seconds, while versions 3 and 4 run in 0.09 seconds (with vector- and GHC 7.6.2). According to this answer, this problem is because the first two versions use Generic vector code rather than low-level imperative code.

Is there anything that a user can do to ensure the best vector code is used without more or less guessing at a style that will make the rule fire (I'm thinking pragmas, compiler flags, any other hints, intuition about why style "x" is a bad choice, etc)?


I think it's quirks of GHC's inliner. It will inline function only if function application is saturated. E.g it will inline zipWith f xs ys and won't zipWith f.

zipWith :: (Vector v a, Vector v b, Vector v c) => (a -> b -> c) -> v a -> v b -> v c
{-# INLINE zipWith #-}
zipWith f xs ys = unstream (Bundle.zipWith f (stream xs) (stream ys))

Only remedy I can think of is to eta-contract zipWith:

zipWith f = \xs ys -> ...

Here's the relevant GHC trac. Someone did indeed suggest eta-contracting zipWith as a solution, but hopefully there will be a better way soon.

It's done to allow inlining even if function application is not fully
saturated. It's important because simple stylistic change could change
performance by order of magnitudes

