Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quadratic performance on left-associated binds #234

Open
tomjaguarpaw opened this issue Oct 2, 2021 · 7 comments
Open

Quadratic performance on left-associated binds #234

tomjaguarpaw opened this issue Oct 2, 2021 · 7 comments

Comments

@tomjaguarpaw
Copy link

tomjaguarpaw commented Oct 2, 2021

The problem

Consider walking over a tree using a streaming library to do something at every leaf:

data Tree = Branch Tree Tree | Leaf Int

walkTreeTransformer :: (Monad (m IO), MonadTrans m)
                    => (Int -> IO ()) -> Tree -> m IO ()
walkTreeTransformer doSomething = loop
  where loop = \case
          Leaf i -> lift (doSomething i)
          Branch t1 t2 -> do
            loop t1
            loop t2

On left-skewed trees, i.e. trees that will end up generating left-associated binds, for example those generated by leftSkewed

-- Fully evaluated left-skewed tree
leftSkewed :: Int -> Tree
leftSkewed 0 = Leaf 0
leftSkewed n = (Branch $! leftSkewed (n - 1)) (Leaf n)

the performance of your streaming library is quadratic. In fact performance is quadratic under

(Neither streamly nor conduit have quadratic performance)

The plot below demonstrates the claim. The code to generate the plot is available at https://github.com/tomjaguarpaw/streaming-benchmark. The compiler was GHC 8.10.7.

image

The solution

Firstly, before talking about a solution, do you actually consider this a bug? I do, and I think the risk of hitting unexpected quadratic performance makes this library unsuitable for production use as it is. On the other hand maybe you have a good reason that I shouldn't think that. If so I would be interested to hear it.

If you do think this is a bug then let's think about what can be done. I know of two straightforward options:

  1. Add an explicit constructor for binds and then reassociate binds during elimination
  2. Use Codensity (streamly essentially uses handwritten Codensity.)

In my benchmarks explicit bind comes out slightly ahead. (The examples are implemented to match streaming but the same techniques ought to apply to your library.)

Perhaps there is another option more suitable for your library.

I welcome your thoughts.

(Related to #131)

@tomjaguarpaw
Copy link
Author

tomjaguarpaw commented Oct 3, 2021

[N.B. conduit doesn't have quadratic performance. I made a mistake when setting up the benchmarks: tomjaguarpaw/streaming-benchmark@2a965a7]

@mitchellwrosen
Copy link
Contributor

For what it's worth, this is mentioned in the tutorial: https://hackage.haskell.org/package/pipes-4.3.16/docs/Pipes-Tutorial.html#g:10

Here's an old ticket about it: #100

@tomjaguarpaw
Copy link
Author

tomjaguarpaw commented Oct 3, 2021

Thanks for pointing that out. My comments:

  • The tell-tale sign of the issue is the use of "List done wrong" (i.e. a monad action returning a list).

    I don't agree with this (the same claim is made in the tutorial). There's no monadic action returning a list in my example.

  • conduit does not do this optimization either, but is currently the mostly highly used streaming library, so I don't think this flaw is as serious as you make it out to be.

    This claim is no longer true. Michael Snoyman corrected me about it this morning.

  • I do not agree with the argument that just because an uninformed user can potentially write code with poor time complexity that it is a serious bug. If I did, then I would also conclude that we should abandon all linked lists for difference lists because uninformed Haskell beginners can (and do) write linked list code that gives quadratic time complexit

    Personally I do think that we should abandon linked lists but not for difference lists, for Seq, Vector/Array or for a streaming library as appropriate.

  • I don't think that the option of a Bind constructor plus carefully written run has been considered. Perhaps that would be considered less complex than Codensity.

@Gabriella439
Copy link
Owner

I don't have plans to fix this myself, but I'd accept a pull request if it didn't lead to a significant performance regression for other benchmarks

@tomjaguarpaw
Copy link
Author

I don't have plans to fix this myself, but I'd accept a pull request if it didn't lead to a significant performance regression for other benchmarks

Which sort of change would you accept? Previously you said you would prefer not to CPS the internal representation. The alternative is to add a Bind constructor plus carefully written run (and perhaps carefully written other things).

@Gabriella439
Copy link
Owner

Gabriella439 commented Oct 11, 2021

Bind + carefully written run, but I'd have to see the full change to be sure, since I'm not sure what other changes are entailed

@tomjaguarpaw
Copy link
Author

Thanks Gabriella. To clarify, I don't have plans to fix this either but I think it's helpful to have the requirements spelled out like this in case anyone else wants to take it on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants