Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks and core inspections need to be made #22

Open
kvanbere opened this issue Jan 19, 2014 · 5 comments
Open

Benchmarks and core inspections need to be made #22

kvanbere opened this issue Jan 19, 2014 · 5 comments

Comments

@kvanbere
Copy link

There's a lot of new exciting things being added in at the moment, which will make this library the BEST way to setup a server so far!

When development settles down and the API becomes concrete, to complement the excellent performance of pipes, this library should be benchmarked/micro-optimised and profiled to make sure it's server-ready (and catch deadlocks or loopholes that may have been missed in the process).

This is a ticket to hold performance notes and ideas.

@kvanbere
Copy link
Author

I'm personally a little bit sus about all the closures being used and such.

@Gabriella439
Copy link
Owner

Just so other people now, right now most of my optimization work is being poured into pipes-parse, which is the current performance bottle-neck. The issue is that FreeT makes binds in the base monad obligatory, and I'm working on a replacement type that will make the bind optional (at the slight cost of making the monad transformer laws slightly incorrect). The type I'm looking at to replace FreeT is:

data List f m r = Impure (m (List f m r)) | Cons (f (List f m r)) | Nil r

I'm actually planning on going one step further and replacing the use of Producers within pipes-parse with the following type:

data Of a r = a `Onto` r

-- example type signature I'm aiming for
lines :: List (Of String) IO r -> List (List (Of String IO) IO r

Here's an example of what code written using List would look like:

zip :: Monad m => List (Of a) m r -> List (Of b) m r -> List (Of (a, b)) m r
zip l1 l2 = case l1 of
    Nil            r    -> Nil r
    Impure         io   -> Impure (liftM (`zip` l2) io)
    Cons (a `Onto` l1') -> zipR l2
      where
        zipR l2 = case l2 of
            Nil            r    -> Nil r
            Impure         io   -> Impure (liftM zipR io)
            Cons (b `Onto` l2') -> Cons ((a, b) `Onto` zip l1' l2')

There would be an isomorphism from List (Of a) m r to Producer a m r for compatibility with pipes, and this would play nicely with the upcoming lens support.

Note that the reason I'm switching to List is not only performance, but to also simplify the writing of lenses for the upcoming lens-based parsing approach. Right now my observation is that writing new lenses is the difficult part of the new pipes-parse API and the List type greatly simplifies this process.

@Gabriella439
Copy link
Owner

Also, the use of closures is primarily for two reasons:

  • Performance (I'm manually implementing the worker/wrapper transformation for the compiler)
  • Space leaks. You'll notice that some of the closures appear to be doing absolutely nothing (like stdinLn from Pipes.Prelude). This is because back in the old days of pipes there would be top-level recursive functions and they would sometimes trigger bizarre corner cases that would cause space leaks. Making a trivial call to a recursive function (i.e. go) would fix this problem. This is why none of the top-level functions are directly recursive.

@kvanbere
Copy link
Author

Cheers! When I was talking about closures I was more talking about being sus about the dispatch performance for read/write for pipes-concurrency, since they use PAPs heavily.

Keep up the good work.

@Gabriella439
Copy link
Owner

If you want to know where might be room for performance improvement, I can think of three main areas:

  • Decrease the cost of using lift. This is tricky and the only approach I know of requires very sophisticated rewrite rules using the monad transformer laws, including rules for functions defined in Control.Monad, which is pretty sketchy.
  • Decrease the cost of pull-based composition (i.e. (>->)) when it cannot be transformed to use shortcut fusion. This might require playing with closures or INLINE/INLINABLE annotations.
  • Improve performance of pipes-bytestring when dealing with small chunks, especially when writing to a handle. Right now, stdout does not buffer bytestring chunks so if you do a manipulation that splits the bytestring into lots of tiny pieces you get tons of small writes which kills performance. There are other operations which also suffer from shredding the bytestring, but this is the most notable one. However, this optimization should probably wait until after changing pipes-parse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants