Memory Leak associated with apply #52

Closed
bjoeris opened this Issue Feb 18, 2013 · 12 comments

Comments

Projects
None yet
3 participants
@bjoeris

bjoeris commented Feb 18, 2013

I've run into a memory leak exemplified by the following code, which fills up about 100MB/sec. The key factors seem to be:

  1. An event foo that is firing rapidly
  2. An event bar that is constructed from apply; it makes no difference whether bar is firing or not, or whether bar is in any way related to foo.
  3. Both events reactimated (in this case to do nothing)

I haven't been able to figure much out from profiling. The allocations seem to be associated with dozens of different cost centers.

I am using GHC 7.6.1 and reactive-banana 0.7.1.1.

{-# LANGUAGE ScopedTypeVariables #-}
module Main where

import Reactive.Banana
import Reactive.Banana.Frameworks

networkDescription :: forall t. (Frameworks t) => AddHandler () -> Moment t ()
networkDescription addHandler =  do
    foo <- fromAddHandler addHandler
    let bar = (pure (const True) :: Behavior t (()->Bool)) `apply` foo
    --let bar = (pure (const True) :: Behavior t (()->Bool)) `apply` never :: Event t Bool -- this line has the same problem
    --let bar = (const True) `fmap` foo                                                    -- but this line fixes the problem

    -- commenting out either reactimate also fixes the problem
    reactimate $ fmap (const $ return()) foo
    reactimate $ fmap (const $ return ()) bar

main::IO()
main = do
    (addHandler,runHandlers) <- newAddHandler
    network <- compile (networkDescription addHandler)
    actuate $ network
    let loop = do
        runHandlers ()
        loop
    loop
@HeinrichApfelmus

This comment has been minimized.

Show comment Hide comment
@HeinrichApfelmus

HeinrichApfelmus Feb 18, 2013

Owner

Thanks a lot for your report!

Other people have reported space leaks as well, so I am currently working on it. The core issue is that behavior values need to be evaluated fully, which reactive-banana currently doesn't do very well. I can still reproduce your report in the current devel branch, though the rate has dropped to 0.5MB/sec.

Owner

HeinrichApfelmus commented Feb 18, 2013

Thanks a lot for your report!

Other people have reported space leaks as well, so I am currently working on it. The core issue is that behavior values need to be evaluated fully, which reactive-banana currently doesn't do very well. I can still reproduce your report in the current devel branch, though the rate has dropped to 0.5MB/sec.

HeinrichApfelmus added a commit that referenced this issue Mar 16, 2013

Use the new `Data.Vault.Strict` to store the latch values in a `Vault…
…` that is strict both in the spine and the values.

Fix strictness of `applyL` to force applications of `readValueL` that would otherwise retain the `Vault` of old latch values.

Fix issue #52.
@HeinrichApfelmus

This comment has been minimized.

Show comment Hide comment
@HeinrichApfelmus

HeinrichApfelmus Mar 16, 2013

Owner

Thanks again for reporting the space leak, bjoeris!

I have now figured out where exactly it came from and how to fix it. The latest development version eliminates the leak. (It does introduce a couple of unrelated test case failures regarding recursion, though, so you may not want to use it for production just yet.)

Please don't hesitate to report any additional leaks that you might encounter!

Owner

HeinrichApfelmus commented Mar 16, 2013

Thanks again for reporting the space leak, bjoeris!

I have now figured out where exactly it came from and how to fix it. The latest development version eliminates the leak. (It does introduce a couple of unrelated test case failures regarding recursion, though, so you may not want to use it for production just yet.)

Please don't hesitate to report any additional leaks that you might encounter!

@rwbarton

This comment has been minimized.

Show comment Hide comment
@rwbarton

rwbarton Oct 19, 2013

This fix (or something on the develop branch anyways, there isn't much) definitely fixed a space leak for me also. Can it get merged into master at some point, pretty please? :)

This fix (or something on the develop branch anyways, there isn't much) definitely fixed a space leak for me also. Can it get merged into master at some point, pretty please? :)

@rwbarton

This comment has been minimized.

Show comment Hide comment
@rwbarton

rwbarton Oct 19, 2013

Oh wait, this fix also causes reactive-banana to compute many things that aren't needed, which slows my program down a lot :(

Oh wait, this fix also causes reactive-banana to compute many things that aren't needed, which slows my program down a lot :(

@HeinrichApfelmus

This comment has been minimized.

Show comment Hide comment
@HeinrichApfelmus

HeinrichApfelmus Oct 20, 2013

Owner

Well, you only have the choice between computing things or leaking them... Reactive-banana is probably still to blame, though. Any chance you could give me a short overview of the problematic code and/or a link to it, so I can have a look?

Owner

HeinrichApfelmus commented Oct 20, 2013

Well, you only have the choice between computing things or leaking them... Reactive-banana is probably still to blame, though. Any chance you could give me a short overview of the problematic code and/or a link to it, so I can have a look?

@rwbarton

This comment has been minimized.

Show comment Hide comment
@rwbarton

rwbarton Oct 21, 2013

Let me give an overview first, and I can try to construct a small example later if it would be helpful.

I have code that in essence is like this:

e = liftA2 mplus (f <$> a <*> b) (g <$> c <*> d)

where a, b, c, d, e are Behaviors. a through d maybe come from steppers or accumB or something, in which case I want to compute them eagerly. But in my case f is usually cheap to compute and usually Just something, while g is usually expensive to compute. So I don't want the values of g <$> c <*> d to be computed when I don't need them.

Now I could write

e = (\a_ b_ c_ d_ -> f a_ b_ `mplus` g c_ d_) <$> a <*> b <*> c <*> d

to avoid computing values of g when they're not needed, but having to apply that transformation everywhere in my program limits my ability to modularize (imagine that e is an input to another similar Behavior).

I'm imagining that Behaviors constructed from things like accumB (definitely) or stepper (maybe) would have their values forced eagerly, while Behaviors constructed from other Behaviors with (<$>) and (<*>) don't have their values forced except when needed by normal Haskell computation (such as if an Event produced by applying the Behavior is reactimated, or used in an accumB). I don't think not forcing the values of a Behavior constructed with (<$>)/(<*>) can cause a space/time leak, since those values are determined by the values of other Behaviors at the same time.

I guess I can implement this policy in my end application with explicit Boxes. I might try that (using the develop branch) and see how it goes.

Let me give an overview first, and I can try to construct a small example later if it would be helpful.

I have code that in essence is like this:

e = liftA2 mplus (f <$> a <*> b) (g <$> c <*> d)

where a, b, c, d, e are Behaviors. a through d maybe come from steppers or accumB or something, in which case I want to compute them eagerly. But in my case f is usually cheap to compute and usually Just something, while g is usually expensive to compute. So I don't want the values of g <$> c <*> d to be computed when I don't need them.

Now I could write

e = (\a_ b_ c_ d_ -> f a_ b_ `mplus` g c_ d_) <$> a <*> b <*> c <*> d

to avoid computing values of g when they're not needed, but having to apply that transformation everywhere in my program limits my ability to modularize (imagine that e is an input to another similar Behavior).

I'm imagining that Behaviors constructed from things like accumB (definitely) or stepper (maybe) would have their values forced eagerly, while Behaviors constructed from other Behaviors with (<$>) and (<*>) don't have their values forced except when needed by normal Haskell computation (such as if an Event produced by applying the Behavior is reactimated, or used in an accumB). I don't think not forcing the values of a Behavior constructed with (<$>)/(<*>) can cause a space/time leak, since those values are determined by the values of other Behaviors at the same time.

I guess I can implement this policy in my end application with explicit Boxes. I might try that (using the develop branch) and see how it goes.

@rwbarton

This comment has been minimized.

Show comment Hide comment
@rwbarton

rwbarton Oct 21, 2013

You can also see it as a violation of the Functor laws with respect to strictness (if you believe in such things) if fmap (const () . g) x doesn't ever evaluate g but fmap (const ()) (fmap g x) does.

You can also see it as a violation of the Functor laws with respect to strictness (if you believe in such things) if fmap (const () . g) x doesn't ever evaluate g but fmap (const ()) (fmap g x) does.

@HeinrichApfelmus

This comment has been minimized.

Show comment Hide comment
@HeinrichApfelmus

HeinrichApfelmus Oct 21, 2013

Owner

Ok, I see. At the moment, the development branch evaluates all behaviors strictly, which is not what you have in mind. The reason is that there <$> can actually create a space leak because it would keep around unevaluated references to old versions of the Behavior store. However, this is a problem with the implementation, and can be fixed by using Boxes internally.

Put differently, I think the workaround with Box will work -- that's how I would implement it internally. (You may want to test it on a really minimal example first, though, just in case.)

I'm not entirely sure whether it's really canonical to make <$> and <*> lazy, but everyone seems to assume that they are, so it's probably a good idea to follow what people expect from a lazy non-strict language.

Owner

HeinrichApfelmus commented Oct 21, 2013

Ok, I see. At the moment, the development branch evaluates all behaviors strictly, which is not what you have in mind. The reason is that there <$> can actually create a space leak because it would keep around unevaluated references to old versions of the Behavior store. However, this is a problem with the implementation, and can be fixed by using Boxes internally.

Put differently, I think the workaround with Box will work -- that's how I would implement it internally. (You may want to test it on a really minimal example first, though, just in case.)

I'm not entirely sure whether it's really canonical to make <$> and <*> lazy, but everyone seems to assume that they are, so it's probably a good idea to follow what people expect from a lazy non-strict language.

@rwbarton rwbarton referenced this issue in rwbarton/rw Oct 22, 2013

Open

fix memory leak #1

@rwbarton

This comment has been minimized.

Show comment Hide comment
@rwbarton

rwbarton Nov 11, 2013

There's still a laziness issue in the develop version that can cause latch values to retain references to earlier versions of the value store. In accumP:

... mdo
    x       <- stepperL a result
    result  <- mkPulse $
        {-# SCC accumP #-} (\x -> fmap ($ x)) <$> readLatchP x <*> readPulseP p

there is nothing to force the lookup of the old latch value and so if the function read from the pulse happens to be lazy (e.g. a data constructor), while the new value of the stepper will be reduced to WHNF by the strict Vault, the lookup of the old value will not be evaluated at that time.

It's true that, in this situation where the values of the argument to accumP are lazy functions, the values of the resulting pulse will necessarily be growing in size also. So this isn't exactly a classic space leak, in which a program that would run in constant space instead takes linear space as a function of time. However, there is still a big difference between retaining a growing value for this single pulse, which may be what the programmer needs anyways (to consume later), and retaining all previous versions of the entire value store.

The fix is just to make the application stricter:

... mdo
    x       <- stepperL a result
    result  <- mkPulse $
        {-# SCC accumP #-} (\x -> fmap ($! x)) <$> readLatchP x <*> readPulseP p

so that (in particular) the lookup of the old value will be forced by the insertion of the new value.


However, I think the whole strategy of ensuring that latch values are evaluated to WHNF before being stored in the Vault is not necessary. You should be able to get the desired level of strictness with a lazy Vault just by being careful to evaluate the lookups needed to build a new value before inserting it, without touching the latch values themselves. This is possible because the lookup operation for Vault (or equivalently for HashMap) returns a Maybe a, and you can force the Nothing or Just constructor to do the lookup, thereby eliminating any references to the Vault itself, without evaluating the actual value. On top of that, you can choose any desired strictness semantics for your combinators; for example it's surely best to have a strict accumulator for accumP but I don't think strictness is good for applyP.

In mkLatch's getValueL you currently pass the result of lookup to maybe err id, which destroys your ability to do that exact extent of evaluation; but you can store the result of getValueL in a Box to regain that ability. I think this is simpler than forcing everything all the time at the lowest level of the implementation and then using Box-valued latches (actually storing Boxes in the value store, rather than just using them as temporaries) to regain laziness where needed.

I'll try to implement this later and will send a patch.

There's still a laziness issue in the develop version that can cause latch values to retain references to earlier versions of the value store. In accumP:

... mdo
    x       <- stepperL a result
    result  <- mkPulse $
        {-# SCC accumP #-} (\x -> fmap ($ x)) <$> readLatchP x <*> readPulseP p

there is nothing to force the lookup of the old latch value and so if the function read from the pulse happens to be lazy (e.g. a data constructor), while the new value of the stepper will be reduced to WHNF by the strict Vault, the lookup of the old value will not be evaluated at that time.

It's true that, in this situation where the values of the argument to accumP are lazy functions, the values of the resulting pulse will necessarily be growing in size also. So this isn't exactly a classic space leak, in which a program that would run in constant space instead takes linear space as a function of time. However, there is still a big difference between retaining a growing value for this single pulse, which may be what the programmer needs anyways (to consume later), and retaining all previous versions of the entire value store.

The fix is just to make the application stricter:

... mdo
    x       <- stepperL a result
    result  <- mkPulse $
        {-# SCC accumP #-} (\x -> fmap ($! x)) <$> readLatchP x <*> readPulseP p

so that (in particular) the lookup of the old value will be forced by the insertion of the new value.


However, I think the whole strategy of ensuring that latch values are evaluated to WHNF before being stored in the Vault is not necessary. You should be able to get the desired level of strictness with a lazy Vault just by being careful to evaluate the lookups needed to build a new value before inserting it, without touching the latch values themselves. This is possible because the lookup operation for Vault (or equivalently for HashMap) returns a Maybe a, and you can force the Nothing or Just constructor to do the lookup, thereby eliminating any references to the Vault itself, without evaluating the actual value. On top of that, you can choose any desired strictness semantics for your combinators; for example it's surely best to have a strict accumulator for accumP but I don't think strictness is good for applyP.

In mkLatch's getValueL you currently pass the result of lookup to maybe err id, which destroys your ability to do that exact extent of evaluation; but you can store the result of getValueL in a Box to regain that ability. I think this is simpler than forcing everything all the time at the lowest level of the implementation and then using Box-valued latches (actually storing Boxes in the value store, rather than just using them as temporaries) to regain laziness where needed.

I'll try to implement this later and will send a patch.

@HeinrichApfelmus

This comment has been minimized.

Show comment Hide comment
@HeinrichApfelmus

HeinrichApfelmus Nov 15, 2013

Owner

@rwbarton Thanks for reporting this once more!

You should be able to get the desired level of strictness with a lazy Vault just by being careful to evaluate the lookups needed to build a new value before inserting it, without touching the latch values themselves

Not entirely sure about that. If I recall correctly, the Build allows for some kinds of recursion that don't go well with evaluating the lookup early. The nice thing about the strict Vault is that it has a space invariant, so I get all the guarantees I desire without having to control the evaluation order in minute detail (which would be hopeless).

Concerning recursion, I have found that I have to implement it differently anyway, so maybe that helps in the long run.

Owner

HeinrichApfelmus commented Nov 15, 2013

@rwbarton Thanks for reporting this once more!

You should be able to get the desired level of strictness with a lazy Vault just by being careful to evaluate the lookups needed to build a new value before inserting it, without touching the latch values themselves

Not entirely sure about that. If I recall correctly, the Build allows for some kinds of recursion that don't go well with evaluating the lookup early. The nice thing about the strict Vault is that it has a space invariant, so I get all the guarantees I desire without having to control the evaluation order in minute detail (which would be hopeless).

Concerning recursion, I have found that I have to implement it differently anyway, so maybe that helps in the long run.

HeinrichApfelmus added a commit that referenced this issue Jan 18, 2014

Latch accumulation does not display the foldl-like space leak anymore.
…#52

The 'Box' type still allows lazy evaluation when using 'fmap' or the Control.Applicative combinators.
@HeinrichApfelmus

This comment has been minimized.

Show comment Hide comment
@HeinrichApfelmus

HeinrichApfelmus Jan 18, 2014

Owner

Alright everyone, I was finally able to put aside some time to properly address these issues. As of commit e879caa , the development version of reactive-banana has greatly improved run-time performance. In particular, the space leak here should be fixed while still allowing lazy evaluation for latch values. All the existing test cases pass, so feel free to use this version right away if you need it sooner rather than later.

Owner

HeinrichApfelmus commented Jan 18, 2014

Alright everyone, I was finally able to put aside some time to properly address these issues. As of commit e879caa , the development version of reactive-banana has greatly improved run-time performance. In particular, the space leak here should be fixed while still allowing lazy evaluation for latch values. All the existing test cases pass, so feel free to use this version right away if you need it sooner rather than later.

@HeinrichApfelmus

This comment has been minimized.

Show comment Hide comment
@HeinrichApfelmus

HeinrichApfelmus Jan 25, 2014

Owner

I am about to release reactive-banana-0.8.0.0 from the current master branch (efda6ab) where this space leak should be fixed for good. Please reopen the issue if it doesn't work for you.

Owner

HeinrichApfelmus commented Jan 25, 2014

I am about to release reactive-banana-0.8.0.0 from the current master branch (efda6ab) where this space leak should be fixed for good. Please reopen the issue if it doesn't work for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment