New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed memory-leak in incremental readEvents. #37

Merged
merged 1 commit into from Jul 10, 2018

Conversation

Projects
None yet
2 participants
@joecrayne
Contributor

joecrayne commented Jul 9, 2018

I noticed a memory leak when opening large eventlog files. I'm not sure why the original version runs linear in space, but replacing the helper function f inside readEvents with a version that avoids break seems to fix it. I also fixed a few comment typos while I was in there.

@maoe

This comment has been minimized.

Show comment
Hide comment
@maoe

maoe Jul 9, 2018

Member

How did you check the original code had a space leak? ghc-events show has a known space leak because of the use of sortEvents. Or did you write a function that uses readEvents?

Member

maoe commented Jul 9, 2018

How did you check the original code had a space leak? ghc-events show has a known space leak because of the use of sortEvents. Or did you write a function that uses readEvents?

@maoe maoe self-requested a review Jul 9, 2018

@joecrayne

This comment has been minimized.

Show comment
Hide comment
@joecrayne

joecrayne Jul 9, 2018

Contributor

I used my own program that uses GHC.RTS.Events.Incremental.readEventLog. It's just some experimental nonsense at the moment, but I've uploaded the source for you at https://github.com/joecrayne/eventlog-monitor

Contributor

joecrayne commented Jul 9, 2018

I used my own program that uses GHC.RTS.Events.Incremental.readEventLog. It's just some experimental nonsense at the moment, but I've uploaded the source for you at https://github.com/joecrayne/eventlog-monitor

@maoe

This comment has been minimized.

Show comment
Hide comment
@maoe

maoe Jul 9, 2018

Member

Thanks for providing the source code. I’ll take a look.

Member

maoe commented Jul 9, 2018

Thanks for providing the source code. I’ll take a look.

@maoe

This comment has been minimized.

Show comment
Hide comment
@maoe

maoe Jul 10, 2018

Member

Thanks to the eventlog-monitor I now see the space leak.

The tuple (rs, ls) that was returned by break was allocated in the heap. If you look at the Core:

readEvents
  = \ (header_a79H :: Header) ->
      let {
        g_s9cu :: Decoder Event
        [LclId]
        g_s9cu = decodeEvents header_a79H } in
      \ (x_i8XX :: BL.ByteString) ->
        let {
          ds_s9cw [Dmd=<L,U(1*U,1*U)>]
            :: ([Either String Event], [Either String Event])
          [LclId]
          ds_s9cw
            = src<libraries/base/GHC/Base.hs:1272:19-23>
              case GHC.List.$wbreak
                     @ (Either String Event)
                     (isLeft @ String @ Event)
                     (GHC.RTS.Events.Incremental.readEvents_go1
                        g_s9cu (src<libraries/base/GHC/Base.hs:1272:22> x_i8XX))
              of
              { (# ww1_i8Y4, ww2_i8Y5 #) ->
              (ww1_i8Y4, ww2_i8Y5)
              } } in
        src<libraries/base/GHC/Base.hs:1272:1-23>
        (case ds_s9cw of { (rs_a8HO, ls_a8HQ) ->
         rights @ String @ Event rs_a8HO
         },
         src<libraries/base/Data/Maybe.hs:232:1-42>
         src<libraries/base/Data/Either.hs:190:1-27>
         case ds_s9cw of { (rs_a8HO, ls_a8HQ) ->
         src<libraries/base/GHC/Base.hs:(996,1)-(999,35)>
         GHC.RTS.Events.Incremental.readEvents_go ls_a8HQ
         })

ds_s8cw is bound by let.

One way to fix this is to float out break like this:

diff --git a/src/GHC/RTS/Events/Incremental.hs b/src/GHC/RTS/Events/Incremental.hs
index b299882..d83dd1f 100644
--- a/src/GHC/RTS/Events/Incremental.hs
+++ b/src/GHC/RTS/Events/Incremental.hs
@@ -124,15 +124,12 @@ readHeader = go $ Left decodeHeader
 -- Note that it doesn't fail if it consumes all input in the middle of decoding
 -- of an event.
 readEvents :: Header -> BL.ByteString -> ([Event], Maybe String)
-readEvents header = f . go (decodeEvents header)
+readEvents header = f . break isLeft . go (decodeEvents header)
   where
-    f :: [Either e a] -> ([a], Maybe e)
-    f xs = (rights rs, listToMaybe (lefts ls))
-      where
-        (rs, ls) = break isLeft xs
+    f (rs, ls) = (rights rs, listToMaybe (lefts ls))
 #if !MIN_VERSION_base(4, 7, 0)
-        isLeft (Left _) = True
-        isLeft _ = False
+    isLeft (Left _) = True
+    isLeft _ = False
 #endif
     go :: Decoder Event -> BL.ByteString -> [Either String Event]
     go decoder bytes = case decoder of

Now Core looks much better:

readEvents
  = \ (header_a2aEC :: Header) ->
      let {
        g_s2bsC :: Decoder Event
        [LclId]
        g_s2bsC = decodeEvents header_a2aEC } in
      \ (x_XatE :: BL.ByteString) ->
        src<libraries/base/GHC/Base.hs:1272:1-23>
        case GHC.List.$wbreak
               @ (Either String Event)
               (isLeft @ String @ Event)
               (GHC.RTS.Events.Incremental.readEventLog_go1
                  g_s2bsC (src<libraries/base/GHC/Base.hs:1272:22> x_XatE))
        of
        { (# ww1_a2bpv, ww2_a2bpw #) ->
        (rights @ String @ Event ww1_a2bpv,
         src<libraries/base/Data/Maybe.hs:232:1-42>
         src<libraries/base/Data/Either.hs:190:1-27>
         src<libraries/base/GHC/Base.hs:(996,1)-(999,35)>
         GHC.RTS.Events.Incremental.readEventLog_go ww2_a2bpw)
        }

Could we float out the break in this PR? I'd prefer to use combination of existing functions rather than writing an explicit recursion in this case.

Member

maoe commented Jul 10, 2018

Thanks to the eventlog-monitor I now see the space leak.

The tuple (rs, ls) that was returned by break was allocated in the heap. If you look at the Core:

readEvents
  = \ (header_a79H :: Header) ->
      let {
        g_s9cu :: Decoder Event
        [LclId]
        g_s9cu = decodeEvents header_a79H } in
      \ (x_i8XX :: BL.ByteString) ->
        let {
          ds_s9cw [Dmd=<L,U(1*U,1*U)>]
            :: ([Either String Event], [Either String Event])
          [LclId]
          ds_s9cw
            = src<libraries/base/GHC/Base.hs:1272:19-23>
              case GHC.List.$wbreak
                     @ (Either String Event)
                     (isLeft @ String @ Event)
                     (GHC.RTS.Events.Incremental.readEvents_go1
                        g_s9cu (src<libraries/base/GHC/Base.hs:1272:22> x_i8XX))
              of
              { (# ww1_i8Y4, ww2_i8Y5 #) ->
              (ww1_i8Y4, ww2_i8Y5)
              } } in
        src<libraries/base/GHC/Base.hs:1272:1-23>
        (case ds_s9cw of { (rs_a8HO, ls_a8HQ) ->
         rights @ String @ Event rs_a8HO
         },
         src<libraries/base/Data/Maybe.hs:232:1-42>
         src<libraries/base/Data/Either.hs:190:1-27>
         case ds_s9cw of { (rs_a8HO, ls_a8HQ) ->
         src<libraries/base/GHC/Base.hs:(996,1)-(999,35)>
         GHC.RTS.Events.Incremental.readEvents_go ls_a8HQ
         })

ds_s8cw is bound by let.

One way to fix this is to float out break like this:

diff --git a/src/GHC/RTS/Events/Incremental.hs b/src/GHC/RTS/Events/Incremental.hs
index b299882..d83dd1f 100644
--- a/src/GHC/RTS/Events/Incremental.hs
+++ b/src/GHC/RTS/Events/Incremental.hs
@@ -124,15 +124,12 @@ readHeader = go $ Left decodeHeader
 -- Note that it doesn't fail if it consumes all input in the middle of decoding
 -- of an event.
 readEvents :: Header -> BL.ByteString -> ([Event], Maybe String)
-readEvents header = f . go (decodeEvents header)
+readEvents header = f . break isLeft . go (decodeEvents header)
   where
-    f :: [Either e a] -> ([a], Maybe e)
-    f xs = (rights rs, listToMaybe (lefts ls))
-      where
-        (rs, ls) = break isLeft xs
+    f (rs, ls) = (rights rs, listToMaybe (lefts ls))
 #if !MIN_VERSION_base(4, 7, 0)
-        isLeft (Left _) = True
-        isLeft _ = False
+    isLeft (Left _) = True
+    isLeft _ = False
 #endif
     go :: Decoder Event -> BL.ByteString -> [Either String Event]
     go decoder bytes = case decoder of

Now Core looks much better:

readEvents
  = \ (header_a2aEC :: Header) ->
      let {
        g_s2bsC :: Decoder Event
        [LclId]
        g_s2bsC = decodeEvents header_a2aEC } in
      \ (x_XatE :: BL.ByteString) ->
        src<libraries/base/GHC/Base.hs:1272:1-23>
        case GHC.List.$wbreak
               @ (Either String Event)
               (isLeft @ String @ Event)
               (GHC.RTS.Events.Incremental.readEventLog_go1
                  g_s2bsC (src<libraries/base/GHC/Base.hs:1272:22> x_XatE))
        of
        { (# ww1_a2bpv, ww2_a2bpw #) ->
        (rights @ String @ Event ww1_a2bpv,
         src<libraries/base/Data/Maybe.hs:232:1-42>
         src<libraries/base/Data/Either.hs:190:1-27>
         src<libraries/base/GHC/Base.hs:(996,1)-(999,35)>
         GHC.RTS.Events.Incremental.readEventLog_go ww2_a2bpw)
        }

Could we float out the break in this PR? I'd prefer to use combination of existing functions rather than writing an explicit recursion in this case.

@joecrayne

This comment has been minimized.

Show comment
Hide comment
@joecrayne

joecrayne Jul 10, 2018

Contributor

I've updated the PR to float out break rather than use an explicit recursion as you suggested.

Contributor

joecrayne commented Jul 10, 2018

I've updated the PR to float out break rather than use an explicit recursion as you suggested.

@maoe

This comment has been minimized.

Show comment
Hide comment
@maoe

maoe Jul 10, 2018

Member

Thank you! Can you also confirm that this actually fixes the space leak issue?

Member

maoe commented Jul 10, 2018

Thank you! Can you also confirm that this actually fixes the space leak issue?

@joecrayne

This comment has been minimized.

Show comment
Hide comment
@joecrayne

joecrayne Jul 10, 2018

Contributor

Yes, confirmed. It fixes the issue.

Contributor

joecrayne commented Jul 10, 2018

Yes, confirmed. It fixes the issue.

@maoe

This comment has been minimized.

Show comment
Hide comment
@maoe

maoe Jul 10, 2018

Member

Great, thank you!

Member

maoe commented Jul 10, 2018

Great, thank you!

@maoe maoe merged commit e4d41c3 into haskell:master Jul 10, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@maoe

This comment has been minimized.

Show comment
Hide comment
Member

maoe commented Jul 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment