New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor #22

Merged
merged 17 commits into from Apr 9, 2017

Conversation

Projects
None yet
2 participants
@maoe
Member

maoe commented Mar 12, 2017

This PR aims to

  • implement a lazy eventlog reader API that supports streaming
  • make the GHC.RTS.Events API closer to v0.4

In order to achieve these, I did

  • rewrite the incremental API as GHC.RTS.Events.Incremental
  • factor out the parser/serializer to GHC.RTS.Events.Binary, which is a hidden module (for now)

Also I did some small cleanup and fixed a space leak.

As for reviews, I guess it's easier to understand if you look at each commit in order. The first 9 commits implement the new incremental API, then 3dd58cf and bcb4595 make the API closer to v0.4. Other commits are small cleanup etc.

GHC.RTS.Events.Incremental

While I'm implementing ghc-events gc (pause|interval) commands in my gc-stats branch, I found that the current incremental API is a bit awkward to use because readEventLogFromFile has been deprecated and it cannot stream output as it reads events. It cannot stream output because it has to reverse events.

The new incremental API allows readEventLogFromFile to stream output and events don't have to be reversed. Also I've retracted the deprecation on readEventLogFromFile as it's now able to stream output.

GHC.RTS.Events.Binary

This module was necessary to make the API closer to the previous version of ghc-events. I found it a bit strange that the Incremental (and EventsIncremental) module exports eventlog serializers because the serializers don't have anything to do with the incremental decoder.

I made the GHC.RTS.Events.Binary module and moved all serializers/deserializers into it so it can be used either from GHC.RTS.Events and GHC.RTS.Events.Incremental.

Now that we can move the I/O interface (read/writeEventLogFromFile) back to GHC.RTS.Events.

Breaking changes

  • GHC.RTS.EventsIncremental has been rewritten as GHC.RTS.Events.Incremental
  • Some format has changed in ghc-events show
    • No leading whitespace in timestamps
    • No indication of incomplete eventlog
  • GHC.RTS.Events doesn't export the parsers

Performance

The performance characteristics of ghc-events show has changed. Heap allocation has been greatly reduced. Unfortunately due to the use of sortEvents, ghc-events show still can't run in constant space though.

% ls -lh small.eventlog
-rw-r--r--  1 maoe  staff   2.0M Feb 18 07:24 small.eventlog
% ./ghc-events.master show small.eventlog > master.out +RTS -sstderr
Warning: The event log was not fully parsed. It could have been malformed or incomplete.
   1,564,294,056 bytes allocated in the heap
     135,686,952 bytes copied during GC
      22,383,952 bytes maximum residency (7 sample(s))
       1,502,000 bytes maximum slop
              66 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      2122 colls,     0 par    0.06s    0.07s     0.0000s    0.0009s
  Gen  1         7 colls,     0 par    0.05s    0.06s     0.0092s    0.0254s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.22s  (  0.23s elapsed)
  GC      time    0.11s  (  0.13s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    0.33s  (  0.37s elapsed)

  %GC     time      32.9%  (36.2% elapsed)

  Alloc rate    7,052,755,224 bytes per MUT second

  Productivity  67.0% of total user, 60.1% of total elapsed

% ./ghc-events.refactor show small.eventlog > refactor.out +RTS -sstderr
     438,965,280 bytes allocated in the heap
     326,603,680 bytes copied during GC
       8,016,040 bytes maximum residency (57 sample(s))
         361,144 bytes maximum slop
              25 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0       766 colls,     0 par    0.11s    0.11s     0.0002s    0.0005s
  Gen  1        57 colls,     0 par    0.10s    0.11s     0.0019s    0.0105s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.10s  (  0.10s elapsed)
  GC      time    0.21s  (  0.22s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    0.31s  (  0.33s elapsed)

  %GC     time      68.7%  (68.1% elapsed)

  Alloc rate    4,578,468,855 bytes per MUT second

  Productivity  31.3% of total user, 29.5% of total elapsed
% diff -w master.out refactor.out
111017d111016
<

The runtime of ghc-events inc has been improved a bit:

% ls -lh app.eventlog
-rw-r--r--  1 maoe  staff    36M Feb 18 07:24 app.eventlog
% ./ghc-events.master inc app.eventlog > master.out +RTS -sstderr
   4,614,234,640 bytes allocated in the heap
       5,488,672 bytes copied during GC
          90,080 bytes maximum residency (2 sample(s))
          35,472 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      8787 colls,     0 par    0.04s    0.04s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0002s    0.0003s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    1.33s  (  1.49s elapsed)
  GC      time    0.04s  (  0.04s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    1.37s  (  1.53s elapsed)

  %GC     time       2.8%  (2.9% elapsed)

  Alloc rate    3,468,658,229 bytes per MUT second

  Productivity  97.2% of total user, 86.9% of total elapsed

% ./ghc-events.refactor inc app.eventlog > refactor.out +RTS -sstderr
   4,297,141,368 bytes allocated in the heap
       5,085,992 bytes copied during GC
          86,104 bytes maximum residency (2 sample(s))
          29,776 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      8113 colls,     0 par    0.03s    0.04s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0002s    0.0003s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    1.24s  (  1.38s elapsed)
  GC      time    0.03s  (  0.04s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    1.27s  (  1.42s elapsed)

  %GC     time       2.6%  (2.7% elapsed)

  Alloc rate    3,479,314,562 bytes per MUT second

  Productivity  97.4% of total user, 87.3% of total elapsed
% diff -w master.out refactor.out
2244442d2244441
< Finished (file was parsed successfully)
@maoe

This comment has been minimized.

Show comment
Hide comment
@maoe

maoe Mar 12, 2017

Member

@kvelicka Would you mind reviewing this PR because it does break your incremental parser API?

Member

maoe commented Mar 12, 2017

@kvelicka Would you mind reviewing this PR because it does break your incremental parser API?

@kvelicka

This comment has been minimized.

Show comment
Hide comment
@kvelicka

kvelicka Mar 13, 2017

Collaborator

I've only had a first pass at the code but the first impression is 👍

I'll try to look into it more later in the week.

Collaborator

kvelicka commented Mar 13, 2017

I've only had a first pass at the code but the first impression is 👍

I'll try to look into it more later in the week.

maoe added a commit to maoe/ThreadScope that referenced this pull request Mar 30, 2017

@maoe

This comment has been minimized.

Show comment
Hide comment
@maoe

maoe Mar 30, 2017

Member

I made a change in threadscope for this refactor: haskell/ThreadScope@master...maoe:ghc-events-refactor. Basically nothing has changed.

Member

maoe commented Mar 30, 2017

I made a change in threadscope for this refactor: haskell/ThreadScope@master...maoe:ghc-events-refactor. Basically nothing has changed.

@maoe maoe requested a review from kvelicka Mar 30, 2017

Done leftover -> Right (header, BL.Chunk leftover bytes)
_ -> fail "readHeader: unexpected decoder"
Consume k -> case bytes of
BL.Empty -> fail "readHeader: not enough bytes"

This comment has been minimized.

@kvelicka

kvelicka Apr 8, 2017

Collaborator

Does this mean that a caller has to supply a ByteString containing (at least) all of the header in one go? In other words, is reading the header non-incremental?

@kvelicka

kvelicka Apr 8, 2017

Collaborator

Does this mean that a caller has to supply a ByteString containing (at least) all of the header in one go? In other words, is reading the header non-incremental?

This comment has been minimized.

@maoe

maoe Apr 8, 2017

Member

That's right. For incremental parsing of a header, one should use decodeHeader. Maybe I should mention this behavior in the Haddock comment.

@maoe

maoe Apr 8, 2017

Member

That's right. For incremental parsing of a header, one should use decodeHeader. Maybe I should mention this behavior in the Haddock comment.

This comment has been minimized.

@maoe

maoe Apr 8, 2017

Member

Added a note about this in 53f2872.

@maoe

maoe Apr 8, 2017

Member

Added a note about this in 53f2872.

@kvelicka

Looks very nice, good job!

Admittedly I haven't tested this myself but given that the tests pass (and the fact that you seem to be using the library in your own code), I am quite confident in the correctness of the patch. I've had a couple of questions here and there but they're mostly minor things.

Show outdated Hide outdated src/GHC/RTS/Events/Incremental.hs Outdated
Show outdated Hide outdated src/GHC/RTS/Events.hs Outdated
@maoe

Thanks for reviewing this. I'm going to extend the Haddock comment on readHeader.

Done leftover -> Right (header, BL.Chunk leftover bytes)
_ -> fail "readHeader: unexpected decoder"
Consume k -> case bytes of
BL.Empty -> fail "readHeader: not enough bytes"

This comment has been minimized.

@maoe

maoe Apr 8, 2017

Member

That's right. For incremental parsing of a header, one should use decodeHeader. Maybe I should mention this behavior in the Haddock comment.

@maoe

maoe Apr 8, 2017

Member

That's right. For incremental parsing of a header, one should use decodeHeader. Maybe I should mention this behavior in the Haddock comment.

Show outdated Hide outdated src/GHC/RTS/Events/Incremental.hs Outdated
Show outdated Hide outdated src/GHC/RTS/Events.hs Outdated
@kvelicka

This comment has been minimized.

Show comment
Hide comment
@kvelicka

kvelicka Apr 9, 2017

Collaborator

Thanks for adding the comment. 👍 from me

Collaborator

kvelicka commented Apr 9, 2017

Thanks for adding the comment. 👍 from me

@maoe

This comment has been minimized.

Show comment
Hide comment
@maoe

maoe Apr 9, 2017

Member

Thanks! I'm going to merge this.

Member

maoe commented Apr 9, 2017

Thanks! I'm going to merge this.

@maoe maoe merged commit 58aefed into haskell:master Apr 9, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@maoe maoe deleted the maoe:refactor branch Apr 9, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment