Advise `call-interactively`; add streaming logging; move profiles to `explain-pause-top`; add integration & regression tests #42

lastquestion · 2020-06-13T19:44:21Z

Feature complete, integration tests work, manually tested on 26 and 28, test ran on 26, 27, 28.

Needs to handle debugger

Large PR to fix the measurement problem and at the same time separate the display abstractions. This thing needs:

integration tests
regression tests on existing bugs also
test on emacs windowed
test on 26, 27, 28

call-interactively

In #26, it turns out that hooking pre-command-hook is rife with troubles, because elisp can call it and we cannot be sure that the calls are matched. After a lot of thought, instead redesign the measurement engine to advise call-interactively instead. This gives much more accurate data for minibuffer, so we don't need the minibuffer hooks either. This work also fixes #18, because now call-inteactively time spent in native waits for keyboard input (from interactive specs) are excluded.

To do this, this PR introduces proper frame records and stack. I have some concern about the memory usage of this and how well the GC collects the unused branches of the tree.

Native frames are inserted whenever elisp is going to call into anything that might wait for keyboard or user input. They are also inserted in between timers and their callbacks, so that in all cases, native wait code has a parent frame to subtract time from. This resolves #31 as well.

streaming logging

This change is very large and rebuilds the measurement engine completely. In order not to regress existing bug fixes and also newly fixed bugs, a integration test suite that runs emacs in interactive mode is going to be needed. To enable this, a new feature is added. Logs of the command records entry/exit/profile/measurement can be streamed to arbitrary UNIX socket. The major intent of this is to enable diffing of measurement engine behavior in tests, but it will also allow for out of process visualization of emacs, which will be really useful when GC or blocking stuff happens; you can't see what's actually running right now because we're inside the same process, of course.

Move profiles to `explain-pause-top` and remove `explain-pause-log` and `explain-pause-profiles`

The log isn't actually super useful and the UX is not great for it anyway. Since the measurement engine rebuilds the command hook API, I took this chance to fix the display engines as well. Remove explain-pause-log and explain-pause-profiles' as well. Instead, profiles are displayed directly in explain-pause-top. Now the limit of profiles applies per command. When explain-pause-top` is first open, it backfills from the profiling statistics so even if you didn't have it open, it still knows about slow commands. Currrently, I don't have a good UX for native frames yet, so they are hidden from the top view. Fixes #39.

Also, while doing this, I had to move the click handlers, so add a new defcustom to allow customization of what actually happens when the profile button is clicked. Fixes #16.

explain-pause-top--table is extended even more to be even more generic and handle multiple extra lines of full width text after the column header line.

However, this drops the "parent" stack of each command, and combines them all together at the leaf. The previous UX allowed you to see separate parents if they came from timers. I removed this because I think I want to build a tree view for this that roots commands within the buffer that they were taken.

This removes the now unused logging defcustoms.

Testing

Add an integration test runner that runs emacs under tmux. Drive that emacs via another emacs that runs the test code. Drive all of that via Make. Add regression tests for all existing bugs that had repros that would not be covered by unit tests. There is still some areas that need testing, especially regarding the display layer which ought to be tested via batch emacs and reading a buffer, probably driven via buttercup.

The test suite can run against all the versions of Emacs supported, but it doesn't work for windowed emacs yet. Not sure I need to double the matrix there. If more bugs come at that layer I will consider adding it.

Random things

Also, rearrange deffaces into the right group.

lastquestion · 2020-06-13T19:48:55Z

Also fixes #19, partially. I think with some small fix it should fix it for both TTY menu bar kinds. Did not test on windowed yet.

lastquestion · 2020-06-14T18:09:08Z

I think I need to deal with file-watching now (see issue Handle file-notify callbacks #3) because otherwise time is attributed to timers when it actually belongs to file notification updates from things like autorevert. This is way more noticeable in windowed mode for some reason.
I think I need to fix the timer callback stack trace problem (see issue Handle recursive timers so their command stacks do not grow infinitely #41) because when running very long emacs sessions (e.g. like everyone does), some timers end up with very deep stacks and since the parent pointer makes them completely live, GC can't free them.

lastquestion · 2020-06-15T03:55:45Z

Rebased to get the fix in #43 in master. Also, squashed the changes so far.

lastquestion · 2020-06-15T20:43:59Z

File notify events are sent as special key map entries in keyboard.c which ultimately become command_execute on file-notify keymap which binds to file-nofity-handle-event. So it will get measured correctly, except not correctly attributed to the actual callback, but the general file-notify code. However it should be easy to advice file-notify-add-watch similar to timers.

So the mismeasurement is somewhere else :-(

lastquestion · 2020-06-16T21:20:52Z

Above PR fixes #41. Also, I discovered that repeat timers often repeat a bunch of times on mac when AppNap is enabled (which it is by default on Mac). timer-max-repeats isn't bindable around a timer schedule, so move to rescheduling timer manually.

I made a comment about this in Bug#41865 https://lists.gnu.org/archive/html/bug-gnu-emacs/2020-06/msg00651.html in GNU emacs bug list.

…`explain-pause-top'; performance improvements; handle recursive timers > call-interactively In #26, it turns out that hooking `pre-command-hook` is rife with troubles, because elisp can call it and we cannot be sure that the calls are matched. After a lot of thought, instead redesign the measurement engine to advise `call-interactively` instead. This gives much more accurate data for minibuffer, so we don't need the minibuffer hooks either. This work also fixes #18, because now `call-inteactively` time spent in native waits for keyboard input (from interactive specs) are excluded. To do this, this PR introduces proper frame records and stack. I have some concern about the memory usage of this and how well the GC collects the unused branches of the tree. Native frames are inserted whenever elisp is going to call into anything that might wait for keyboard or user input. They are also inserted in between timers and their callbacks, so that in all cases, native wait code has a parent frame to subtract time from. This resolves #31 as well. > streaming logging This change is very large and rebuilds the measurement engine completely. In order not to regress existing bug fixes and also newly fixed bugs, a integration test suite that runs emacs in interactive mode is going to be needed. To enable this, streaming logging of the command records entry/exit/profile/measurement is added to a UNIX socket. The major intent of this is to enable diffing of measurement engine behavior in tests, but it will also allow for out of process visualization of emacs, which will be really useful when GC or blocking stuff happens; you can't see what's actually running right now because we're inside the same process, of course. > Move profiles to `explain-pause-top` and remove `explain-pause-log` and `explain-pause-profiles` The log isn't actually super useful and the UX is not great for it anyway. Since the measurement engine rebuilds the command hook API, I took this chance to fix the display engines as well. Remove `explain-pause-log` and `explain-pause-profiles' as well. Instead, profiles are displayed directly in `explain-pause-top`. Now the limit of profiles applies per command. When `explain-pause-top` is first open, it backfills from the profiling statistics so even if you didn't have it open, it still knows about slow commands. Currrently, I don't have a good UX for native frames yet, so they are hidden from the top view. Fixes #39. Also, while doing this, I had to move the click handlers, so add a new defcustom to allow customization of what actually happens when the profile button is clicked. Fixes #16. `explain-pause-top--table` is extended even more to be even more generic and handle multiple extra lines of full width text after the column header line. This removes the now unused logging `defcustoms`. > Performance improvements to `explain-pause-top` Generally, opt to preallocate vectors of strings and try moderately hard not to allocate memory during refreshes. Also, move to a more consistent double buffering algorithm, which is where we were going anyway. Now we flip between two sets of cached data, which allows re-use of strings even if the entries have moved in order. This also simplifies the code. In tight loops, move let bindings outside the loop, to save on frame allocations. Also, use a recursive timer instead of a interval timer because interval timers will reschedule repeatedly after a delay, and this is not configurable via a binding like I thought before. See emacs bug #41865. > Handle recursive timers If a timer reschedules itself by calling `run-with-timer` within itself, the command stack grows indefinitely and will run out of memory eventually. Fix this by reusing the native command frame if possible; if not, after a certain depth - 64, `explain-pause--timer-frame-max-depth`, not configurable - just reparent the entire stack to root-emacs.

lastquestion · 2020-06-21T18:03:29Z

Rebase to get fix for #44 and squash the history.

I have been working on reducing memory allocations for the refresh handler and cleaning up the top display code.

Going to add one more feature to show the times for the "slow executions" in a new line similar to profiles, clean up some more TODOs, fix some more bugs, and add the integration tests. This might be 2 more weeks at this rate...

1. Update the profiling code to always store slow times and counts. 2. If a command is marked for profiling, but fails to actually save the profile, e.g. it isn't slow, for `explain-pause-profile-slow-threshold` times, reset the profiling request and start over. 3. Update the top UI to show slow times as a similar line as profiles. 4. Generally improve memory performance by hoisting lets out of tight loops if possible, and use closure globals for anything in the measurement hooks. Still more possible work here.

Full integration tests that run emacs in tmux so we can press keys and fully simulate a user. Use the event logging stream to connect to the driver, which is also written in elisp. Drive the driver via Make. Add test cases for most existing (fixed) bugs that are not covered at unit test layer.

When we wrap a process filter or sentinel, we end up putting in a lambda into the actual process. If later, `process-filter` is called to get the filter out, eq comparisons that used to work will fail. Store the original callbacks into the process object so we can hijack `process-filter` to return the original. Fixes #46. Originally found from `magit`, in `with-editor`, in `with-editor-set-process-filter`.

lastquestion · 2020-06-29T16:10:36Z

The above commit fixes #46 in this branch and adds yet more tests. I'm probably not going to backport that fix.

Also pass bytecode warnings.

lastquestion · 2020-06-30T06:50:11Z

debugger needs to be handled and then this PR is done.

lastquestion · 2020-07-03T19:10:37Z

🎉 let's merge. There's a few tests I'd like to add around displaying things but I will do that in followup PRs. This will be 0.2 for sure, the next job is to update README & make a screencast video.

lastquestion mentioned this pull request Jun 13, 2020

Track menu-bar-open in TTY only #19

Closed

3 tasks

This was linked to issues Jun 13, 2020

Track menu-bar-open in TTY only #19

Closed

"CPU profiler is already running" in magit (root cause: server-start aka server.el is not handled) #26

Closed

lastquestion mentioned this pull request Jun 14, 2020

Error running timer ‘explain--measure-timer-callback’: (args-out-of-range [] 0) Root cause: explain-pause-top--table-refresh called on first load with one entry already #40

Closed

lastquestion force-pushed the execution-loop branch from 2f911d4 to 603669f Compare June 15, 2020 03:55

lastquestion linked an issue Jun 15, 2020 that may be closed by this pull request

Handle file-notify callbacks #3

Closed

2 tasks

lastquestion mentioned this pull request Jun 15, 2020

Handle file-notify callbacks #3

Closed

2 tasks

lastquestion linked an issue Jun 16, 2020 that may be closed by this pull request

Handle recursive timers so their command stacks do not grow infinitely #41

Closed

lastquestion force-pushed the execution-loop branch from 9087a0e to 28291ed Compare June 21, 2020 17:56

lastquestion added 2 commits June 23, 2020 01:01

Add tests; fix developer-alert; fix interactive-check.

5fa5eea

lastquestion mentioned this pull request Jun 27, 2020

Get testing working #21

Closed

3 tasks

lastquestion linked an issue Jun 27, 2020 that may be closed by this pull request

Get testing working #21

Closed

3 tasks

lastquestion changed the title ~~Advise call-interactively; add streaming logging; move profiles to explain-pause-top~~ Advise call-interactively; add streaming logging; move profiles to explain-pause-top; add integration & regression tests Jun 27, 2020

lastquestion mentioned this pull request Jun 27, 2020

nil deref error in treemacs when explain-pause-mode is active (root cause: process-filter wrap nil callback) #44

Closed

lastquestion added 2 commits June 27, 2020 23:35

Add more tests for startup modes and make test driver more reliable.

36c1549

add gitignore ignoring elc

5303e4a

lastquestion mentioned this pull request Jun 28, 2020

Enabling explain-pause-mode causes committing via magit/with-editor over TRAMP to stop working (Root cause: with-editor checks for itself with process-filter, which fails because we wrapped it) #46

Closed

lastquestion added 2 commits June 29, 2020 01:01

Add more tests for #26, add buffering for DGRAM, improve test driver

4621c93

lastquestion force-pushed the execution-loop branch from 1905bcf to 58785b8 Compare June 29, 2020 16:09

lastquestion linked an issue Jun 29, 2020 that may be closed by this pull request

Enabling explain-pause-mode causes committing via magit/with-editor over TRAMP to stop working (Root cause: with-editor checks for itself with process-filter, which fails because we wrapped it) #46

Closed

Advise x-popup-menu; add a test for it.

f55fa50

Also pass bytecode warnings.

lastquestion mentioned this pull request Jun 30, 2020

Pass bytecompile #38

Closed

lastquestion linked an issue Jun 30, 2020 that may be closed by this pull request

Pass bytecompile #38

Closed

Make makefile work for any emacs. Reorganize test folder.

54dbc20

lastquestion force-pushed the execution-loop branch from 04eaf62 to 54dbc20 Compare June 30, 2020 07:21

lastquestion mentioned this pull request Jul 1, 2020

Handle recursive-edit (fixes debugger, and other things) #47

Closed

lastquestion added 3 commits June 30, 2020 20:03

Actually pass bytecompile.

034df20

Support recursive edit, improve install check, fix TODO for log full

772de98

fix bytecompile warning by moving install of hook inside enable

ae1d673

lastquestion merged commit 36b9b71 into master Jul 3, 2020

lastquestion linked an issue Jul 3, 2020 that may be closed by this pull request

Handle recursive-edit (fixes debugger, and other things) #47

Closed

lastquestion deleted the execution-loop branch July 3, 2020 21:40

lastquestion mentioned this pull request Jul 10, 2020

"wrong-type-argument hash-table-p nil" when opening a buffer to write a commit message in magit (duplicate with #26) #69

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Advise `call-interactively`; add streaming logging; move profiles to `explain-pause-top`; add integration & regression tests #42

Advise `call-interactively`; add streaming logging; move profiles to `explain-pause-top`; add integration & regression tests #42

Uh oh!

lastquestion commented Jun 13, 2020 •

edited

Loading

Uh oh!

lastquestion commented Jun 13, 2020

Uh oh!

lastquestion commented Jun 14, 2020

Uh oh!

lastquestion commented Jun 15, 2020

Uh oh!

lastquestion commented Jun 15, 2020

Uh oh!

lastquestion commented Jun 16, 2020 •

edited

Loading

Uh oh!

lastquestion commented Jun 21, 2020

Uh oh!

lastquestion commented Jun 29, 2020

Uh oh!

lastquestion commented Jun 30, 2020 •

edited

Loading

Uh oh!

lastquestion commented Jul 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Advise call-interactively; add streaming logging; move profiles to explain-pause-top; add integration & regression tests #42

Advise call-interactively; add streaming logging; move profiles to explain-pause-top; add integration & regression tests #42

Uh oh!

Conversation

lastquestion commented Jun 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

call-interactively

streaming logging

Move profiles to explain-pause-top and remove explain-pause-log and explain-pause-profiles

Testing

Random things

Uh oh!

lastquestion commented Jun 13, 2020

Uh oh!

lastquestion commented Jun 14, 2020

Uh oh!

lastquestion commented Jun 15, 2020

Uh oh!

lastquestion commented Jun 15, 2020

Uh oh!

lastquestion commented Jun 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lastquestion commented Jun 21, 2020

Uh oh!

lastquestion commented Jun 29, 2020

Uh oh!

lastquestion commented Jun 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lastquestion commented Jul 3, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Advise `call-interactively`; add streaming logging; move profiles to `explain-pause-top`; add integration & regression tests #42

Advise `call-interactively`; add streaming logging; move profiles to `explain-pause-top`; add integration & regression tests #42

lastquestion commented Jun 13, 2020 •

edited

Loading

Move profiles to `explain-pause-top` and remove `explain-pause-log` and `explain-pause-profiles`

lastquestion commented Jun 16, 2020 •

edited

Loading

lastquestion commented Jun 30, 2020 •

edited

Loading