Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perfmon System version 2 #1453

Merged
merged 33 commits into from Jul 22, 2020
Merged

Perfmon System version 2 #1453

merged 33 commits into from Jul 22, 2020

Conversation

pwinston
Copy link
Contributor

@pwinston pwinston commented Jul 14, 2020

Description

Updates to the perfmon system which were made while investigating and solving #1320:

  1. Use monkey patching from a config file instead of decorators.
  2. Config file has "sets" of callables you can toggle on and off.
  3. Config file has new option to turn Qt event tracing on or off.
  4. Added kwargs to perf_timer to display per-event key/values in Chrome tracing.
  5. Added add_instant_event()per API to add vertical lines in Chrome tracing.
  6. New napari.utils.patcher.py does the monkey patching, it's generic not perf specific.

Type of change

  • Update existing feature (non-breaking change which adds functionality)

Example 1

87239376-7caaa100-c3dc-11ea-9846-f0258f82b6ce

Example 2

napari-slow-8k

Notes

New NAPARI_TRACE_FILE env var

Set this to a file path and Napari will trace to file on startup. I basically always use this and never start a trace from the debug menu anymore, although we should keep that option because it's more discoverable and useful for one-time tracing. This is how I tend to use perfmon today:

  1. Set NAPARI_TRACE_FILE to some local latest.json file
  2. Run Napari, do what I want, exit with Command-Q (hitting red circle to exit does not work?)
  3. Drop latest.json into chrome://tracing.

So every time I run napari I get a trace file, and if I care what's in it I just throw it into Chrome.

Checking NAPARI_PERFMON

Before we took care to check the NAPARI_PERFMON env var without importing perf in a few place. Now with a config file we have to check the settings in that file to know whether to trace Qt events, so it's more natural to import perf. We check whether to trace Qt events like this:

if perf_config and perf_config.trace_qt_events:
    # setup tracing

Perfmon Config File

If NAPARI_PERFMON=1 we turn on Qt Event tracing like before, and we do nothing else. If NAPARI_PERFMON is a file path however, we load that as our perfmon config file. The format of the config file is:

{
    "trace_qt_events": true,
    "trace_callables": [
        "chunk_loader",
        "vispy_rendering"
    ],
    "callable_lists": {
        "chunk_loader": [
            "napari.components.dims.Dims.set_point",
            "napari.components.viewer_model.ViewerModel._update_layers",
            . . . 
        ],
        "vispy_rendering": [
            "vispy.app.canvas.Canvas.set_current",
            "vispy.gloo.program.Program.draw",
            "vispy.gloo.context.GLContext.flush_commands",            
            . . . 
        ]
    }
}

The PerfmonConfig class monkey patches our perf_timer context object into each function or method above. This means we don't have to modify the code at all, not even to add a decorator. And it also means we can have different trace sets, even overlapping ones, that we can toggle on or off as needed. You want enough tracing to see what you want, but not so much that the overhead is annoying.

Add kwargs to perf_timer

Now PerfTimer has a kwargs field that lets you set arbitrary key=value pairs that will be displayed into the chrome://tracing GUI when you click on an event. With monkey patching you can't use these, but with a manual perf_timer you can.

New add_instant_event()

New add_instant_event() is shown in chrome://tracing as a vertical line. This is useful because very short events are invisible unless you zoom way in. But instant events are always visible as lines. For instance when debugging I created instant events for MousePressed and MouseReleased. There are 3-4 other graphical things in the chrome tracing spec that we aren't using yet, need to look into those.

"Tracing" vs. "Timing"

I've been switching to the term "tracing" based on my experience using the perf stuff. The Chrome feature is called chrome://tracing and tracing really is the right term. There is timing information, but just visualizing how the code executes is a big part of things, beyond the timing information. I might rename the whole module from "perf" to "tracing" at some point.

Call Stack

The perf_timer wrappers clutter up the stack. If you want to debug might want to turn off perfmon.

Patcher

The new napari.utils.patcher.py is actually totally generic for patching functions/methods. The perf code uses patcher to patch in perf_timer but you could use it to patch anything. Not sure if there is some library (event wrapt itself?) that can patch lists of callables like we are doing, but it was not hard to implement. It's only 150 lines with comments.

How has this been tested?

Final checklist:

  • My PR is the minimum possible work for the desired functionality
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

@pwinston pwinston changed the title Major upgrade of perfmon system Perfmon System version 2 Jul 14, 2020
@pwinston pwinston marked this pull request as ready for review July 14, 2020 21:52
@pwinston
Copy link
Contributor Author

pwinston commented Jul 14, 2020

I still need to polish some comments and add types, but ready for high level review.

@sofroniewn sofroniewn added this to the 0.3.7 milestone Jul 14, 2020
@sofroniewn sofroniewn added the feature New feature or request label Jul 14, 2020
@sofroniewn sofroniewn requested review from jni, AhmetCanSolak and tlambert03 and removed request for jni July 14, 2020 23:38
pwinston added a commit to pwinston/napari that referenced this pull request Jul 15, 2020
Moved the perfmon parts of napari#1354 to new PR napari#1453. This PR contains the ChunkLoader parts of napari#1354.
Copy link
Contributor

@sofroniewn sofroniewn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is amazing! Just gave it a spin and everything works great. I love the config file, so great to be able to specify the callables that I want to trace, I can imagine this being incredibly useful when developing and debuging. Right now I need to use benchmarks and airspeed velocity for all profiling, which can be nice for seeing everything, but I see this as being much faster and easier to use when figuring things out the first time.

Curious how @tlambert03 thinks about this, but I guess after we merge #1183 we'll probably want to combine that config file into that approach, but that can come later. Definitely nothing that needs to hold this PR up.

Now PerfTimer has a kwargs field that lets you set arbitrary key=value pairs that will be displayed into the GUI when you click on an event. With monkey patching you can't use these, but with a manual perf_timer you can.

When you say GUI here you mean the chrome tracing GUI right? not napari? this is how you injected your additional metadata that you wanted to view in the tracing tool right? just want to make sure I understand.

but you could use it to patch anything

Just curious what else could you imagine we might want to patch in like this?

final comment

I also read through the whole code and everything seemed very simple and clear - I won't claim to understand every line after this read, but it all looks very good to me and well documented. I think this is good to merge, but will let @jni and @tlambert03 review before then

@pwinston
Copy link
Contributor Author

pwinston commented Jul 16, 2020

When you say GUI here you mean the chrome tracing GUI right? not napari?

Good point. I corrected it above to say "chrome://tracing GUI". Before I actually just tacked the key/value pairs onto the string name like with perf_timer(f"calculate count="{count}"). In some ways that's even better because the values are visible in the timeline rectangles themselves. However with this kwargs feature you can do this:

        with perf_timer("clear_pending", data_id=request.data_id, my_value="pizza"):

When you click on that event in Chrome you can see your values under Args:

Screen Shot 2020-07-16 at 7 45 26 AM

So this might be useful if you had a lot of values or big values. In fact looking at those Args it looks like the variable view in the debugger. And the nested rectangles depict the stack. So tracing is sort of a marriage between "debug prints" and "debugger technology". It's after the fact, but a fairly rich display of what happened.

I'm bullish on tracing being useful well beyond timing. From my experience debugging with it, the code could statically look good but the trace looks wonky. You might see A calls B and B takes unexpectedly long. Or why do we call C from 3 different places? Or doing D triggers a ton of stuff. And it's all through the lens of time, because really fast stuff basically disappear unless you zoom way in.

Just curious what else could you imagine we might want to patch in like this?

I'm not really sure. At first the patcher was perf-specific but then I realized perf could just pass perf_timer as a patch_func to the patcher. That made the patcher is generic, so I pulled it out of the utils.perf directory.

I guess you could have a patch_func that created some totally different profiling or tracing format? You could maybe disable or replace/mock some functions for testing or other reasons? Really anything you could do with a decorator, but without actually modifying the code to add the decorator. And being able to do it in other people's modules/packages. Although as we learned with Qt you can't monkey patch some C++/sip functions.

The patcher format is so simple and useful I'm guessing there are packages that do the same thing? Maybe even wrapt itself? I didn't hunt around since it was easy enough to just do it, the code is like 150 lines. Taking a list like this and patching all the functions and method is pretty nice. And we could add wildcards to make it even more powerful.

            "napari.components.dims.Dims.set_point",
            "napari.components.viewer_model.ViewerModel._update_layers",
            "napari.layers.image._image_slice.ImageSlice.chunk_loaded",
            "napari.layers.image._image_slice.ImageSlice.load_chunk",

Copy link
Member

@tlambert03 tlambert03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with @sofroniewn... this looks amazing & useful and the code is very clean and readable. I also can't claim to fully follow everything that's happening here without spending more time than I currently have... but since this is still an opt-in feature, I have no issues merging this to master.

I guess one request for a follow-up PR would be a bit of a "beginners guide" to using it. I'm new to this sort of performance tracing, so it's not totally clear to me whether the only output here is the "latest.json" file which should be evaluated in chrome://tracing, or whether there are other ways that you're using this. I wouldn't want you to regurgitate some chrome tracing tutorial that is already available elsewhere, but some docs that point to useful resources, with a quick example workflow (e.g. create something that doesn't perform that well, monitor it, fix it, and show how you know it got fixed) would be great.

@pwinston
Copy link
Contributor Author

Yes I think it very much needs kind of walkthrough of usage. I can create a folder docs/source/perfmon maybe? With the title "Performance Monitoring"?

Although as an aside, I've grown to somewhat feel that "tracing" is actually a bigger thing than just "performance monitoring". I think "tracing" is kind of purpose-neutral in the same way that "logging" is purpose-neutral. And in a way tracing is just logging, but it's logging in a very specific format.

Because you could use tracing or logging to investigate performance issues, but you could use them for other reasons. A lot of the traces I made working on async where not really for "performance" they were getting things working correctly, at whatever speed.

That's a pretty subtle point though. All the code is in utils.perf and it was written with performance monitoring in mind. So I think using the umbrella term is fine. If I were to start over from scratch I might brand it all "tracing" instead, but I don't think it's worth changing.

@sofroniewn
Copy link
Contributor

Yes I think it very much needs kind of walkthrough of usage. I can create a folder docs/source/perfmon maybe? With the title "Performance Monitoring"?

Sounds great @pwinston thanks. I think I'll see if @jni has any additional comments on this PR in the next few hours and if not will merge before I go to sleep tonight so you can make the updated async PR tomorrow morning. That docs change can then come in its own PR.

Although as an aside, I've grown to somewhat feel that "tracing" is actually a bigger thing than just "performance monitoring". I think "tracing" is kind of purpose-neutral in the same way that "logging" is purpose-neutral.

Good points! Exciting to get this in!!

@jni
Copy link
Member

jni commented Jul 20, 2020

I can create a folder docs/source/perfmon maybe

I'd stick it in docs/source/developers, where benchmarks live. (For now, anyway.) In the future, maybe we can collect this together with your previous performance doc in a "speed" folder?

btw why are all the filenames yelling at me in that folder? 😂 All in favour of changing the filenames to lowercase? 🙋

I'll see if @jni has any additional comments on this PR in the next few hours

Gonna have a look in a bit but if you don't hear from me, go ahead. =)

Copy link
Member

@jni jni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pwinston, much simpler now from my perspective! I've made a couple of extra notes around docstrings.

I'm unclear from your comment whether we should wait to merge for you to use my toolz suggestion to simplify the timer, or whether you'd rather move on to other things! Either is fine. (But toolz is awesome ;)

napari/utils/perf/_patcher.py Show resolved Hide resolved
napari/utils/perf/_patcher.py Show resolved Hide resolved
napari/utils/perf/_patcher.py Outdated Show resolved Hide resolved
napari/utils/perf/_patcher.py Outdated Show resolved Hide resolved
pwinston and others added 10 commits July 21, 2020 06:44
Change pid/tid to process_id/thread_id
Add comment about "X" and "I"
This makes it clearer that it's a chrome://tracing thing and not just
a generic "type" thing.
Co-authored-by: Juan Nunez-Iglesias <juan.nunez-iglesias@monash.edu>
Co-authored-by: Juan Nunez-Iglesias <juan.nunez-iglesias@monash.edu>
I have not really noticed where this shows up in the UI so until we feel
it's useful best to leave it out.
These are not needed now that we have the patcher.
@pwinston
Copy link
Contributor Author

I think everything is resolved. All feedback/changes were beneficial I think, improved things.

Only totally missing thing is external documentation, like a "how to" guide. I don't think that needs to be part of this PR relative to getting async further along

Remove _utils.py since there was only 1 thing left in it.
@sofroniewn
Copy link
Contributor

I think everything is resolved. All feedback/changes were beneficial I think, improved things.

Great! Thanks for the review @jni and making those changes @pwinston!

I don't think that needs to be part of this PR relative to getting async further along

Agreed. I will wait till tests pass here and then merge shortly. I'm looking forward to checking out the async PR!!

Add new start_file_on_start config file option instead.

Using a separate env var was legacy, everything should be
controlled via the config file now.
@pwinston
Copy link
Contributor Author

pwinston commented Jul 21, 2020

I snuck in the previous commit to get rid of NAPARI_TRACE_FILE env var which was now legacy. I switched it to use the config file instead, since we're doing everything else with the config file. So my config file starts:

{
    "trace_qt_events": true,
    "trace_file_on_start": "/Users/pbw/Desktop/perf/latest.json",

Which means if perfmon is enabled it starts writing to latest.json as soon as it starts. I find this really useful because when investigating something I just want to:

  1. Start napari.
  2. Do the thing I'm testing.
  3. Hit command-Q to exit
  4. Drag the latest.json file into Chrome to view it.

No need to use our Debug->Start Trace File menu option. Although I think it makes sense to keep that option.

@sofroniewn
Copy link
Contributor

No need to use our Debug->Start Trace File menu option. Although I think it makes sense to keep that option.

Yes, love this! Very nice

@jni jni merged commit c53051e into napari:master Jul 22, 2020
@sofroniewn sofroniewn mentioned this pull request Jul 22, 2020
6 tasks
@pwinston pwinston deleted the feature/perfmon2 branch July 29, 2020 13:55
@pwinston pwinston mentioned this pull request Jul 29, 2020
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants