Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add execution time to notebook/console cells #3320

Closed
jasongrout opened this issue Dec 5, 2017 · 45 comments · Fixed by #6864
Closed

Add execution time to notebook/console cells #3320

jasongrout opened this issue Dec 5, 2017 · 45 comments · Fixed by #6864

Comments

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Dec 5, 2017

It would be useful to have the execution time and running time of notebook cells and console cells as part of the cell UI. There are at least two implementations of this already: the CoCalc project, and the Execute Time widget.

See jupyter-widgets/ipywidgets#1429 for more conversation about this.

@jasongrout
Copy link
Contributor Author

@jasongrout jasongrout commented Dec 5, 2017

Here's what CoCalc looks like:

screen shot 2017-12-05 at 11 58 49 am

@jasongrout jasongrout added this to the Future milestone Dec 5, 2017
@ellisonbg
Copy link
Contributor

@ellisonbg ellisonbg commented Dec 5, 2017

@tgeorgeux
Copy link
Contributor

@tgeorgeux tgeorgeux commented Mar 12, 2018

I agree with @ellisonbg on this, I think it should be either toggle-able or something you can check when you want to.

@JoshuaC3
Copy link

@JoshuaC3 JoshuaC3 commented Mar 16, 2018

IMO it is the most useful of all the extensions in jupyter notebook. So many times I want to retrospectively know the time to run a cell. Also, its very discreet in the nbs.

@jakirkham
Copy link

@jakirkham jakirkham commented Mar 31, 2018

There's one for the Jupyter Notebook currently.

In fact, there are lot of things that I see listed as nice features for JupyterLab that exist as extensions in that repo. Maybe it is just worth the effort to port them over/add a compatibility API. Would be really great if JupyterLab could benefit from that existing body of work. Not to mention the developers over there seem like really nice, sharp people that JupyterLab could benefit from engaging more with. Is there some discussion already occuring with the developers of jupyter_contrib_nbextensions about what would help make that possible? If this is already happening and I'm just oblivious, that would be great to know. :)

cc @jcb91 @jfbercher

@saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented May 2, 2018

Before this can be implemented, we need to decide how this information should be displayed in the UI.

There are a couple of options I have considered so far, including

  • Inspector in the cell tools that shows the runtimes for the current cell.
  • Mirror the cocalc UI, show the start and duration on the right side of the code cells.
  • Mirror the old notebook extension UI, putting the duration and time below the cell.

There are actually three pieces of information we could display for each code cell. The start time, end time, and the duration. The times could also optionally be relative instead of absolute.

If we can get a sense of the use cases first that this information facilitates, then this can help us figure out how the UI can help support those. What would you use this information for in your workflow?

In using the notebook extension, I personally just used the duration, as a way to do an easy benchmark, to see if code changes I made sped up execution. So for this case, for me, having it off to the side in the inspector would work out, because I only need to see the information for one cell at a time.

@jakirkham
Copy link

@jakirkham jakirkham commented May 2, 2018

So the current notebook extension works great on this front IMHO. :)

To clarify, it places the time info as close as possible to where the user is already looking (right below the input cell). IOW there is nowhere else one needs to think of checking or go searching for (i.e. context switching) needed to find this info.

Basically what we want with this is the ability to see when a cell was run and how long it took to run.

As to what this is used for, there are a few scenarios it might be used for.

  1. Timestamping cells (e.g. did I run this step recently?)
  2. Rough estimate of cost (e.g. how long did this take to run?)
  3. Debugging issues (e.g. what was I doing when X happened?)

To add to 3, we normally run Jupyter Notebooks on our cluster. As some of the debug information only gets back to us later (possibly even after we are done running), having a way to isolate what step was problematic is very helpful.

@JoshuaC3
Copy link

@JoshuaC3 JoshuaC3 commented May 2, 2018

An example of the nb UI:
image

@saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented May 3, 2018

I re-read @ellisonbg's post in the pull request about his thoughts on the UI.

tl;dr - I am not in favor of the CoCalc UI for that information - I think it is better suited at the top of the Cell Tools panel.

He was saying that there are already some plans to use the top left corner of the cell. One question I have about putting it in the cell tools is that then there is no way to see the execution times in the console.

I could try implementing it like it looks in the original extension for the notebook and console.

@jasongrout
Copy link
Contributor Author

@jasongrout jasongrout commented May 3, 2018

One question I have about putting it in the cell tools is that then there is no way to see the execution times in the console.

It's also not easy to scroll and see at a glance when cells executed or how long they took - it requires actually focusing the cell to see the information in the cell tools panel. That may be a good or bad thing, depending on the usecase.

@ellisonbg
Copy link
Contributor

@ellisonbg ellisonbg commented May 7, 2018

@goerz
Copy link

@goerz goerz commented May 16, 2018

My vote is definitely on the UI of the old notebook extension. I'd really love to see this!

@moble
Copy link

@moble moble commented Jun 13, 2018

Big +1 for the ExecuteTime format being built-in, but toggle-able via settings.

But I also want to point out that another important feature nobody has mentioned here is the "execution queued" time that displays after you've hit run on a cell but before it's finished. That also suggests a possible improvement of switching to something like "execution began" after the cell has actually started (which may have been well after the queued time if other slow cells were run first). This can give a more accurate sense of how long you have to run to the bathroom or whatever.

I just have to add that the lack of this feature is literally the only thing keeping me from switching to jupyterlab (even the prospect of losing my beloved snippets extension). IMO, conscientious scientists absolutely need this information included in their notebooks.

@saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented Jun 27, 2018

But I also want to point out that another important feature nobody has mentioned here is the "execution queued" time that displays after you've hit run on a cell but before it's finished.

@moble I think this is a great idea. Just to clarify, this would be helpful as a duration, not a timestamp, right?

I closed my work on this #4161 until we can figure out the design that should be implemented here.

I am currently thinking that execution time should be switched off by default, because saving that extra metadata to the notebook file makes git diffs worse. However, I see a couple of options here:

  • Always save execution times in notebook metadata (as currently implemented)
  • Save them according to a setting/global flag, default is off
  • Always calculate timings and display them in the UI, but only save them to notebook file when using "save with extras", similar to the collapsed state of cells #3981 (my preference, but it does add more visual information to UI that all people might not want)

The other thing to decide is whether to support them in consoles or not. My preference is to support them, but they wouldn't be persisted since console state is not persisted.

The final decision is how they should appear in the UI:

  • Old notebook extension UI, under the cells This seems to be favorite of comments on this issue
  • CoCalc UI
  • In Cell tools. Issues are that you might want to see multiple times at once and there are no cell tools currently for console outputs.

@moble
Copy link

@moble moble commented Jun 27, 2018

But I also want to point out that another important feature nobody has mentioned here is the "execution queued" time that displays after you've hit run on a cell but before it's finished.

@moble I think this is a great idea. Just to clarify, this would be helpful as a duration, not a timestamp, right?

I'm not sure I understand the question. By "duration" do you mean an active timer that ticks off the seconds as they pass? Either one would work, since it's just simple mental math to get whatever information you need. But I imagine timestamps would be easier to code, especially when dealing with things like saving and even kernel crashes, so you don't leave a running timer in the notebook. To clarify, I think there are three stages that need to be treated differently.

  1. Queued but not yet running -> display something like execution queued 11:36:20 2018-06-27
  2. Running but not yet finished -> display something like execution started 11:36:30 2018-06-27
  3. Finished -> display something like executed in 10.0s, finished 11:36:40 2018-06-27

The first and third are currently done in the nbextension, whereas I'm suggesting the second as a new addition. I don't have strong opinions on details like wording as long as all relevant information is present and reliable. But IMO the CoCalc UI isn't as nice, since the placement feels like it's crowding my code and leads to possible overlap — plus any time I see things like "2 minutes ago" on a webpage I'm skeptical that it's been updated correctly.

I am currently thinking that execution time should be switched off by default, because saving that extra metadata to the notebook file makes git diffs worse.

But surely barely worse than including the execution_count, or numerous other things that can change on execution, right?

Anyway, I really feel that having proper timestamps by default is crucial to the integrity of notebooks as part of the scientific process — and file timestamps are not remotely good enough. One analogy I would draw is with paper notebooks maintained by people working in a biology lab, for example, where it's widely considered scientific misconduct to fail to record the date for entries. But beyond scientific integrity, it's also frequently just very convenient to have the information available, to see how long things took, or to ensure that input data were from the correct time period, or to verify that its outputs were up to date when used elsewhere, etc., because notebooks don't exist in isolation. And sometimes you don't know you need the information until after the code has run. So I'm arguing that there are potentially huge benefits to always storing the information, whereas the downside of uglier git diffs is minimal and manageable.

@ellisonbg
Copy link
Contributor

@ellisonbg ellisonbg commented Jul 3, 2018

@concretevitamin
Copy link

@concretevitamin concretevitamin commented Jul 18, 2018

Can't +1 enough on this. Execute time of cells are critical information for scientific experiments. Big surprise that the extension exists for jupyter notebooks but not in jupyter lab.

@BoPeng
Copy link
Collaborator

@BoPeng BoPeng commented Jul 18, 2018

I personally hate to cluster the cell UI with time information, and I think this could be considered together with other resource types (jupyter/jupyter#264, @rgbkrk). I think the transient_display_message interface (#4879) could fit the purpose well. For example, an extension could allow the following:

  1. When a cell is evaluated, the time stamps (and perhaps also CPU/memory information) are saved as meta data of the cell.
  2. The information is also sent to one of the tabs of the inspection panel.
  3. When the user switch to a new cell, the information on that cell (if available) is displayed in the inspection panel.

Note that in CPU/Memory information was considered one of the possible 'transient' information in the proposal for transient_display_data. The information are in this case saved to the notebook, and but are displayed only temporarily (transiently).

@concretevitamin
Copy link

@concretevitamin concretevitamin commented Jul 18, 2018

I appreciate all the discussions. However, I imagine most if not all of these considerations should've been conducted when people developed the corresponding extension for notebooks. If that extension is widely used, I see no reason to reinvent the discussion.

@jasongrout
Copy link
Contributor Author

@jasongrout jasongrout commented Jul 18, 2018

We don't know what UI conversation happened when the different implementations were developed by various third parties. This would be the first time we'd support such UI in Jupyter core, and JupyterLab has some different UI constraints and capabilities anyway, so I think having a discussion for such an ever-present UI element is definitely warranted, informed of course by what is out there.

@moble
Copy link

@moble moble commented Jul 18, 2018

It sounds like this has fully evolved into two fundamentally separate issues:

  1. The need to store various pieces of time-related metadata for each cell as part of the saved notebook, which at least @concretevitamin and I agree (and most other commenters seem not to dispute) are crucial for scientific integrity.
  2. The various options for displaying this information, whether
    a. no display at all,
    b. ExecuteTime-style as a separate little section at the bottom of the cell,
    c. CoCalc-style within the cell,
    d. as part of a cell inspector,
    e. as part of the table of contents,
    f. in a global status bar,
    g. as part of some transient display data,
    h. as hover text on the input and output "prompt" divs [similar to what was suggested in #1826], or
    i. any one of many other possibilities

It seems that at least two of us will argue that item 1 should be an integral part of jupyterlab's format and its default behavior. In fact, I might take it further to argue that some sort of timestamp should be the default behavior for every cell type — not just code cells.

On the other hand, item 2 is obviously much more subjective, so that users should be able to choose their preferred UI. As a topic for discussion, it seems that this item further separates into two parts:

  1. which options jupyterlab itself should support and which to leave for extensions, and
  2. what the default setting should be.

Personally, I think it's reasonable to keep the UI clean by not displaying the information by default. On the other hand, I think it would be helpful to make it readily discoverable by someone who doesn't realize it's stored — which is why I suggested the hover text thing.

@saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented Jul 18, 2018

@takashimokobe
Copy link

@takashimokobe takashimokobe commented Jul 25, 2018

Hi all, my name is Taka. I'm on an intern team at Project Jupyter working on a status bar extension for JupyterLab. We're interested in possibly adding running/execution time as part of our status bar. We'd like to hear from everyone about why they would want to view this data and how important it would be to their workflow. @ellisonbg @tgeorgeux @saulshanabrook @jasongrout

@goerz
Copy link

@goerz goerz commented Jul 25, 2018

@takashimokobe It's fairly important to my workflow, as I do "small-scale" calculations that take seconds to minutes interactively in the notebook, which are then scaled up to run on an HPC cluster and can take hours to days. Being able to see the runtime in the notebook (as shown by the plugin for the traditional notebook) allows me to have a good sense how long things will take when scaling up.

The other effect of the traditional-notebook-plugin that I really like was that it color codes the section that's currently running/scheduled in the TOC. As I often "Run all cells" from the top, this gives me a good sense of how far a notebook is along.

So, basically, personally I'm 100% happy with the traditional plugin, and it would be really nice if the exact same functionality was available in JupyterLab.

@ellisonbg
Copy link
Contributor

@ellisonbg ellisonbg commented Jul 26, 2018

@saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented Jul 30, 2018

I have a small initial PR with a new approach that just adds some event timings to the metadata. #5009 This sidesteps some of the problems about "what really counts as start/end time" and gives us more flexibility to include a variety of times and have UIs display/interpret them however they like.

@EamonKeane
Copy link

@EamonKeane EamonKeane commented Aug 21, 2018

+1 for this also. My workaround for now which I'll post in case it's of interest is to use Prometheus+Grafana to view the CPU/memory spikes. If you're on GKE, there's a one click extension to deploy Prometheus+Grafana.

grafana-jupyterhub

@saulshanabrook saulshanabrook removed this from the Future milestone Sep 5, 2018
@saulshanabrook saulshanabrook added this to the 0.35 milestone Sep 5, 2018
@blink1073 blink1073 removed this from the 0.35 milestone Sep 5, 2018
@blink1073 blink1073 added this to the 1.0 milestone Sep 5, 2018
@jasongrout
Copy link
Contributor Author

@jasongrout jasongrout commented Sep 7, 2018

#1826 also has some suggestions about the UX for showing the execution timestamp.

@elgalu
Copy link

@elgalu elgalu commented Sep 19, 2018

Hi, what's the status? will #5009 be merged and continue the work on the UI?

@saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented Sep 19, 2018

@elgalu Yep, updating that PR so that it passes tests and possibly adding more timings to it is on my todolist for the next release. If you are interested in helping, happy to give you any feedback.

@elgalu
Copy link

@elgalu elgalu commented Sep 19, 2018

Nice! can you merge master branch into your timing branch? then I will test it on my setup.

@saulshanabrook saulshanabrook self-assigned this Sep 21, 2018
@saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented Oct 4, 2018

@elgalu I just added more timings and rebased off master.

@tgeorgeux
Copy link
Contributor

@tgeorgeux tgeorgeux commented Oct 4, 2018

As far as the UI is concerned, the upcoming Status Bar has a spot for execution time.

@jasongrout jasongrout removed this from the 1.0 milestone Nov 27, 2018
@jasongrout jasongrout added this to the Future milestone Nov 27, 2018
@bl-deepakchawla
Copy link

@bl-deepakchawla bl-deepakchawla commented Jan 8, 2019

How I can see the running time of cell in jupyterlab..??

@YubinXie
Copy link

@YubinXie YubinXie commented Jan 31, 2019

Any recent update on this function? Execute time is really one of the most crucial method in jupyter that records 'when this is done'. And it is important for jupyterlab to replace notebook. Thanks!

@saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented Jan 31, 2019

@YubinXie I just summarized the status on my PR: #5009 (comment)

If you are interested in helping push it over the line, I am happy to support you!

@pkasinathan
Copy link

@pkasinathan pkasinathan commented Aug 22, 2019

+1

@lock lock bot locked as resolved and limited conversation to collaborators Oct 9, 2019
@saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented Apr 13, 2020

If anyone is looking for a UI for execution timing, you can checkout out the jupyterlab-execute-time extension, which provides a similar UI to the notebook extension: https://github.com/deshaw/jupyterlab-execute-time

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.