Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a timeout argument to execute #48

Merged
merged 9 commits into from
May 29, 2020

Conversation

AakashGfude
Copy link
Member

a timeout option to specify what is the maximum time allowed to execute for each cell before it times out. The default specified here is 30s, which is also the default value of ExecutePreProcessor.

@chrisjsewell
Copy link
Member

As in https://nbsphinx.readthedocs.io/en/0.6.0/timeout.html#Cell-Execution-Timeout,
this should this also be controlled by notebook level metadata, i.e. there would be a global value (which for myst-nb could be an option in conf.py) which is used by default, but would be superseded by a value set in the notebook metadata, if present.

@AakashGfude
Copy link
Member Author

Great. So, we will need to read the timeout metadata information somewhere in that loop before https://github.com/ExecutableBookProject/jupyter-cache/blob/517366e31adeafd4818fb40f7c85a364e44f600b/jupyter_cache/executors/basic.py#L159.

I will update this PR for this, and the one in myst-nb to check for conf variable

@AakashGfude
Copy link
Member Author

Hey @chrisjsewell , a quick question. Where do you think the timeout keyword should reside here in the metadata. Like in the example above it is inside "nbsphinx" object.
We don't have any objects at present in metadata where timeout would fit. Nor I guess, it makes sense to create a new one unless we plan to have more variables.
Top-level sounds good enough?

@chrisjsewell
Copy link
Member

Where do you think the timeout keyword should reside here in the metadata.

Not quite sure TBH, maybe execution/timeout rather than "polluting" the base namespace.
Would also be nice if it was something agreed on in nbclient

cc'in @choldgraf

@choldgraf
Copy link
Member

I opened up jupyter/nbclient#45 to see if others in the nbclient repo have thoughts

@AakashGfude
Copy link
Member Author

AakashGfude commented Apr 17, 2020

@choldgraf should we just create an object like nbsphinx does to store the timeout.

"mystnb": {
   "timeout": 60
}

No one seems to be replying to that thread. Probably there is no jupyter-standard, else nbsphinx might have used it as well?
Else we can wait a bit longer for their reply.

@choldgraf
Copy link
Member

I think that you're right that there isn't really a standard, so let's just go with something that seems reasonable to us and we can support a new standard later on if we wish

@chrisjsewell
Copy link
Member

chrisjsewell commented Apr 17, 2020

jupyter-client doesn't/shouldn't have any direct ties to mystnb though, so it would be better to use a more generic top level key like execution or nbclient which we would then suggest in jupyter/nbclient#45

@AakashGfude
Copy link
Member Author

AakashGfude commented Apr 20, 2020

jupyter-client doesn't/shouldn't have any direct ties to mystnb though, so it would be better to use a more generic top level key like execution or nbclient which we would then suggest in jupyter/nbclient#45

ok, so committing again with execution key then

Added

"execution": {
     "timeout":  
}

@choldgraf
Copy link
Member

Just a note that matt did have some thoughts in jupyter/nbclient#45 he just posted...it mostly comes down to "there is no current standard" but I think nbclient folks are open to agreeing on a metadata spec that is reasonable.

@AakashGfude
Copy link
Member Author

@choldgraf that's great. So, basically he prefers to have separate metadata for execution as well instead of namespaced subsection. Then we can ditch the option of putting timeout in nbclient and can use execution instead in the top-level?

@choldgraf
Copy link
Member

I'm not entirely sure - but feel free to throw out some ideas in the nbclient thread and perhaps we can come to a quick agreement

@choldgraf
Copy link
Member

choldgraf commented May 22, 2020

@AakashGfude this looks good from a functionality standpoint, three major points:

  1. Could you document the timeout parameter in the docstring of one of those classes? E.g. make it clear that the number is in seconds, and also add to the docs of jupyter-cache about how to control this?
  2. Could you add a test to make sure that this works as expected. Maybe just add the notebook timeout metadata to one of the test notebooks and make sure there's a computation in there that both: 1. breaks the timeout, and 2. happens underneath the timeout
  3. I think it should also be possible to control the timeout globally from, say, the command line or the Python API. That way downstream in Jupyter Book, we could set a single config like execution_timeout: 60 and it would then set the timeout for every notebook execution, rather than requiring us to put the metadata in every notebook.

Does that all make sense? Lemme know what you think.

@AakashGfude AakashGfude force-pushed the exec-timeout branch 2 times, most recently from dc84e82 to d4a6bbc Compare May 23, 2020 15:45
@AakashGfude
Copy link
Member Author

AakashGfude commented May 23, 2020

@choldgraf have added tests and also added this as an option in cli commands, coz it might come handy?
usage example (for 10 second timeout):-

jb execute --timeout 10 or jb execute -t 10 
  1. Could you document the timeout parameter in the docstring of one of those classes? E.g. make it clear that the number is in seconds, and also add to the docs of jupyter-cache about how to control this?

have added a line in python API section of the docs. But, the execute command line option has none of its parameters specified in the docs, so, was not sure where to put it.
would like to know your thoughts.

  1. Could you add a test to make sure that this works as expected. Maybe just add the notebook timeout metadata to one of the test notebooks and make sure there's a computation in there that both: 1. breaks the timeout, and 2. happens underneath the timeout

Thanks. Have added the tests now.

  1. I think it should also be possible to control the timeout globally from, say, the command line or the Python API. That way downstream in Jupyter Book, we could set a single config like execution_timeout: 60 and it would then set the timeout for every notebook execution, rather than requiring us to put the metadata in every notebook.

it indeed was setup to handle a global config value for timeout , In jupyter-book inside the execute object in _config.yml, we can have a timeout key.

@AakashGfude AakashGfude force-pushed the exec-timeout branch 3 times, most recently from 8b5df22 to a7c2b8d Compare May 23, 2020 16:01
@AakashGfude
Copy link
Member Author

AakashGfude commented May 23, 2020

This is so weird. Not sure, what got updated, but pre-commit tests are failing for files in Travis in which I did not work on. Needless to say, Working fine on my local system.

@AakashGfude
Copy link
Member Author

not sure why complex_outputs*.ipynb files are failing in Travis CI. Probably because of lack of installed packages in the server. I will have a deeper look later.

@AakashGfude
Copy link
Member Author

AakashGfude commented May 24, 2020

Have added the same files complex_outputs_unrun and complex_outputs_unrun_metadata as MyST-NB tests. and also the same extra testing dependencies.

@choldgraf
Copy link
Member

nice! is this ready for a review or merge?

@AakashGfude
Copy link
Member Author

@choldgraf yes

Copy link
Member

@choldgraf choldgraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me - just two quick comments in there 👍 thanks Aakash!

jupyter_cache/cli/commands/cmd_cache.py Outdated Show resolved Hide resolved
@@ -57,12 +57,13 @@ class JupyterExecutorBasic(JupyterExecutorAbstract):

The execution is split into two methods: `run` and `execute`.
In this way access to the cache can be synchronous, but the execution can be
multi/async processed.
multi/async processed. Takes timeout parameter in seconds for execution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a short section on "timeout" under the execution documentation here: https://jupyter-cache.readthedocs.io/en/latest/using/api.html#execution

And mention that you can specify a timeout either via a the call to jupyter-cache, or via the execution: timeout: metadata field?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had added a small Note there https://github.com/executablebooks/jupyter-cache/pull/48/files#diff-7d5c2179cfd8d418561a37c077b98278R1063 . But, creating a small section sounds better

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@choldgraf what do you think?

@choldgraf
Copy link
Member

ok this LGTM and the integration tests are passing, so I'll give it a merge!

@choldgraf choldgraf merged commit 55db432 into executablebooks:master May 29, 2020
@choldgraf
Copy link
Member

thanks very much @AakashGfude for working on this one! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants