Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional --output_path param & Async/await support #502

Open
EugeneTorap opened this issue May 12, 2020 · 4 comments
Open

Optional --output_path param & Async/await support #502

EugeneTorap opened this issue May 12, 2020 · 4 comments

Comments

@EugeneTorap
Copy link

Hi, I'm using papermill for high load microservice to display html converted notebook in real time.

  1. I don't need creating output notebooks because I just receive nb object, convert it to html and rendering it on site.
    How can I skip & disable creating a output notebook?
  2. About creating a new async/await API for execute_notebook because nbclient 0.2.0 supports async. Do you plan to add async method like async_execute_notebook with new nbclient API async_setup_kernel, async_wait_for_reply, async_execute_cell?
@MSeal
Copy link
Member

MSeal commented May 12, 2020

  1. I don't need creating output notebooks because I just receive nb object, convert it to html and rendering it on site.

I'm assuming you're using the fetch pattern from papermill? Otherwise why not use nbclient / nbconvert directly if you're converting? Are you executing and converting to html or just converting to html? If you're just converting to html I'm not sure why you need papermill or nbclient -- sorry for the confusion here on my part.

How can I skip & disable creating a output notebook?

Set the output to /dev/null, but generally papermill is a high opinion tool so it always requires an input and an output.

  1. About creating a new async/await API for execute_notebook because nbclient 0.2.0 supports async. Do you plan to add async method like async_execute_notebook with new nbclient API async_setup_kernel, async_wait_for_reply, async_execute_cell?

Yes, we just haven't gotten around to adding async to papermill now that the stack above is async. One thing to note is that our IO fetch methods rely on external libraries that are mostly NOT async so even if execution is made async fetching / saving notebooks may not always be.

@EugeneTorap
Copy link
Author

@MSeal

Are you executing and converting to html or just converting to html?

I receive path_to_nb & params from a user request. params is data need to parameterizing. Then I use papermill.execute_notebook for parameterizing & executing a notebook. Thereafter I convert nb by nbconvert. After I just return html to user.

@MSeal
Copy link
Member

MSeal commented May 12, 2020

Got it, makes sense. In that case I'd just use a tmp path based on session id (or randomly) as the output and clean the output on successful request termination.

On the async front, happy to review PRs if you wanted to help add async support :)

@devstein
Copy link

Hi, @MSeal I'm having issues executing the sparkmagic PySpark kernel while testing the async changes on this PR. The execution hangs on the first cell (see logs below).

Do you know if this related to adding async support to papermill? If so, happy to try to make a contribution.

Using selector: EpollSelector
Starting kernel (async): ['/usr/bin/python3.7', '-m', 'sparkmagic.kernels.pysparkkernel.pysparkkernel', '-f', '/tmp/tmpk5lukfi7.json']
Connecting to: tcp://127.0.0.1:34727
connecting iopub channel to tcp://127.0.0.1:57569
Connecting to: tcp://127.0.0.1:57569
connecting shell channel to tcp://127.0.0.1:55435
Connecting to: tcp://127.0.0.1:55435
connecting stdin channel to tcp://127.0.0.1:47223
Connecting to: tcp://127.0.0.1:47223
connecting heartbeat channel to tcp://127.0.0.1:58479
Using selector: EpollSelector
connecting control channel to tcp://127.0.0.1:34727
Connecting to: tcp://127.0.0.1:34727
Executing notebook with kernel: pysparkkernel
Executing Cell 1---------------------------------------
Skipping non-executing cell 0
Ending Cell 1------------------------------------------
Executing Cell 2---------------------------------------
Skipping non-executing cell 1
Ending Cell 2------------------------------------------
Executing Cell 3---------------------------------------
Executing cell:
%%info
msg_type: status
content: {'execution_state': 'busy'}
msg_type: execute_input
content: {'code': '%%info', 'execution_count': 1}
msg_type: status

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants