-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Combined run_cell and execute_cell #12
Conversation
I would like to break execute_cell into smaller parts, but given it's complexity from nbconvert and all the tracking it does I wait to tackle that later on. |
@choldgraf if you wouldn't mind taking a look at this one, I'd like to merge it before the initial release and before more PRs get made that'd be in conflict. |
Hey, will try to get to it soon. Sorry, I just finished a flight to Australia 🇦🇺 |
No worries, I really appreciate all the reviews you've been doing. If you can't get to it for a bit that's fine too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks pretty reasonable to me - what was the differing purposes of run_cell
and execute_cell
in the first place? I had one question about execute_cell
s args, but other than that this looks like a nice simplification to me
nbclient/execute.py
Outdated
if execution_count: | ||
cell['execution_count'] = execution_count | ||
self._check_raise_for_error(cell, exec_reply) | ||
self.nb['cells'][cell_index] = cell |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we've already got access to the notebook in self.nb
, couldn't this method be simplified by simply providing a cell_index
rather than cell
and cell_index
? Then the first thing that could be done is running
cell = self.nb['cells'][cell_index]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we've already got access to the notebook in
self.nb
I think that's actually a bad thing.
The notebook should not be stored in the executor!
This mistake was made in nbconvert
, where the notebook was passed to the preprocess()
method and subsequently stored in the executor:
A later preprocess()
call could overwrite the locally stored notebook.
Therefore, the preprocess()
call was not re-entrant.
I've already mentioned this problem in nbconvert
: jupyter/nbconvert#886
During the move to nbclient
, the situation seems to have gotten worse!
Now the notebook seems to be passed to the constructor of the executor, which means one executor instance can only ever execute a single notebook.
Is that really what you want?
To me, it would make much more sense if one executor instance could execute multiple notebooks in a row. Ideally also concurrently.
[UPDATE: I've created a separate issue for this: #18]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the notebook seems to be passed to the constructor of the executor, which means one executor instance can only ever execute a single notebook.
Is that really what you want?
Yes. There's a lot of state it holds for this particular execution and all the configuration for the execution means that having a owning object makes it a lot easier to manage this state and object manipulation. Otherwise we would have to pass many many arguments to every function, making the function signatures unwieldy.
A later preprocess() call could overwrite the locally stored notebook.
Therefore, the preprocess() call was not re-entrant.
...During the move to nbclient, the situation seems to have gotten worse!
The issue in nbconvert was that it was half-way between abstractions. There were some functions that were reusable and some that were not without a new object. This class should now be re-entrant on execution methods at the top. .execute
called twice will execute the whole notebook twice without having prior run state cause issues (I should add a test for this). I don't think it's "worse", it just more an object oriented approach than a functional one. in this case with lots of state and configuration object oriented has a lot of benefits over functional.
The rename that merged for changing to NotebookClient should help with the abstraction of concerns. Specifically it's denoting it acts on behalf of a notebook, meaning it should hold the state of the said notebook.
To me, it would make much more sense if one executor instance could execute multiple notebooks in a row. Ideally also concurrently.
I didn't design for this pattern. We could have designed for that but I choose to keep it closer to how the code was before to reduce the number of changes -- also because I'm not sure that's strictly a better model. You do have the execute
method for repeating calls to the client object to execute everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't design for this pattern. We could have designed for that but I choose to keep it closer to how the code was before to reduce the number of changes -- also because I'm not sure that's strictly a better model.
This is pragmatic. I'd like to see the code evolve as well to be more functional, but at the same time we're still in this annoying mess of very class oriented dynamic traitlets for configuration in Jupyterland.
Mostly an artifact of organic growth of the code in nbconvert I think. There were competing thoughts of how the interface should operate that weren't reconciled. One of the two was doing some state prep before it did an almost stateless execution call. Not necessary at this point imo. |
I think after the conflict is cleaned up you should go ahead with this simplification before taking it even further for a refactor. 👍 |
I agree w/ @rgbkrk - we can tidy-up and extend the package over time, but I don't think any of that should block on an initial release, and this is a clear improvement to me! |
Woo 🎉 |
Addresses #11
Changed the default for store_history to False (was intended to be changed with nbconvert 6.0). In doing so I found an issue with ipython kernel execution, where execution counts were not tracked when history is disabled. I've made nbclient manage execution_count itself by default now with the ability to control the behavior in execute_cell.
Also removed extra debug prints left in tests