Refactored execute preprocessor to have a process_message function #905

MSeal · 2018-10-28T23:58:52Z

Built ontop of #982 which added additional requested testing. We can merge that PR first if it's desired.

Broke apart the large run_cell function to enable intercepting messages easier. This will enable deleting code from papermill where we have to reimplement the entire run_cell method.

MSeal · 2018-10-29T01:08:26Z

Hmmm seeings this test fail against master so I don't think it's related to the PR

maartenbreddels · 2018-10-29T20:32:57Z

I see the same in #900

mpacer · 2018-10-30T22:28:29Z

@Carreau I think you mentioned something about changing the error formatting in IPython a while back — shall I assume that the errors that are cropping up are related to a recent release of IPython?

mpacer · 2018-10-30T22:39:44Z

nbconvert/preprocessors/execute.py


-            outs.append(out)
+                cell.outputs.append(result)


I don't really like the idea of having an API that works primarily via side-effects — that's a difficult API to document and test for.

Can you think of a way that we could do this more as a pure function? E.g., could we adhere to the pattern for we use with preprocessors? I think that would involve returning and reassigning the modified cell object rather than returning the result itself while treating the cell.outputs as a location where we can append changes?

I can take a look. Mostly I tried to keep the logic the same and not change the design, just break it into smaller parts.

mpacer

First, this is a really exciting idea as it should make it much easier for people to handle messages in custom ways, which I know we're already doing in papermill in order to log messages as they come in.

~~However, given how this is a piece of core functionality to the app, I'd prefer to see a new test for handling the new API in the standard case explicitly.~~ Because it's such core functionality it would be a testing failure if we weren't already covering this.

Ideally, I think we should also add a new test specifically to add a mocked replacement method, in part because that would surface some of the details of how the API would work when it is overridden. I think the side-effects would show up as an anti-pattern in that case.

Overall, I'd like if if we could find a way to make the new API work with fewer side-effects and if we could make the logic of how messages are being handled a little more explicit.

mpacer · 2018-10-30T22:42:00Z

nbconvert/preprocessors/execute.py

+                if cell_index in cell_map:
+                    cell_map[cell_index] = []
+        # Check for remaining messages we don't process
+        elif not (msg_type in ['execute_input', 'update_display_data'] or msg_type.startswith('comm')):


This seems like really different logic from what we had before with msg_type.startsith('comm'); previously it was a continue.

It's actually identical logically, by not returning None as a function it will effectively continue for the parent function loop

Might have missed the not?

I don't think so. If the msg_type.startswith('comm') is true, then it does not fall into here and is not handled with another else, instead it will return the msg as the response (since that's what it defaults to and it is never reassigned).

Additionally now if something is an update_display_data message, it will return something (specifically the content of the msg).

I'm starting to wonder if this all might be fixed if we default to returning None…

Was working fine post-rebase, but I changed this to a raise control flow pattern which I think is a lot easier to read.

mpacer · 2018-10-30T22:45:48Z

nbconvert/preprocessors/execute.py

            try:
-                out = output_from_msg(msg)
+                # Assign output as our processed "result"


Unless I'm reading this incorrectly, I think this may have in effect added nesting and only applied The cell_map logic in the case where it is not an execute_input or a comm? if that's what this is intended to accomplish we should talk about that, but I was reviewing this originally as a refactor not a change in functionality.

! (A || B) == (!A and !B) So it should be equivalent to before I believe. I could change it to msg_type not in ['execute_input', 'update_display_data'] and not msg_type.startswith('comm') if you think it reads easier?

The problem is that display_id should have been dealt with in the cell_map logic regardless of the message type previously, except in the cases where a continue was followed. That try is not going to be hit by all message types and so the else also won't be hit by all message types.

mpacer · 2018-10-30T22:48:48Z

nbconvert/preprocessors/execute.py

+
+        display_id = content.get('transient', {}).get('display_id', None)
+        if display_id and msg_type in {'execute_result', 'display_data', 'update_display_data'}:
+            self._update_display_id(display_id, msg)


Previously there was a continue in the case where there was an update_display_data message with a statement that # update_display_data doesn't get recorded am I correct that that logic is now implicit if it's present at all?

@MSeal could you respond to this comment?

Late response (post rebase), but yes it's implicit by not raising CellExecutionComplete here

Carreau · 2018-10-30T23:25:43Z

I made a release of IPython on Sept 27 and one on Oct 27. There are likely errors due to that.

…

On Tue, Oct 30, 2018, 16:04 M Pacer ***@***.***> wrote: ***@***.**** requested changes on this pull request. I'd prefer to see a new test for handling this functionality in the standard case as well as a mocked replacement method as a way to make sure we understand how the API would work. Overall, I'd like if if we could find a way to make the new API work with fewer side-effects and if we could make the logic of how messages are being handled a little more explicit. ------------------------------ In nbconvert/preprocessors/execute.py <#905 (comment)>: > + # set the prompt number for the input and the output + if 'execution_count' in content: + cell['execution_count'] = content['execution_count'] + + if msg_type == 'status': + if content['execution_state'] == 'idle': + # Set result to None to halt execution + result = None + elif msg_type == 'clear_output': + cell.outputs[:] = [] + # clear display_id mapping for this cell + for display_id, cell_map in self._display_id_map.items(): + if cell_index in cell_map: + cell_map[cell_index] = [] + # Check for remaining messages we don't process + elif not (msg_type in ['execute_input', 'update_display_data'] or msg_type.startswith('comm')): This seems like really different logic from what we had before with msg_type.startsith('comm'); previously it was a continue. ------------------------------ In nbconvert/preprocessors/execute.py <#905 (comment)>: > try: - out = output_from_msg(msg) + # Assign output as our processed "result" Unless I'm reading this incorrectly, I think this may have in effect added nesting and only applied The cell_map logic in the case where it is not an execute_input or a comm? if that's what this is intended to accomplish we should talk about that, but I was reviewing this originally as a refactor not a change in functionality. ------------------------------ In nbconvert/preprocessors/execute.py <#905 (comment)>: > + + def process_message(self, msg, cell, cell_index): + ''' + Returns None if execution should be halted. + ''' + msg_type = msg['msg_type'] + self.log.debug("msg_type: %s", msg_type) + content = msg['content'] + self.log.debug("content: %s", content) + + # Default to our input as the "result" of processing the message + result = msg + + display_id = content.get('transient', {}).get('display_id', None) + if display_id and msg_type in {'execute_result', 'display_data', 'update_display_data'}: + self._update_display_id(display_id, msg) Previously there was a continue in the case where there was an update_display_data message with a statement that # update_display_data doesn't get recorded am I correct that that logic is now implicit if it's present at all? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#905 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAUez8A5iUQVDLSd8gyzJxopCU4TrHcPks5uqNsagaJpZM4X-Q1I> .

MSeal · 2018-10-31T04:22:36Z

I can add a test for the inheritance case. It did seem when I tried moving things to the wrong places on purpose that tests would usually fail. I'll do a more thorough check.

Do you want me to do functional rewrite to avoid the side_effect patterns already there? Might be higher risk of changing execution logic but I can do the more extensive changes if you think it's worth it.

MSeal · 2018-12-08T00:39:13Z

@mpacer I added a new test. From reading the code I think we'd need to do a semi-significant update to add in the message handler we talked about. Because of that I'd prefer to split the implementation into two parts and do that refactor post-merge of this PR.

nbconvert/preprocessors/tests/test_execute.py

+
+        original = copy.deepcopy(input_nb)
+        wpp = WrappedPreProc()
+        executed = wpp.preprocess(input_nb, {})[0]


mpacer · 2018-12-13T22:36:00Z

nbconvert/preprocessors/tests/test_execute.py

+        original = copy.deepcopy(input_nb)
+        wpp = WrappedPreProc()
+        executed = wpp.preprocess(input_nb, {})[0]
+        self.assertEqual(outputs, [


Elsewhere in the library we tend to use direct asserts as we are using pytest to run the tests. In addition to stylistic consistency, I think it's a lot easier to read.

Noted, will fix

nbconvert/preprocessors/tests/test_execute.py

mpacer · 2018-12-13T22:47:13Z

nbconvert/preprocessors/tests/test_execute.py

+    def test_process_message_wrapper(self):
+        outputs = []
+
+        class WrappedPreProc(ExecutePreprocessor):


I would prefer if we directly created the context needed so that process_message can be tested functionally, iterating over the cell, msg and cell_index as directly as possible.

I'd prefer if we weren't concatenating all of the outputs into a single object for the whole notebook, I'm thinking it should follow the cell structure.

I'd also like to make sure that message types that are supposed to return None are returning None, which means keeping track of at least the msg type in relation to the output if not some of the more predictive fields (e.g., a display_id that we know is set to a particular value).

Done -- this is rebased off #982 now

MSeal · 2019-04-15T03:19:18Z

@mpacer PR should now be ready for re-review

…r each kernel message

mpacer · 2019-04-16T18:09:12Z

nbconvert/preprocessors/execute.py

@@ -25,6 +25,8 @@
 from ..utils.exceptions import ConversionException


+class CellExecutionComplete(Exception): pass  # Used as a control signal


can you make this an actual docstring? Specifically one that describes what this is going to be used for?

mpacer · 2019-04-16T18:11:27Z

nbconvert/preprocessors/tests/test_execute.py

@@ -185,18 +185,18 @@ def normalize_output(output):
    def assert_notebooks_equal(self, expected, actual):
        expected_cells = expected['cells']
        actual_cells = actual['cells']
-        self.assertEqual(len(expected_cells), len(actual_cells))
+        assert len(expected_cells) == len(actual_cells)


Huzzah! Many thanks for switching to using the pytest assert idiom.

mpacer · 2019-04-16T18:14:13Z

nbconvert/preprocessors/tests/test_execute.py

+            def process_message(self, msg, cell, cell_index):
+                result = super(WrappedPreProc, self).process_message(msg, cell, cell_index)
+                if result:
+                    outputs.append(result)


why does this still need special logic distinct from the standard process_message?

It's purely to test that one could wrap an retrieve outputs to process on it's own. It has no real purpose for doing so beyond showing how to wrap the function in inherited classes. Namely to prove we can wrap the function safely in papermill cleanly.

mpacer · 2019-04-21T23:50:28Z

This looks great! Thanks for working through all the issues!

MSeal requested review from minrk and mpacer October 28, 2018 23:58

MSeal mentioned this pull request Oct 28, 2018

Moved code into nbconvert nteract/papermill#241

Closed

mpacer reviewed Oct 30, 2018

View reviewed changes

mpacer requested changes Oct 30, 2018

View reviewed changes

MSeal force-pushed the processMessage branch from 114d66e to b64442e Compare December 8, 2018 00:37

MSeal mentioned this pull request Dec 10, 2018

5.4.1 Release #929

Closed

mpacer reviewed Dec 13, 2018

View reviewed changes

nbconvert/preprocessors/tests/test_execute.py

original = copy.deepcopy(input_nb)

wpp = WrappedPreProc()

executed = wpp.preprocess(input_nb, {})[0]

This comment was marked as outdated.

Sign in to view

mpacer reviewed Dec 13, 2018

View reviewed changes

nbconvert/preprocessors/tests/test_execute.py Show resolved Hide resolved

mpacer reviewed Dec 13, 2018

View reviewed changes

MSeal mentioned this pull request Mar 5, 2019

Output notebook while executing #648

Open

MSeal mentioned this pull request Mar 18, 2019

output saved partially to the notebook when executing with timeout=-1 #968

Closed

MSeal mentioned this pull request Apr 7, 2019

Added tests for each branch in execute's run_cell method #982

Merged

MSeal force-pushed the processMessage branch from b64442e to f2a99cc Compare April 15, 2019 02:01

MSeal force-pushed the processMessage branch from f2a99cc to 6804e9e Compare April 16, 2019 17:30

MSeal added 5 commits April 16, 2019 11:04

Added tests for each branch in execute's run_cell method

72bac11

Refactored execute preprocessor to have a process_message function fo…

6ed3f0b

…r each kernel message

Added execute wrapper test

a08dc68

Refactored process_message to raise CellExecutionComplete on completion

6cbad47

Changed new asserts to pytest friendlier pattern

9ed71d7

MSeal force-pushed the processMessage branch from 6804e9e to 9ed71d7 Compare April 16, 2019 18:04

mpacer reviewed Apr 16, 2019

View reviewed changes

MSeal added 3 commits April 16, 2019 11:16

Added docstring to control exception

a8b4989

Added backwards compatability inheritance fix for run_cell

498d886

Removed unnecessary else

2f9c697

mpacer merged commit 7502276 into jupyter:master Apr 21, 2019

MSeal added this to the 5.5 milestone Apr 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored execute preprocessor to have a process_message function #905

Refactored execute preprocessor to have a process_message function #905

MSeal commented Oct 28, 2018 •

edited

MSeal commented Oct 29, 2018

maartenbreddels commented Oct 29, 2018

mpacer commented Oct 30, 2018

mpacer Oct 30, 2018 •

edited

MSeal Oct 31, 2018

mpacer left a comment •

edited

mpacer Oct 30, 2018

MSeal Oct 31, 2018 •

edited

MSeal Oct 31, 2018

mpacer Dec 13, 2018

mpacer Dec 13, 2018

MSeal Apr 15, 2019

mpacer Oct 30, 2018

MSeal Oct 31, 2018

mpacer Dec 13, 2018 •

edited

mpacer Oct 30, 2018

mpacer Dec 13, 2018

MSeal Apr 15, 2019

Carreau commented Oct 30, 2018 via email

MSeal commented Oct 31, 2018

MSeal commented Dec 8, 2018

This comment was marked as outdated.

mpacer Dec 13, 2018

MSeal Dec 14, 2018

mpacer Dec 13, 2018

MSeal Apr 15, 2019

MSeal commented Apr 15, 2019

mpacer Apr 16, 2019 •

edited

mpacer Apr 16, 2019

mpacer Apr 16, 2019

MSeal Apr 16, 2019

mpacer commented Apr 21, 2019

		@@ -25,6 +25,8 @@
		from ..utils.exceptions import ConversionException


		class CellExecutionComplete(Exception): pass # Used as a control signal

Refactored execute preprocessor to have a process_message function #905

Refactored execute preprocessor to have a process_message function #905

Conversation

MSeal commented Oct 28, 2018 • edited

MSeal commented Oct 29, 2018

maartenbreddels commented Oct 29, 2018

mpacer commented Oct 30, 2018

mpacer Oct 30, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mpacer left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MSeal Oct 31, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mpacer Dec 13, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Carreau commented Oct 30, 2018 via email

MSeal commented Oct 31, 2018

MSeal commented Dec 8, 2018

This comment was marked as outdated.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MSeal commented Apr 15, 2019

mpacer Apr 16, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mpacer commented Apr 21, 2019

MSeal commented Oct 28, 2018 •

edited

mpacer Oct 30, 2018 •

edited

mpacer left a comment •

edited

MSeal Oct 31, 2018 •

edited

mpacer Dec 13, 2018 •

edited

mpacer Apr 16, 2019 •

edited