Expose error due to failure of local python pipeline node execution #1411

kiersten-stokes · 2021-03-10T23:34:20Z

Resolves #1382 by adding an except block to trap CalledProcessError's that occur specifically as a result of python subprocess execution failure.

New look

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

elyra-bot · 2021-03-10T23:34:22Z

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link:

…rsten-stokes/elyra into local-failure-error-handling

kevin-bates · 2021-03-11T23:12:59Z

Hi @kiersten-stokes - thanks for working on this.

I've pulled your branch and run a pipeline with a known error. Prior to this change, the error message produced from a local run was:

After the changes, it seems that most of the traceback now appears in the title. Also, note the double scrollbars in the details window - along with the '.' - which leads me to believe something additional was expected.

While the actual error (File 'node1a.out' already exists!) appears a tad more prominently at the bottom of the details box, I'm not sure it's an improvement.

Have you stepped through with a debugger to see what pieces of CalledProcessError might be usable? If we can identify a succinct portion of the actual error message, we could log the full error and raise another error with the more succinct portion. At least that's my thoughts on this.

I would also recommend adding an additional assertion in the test that ensures this (TBD) succinct portion is in the raised exception.

kevin-bates · 2021-03-11T23:23:44Z

@kiersten-stokes - please ignore my last comment! I just realized, when looking into debugging this, that your changes only apply to python script execution, not notebook node execution - which is what my response was about. I apologize.

Why the different output must be a function of the PR branch and I wonder if there are changes in rc3 that are not in your PR. At any rate, I will switch to looking at this from the perspective of a python node.

In general, when fixing issues that change what is visible to the user, I try to include the updated message or dialog for the reviewer. If you could update your opening description with a screenshot of the new message, that would be helpful - thank you.

kevin-bates · 2021-03-12T00:07:38Z

OK - here's the error dialog - now that I know I'm running the right stuff!!!

This looks good. The header is detailed yet the details are rich with traceback.

It would be awesome if the header portion could read more like:
Error processing operation node1py: TypeError: '<' not supported between instances of 'str' and 'int' .

Have you explored gaining access to only the TypeError portion? Not sure this is possible w/o impacting the traceback info.

kiersten-stokes · 2021-03-12T00:13:51Z

@kevin-bates No double scrollbars in my case.

It would be awesome if the header portion could read more like:
Error processing operation node1py: TypeError: '<' not supported between instances of 'str' and 'int' .

^^ I completely agree. I tried to get something like this, but unfortunately I don't think CalledProcessError returns anything that can be used to get nicer formatting. Let me triple check on that front, though!

In general, when fixing issues that change what is visible to the user, I try to include the updated message or dialog for the reviewer. If you could update your opening description with a screenshot of the new message, that would be helpful - thank you.

^^ Will do! Thank for another good tip!

elyra/pipeline/processor_local.py

kevin-bates · 2021-03-12T01:44:41Z

I completely agree. I tried to get something like this, but unfortunately I don't think CalledProcessError returns anything that can be used to get nicer formatting. Let me triple check on that front, though!

You're right - CPE is very limited (kinda frustrating actually). I think trying to log something useful and trimming the error message string to essentially the file_name might be the best we can do.

I think we have some better options with notebook node issues by trapping PapermillExecutionError (and its evalue field) similar to what you've done for (the limited) CalledProcessError might be something we should add as well (in the notebook processing method).

In your experience with CPE.stderr is it always the case that the interesting error is at the end of the stream (TypeError: in this case)? (We might want to experiment with a few.) If so, perhaps a reverse search of 'Error:' - followed by further capture of the full error name, might prove a useful way to grab the interesting portion.

kiersten-stokes · 2021-03-12T23:48:43Z

Local notebook handling errors now look like the below, matching the general style of the log and popup box messages for python script execution errors. Note that the evalue attribute of the PapermillExecutionError is sometimes empty, as can be seen in the second screenshot when compared to the corresponding lines of code (line 211 in processor_local).

kevin-bates

This looks really good - just had a few comments.

kevin-bates · 2021-03-13T00:39:37Z

elyra/pipeline/processor_local.py

+        except papermill.PapermillExecutionError as pmee:
+            self.log.error(f'Internal error executing {file_name}: {str(pmee.ename)}' +
+                           f'{str(pmee.evalue)} in [{str(pmee.exec_count)}]')
+            raise RuntimeError(f'{str(pmee.ename)} {str(pmee.evalue)} in [{str(pmee.exec_count)}]') from pmee


I find the in [2] portion of this confusing and regular users won't know what this means. I suggest we drop this from the RuntimeError and even the log message.

The log will contain a traceback for this somewhere - I suspect when the web request throws - so that kind of information should be in the log.

Will do! Makes sense if the normal user wouldn't get any useful info from it

Perhaps if we use notebook cell X?

@ptitzler - good idea, but I'd rather convey the cell index slightly differently...

f'{str(pmee.ename)} {str(pmee.evalue)} executing cell {str(pmee.cell_index}]'

which would yield...

Error processing operation load_data: FileNotFoundError [Errno 2] No such file or directory: 'data/file1.csv' executing cell 3

or

Error processing operation load_data: AssertionError executing cell 3

per examples above. Does that sound okay?

(Note: I'm assuming the attribute name is cell_index.)

Actually, amending my previous comment, cell_index is not the correct attribute, exec_count still gives the correct cell value.

kevin-bates · 2021-03-13T00:44:13Z

elyra/pipeline/processor_local.py

        except Exception as ex:
-            raise RuntimeError(f'Internal error executing {filepath}: {ex}') from ex
+            self.log.error(f'Internal error executing {file_name}: {str(ex)}')
+            raise RuntimeError('Internal error executing notebook') from ex


We should include the notebook name in the error.

Suggested change

raise RuntimeError('Internal error executing notebook') from ex

raise RuntimeError(f'Internal error executing {file_name}') from ex

Ah sorry, I must have misunderstood this piece in our earlier conversation. I removed the file name in the popup message because the exception message one up the chain already includes the node name, which already is the file name (sans extension).

Error processing operation load_data: Internal error executing load_data.ipynb: ...

Do you think we should keep both in?

I'm not sure it will always be the case that the node name will always be derived from the file name and I feel including the actual file name is an extra level of detail that could be useful. We might also find other "internal execution errors" that are based on other things - not necessarily related to the node/file name.

That said, everyone has opinions and mine aren't necessarily correct. 😄 I'm going to as for @ptitzler to review as well.

I agree. Unlike in earlier releases the node name and the file name can be entirely different.

Alright, that's helpful! I was wondering if there were cases where node name != file name. I'll add the file name to each of these errors, and remove the word "internal" since it's unnecessary.

kevin-bates · 2021-03-13T00:44:20Z

elyra/pipeline/processor_local.py

+            if error_trim_index != -1:
+                raise RuntimeError(error_msg[error_trim_index:]) from cpe
+            else:
+                raise RuntimeError('Internal error executing Python script') from cpe


We should include the actual script name.

Suggested change

raise RuntimeError('Internal error executing Python script') from cpe

raise RuntimeError(f'Internal error executing {file_name}') from cpe

kevin-bates · 2021-03-13T00:44:32Z

elyra/pipeline/processor_local.py

        except Exception as ex:
-            raise RuntimeError(f'Internal error executing {filepath}: {ex}') from ex
+            self.log.error(f'Internal error executing {file_name}: {str(ex)}')
+            raise RuntimeError('Internal error executing Python script') from ex


Ditto for script name.

ptitzler · 2021-03-15T17:06:28Z

General comment: there's probably no need to classify these as internal errors in the message text.

kevin-bates

These changes look good. Thanks for tackling a similar issue for the notebook node as well!

kiersten-stokes · 2021-03-15T22:40:47Z

Thanks, @kevin-bates! I'll just clean up those tests and hopefully we'll be all set!

kevin-bates · 2021-03-16T18:44:17Z

This looks awesome now - thank you @kiersten-stokes!

kevin-bates · 2021-03-17T18:22:01Z

Thanks @kiersten-stokes

Here's a python script error:

And a notebook error:

…1485) It introduces a new base class named ScriptOperationProcessor from which the existingPythonScriptOperationProcessor and RScriptOperationProcessor classes now derive. This base class contains 90% of the applicable code with the subclasses providing their name and argument vectors. It introduces a log_and_raise() method on the base FIleOperationProcessor class that is available to all file-based operations. Building on the work done in #1411, this method checks the length of the error message and truncates it to around the max (80), replacing overflow with ellipses (...). Adds a test that removes the kernel metadata from a notebook node and ensures the appropriate error is raised.

Kiersten Stokes added 2 commits March 10, 2021 17:26

Expose error due to failure of local python pipeline node execution

c729e91

Expose error due to failure of local python pipeline node execution

78e9457

kiersten-stokes added the component:pipeline-runtime issues related to pipeline runtimes e.g. kubeflow pipelines label Mar 10, 2021

Kiersten Stokes added 3 commits March 10, 2021 18:41

Update tests to reflect change made to raised error message

4a7ad4e

Merge branch 'local-failure-error-handling' of https://github.com/kie…

fd4bf05

…rsten-stokes/elyra into local-failure-error-handling

Change run() arguments to work in python 3.6 envs

c1f485d

lresende requested review from kevin-bates and lresende March 11, 2021 03:43

kevin-bates reviewed Mar 12, 2021

View reviewed changes

elyra/pipeline/processor_local.py Outdated Show resolved Hide resolved

kiersten-stokes added the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label Mar 12, 2021

Kiersten Stokes added 2 commits March 12, 2021 16:17

Add logic to parse CalledProcessError for better display

015e47a

Add handling for notebook-specific execution errors

a523135

kevin-bates reviewed Mar 13, 2021

View reviewed changes

kevin-bates requested a review from ptitzler March 15, 2021 16:36

Kiersten Stokes added 2 commits March 15, 2021 16:40

Remove redundant/unnecessary wording in exception chain

efee92b

Change notebook execution error attribute from cell_index to exec_count

39ac7e1

kevin-bates approved these changes Mar 15, 2021

View reviewed changes

Update tests to reflect newest changes to error messages

621f20e

kiersten-stokes removed the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label Mar 16, 2021

akchinSTC added this to the 2.2.0 milestone Mar 16, 2021

Parse PapermillExecutionError to remove potential punctuation

d4d7fbf

Kiersten Stokes added 2 commits March 16, 2021 13:16

Change location of error-causing cell for clarity in message

689f9cc

Remove all added punctuation from error messages

52d326a

Change word 'node' back to 'operation' in error message

37879d2

lresende approved these changes Mar 18, 2021

View reviewed changes

lresende merged this pull request into elyra-ai:master Mar 18, 2021

lresende pushed a commit that referenced this pull request Mar 18, 2021

Expose error details on Python node local execution (#1411)

044463f

This was referenced Mar 25, 2021

Missing kernelspec on new notebooks created in VSCode #1337

Closed

Refactor script processors, include brief detail on generic errors #1485

Merged

kiersten-stokes deleted the local-failure-error-handling branch August 20, 2021 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose error due to failure of local python pipeline node execution #1411

Expose error due to failure of local python pipeline node execution #1411

kiersten-stokes commented Mar 10, 2021 •

edited

elyra-bot bot commented Mar 10, 2021

kevin-bates commented Mar 11, 2021

kevin-bates commented Mar 11, 2021

kevin-bates commented Mar 12, 2021

kiersten-stokes commented Mar 12, 2021

kevin-bates commented Mar 12, 2021 •

edited

kiersten-stokes commented Mar 12, 2021 •

edited

kevin-bates left a comment

kevin-bates Mar 13, 2021

kiersten-stokes Mar 15, 2021

ptitzler Mar 15, 2021

kevin-bates Mar 15, 2021

kiersten-stokes Mar 15, 2021 •

edited

kevin-bates Mar 13, 2021

kiersten-stokes Mar 15, 2021

kevin-bates Mar 15, 2021

ptitzler Mar 15, 2021

kiersten-stokes Mar 15, 2021

kevin-bates Mar 13, 2021

kevin-bates Mar 13, 2021

ptitzler commented Mar 15, 2021

kevin-bates left a comment

kiersten-stokes commented Mar 15, 2021

kevin-bates commented Mar 16, 2021

kevin-bates commented Mar 17, 2021

	raise RuntimeError('Internal error executing notebook') from ex
	raise RuntimeError(f'Internal error executing {file_name}') from ex

Expose error due to failure of local python pipeline node execution #1411

Expose error due to failure of local python pipeline node execution #1411

Conversation

kiersten-stokes commented Mar 10, 2021 • edited

New look

elyra-bot bot commented Mar 10, 2021

kevin-bates commented Mar 11, 2021

kevin-bates commented Mar 11, 2021

kevin-bates commented Mar 12, 2021

kiersten-stokes commented Mar 12, 2021

kevin-bates commented Mar 12, 2021 • edited

kiersten-stokes commented Mar 12, 2021 • edited

kevin-bates left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kiersten-stokes Mar 15, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ptitzler commented Mar 15, 2021

kevin-bates left a comment

Choose a reason for hiding this comment

kiersten-stokes commented Mar 15, 2021

kevin-bates commented Mar 16, 2021

kevin-bates commented Mar 17, 2021

kiersten-stokes commented Mar 10, 2021 •

edited

kevin-bates commented Mar 12, 2021 •

edited

kiersten-stokes commented Mar 12, 2021 •

edited

kiersten-stokes Mar 15, 2021 •

edited