Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline execution hides actual issue when local failed execution #1382

Closed
lresende opened this issue Mar 8, 2021 · 1 comment · Fixed by #1411
Closed

Pipeline execution hides actual issue when local failed execution #1382

lresende opened this issue Mar 8, 2021 · 1 comment · Fixed by #1411
Assignees
Labels
component:pipeline-runtime issues related to pipeline runtimes e.g. kubeflow pipelines
Milestone

Comments

@lresende
Copy link
Member

lresende commented Mar 8, 2021

Describe the issue
When executing a pipeline locally, and there is a failure in one of the nodes, the actual failure is not exposed trough the dialogbox error details.

To Reproduce
Steps to reproduce the behavior:

  1. Create a new pipeline
  2. Include a new node with the python script below
  3. Add an env variable on the node properties FACTORIAL_NUMBER=10
  4. Execute the pipeline
import os

# change the value for a different result
num =os.getenv('FACTORIAL_NUMBER', 7)

factorial = 1

# check if the number is negative, positive or zero
if num < 0:
    raise RuntimeError('Sorry, factorial does not exist for negative numbers')
elif num == 0:
    print('The factorial of 0 is 1')
else:
    for i in range(1, num + 1):
        factorial = factorial * i
    print(f'The factorial of {num} is {factorial}')

Screenshots or log output

image

Log Output
Traceback (most recent call last):
  File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor_local.py", line 224, in process
    run(argv, cwd=file_dir, env=envs, check=True)
  File "/home/lresende/opt/conda/lib/python3.8/subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['python3', '/home/lresende/opensource/jupyter-notebooks/pipelines/factorial.py', '--PYTHONHOME', '/home/lresende/opensource/jupyter-notebooks/pipelines']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor_local.py", line 74, in process
operation_processor.process(operation)
File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor_local.py", line 226, in process
raise RuntimeError(f'Internal error executing {filepath}: {ex}') from ex
RuntimeError: Internal error executing /home/lresende/opensource/jupyter-notebooks/pipelines/factorial.py: Command '['python3', '/home/lresende/opensource/jupyter-notebooks/pipelines/factorial.py', '--PYTHONHOME', '/home/lresende/opensource/jupyter-notebooks/pipelines']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/lresende/opt/conda/lib/python3.8/site-packages/tornado/web.py", line 1704, in _execute
result = await result
File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/handlers.py", line 89, in post
response = await PipelineProcessorManager.instance().process(pipeline)
File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor.py", line 75, in process
res = await asyncio.get_event_loop().run_in_executor(None, processor.process, pipeline)
File "/home/lresende/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor_local.py", line 79, in process
raise RuntimeError(f'Error processing operation {operation.name}.') from ex
RuntimeError: Error processing operation factorial.

Expected behavior
When looking at the server logs, there is actually more detailed information regarding the failure, which should be the actual best information to show the user.

Processing python script: /Users/lresende/opensource/jupyter-notebooks/elyra-pipelines/factorial.py
|Traceback (most recent call last):
  File "/Users/lresende/opensource/jupyter-notebooks/elyra-pipelines/factorial.py", line 9, in <module>
    if num < 0:
TypeError: '<' not supported between instances of 'str' and 'int'
Server Log Output
[I 2021-03-08 13:40:51.927 ServerApp] local 'helloworld' - processing pipeline 
[D 2021-03-08 13:40:51.927 ServerApp] Processing python script: /home/lresende/opensource/jupyter-notebooks/pipelines/factorial.py
Traceback (most recent call last):
  File "/home/lresende/opensource/jupyter-notebooks/pipelines/factorial.py", line 9, in 
    if num < 0:
TypeError: '<' not supported between instances of 'str' and 'int'
[D 2021-03-08 13:40:51.975 ServerApp] 200 GET /api/metrics/v1?1615239651748 (9.211.65.219) 8.40ms
[E 2021-03-08 13:40:51.976 ServerApp] Uncaught exception POST /elyra/pipeline/schedule?1615239651735 (9.211.65.219)
    HTTPServerRequest(protocol='http', host='lresende-dev-node.fyre.ibm.com:8888', method='POST', uri='/elyra/pipeline/schedule?1615239651735', version='HTTP/1.1', remote_ip='9.211.65.219')
    Traceback (most recent call last):
      File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor_local.py", line 224, in process
        run(argv, cwd=file_dir, env=envs, check=True)
      File "/home/lresende/opt/conda/lib/python3.8/subprocess.py", line 512, in run
        raise CalledProcessError(retcode, process.args,
    subprocess.CalledProcessError: Command '['python3', '/home/lresende/opensource/jupyter-notebooks/pipelines/factorial.py', '--PYTHONHOME', '/home/lresende/opensource/jupyter-notebooks/pipelines']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor_local.py", line 74, in process
    operation_processor.process(operation)
  File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor_local.py", line 226, in process
    raise RuntimeError(f'Internal error executing {filepath}: {ex}') from ex
RuntimeError: Internal error executing /home/lresende/opensource/jupyter-notebooks/pipelines/factorial.py: Command '['python3', '/home/lresende/opensource/jupyter-notebooks/pipelines/factorial.py', '--PYTHONHOME', '/home/lresende/opensource/jupyter-notebooks/pipelines']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lresende/opt/conda/lib/python3.8/site-packages/tornado/web.py", line 1704, in _execute
    result = await result
  File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/handlers.py", line 89, in post
    response = await PipelineProcessorManager.instance().process(pipeline)
  File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor.py", line 75, in process
    res = await asyncio.get_event_loop().run_in_executor(None, processor.process, pipeline)
  File "/home/lresende/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/lresende/opt/conda/lib/python3.8/site-packages/elyra/pipeline/processor_local.py", line 79, in process
    raise RuntimeError(f'Error processing operation {operation.name}.') from ex
RuntimeError: Error processing operation factorial.

[E 2021-03-08 13:40:51.977 ServerApp] {
"Host": "lresende-dev-node.fyre.ibm.com:8888",
"Connection": "keep-alive",
"Content-Length": "1961",
"X-Xsrftoken": "2|89b27294|9d5ac43878cd20950833bcc577ea2153|1615224692",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36",
"Content-Type": "text/plain;charset=UTF-8",
"Accept": "/",
"Origin": "http://lresende-dev-node.fyre.ibm.com:8888",
"Referer": "http://lresende-dev-node.fyre.ibm.com:8888/lab/tree/pipelines/helloworld.pipeline",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "en-US,en;q=0.9,pt-BR;q=0.8,pt;q=0.7",
"Cookie": "cmTPSet=Y; _cc_id=61f53dd10d06ed74b6ac17a90fb122e0; _ga=GA1.2.1614255526.1614623251; _hjTLDTest=1; _hjid=96e8c559-ed72-4da0-a5dd-ee40481be517; CoreM_State=44~-1~-1~-1~-13353377||||||||||||||||||; CoreM_State_Content=6|~~|~|; ajs_anonymous_id=%221af1a763-15d2-45d5-a310-1fa134bf6b44%22; IBM_W3SSO_ACCESS=; ajs_user_id=%22IBMid-060001CBRY%22; cd_user_id=177fefef26faa2-0d88ac0449820c-121a4759-4b9600-177fefef270dc1; OPTOUTMULTI=0:0%7Cc1:1%7Cc2:0%7Cc3:0; CoreID6=22398414543616146231263&ci=51040000|IBMTESTW3_50200000|DWNEXT_51040000|NEWBLUEPAGES_50200000|Watson_50200000|IBMTESTWWW; utag_main=v_id:0177ef0b9b9300129c5be62676c603078002807001328$_sn:9$_se:1$_ss:1$_st:1614957378970$is_country_requiring_explicit_consent:false$dc_visit:6$ses_id:1614955578970%3Bexp-session$_pn:1%3Bexp-session; 50200000_clogin=l=71332101614955578995&v=33&e=1614957378995; _xsrf=2|89b27294|9d5ac43878cd20950833bcc577ea2153|1615224692; 51040000_clogin=v=1&l=58694521615227181595&e=1615228995595; username-lresende-dev-node-fyre-ibm-com-8888="2|1:0|10:1615233418|44:username-lresende-dev-node-fyre-ibm-com-8888|44:Mzk5ZDVjNTEzZDZmNDU4YTlhNjJlNWYyNDVkMjA0NTE=|c89b0f3d11213ffe6de66b7527feded04781b4ff048901b94e992b825a938393""
}
[E 2021-03-08 13:40:51.977 ServerApp] 500 POST /elyra/pipeline/schedule?1615239651735 (9.211.65.219) 54.00ms referer=http://lresende-dev-node.fyre.ibm.com:8888/lab/tree/pipelines/helloworld.pipeline

Deployment information
Describe what you've deployed and how:

  • Elyra version: [e.g. 2.1.rc2]
  • Operating system: [e.g. macos and linux]
  • Installation source: [e.g. PyPI]
  • Deployment type: [e.g. local installation]
@kevin-bates
Copy link
Member

FYI: The example from @lresende is not failing for me. I think num =os.getenv('FACTORIAL_NUMBER', 7) needs to list 7 as a string '7' otherwise no error is produced. I suspect the single-quotes were stripped when creating the issue. Here's an update that produces the referenced error message.

import os

# change the value for a different result
num =os.getenv('FACTORIAL_NUMBER', '7')

factorial = 1

# check if the number is negative, positive or zero
if num < 0:
    raise RuntimeError('Sorry, factorial does not exist for negative numbers')
elif num == 0:
    print('The factorial of 0 is 1')
else:
    for i in range(1, num + 1):
        factorial = factorial * i
    print(f'The factorial of {num} is {factorial}')

@lresende lresende added this to the 2.2.0 milestone Mar 13, 2021
@kiersten-stokes kiersten-stokes self-assigned this Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:pipeline-runtime issues related to pipeline runtimes e.g. kubeflow pipelines
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants