Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watcher - if Zeebe is emitting StatusCode.RESOURCE_EXHAUSTED for one task, but not the other, the worker will not be stopped #138

Closed
kbakk opened this issue Mar 12, 2021 · 0 comments · Fixed by #146
Assignees
Labels
bug Something isn't working

Comments

@kbakk
Copy link
Collaborator

kbakk commented Mar 12, 2021

Describe the bug
I have set up a worker to go against Camunda Cloud (development Zeebe cluster). For some reason, the broker is responding with StatusCode.RESOURCE_EXHAUSTED for just one of the configured tasks (the other succeeds). In this case, the watcher will not catch that one of the tasks have issues, and it will stay in a state where it's constantly restarting one of the tasks.

To Reproduce
Steps to reproduce the behavior:
This is seen intermittently with Camunda Cloud, I have no way to reproduce this at the moment.

Expected behavior
After a task has failed for n attempts, the watcher should give up and shut down the worker.

Logs

Details
2021-03-12 14:49:20,987 | DEBUG    |  ZeebeWorker-Watch | pyzeebe.worker.worker | Checking task thread status
2021-03-12 14:49:20,988 | WARNING  |  ZeebeWorker-Watch | pyzeebe.worker.worker | Task thread ffprofile.multiaudio_video_proxy_from_audio_files is not alive, restarting
2021-03-12 14:49:20,988 | DEBUG    |  ZeebeWorker-Watch | pyzeebe.worker.worker | Starting task thread for ffprofile.multiaudio_video_proxy_from_audio_files
2021-03-12 14:49:20,988 | DEBUG    |  ZeebeWorker-Task-ffprofile.multiaudio_video_proxy_from_audio_files | pyzeebe.worker.worker | Handling task {'type': 'ffprofile.multiaudio_video_proxy_from_audio_files', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_paths', 'input_overlay_video_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'err']}
2021-03-12 14:49:20,988 | DEBUG    |  ZeebeWorker-Task-ffprofile.multiaudio_video_proxy_from_audio_files | pyzeebe.worker.worker | Activating jobs for task: {'type': 'ffprofile.multiaudio_video_proxy_from_audio_files', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_paths', 'input_overlay_video_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'err']}
2021-03-12 14:49:25,059 | DEBUG    |  ZeebeWorker-Task-ffprofile.copy | pyzeebe.worker.worker | Activating jobs for task: {'type': 'ffprofile.copy', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'input_full_path', 'output_full_path']}
2021-03-12 14:49:30,996 | DEBUG    |  ZeebeWorker-Watch | pyzeebe.worker.worker | Checking task thread status
2021-03-12 14:49:31,053 | DEBUG    |  ZeebeWorker-Task-ffprofile.multiaudio_video_proxy_from_audio_files | pyzeebe.worker.worker | Activating jobs for task: {'type': 'ffprofile.multiaudio_video_proxy_from_audio_files', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_paths', 'input_overlay_video_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'err']}
2021-03-12 14:49:35,107 | DEBUG    |  ZeebeWorker-Task-ffprofile.copy | pyzeebe.worker.worker | Activating jobs for task: {'type': 'ffprofile.copy', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'input_full_path', 'output_full_path']}
Exception in thread ZeebeWorker-Task-ffprofile.multiaudio_video_proxy_from_audio_files:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/pyzeebe/grpc_internals/zeebe_job_adapter.py", line 20, in activate_jobs
    for response in self._gateway_stub.ActivateJobs(
  File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 416, in __next__
    return self._next()
  File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 803, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
    status = StatusCode.RESOURCE_EXHAUSTED
    details = "Expected to activate jobs of type 'ffprofile.multiaudio_video_proxy_from_audio_files', but no jobs available and at least one broker returned 'RESOURCE_EXHAUSTED'. Please try again later."
    debug_error_string = "{"created":"@1615560578.126142761","description":"Error received from peer ipv4:34.77.154.112:443","file":"src/core/lib/surface/call.cc","file_line":1062,"grpc_message":"Expected to activate jobs of type 'ffprofile.multiaudio_video_proxy_from_audio_files', but no jobs available and at least one broker returned 'RESOURCE_EXHAUSTED'. Please try again later.","grpc_status":8}"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.8/site-packages/pyzeebe/worker/worker.py", line 167, in _handle_task
    self._handle_jobs(task)
  File "/usr/local/lib/python3.8/site-packages/pyzeebe/worker/worker.py", line 171, in _handle_jobs
    for job in self._get_jobs(task):
  File "/usr/local/lib/python3.8/site-packages/pyzeebe/grpc_internals/zeebe_job_adapter.py", line 32, in activate_jobs
    self._common_zeebe_grpc_errors(rpc_error)
  File "/usr/local/lib/python3.8/site-packages/pyzeebe/grpc_internals/zeebe_adapter_base.py", line 84, in _common_zeebe_grpc_errors
    raise ZeebeBackPressure()
pyzeebe.exceptions.zeebe_exceptions.ZeebeBackPressure
2021-03-12 14:49:41,005 | DEBUG    |  ZeebeWorker-Watch | pyzeebe.worker.worker | Checking task thread status
2021-03-12 14:49:41,005 | WARNING  |  ZeebeWorker-Watch | pyzeebe.worker.worker | Task thread ffprofile.multiaudio_video_proxy_from_audio_files is not alive, restarting
2021-03-12 14:49:41,005 | DEBUG    |  ZeebeWorker-Watch | pyzeebe.worker.worker | Starting task thread for ffprofile.multiaudio_video_proxy_from_audio_files
2021-03-12 14:49:41,006 | DEBUG    |  ZeebeWorker-Task-ffprofile.multiaudio_video_proxy_from_audio_files | pyzeebe.worker.worker | Handling task {'type': 'ffprofile.multiaudio_video_proxy_from_audio_files', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_paths', 'input_overlay_video_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'err']}
2021-03-12 14:49:41,006 | DEBUG    |  ZeebeWorker-Task-ffprofile.multiaudio_video_proxy_from_audio_files | pyzeebe.worker.worker | Activating jobs for task: {'type': 'ffprofile.multiaudio_video_proxy_from_audio_files', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_paths', 'input_overlay_video_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'err']}
2021-03-12 14:49:45,146 | DEBUG    |  ZeebeWorker-Task-ffprofile.copy | pyzeebe.worker.worker | Activating jobs for task: {'type': 'ffprofile.copy', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'input_full_path', 'output_full_path']}
Exception in thread ZeebeWorker-Task-ffprofile.multiaudio_video_proxy_from_audio_files:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/pyzeebe/grpc_internals/zeebe_job_adapter.py", line 20, in activate_jobs
    for response in self._gateway_stub.ActivateJobs(
  File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 416, in __next__
    return self._next()
  File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 803, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
    status = StatusCode.RESOURCE_EXHAUSTED
    details = "Expected to activate jobs of type 'ffprofile.multiaudio_video_proxy_from_audio_files', but no jobs available and at least one broker returned 'RESOURCE_EXHAUSTED'. Please try again later."
    debug_error_string = "{"created":"@1615560588.156142878","description":"Error received from peer ipv4:34.77.154.112:443","file":"src/core/lib/surface/call.cc","file_line":1062,"grpc_message":"Expected to activate jobs of type 'ffprofile.multiaudio_video_proxy_from_audio_files', but no jobs available and at least one broker returned 'RESOURCE_EXHAUSTED'. Please try again later.","grpc_status":8}"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.8/site-packages/pyzeebe/worker/worker.py", line 167, in _handle_task
    self._handle_jobs(task)
  File "/usr/local/lib/python3.8/site-packages/pyzeebe/worker/worker.py", line 171, in _handle_jobs
    for job in self._get_jobs(task):
  File "/usr/local/lib/python3.8/site-packages/pyzeebe/grpc_internals/zeebe_job_adapter.py", line 32, in activate_jobs
    self._common_zeebe_grpc_errors(rpc_error)
  File "/usr/local/lib/python3.8/site-packages/pyzeebe/grpc_internals/zeebe_adapter_base.py", line 84, in _common_zeebe_grpc_errors
    raise ZeebeBackPressure()
pyzeebe.exceptions.zeebe_exceptions.ZeebeBackPressure
2021-03-12 14:49:51,016 | DEBUG    |  ZeebeWorker-Watch | pyzeebe.worker.worker | Checking task thread status
2021-03-12 14:49:51,016 | WARNING  |  ZeebeWorker-Watch | pyzeebe.worker.worker | Task thread ffprofile.multiaudio_video_proxy_from_audio_files is not alive, restarting
2021-03-12 14:49:51,016 | DEBUG    |  ZeebeWorker-Watch | pyzeebe.worker.worker | Starting task thread for ffprofile.multiaudio_video_proxy_from_audio_files
2021-03-12 14:49:51,017 | DEBUG    |  ZeebeWorker-Task-ffprofile.multiaudio_video_proxy_from_audio_files | pyzeebe.worker.worker | Handling task {'type': 'ffprofile.multiaudio_video_proxy_from_audio_files', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_paths', 'input_overlay_video_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'err']}
2021-03-12 14:49:51,017 | DEBUG    |  ZeebeWorker-Task-ffprofile.multiaudio_video_proxy_from_audio_files | pyzeebe.worker.worker | Activating jobs for task: {'type': 'ffprofile.multiaudio_video_proxy_from_audio_files', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_paths', 'input_overlay_video_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'err']}
2021-03-12 14:49:55,217 | DEBUG    |  ZeebeWorker-Task-ffprofile.copy | pyzeebe.worker.worker | Activating jobs for task: {'type': 'ffprofile.copy', 'timeout': 10000, 'max_jobs_to_activate': 32, 'variables_to_fetch': ['input_file_path', 'output_file_path', 'input_path_prefix', 'output_path_prefix', 'output_suffix', 'ffmpeg_command', 'input_full_path', 'output_full_path']}

Version
Python: 3.8
Pyzeebe: https://github.com/JonatanMartens/pyzeebe/releases/tag/v2.3.0
Server: Docker python:3.8-slim-buster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant