Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No exception on job timeout #247

Closed
kiriusm2 opened this issue May 11, 2021 · 11 comments
Closed

No exception on job timeout #247

kiriusm2 opened this issue May 11, 2021 · 11 comments

Comments

@kiriusm2
Copy link
Contributor

kiriusm2 commented May 11, 2021

In v0.20 when job is timed out asyncio.TimeoutError is not raised. Instead there is log message: job_name cancelled, will be run again and job is restarted immediately.
Is it intentional? Having some exception on timeout and allow app to decide how to handle it seems more appropriate to me.

@samuelcolvin
Copy link
Member

I think asyncio.TimeoutError is raised, it's just caught and the job is re-run.

If you want to do something different, you might be able to catch it inside the job?

@kiriusm2
Copy link
Contributor Author

Inside the job there is asyncio.CancelledError on timeout now.
I guess it's because in #212
async with async_timeout.timeout(timeout_s):
was replaced with
cancel_handler = self.loop.call_at(self.loop.time() + timeout_s, task.cancel)

@samuelcolvin
Copy link
Member

Humm, the the logic I implemented was copied from async-timeout, so I would have expected the behaviour to be the same.

Guess I'm wrong...

@kiriusm2
Copy link
Contributor Author

https://github.com/aio-libs/async-timeout/blob/master/async_timeout/__init__.py#L191-L194
async-timeout checks exception type in __exit__
and raises asyncio.TimeoutError if it's asyncio.CancelledError and task was cancelled by async-timeout.
I can't find similar logic in arq

@samuelcolvin
Copy link
Member

Oh, missed that. Happy to accept a PR if we can fix it.

@kiriusm2
Copy link
Contributor Author

Is it ok to just return async-timeout back or there is some deep reason behind removing it?

@samuelcolvin
Copy link
Member

yes, we need access to the task to add it to self.job_tasks

@samuelcolvin
Copy link
Member

For stuff like this, it's worth reading the code rather than asking questions, it's relatively clear in the code:

https://github.com/samuelcolvin/arq/blob/8ee04f96934684c9b8f8278df0530ae9fad1be4a/arq/worker.py#L520-L535

@kiriusm2
Copy link
Contributor Author

Actually inside a job it's always been asyncio.CancelledError on timeout. The difference only in retries. In 0.20 job is retried and previously exception was just logged.

@samuelcolvin
Copy link
Member

Humm, I think we should revert this to just log the error, you can always catch the CancelledError and raise a retry exception.

Would you create a PR?

kiriusm2 pushed a commit to kiriusm2/arq that referenced this issue May 16, 2021
samuelcolvin pushed a commit that referenced this issue Sep 2, 2021
* Fix jobs timeout #247

* Bump codecov-action to v1.5.2

* Fix test

Co-authored-by: Kirill Matveev <kirill.matveev@akvelon.com>
isobelhooper pushed a commit to CQCL-DEV/arq that referenced this issue May 4, 2022
* Fix jobs timeout python-arq#247

* Bump codecov-action to v1.5.2

* Fix test

Co-authored-by: Kirill Matveev <kirill.matveev@akvelon.com>
@JonasKs
Copy link
Collaborator

JonasKs commented Apr 4, 2023

This seems to have been resolved.

@JonasKs JonasKs closed this as completed Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants