Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled exceptions occurring inside function passed to executor.map are swallowed #776

Open
naglis opened this issue Apr 13, 2024 · 7 comments
Labels
good first issue Good for newcomers

Comments

@naglis
Copy link
Contributor

naglis commented Apr 13, 2024

Currently, if an unhandled exception occurs inside convert_doc when running dangerzone-cli, the exception is swallowed (traceback is not logged, the exit code is not changed).

This manifests in tests with dummy conversion - e.g. if shutil.copy raises an exception (e.g. src and dst being the same path), the CLIResult is still reported as a success due to the swallowed exception. Here is a demo test case.

IIUC during actual (not dummy) execution it would hide potential exceptions occurring here or here.

I believe this is because the generator returned by executor.map is not consumed - from ThreadPoolExecutor docs:

If a fn call raises an exception, then that exception will be raised when its value is retrieved from the iterator.

Here is a simplified form showing the issue:

>>> from concurrent.futures import ThreadPoolExecutor
>>>
>>> def do_work(i):
...     print(f"do_work({i})")
...     1 / 0
...
>>> with ThreadPoolExecutor(max_workers=4) as executor:
...     _ = executor.map(do_work, range(4))
...
do_work(0)
do_work(1)
do_work(2)
do_work(3)
>>>
@apyrgio
Copy link
Contributor

apyrgio commented Apr 15, 2024

Thanks a lot for bringing this issue to our attention, and for writing a test case.

I agree that the generator result should be consumed in principle, but we catch and log all exceptions (as you've already pointed out) here. I have to dig deeper, but my bet would be that there's a different underlying cause :-/.

@naglis
Copy link
Contributor Author

naglis commented Apr 15, 2024

but we catch and log all exceptions (as you've already pointed out) here

FWIW, I've ran into this in CLI tests when dummy conversion is used. The Dummy isolation provider overrides the convert method and does not have any exception handling.

I do not have a test case when using e.g. the container isolation provider (which IIUC uses the convert method defined on the base isolation provider), but the two sections (here and here (come to think of it, it also applies to the except errors.ConverterProcException and except errors.ConversionException blocks as well)) I've mentioned could hide exceptions if an exception were to occur inside them, since those sections are not inside a try/except (this part is inside the except Exception block, but any potential exception coming from inside the except block is not handled (there is no exception handling in convert_doc) and would be swallowed by exception.map).

To simulate a swallowed exception when using the container provider, we would have to raise it explicitly. E.g. try adding 1 / 0 here (before the try block). Running dev_scripts/dangerzone-cli /tmp/sample.pdf then results in:

╭──────────────────────────╮
│           ▄██▄           │
│          ██████          │
│         ███▀▀▀██         │
│        ███   ████        │
│       ███   ██████       │
│      ███   ▀▀▀▀████      │
│     ███████  ▄██████     │
│    ███████ ▄█████████    │
│   ████████████████████   │
│    ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀    │
│                          │
│    Dangerzone v0.6.0     │
│ https://dangerzone.rocks │
╰──────────────────────────╯
[INFO ] Assigning ID '0sHZEe' to doc '/tmp/sample.pdf'

Converting document to safe PDF
[DEBUG] Marking doc 0sHZEe as 'converting'

@apyrgio
Copy link
Contributor

apyrgio commented Apr 18, 2024

Got it. So we practically need a try: ... except Exception: log.exception at the top level, that would log any stray exceptions from the exception blocks you pointed out.

@naglis
Copy link
Contributor Author

naglis commented Apr 24, 2024

Indeed. IIUC this wrapping handler should also mark the document as failed upon exception (if it was not done already) to correctly report it at the end.

@apyrgio apyrgio added the good first issue Good for newcomers label Apr 30, 2024
@amnak613
Copy link

amnak613 commented May 7, 2024

I'm interested in contributing to this

@apyrgio
Copy link
Contributor

apyrgio commented May 8, 2024

Cool! Hopefully this issue provides enough context. In a nutshell, we want to:

  1. Catch and report all stray conversion exceptions from IsolationProvider.convert.
  2. Mark the related document as failed.

Happy to answer any question you may have 🙂

@amnak613
Copy link

Setting up the environment might take a while because I'm busy right now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants