-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store task exceptions #68
Conversation
This is less of a foot-gun than receiving a different exception than expected.
While storing the entire stack with full objects is not necessary, having at least the textual representation of the traceback would be fantastic for debugging. You can't look at an exception message and guess what went wrong most of the time. And debugging async task is very hard, so having a clue with file paths and lines would be super helpful. Even if you have logging well setup (which a lot of people don't have), at some point people will build django admin plugins to monitor tasks. Being able to click on that and see a string representation of the traceback in there would be huge. |
The full exception is logged, so for debugging the traceback is still accessible. From a code perspective, it should only be the exception instance itself which is needed (at least in most cases). |
True, but again:
Since I understand you might start to feel like I talk a lot while you do all the hard work, maybe I could contribute a PR? If you are open to the concept. |
I think this PR is a good start, but I'd definitely welcome some improvements to it to include a traceback! I'd rather avoid external libraries or too much added complexity if at all possible.
The logs show the job ID when showing the exception. The Job's ID is logged before execution, and on success/failure, so it shouldn't be too difficult to correlate.
That's one of the use cases I have in mind. The exception instance and its arguments might be enough for simple jobs, although more complex ones might benefit from at at least showing the text representation of the call stack (which I don't think you could add back into an exception very easily, but I'm happy to be proved wrong!). |
The traceback is gathered from the context and not the exception itself (sys.exc_info() is even Thread specific), so you are right, we can't add it back to the exception. To make it appear in a natural way would mean monkey patch the whole system which wouldn't fit the goal of avoiding too much added complexity. I can get the traceback lines from
Which would you prefer I start working on? I like the last one, it avoids cluttering the TastResult namespace, so we can extend the mechanism with more info on the TaskError later on if necessary. It lets you catch any TaskError, or the original error if you want because of task chaining. I can even provide a signature such as But I'll work on your selection. |
Having the re-serialized exception be different to how it was before sounds like a real foot-gun. Unless we can replicate it entirely, I don't think it's worth doing. Adding it to the bare exception dict would be nice, but there's no real interface for it. I guess we could add a |
Ok. I'll look into that. |
While working on the So I have an additional, maybe more natural proposal. There is a bit of black magic to it, but we can limit it because we support only 3.8+. Here is what it looks like. Serializing partWe need one function to unravel the entire stack trace (otherwise, in the exception handling site, since we don't reach the top of the program, we only get the current frame): def get_stacktrace():
tb = None
depth = 2
while True:
try:
frame = sys._getframe(depth)
depth += 1
except ValueError as exc:
break
tb = types.TracebackType(tb, frame, frame.f_lasti, frame.f_lineno)
return tb Then, we dump it recursively into a dict: def tb_to_dict(tb):
tb_list = []
while tb:
frame = tb.tb_frame
code = frame.f_code
tb_list.append(
{
"filename": code.co_filename,
"name": code.co_name,
"lineno": tb.tb_lineno,
"globals": {
k: str(v)
for k, v in frame.f_globals.items()
if k in ("__file__", "__name__")
},
"locals": {k: str(v) for k, v in frame.f_locals.items()},
"code_info": {
"co_name": code.co_name,
"co_filename": code.co_filename,
},
}
)
tb = tb.tb_next
return tb_list Globals and locals are Deserialization partBlack magic is needed because Python won't let us create a Frame instance, so we create and compile a code type with the exception then execute it. def rebuild_traceback(serialized_traceback):
top_tb = None
tb = None
for frame in serialized_traceback:
code_info = frame["code_info"]
code = compile(
"\n" * (frame["lineno"] - 1) + "raise __traceback_maker",
frame["filename"],
"exec",
)
# For our purpose, we can make it as minimal as we need
code = code.replace(
co_argcount=0,
co_filename=code_info["co_filename"],
co_name=code_info["co_name"],
co_freevars=(),
co_cellvars=(),
)
try:
exec(code, frame["globals"], frame["locals"])
except Exception:
next_tb = sys.exc_info()[2].tb_next
if top_tb is None:
top_tb = next_tb
if tb is not None:
tb.tb_next = next_tb
tb = next_tb
del next_tb
return top_tb What it looks like in actionsource_of_error.py: import json
import pickle
from pathlib import Path
import pprint
...
def fail():
raise ValueError("Example exception")
def one():
try:
fail()
except Exception as e:
tb_dump = tb_to_dict(get_traceback()) + tb_to_dict(e.__traceback__)
pprint.pprint(tb_dump)
Path("/tmp/tb").write_text(json.dumps(tb_dump))
# This simulate your Exception serialization without recoding it
Path("/tmp/ex").write_bytes(pickle.dumps(e))
def two():
one()
def three():
two()
three() This saves the stack trace to a file, and prints the dump, which looks like this: [{'code_info': {'co_filename': '/home/user/Bureau/source_of_error.py',
'co_name': '<module>'},
'filename': '/home/user/Bureau/source_of_error.py',
'globals': {'__file__': '/home/user/Bureau/source_of_error.py', '__name__': '__main__'},
'lineno': 72,
'locals': {'Path': "<class 'pathlib.Path'>",
'__annotations__': '{}',
'__builtins__': "<module 'builtins' (built-in)>",
'__cached__': 'None',
'__doc__': 'None',
'__file__': '/home/user/Bureau/source_of_error.py',
'__loader__': '<_frozen_importlib_external.SourceFileLoader '
'object at 0x79f230121ae0>',
'__name__': '__main__',
'__package__': 'None',
'__spec__': 'None',
'get_traceback': '<function get_traceback at 0x79f2300ab880>',
'json': "<module 'json' from "
"'/usr/lib/python3.10/json/__init__.py'>",
'one': '<function one at 0x79f22ffa9a20>',
'pickle': "<module 'pickle' from '/usr/lib/python3.10/pickle.py'>",
'pprint': "<module 'pprint' from '/usr/lib/python3.10/pprint.py'>",
'sys': "<module 'sys' (built-in)>",
'tb_to_dict': '<function tb_to_dict at 0x79f23018f5b0>',
'three': '<function three at 0x79f22ffa9b40>',
'two': '<function two at 0x79f22ffa9ab0>',
'types': "<module 'types' from '/usr/lib/python3.10/types.py'>"},
'name': '<module>'},
{'code_info': {'co_filename': '/home/user/Bureau/source_of_error.py', 'co_name': 'three'},
'filename': '/home/user/Bureau/source_of_error.py',
'globals': {'__file__': '/home/user/Bureau/source_of_error.py', '__name__': '__main__'},
'lineno': 69,
'locals': {},
'name': 'three'},
{'code_info': {'co_filename': '/home/user/Bureau/source_of_error.py', 'co_name': 'two'},
'filename': '/home/user/Bureau/source_of_error.py',
'globals': {'__file__': '/home/user/Bureau/source_of_error.py', '__name__': '__main__'},
'lineno': 65,
'locals': {},
'name': 'two'}] In a different file, we can rebuild the exception: a = pickle.loads(Path("/tmp/ex").read_bytes())
stack_trace = json.loads(Path("/tmp/tb").read_text())
a.__traceback__ = rebuild_traceback(stack_trace)
traceback.print_exception(type(a), a, a.__traceback__) Which prints the content as if it were from the original module: Traceback (most recent call last):
File "/home/user/Bureau/foo.py", line 73, in <module>
three()
File "/home/user/Bureau/foo.py", line 70, in three
two()
File "/home/user/Bureau/foo.py", line 66, in two
one()
File "/home/user/Bureau/foo.py", line 56, in one
fail()
File "/home/user/Bureau/foo.py", line 51, in fail
raise ValueError("Example exception")
ValueError: Example exception Pros and consPros
Cons
There are some things to iron out but it's promising. Should I try this out, or continue on the TaskResult.traceback attribute solution? |
Personally, I think the need for |
Fair enough. I'll continue in that direction. |
Does complex_exception need a tweak? def test_complex_exception(self) -> None:
result = default_task_backend.enqueue(test_tasks.complex_exception, [], {})
...
# Is this meant as "self.assertIsInstance" ?
self.assertIsNone(result.result, TypeError) Currently, this passes because result.result is actually If this is expected I can:
|
bc02c47
to
1d41585
Compare
Closes #63
Tasks can now store the exception they raised, if they raised one.
The serialization process isn't perfect, since it must be JSON-compatible. Some information (such as traceback) is lost as a result. I suspect in most cases, an instance check or extracting the message, which is possible here. The format is internal, so can be expanded in future if needed.