Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate frame in traceback of exception raised inside trace function #102818

Closed
chgnrdv opened this issue Mar 18, 2023 · 8 comments
Closed

Duplicate frame in traceback of exception raised inside trace function #102818

chgnrdv opened this issue Mar 18, 2023 · 8 comments
Assignees
Labels
type-bug An unexpected behavior, bug, or error

Comments

@chgnrdv
Copy link
Contributor

chgnrdv commented Mar 18, 2023

First appeared in e028ae9.
Reproducer:

import sys

def f():
    pass

def trace(frame, event, arg):
    raise ValueError()

sys.settrace(trace)
f()

Before 'bad' commit (3e43fac):

Traceback (most recent call last):
  File "/home/.../trace_tb_bug.py", line 10, in <module>
    f()
    ^^^
  File "/home/.../trace_tb_bug.py", line 3, in f
    def f():
  File "/home/.../trace_tb_bug.py", line 7, in trace
    raise ValueError()
    ^^^^^^^^^^^^^^^^^^
ValueError

After 'bad' commit (e028ae9):

Traceback (most recent call last):
  File "/home/.../trace_tb_bug.py", line 10, in <module>
    f()
    ^^^
  File "/home/.../trace_tb_bug.py", line 3, in f
    def f():
    
  File "/home/.../trace_tb_bug.py", line 3, in f
    def f():
    
  File "/home/.../trace_tb_bug.py", line 7, in trace
    raise ValueError()
    ^^^^^^^^^^^^^^^^^^
ValueError

3.11.0 release and main (039714d) also lack pointers to error locations, but this probably needs a different issue:

Traceback (most recent call last):
  File "/home/.../trace_tb_bug.py", line 10, in <module>
    f()
  File "/home/.../trace_tb_bug.py", line 3, in f
    def f():
    
  File "/home/.../trace_tb_bug.py", line 3, in f
    def f():
    
  File "/home/.../trace_tb_bug.py", line 7, in trace
    raise ValueError()
ValueError

Linked PRs

@chgnrdv chgnrdv added the type-bug An unexpected behavior, bug, or error label Mar 18, 2023
@Eclips4
Copy link
Member

Eclips4 commented Mar 18, 2023

cc @markshannon

@chgnrdv chgnrdv changed the title Duplicate frame in traceback if exception is raised inside trace function Duplicate frame in traceback of exception raised inside trace function Mar 18, 2023
@gaogaotiantian
Copy link
Member

This actually will reproduce before the commit given. The reason the example code was working was because the trace function happens to raise an exception on "call" event, which happened in start_frame label and is directed to exit_unwind. If you change the code of trace to:

def trace(frame, event, arg):
    if event == "line":
        raise ValueError()
    return trace

It will reproduce this issue even on e028ae9. The fundamental issue here is - who is responsible to set the traceback when the trace function raises an exception.

Currently, call_trampoline has the code to set traceback, but on the frame the trace function is being called. In the meantime, error label in _PyEval_EvalFrameDefault also has the code to set the traceback. If both code executes, two identical entries will be added to the traceback.

In this specific case, TRACE_FUNCTION_ENTRY in RESUME went to error label when failed, which caused the duplicate frames (It used to goto exit_unwind which hides this issue).

The solution is not trivial here - multiple functions are relying on call_trampoline to set the traceback. A clear way to fix this is to always do traceback at the same place, preferably in _PyEval_EvalFrameDefault where the tracebacks are set for normal functions. However, to achieve that, all the trace function failures need to go through error lable, rather than some going through exit_unwind on failure as of now.

I'm not familiar with the structure enough to make the call, but I guess this would be a good question to @markshannon - can we do that? There is definitely performance hit(on a very rare case), but is that even valid? To go through error lable when the frame got an error from the tracing function?

I can help implementing this if needed, just need to confirm the solution.

@chgnrdv
Copy link
Contributor Author

chgnrdv commented Mar 20, 2023

@gaogaotiantian Yes, I overlooked the case on 3e43fac when trace function raises during line event.

My first rough attempt to fix this issue was to make goto in TRACE_FUNCTION_ENTRY point to exception_unwind label, but tests were failing and I was not familiar enough with code in _PyEval_EvalFrameDefault to make other attempts.

This draft patch works as expected and passes all tests. It's not a final fix but something that can prove the concept.

diff --git a/Python/ceval.c b/Python/ceval.c
index 7d60cf987e..0b70c7dc8b 100644
--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -835,6 +835,8 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
     }
     DISPATCH();
 
+    bool tracing_error = false;
+
     {
     /* Start instructions */
 #if !USE_COMPUTED_GOTOS
@@ -896,6 +898,7 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
                         // instruction. Increment it before handling the error,
                         // so that it looks the same as a "normal" instruction:
                         next_instr++;
+                        tracing_error = true;
                         goto error;
                     }
                     // Reload next_instr. Don't increment it, though, since
@@ -982,13 +985,15 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
 
         /* Log traceback info. */
         assert(frame != &entry_frame);
-        if (!_PyFrame_IsIncomplete(frame)) {
+        if (!_PyFrame_IsIncomplete(frame) && !tracing_error) {
             PyFrameObject *f = _PyFrame_GetFrameObject(frame);
             if (f != NULL) {
                 PyTraceBack_Here(f);
             }
         }
 
+        tracing_error = false;
+
         if (tstate->c_tracefunc != NULL) {
             /* Make sure state is set to FRAME_UNWINDING for tracing */
             call_exc_trace(tstate->c_tracefunc, tstate->c_traceobj,
@@ -996,6 +1001,7 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
         }
 
 exception_unwind:
+
         {
             /* We can't use frame->f_lasti here, as RERAISE may have set it */
             int offset = INSTR_OFFSET()-1;
diff --git a/Python/ceval_macros.h b/Python/ceval_macros.h
index c2257515a3..107a5724e0 100644
--- a/Python/ceval_macros.h
+++ b/Python/ceval_macros.h
@@ -312,6 +312,7 @@ GETITEM(PyObject *v, Py_ssize_t i) {
         stack_pointer = _PyFrame_GetStackPointer(frame); \
         frame->stacktop = -1; \
         if (err) { \
+            tracing_error = true; \
             goto error; \
         } \
     }

@gaogaotiantian
Copy link
Member

Well, even though the fix is easy to understand, I'm not sure if it's elegant enough. Introducing a state variable makes the code less robust and harder to maintain in the future. But I agree that it proves the logic of the problem. I guess we still need to wait for @markshannon for the actual path to fix this.

@chgnrdv
Copy link
Contributor Author

chgnrdv commented Mar 20, 2023

I think we can get rid of variable by placing the code that sets traceback just after error label, since, as far as I see, this code cannot clear the current exception set on tstate and obviously can't change kwnames variable. But yes, let's wait and see what markshannon say about it.

diff --git a/Python/ceval.c b/Python/ceval.c
index 7d60cf987e..9d4f25a58b 100644
--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -896,7 +896,7 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
                         // instruction. Increment it before handling the error,
                         // so that it looks the same as a "normal" instruction:
                         next_instr++;
-                        goto error;
+                        goto trace_error;
                     }
                     // Reload next_instr. Don't increment it, though, since
                     // we're going to re-dispatch to the "true" instruction now:
@@ -969,6 +969,16 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
 pop_1_error:
     STACK_SHRINK(1);
 error:
+        /* Log traceback info. */
+        assert(frame != &entry_frame);
+        if (!_PyFrame_IsIncomplete(frame)) {
+            PyFrameObject *f = _PyFrame_GetFrameObject(frame);
+            if (f != NULL) {
+                PyTraceBack_Here(f);
+            }
+        }
+
+trace_error:
         kwnames = NULL;
         /* Double-check exception status. */
 #ifdef NDEBUG
@@ -980,15 +990,6 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int
         assert(_PyErr_Occurred(tstate));
 #endif
 
-        /* Log traceback info. */
-        assert(frame != &entry_frame);
-        if (!_PyFrame_IsIncomplete(frame)) {
-            PyFrameObject *f = _PyFrame_GetFrameObject(frame);
-            if (f != NULL) {
-                PyTraceBack_Here(f);
-            }
-        }
-
         if (tstate->c_tracefunc != NULL) {
             /* Make sure state is set to FRAME_UNWINDING for tracing */
             call_exc_trace(tstate->c_tracefunc, tstate->c_traceobj,
diff --git a/Python/ceval_macros.h b/Python/ceval_macros.h
index c2257515a3..c8a077a9a6 100644
--- a/Python/ceval_macros.h
+++ b/Python/ceval_macros.h
@@ -312,7 +312,7 @@ GETITEM(PyObject *v, Py_ssize_t i) {
         stack_pointer = _PyFrame_GetStackPointer(frame); \
         frame->stacktop = -1; \
         if (err) { \
-            goto error; \
+            goto trace_error; \
         } \
     }

@markshannon
Copy link
Member

It is a bit strange that call_trampoline is setting the traceback.
I'll see if we can do something more sensible.

@markshannon markshannon self-assigned this May 17, 2023
@markshannon
Copy link
Member

The documentation for PyEval_SetTrace doesn't say anything about adding the caller's frame to the traceback.
It would a strange API design if it did.

So, call_trampoline should not be adding the frame. It is the job of the interpreter to add the frame.
There are no tests for PyEval_SetTrace set trace functions raising exceptions, and it is a very rare use case.

carljm added a commit to gsallam/cpython_with_perfmap_apii that referenced this issue May 20, 2023
* main: (30 commits)
  pythongh-103987: fix several crashes in mmap module (python#103990)
  docs: fix wrong indentation causing rendering error in dis page (python#104661)
  pythongh-94906: Support multiple steps in math.nextafter (python#103881)
  pythongh-104472: Skip `test_subprocess.ProcessTestCase.test_empty_env` if ASAN is enabled (python#104667)
  pythongh-103839: Allow building Tkinter against Tcl 8.7 without external libtommath (pythonGH-103842)
  pythongh-85984: New additions and improvements to the tty library. (python#101832)
  pythongh-104659: Consolidate python examples in enum documentation (python#104665)
  pythongh-92248: Deprecate `type`, `choices`, `metavar` parameters of `argparse.BooleanOptionalAction` (python#103678)
  pythongh-104645: fix error handling in marshal tests (python#104646)
  pythongh-104600: Make type.__type_params__ writable (python#104634)
  pythongh-104602: Add additional test for listcomp with lambda (python#104639)
  pythongh-104640: Disallow walrus in comprehension within type scopes (python#104641)
  pythongh-103921: Rename "type" header in argparse docs (python#104654)
  Improve readability of `typing._ProtocolMeta.__instancecheck__` (python#104649)
  pythongh-96522: Fix deadlock in pty.spawn (python#96639)
  pythonGH-102818: Do not call `PyTraceBack_Here` in sys.settrace trampoline.  (pythonGH-104579)
  pythonGH-103545: Add macOS specific constants for ``os.setpriority`` to ``os`` (python#104606)
  pythongh-104623: Update macOS installer to SQLite 3.42.0 (pythonGH-104624)
  pythongh-104619: never leak comprehension locals to outer locals() (python#104637)
  pythongh-104602: ensure all cellvars are known up front (python#104603)
  ...
@chgnrdv
Copy link
Contributor Author

chgnrdv commented May 29, 2023

I'm closing this one as it is fixed and the fix is backported.

@chgnrdv chgnrdv closed this as completed May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants