-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crash on windows invoking flake8 #88350
Comments
I installed python using the installers from python.org -- I originally reproduced this using github actions using pyflakes's testsuite
Microsoft Windows [Version 10.0.19041.985]
C:\Users\asott\AppData\Local\Temp\y\pyflakes>venv\Scripts\python --version --version to reproduce:
# install tox somehow
here are the versions I have at that point:
then run this a few times:
even with it occasionally crashes like this:
from some googling:
I don't do much development on windows so I'm passing the torch to someone who knows more :) |
I can reproduce it outside of tox using: venv\Scripts\pip install flake8==3.6.0 C:\Users\asott\AppData\Local\Temp\y\pyflakes>venv\Scripts\flake8.exe setup.py C:\Users\asott\AppData\Local\Temp\y\pyflakes>echo %ERRORLEVEL% |
This looks like a pyflakes error to me. And you've also mentioned you can reproduce it outside tox with pyflakes. |
Anthony, could you please check if any of your dependencies has a native C extension? On Windows they have a .pyd extension. |
It's reproducible. I reproduced it on my Windows 10 with Python 3.10.0b1 |
everything in this virtualenv is pure python so I don't think it's a faulty third party extension module |
Pablo, Steve, please take a look. |
Is b02ba019e16f7c156ec63c2ea05c627a0fe86c48 the correct commit? I cannot checkout this: ❯ git clone https://github.com/pycqa/pyflakes Cloning into 'pyflakes'... /tmp pyflakes on master pyenv 3.9.1 fatal: reference is not a tree: b02ba019e16f7c156ec63c2ea05c627a0fe86c48 |
ah sorry, the branch got squash-merged this is the equivalent revision after the merge: f3b1b44bf3d2d5927004fa1c2fcf1ab2def816b9 |
Can recreate on the latest 3.10 checkout, taking a look. |
Where does flake8.exe come from? Is it created by pip? Or by Python? Is it part of the flake8 project? |
vstinner I showed the directions above, but here they are again: venv\Scripts\pip install flake8==3.6.0 |
This bug is really hard to reproduce on Windows. It depends on flake8 is run. It's a random crash in the last GC collection at Python exit. Crash related to AST in interpreter_clear() remains me bpo-41796. Extract of the code: static void
interpreter_clear(PyInterpreterState *interp, PyThreadState *tstate)
{
...
_PyAST_Fini(interp);
...
/* Last garbage collection on this interpreter */
_PyGC_CollectNoFail(tstate);
_PyGC_Fini(interp);
...
} The crash occurs in _PyGC_CollectNoFail(). _PyAST_Fini() clears references to AST types in the interpreter AST state. AST type instances hold a reference to the their heap type: static int
ast_traverse(AST_object *self, visitproc visit, void *arg)
{
Py_VISIT(Py_TYPE(self));
Py_VISIT(self->dict);
return 0;
} But the crash happens when self->dict is deallocated. |
ISTR there were some changes made to assigning attributes on AST classes recently? I forget who did them, but I remember discussing it during the sprints last year. Perhaps it's related to that? (Though more likely a subclass, as Steve suggests.) |
would it maybe be helpful to bisect a history where the dataclasses / inspect import change is introduced earlier? this would perhaps help pinpoint the other commit which is causing this? |
I have a reproducer (.py script) on Windows, but the crash rate is between 1/3 and 1/2. Also, in VS, if I run "import bug" (bug.py) in the REPL, I fail to reproduce the crash. Using "exec(open('../bug.py').read())" is more likely to trigger the crash, but in VS, I have hard time to trigger the bug. I only reproduced it once. The bug seems to require very precise conditions. Maybe it depends on the randomized hash function. |
Apply attached debug_subtype_dealloc.patch to reproduce the issue on Linux with attached ref.py script: $ ./python ref.py
exit
subtype_dealloc(_ABC): call basedealloc() with Py_REFCNT(type)=1
subtype_dealloc(_Precedence): call basedealloc() with Py_REFCNT(type)=1
subtype_dealloc(property): call basedealloc() with Py_REFCNT(type)=1
subtype_dealloc(FlagBoundary): call basedealloc() with Py_REFCNT(type)=1
LAST GC
Cycle.__del__
Cycle.__del__
subtype_dealloc(keyword): call basedealloc() with Py_REFCNT(type)=1
python: Objects/typeobject.c:1462: subtype_dealloc: Assertion `!_PyMem_IsPtrFreed(type->tp_name)' failed.
Abandon (core dumped) |
Ok, I got a crash under the address sanitizer using ref.py: ./python lel.py previously allocated by thread T0 here: |
Valgrind detects the bug (unmodified Python): $ PYTHONMALLOC=debug valgrind ./python ref.py
(...)
==607098== Invalid read of size 8
==607098== at 0x493FBE: subtype_dealloc (typeobject.c:1456)
==607098== by 0x47C914: _Py_Dealloc (object.c:2288)
==607098== by 0x45F6BF: _Py_DECREF (object.h:500)
==607098== by 0x45F70D: _Py_XDECREF (object.h:567)
==607098== by 0x4654F5: dict_dealloc (dictobject.c:2068)
==607098== by 0x47C914: _Py_Dealloc (object.c:2288)
==607098== by 0x668EBE: _Py_DECREF (object.h:500)
==607098== by 0x66E8BE: ast_dealloc (Python-ast.c:764)
==607098== by 0x493FB9: subtype_dealloc (typeobject.c:1450)
(...) |
Well remembered, Victor! Bisecting using Pablo's reproducer:
|
Anthony Sottile: I'm surprised that AST nodes survive until the last GC collection. It seems like somehow a reference cycle prevent to delete these nodes, and this reference cycle is kept alive somehow until the last GC collection at Python exit. It would be interesting to follow references. You may start from PyInterpreterState which keeps objects alive until the last GC collection:
Callbacks keep global variables of a module alive through its __globals__ attribute (namespace of the module where it's defined). |
that version of flake8 uses multiprocessing (even for 1 file) -- would the ast objects be involved in that way? (pyflakes also makes reference cyles to handle "parent" relationships) |
The issue is fixed by: commit 615069e
|
Anthony Sottile: "that version of flake8 uses multiprocessing (even for 1 file) -- would the ast objects be involved in that way? (pyflakes also makes reference cyles to handle "parent" relationships)" I expect flake8 to use the ast somewhere to analyze source code. But I don't have the bandwidth to investigate flake8 creates a reference cycle. The Python regression is fixed, I closed the issue. Thanks for the bug report Anthony! |
Indeed, it's quite a tricky issue so I'm glad it was caught in the beta. Thank you for the report Anthony. Thanks for tracing the root cause and the fix Victor and thank you to everyone who helped debug. |
Reopening this issue, as there is another branch (for non-GC heaptypes) earlier in subtype_dealloc that I believe suffers from the same problem. Actually triggering the error in a test has been difficult because as far as I can tell it relies on garbage collection at the right time, but reading the code it seems clear it's problematic. I'll prepare a PR to fix it there. I'm also reopening this issue because I believe it should've been backported to 3.9, and possibly 3.8 (if it's considered a security problem to get python to read and write freed memory). I found this issue in 3.9 while debugging a pybind11 crash. I'll backport after the other PR is in (or rejected). |
Fix extended to the other branch (and backported to 3.10), and both parts backported to 3.9. I don't think it counts as a security issue, so not backporting to 3.8. |
Thanks Thomas, I missed this code path! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: