New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python logic error when deal with re and muti-threading #68743
Comments
Bug 0x01 is the main problem. t.start()
t.join(timeout)
In normal case, I run a while() in sub-thread, the main thread will get the control of the program after the sub-thread is timed out.
But, in our POC, even the sub-thread timed out, the main thread still can't execute continue. After analyzing, I found the main thread trapped into an infinite loop like I described in the PDF. |
If you re-post your bug information in a plain text and/or test program format it might get faster attention. |
#Python logic error when deal with re and muti-threading When use re and multi-threading it will trigger the bug. Bug type: Test Enviroment:
-----------------------------Normal Case------------------------
Test Code: #!/usr/bin/python
__author__ = 'bee13oy'
import re
import threading
timeout = 2
source = "(.*(.)?)*bcd\\t\\n\\r\\f\\a\\e\\071\\x3b\\$\\\\\?caxyz"
def run(source):
while(1):
print("test1")
def handle():
try:
t = threading.Thread(target=run,args=(source,))
t.setDaemon(True)
t.start()
t.join(timeout)
print("thread finished...It's an normal case!\n")
except:
print("exception ...\n")
handle() +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----------------------------Bug Case-----------------------------------------------------------------------------
POC: #!/usr/bin/python
__author__ = 'bee13oy'
import re
import os
import threading
timeout = 2
source = "(.*(.)?)*bcd\\t\\n\\r\\f\\a\\e\\071\\x3b\\$\\\\\?caxyz"
def run(source):
regexp = re.compile(r''+source+'')
sgroup = regexp.search(source)
def handle():
try:
t = threading.Thread(target=run,args=(source,))
t.setDaemon(True)
t.start()
t.join(timeout)
print("finished...\n")
except:
print("exception ...\n")
handle() +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
---------------------------------------------------------------- At first, it will run into the sub-thread, but it can't end normally. The bug is that the sub-thread was into an infinite loop and the main-thread was into an infinite loop too, which causes the program to be hang up. By analyzing the source code of Python, we found that:
-----------------------------code block 0----------------------------------
-----------------------------code block 1----------------------------------
static void take_gil(PyThreadState *tstate)
{
int err;
if (tstate == NULL)
Py_FatalError("take_gil: NULL tstate");
err = errno;
MUTEX_LOCK(gil_mutex);
if (!_Py_atomic_load_relaxed(&gil_locked))
goto _ready;
/*Cycle code which will never return*/
while (_Py_atomic_load_relaxed(&gil_locked)) {
int timed_out = 0;
unsigned long saved_switchnum;
saved_switchnum = gil_switch_number;
COND_TIMED_WAIT(gil_cond, gil_mutex, INTERVAL, timed_out);
/* If we timed out and no switch occurred in the meantime, it is time
to ask the GIL-holding thread to drop it. */
if (timed_out &&
_Py_atomic_load_relaxed(&gil_locked) &&
gil_switch_number == saved_switchnum) {
SET_GIL_DROP_REQUEST();
}
}
.....
} |
Your regex is a pathological case: it suffers from catastrophic backtracking and can take a long time to finish. The other problem is that the re module never releases the GIL, so while it's performing the search in the low-level C code, other Python threads don't get a chance to run. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: