Skip to content

Holding the PyGIL-lock during invokeTask causes segfault #29

@coppol-gh

Description

@coppol-gh

In "run" task of TaskCallClosure the "m_obj.invokeTask" is called from within critical section bounded by "PyGILState_Ensure/Release" pair. By design of CPython, such entrances must be nested within a thread. In contrast, "invokeTask" may take arbitrary time for execution (e.g. it can have a "#(delay)" block or loop or anything that causes it to last for a while). When multiple tasks run in parallel in a single thread (regular cocotb-coroutines) it is very probable that the order of calling Ensure/Release functions will get broken:

  1. task A: stateA = PyGILState_Ensure(), then yields due to entering SV-task; stateA is "not-locked" here
  2. task B: stateB = PyGILState_Ensure(), then yields; stateB is "locked" here
  3. task A: SV-task is over, calling PyGILState_Release(stateA); pass "not-locked" with gilstate_counter == 2 is non-fatal (yet)
  4. task B: SV-task is over, calling PyGILState_Release(stateB); pass "locked" with gilstate_counetr == 1 is fatal

"m_obj.invokeTask" indeed requires GIL-lock as it calls things like "pyhdl_if::PyLong_AsLong" to access parameters / setup return values. However, it should not hold the lock while it executes the SV-task per se. I suggest to modify "GenSVClass::gen_task_dispatch" to generate something like that:

diff --git a/src/hdl_if/share/dpi/pyhdl_if_icall_api.svh b/src/hdl_if/share/dpi/pyhdl_if_icall_api.svh
index 026188b..8c88bdf 100644
--- a/src/hdl_if/share/dpi/pyhdl_if_icall_api.svh
+++ b/src/hdl_if/share/dpi/pyhdl_if_icall_api.svh
@@ -26,7 +26,8 @@ interface class ICallApi;
 
     pure virtual task invokeTask(
         output PyObject     retval,
+        inout PyGILState_STATE state,
         input string        method,
         input PyObject      args);
 
diff --git a/src/hdl_if/impl/call/gen_sv_class.py b/src/hdl_if/impl/call/gen_sv_class.py
index faa4c8e..ae89fac 100644
--- a/src/hdl_if/impl/call/gen_sv_class.py
+++ b/src/hdl_if/impl/call/gen_sv_class.py
@@ -413,6 +413,7 @@ class GenSVClass(object):
         self.println("virtual task invokeTask(")
         self.inc_ind()
         self.println("output pyhdl_if::PyObject retval,")
+        self.println("inout pyhdl_if::PyGILState_STATE state,")
         self.println("input string method,")
         self.println("input pyhdl_if::PyObject args);")
         self.dec_ind()
@@ -439,6 +440,7 @@ class GenSVClass(object):
                         self.py2sv_func(p[1]),
                         i
                     ))
+                self.println("pyhdl_if::PyGILState_Release(state);")
                 self.write("%s" % self._ind)
                 self.write("%s" % m.name)
                 if len(m.params) == 0:
@@ -471,6 +473,7 @@ class GenSVClass(object):
                             "," if (i+1 < len(m.params)) else ");"
                         ))
                     self.dec_ind()
+                self.println("state = pyhdl_if::PyGILState_Ensure();")
                 if m.rtype is not None:
                     if type(m.rtype) is tuple:
                         last_idx = len(m.rtype) - 1

diff --git a/src/hdl_if/share/dpi/pyhdl_if_taskcall_closure.svh b/src/hdl_if/share/dpi/pyhdl_if_taskcall_closure.svh
index a03666f..91fe366 100644
--- a/src/hdl_if/share/dpi/pyhdl_if_taskcall_closure.svh
+++ b/src/hdl_if/share/dpi/pyhdl_if_taskcall_closure.svh
@@ -23,7 +23,7 @@ class TaskCallClosure implements PyHdlPiRunnable;
         evt_obj_set = PyObject_GetAttrString(m_evt_obj, "set");
         args = PyTuple_New(1);
 
-        m_obj.invokeTask(res, m_method_name, m_args);
+        m_obj.invokeTask(res, state, m_method_name, m_args);
 
         if (res == null) begin
             res = None;

It may be not the very best way to resolve the issue. Also, we can imagine the SV-task will try to call Python code in turn (and therefore the GIL-lock will need to be acquired/released again). Rather it is a description of the issue I have encountered trying to run many tasks in parallel.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions