Skip to content

Latest commit

 

History

History
303 lines (229 loc) · 12.3 KB

File metadata and controls

303 lines (229 loc) · 12.3 KB

GIL image title

Contents

Related files

  • cpython/Python/ceval.c
  • cpython/Python/ceval_gil.h
  • cpython/Include/internal/pycore_gil.h

Introduction

This is the definition of the Global Interpreter Lock.

In CPython, the global interpreter lock, or GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. This lock is necessary mainly because CPython's memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)

Thread scheduling before Python 3.2

Basically, the tick is a counter for how many opcodes current thread executed continuously without releasing the GIL.

If the current thread is running a CPU-bound task, it will release the gil and offer an opportunity for another thread to run for every 100 ticks.

If the current thread is running an IO-bound task, the GIL will be released manually if you call sleep/recv/send(...etc) even without count to 100 ticks.

You can call sys.setcheckinterval() to set other tick count value instead of 100.

old_gil (picture from Understanding the Python GIL(youtube))

Because the tick is not time-based, some thread might run far longer than other threads.

In multi-core machine, if two threads both running CPU-bound tasks, the OS might schedule the two threads running on different cores, there might be a situation that one thread holding the GIL executing its task in its 100 ticks cycle in a core, while the thread in the other core wakes up periodically try to acquire the GIL but fail, spinning the CPU.

The job (thread) schedule mechanism is fully controlled by the operating system, the thread handling an IO-bound task have to wait for other thread to release the GIL, and other thread might re-acquire the GIL after it release the GIL, which makes the current IO-bound task thread wait even longer (actually, the thread that cause OS's context-switch by itself will have higher priority than those thread forced by the OS, programmer can utilize this feature by putting the IO-bound thread to sleep as soon as possible).

gil_battle (picture from Understanding the Python GIL(youtube))

Thread scheduling after Python 3.2

Due to some performance issue in multi-core machine, the implementation of the GIL has changed a lot after Python 3.2.

If there's only one thread, it can run forever without checking and releasing the GIL.

If there're more than one threads, the thread currently blocked by the GIL will wait for a period of timeout and set the gil_drop_request to 1, and continue waiting, the thread currently holding the GIL will release the GIL and wait for same period of timeout if the gil_drop_request is set to 1, the thread currently blocking will be signaled and is able to acqure the GIL

new_gil (picture from Understanding the Python GIL(youtube))

The thread set the gil_drop_request to 1 might not be the thread acquiring the GIL.

If the current thread is waiting for the interval, and the owner of the GIL changed during the waiting interval, after wake up, the current thread need to wait, set gil_drop_request to 1, and wait again.

new_gil2 (picture from Understanding the Python GIL(youtube))

For those who are interested in further details, please refer to Understanding the Python GIL(article).

Memory layout

git_layout

Fields

The python intepreter is a program written in C, every executable program written in C have a main function.

Those main-related functions are defined in cpython/Modules/main.c, you will find that the main-related function does some initialization for the intepreter status before executing the main loop, the _gil_runtime_state will be created and initialized in the initialization.

./python.exe

init

interval

>>> import sys
>>> sys.getswitchinterval()
0.005

interval is the suspend timeout before setting the gil_drop_request in microseconds, 5000 microseconds is 0.005 seconds.

It's stored as microseconds in C and represented as seconds in Python.

last_holder

last_holder stores the C address of the last PyThreadState holding the GIL, this helps us know whether anyone else was scheduled after we dropped the GIL.

locked

locked is a field of type _Py_atomic_int, -1 indicates uninitialized, 0 means no one is currently holding the GIL, 1 means someone is holding it. This is atomic because it can be read without any lock taken in ceval.c.

/* cpython/Python/ceval_gil.h */
static void take_gil(PyThreadState *tstate)
{
    /* omit */
    /* We now hold the GIL */
    _Py_atomic_store_relaxed(&_PyRuntime.ceval.gil.locked, 1);
    _Py_ANNOTATE_RWLOCK_ACQUIRED(&_PyRuntime.ceval.gil.locked, /*is_write=*/1);
    if (tstate != (PyThreadState*)_Py_atomic_load_relaxed(
                    &_PyRuntime.ceval.gil.last_holder))
    {
        _Py_atomic_store_relaxed(&_PyRuntime.ceval.gil.last_holder,
                                 (uintptr_t)tstate);
        ++_PyRuntime.ceval.gil.switch_number;
    }
    /* omit */
}

static void drop_gil(PyThreadState *tstate)
{
    /* omit */
    if (tstate != NULL) {
        _Py_atomic_store_relaxed(&_PyRuntime.ceval.gil.last_holder,
                                 (uintptr_t)tstate);
    }
    MUTEX_LOCK(_PyRuntime.ceval.gil.mutex);
    _Py_ANNOTATE_RWLOCK_RELEASED(&_PyRuntime.ceval.gil.locked, /*is_write=*/1);
    _Py_atomic_store_relaxed(&_PyRuntime.ceval.gil.locked, 0);
    /* omit */
}

switch_number

switch_number is a counter for the number of GIL switches since the beginning.

It's used in function take_gil.

static void take_gil(PyThreadState *tstate)
{
    /* omit */
    while (_Py_atomic_load_relaxed(&_PyRuntime.ceval.gil.locked)) {
    	/* as long as the gil is locked */
        int timed_out = 0;
        unsigned long saved_switchnum;

        saved_switchnum = _PyRuntime.ceval.gil.switch_number;
        /* release gil.mutex, wait for INTERVAL microseconds(default 5000)
        or gil.cond is signaled during the INTERVAL
        */
        COND_TIMED_WAIT(_PyRuntime.ceval.gil.cond, _PyRuntime.ceval.gil.mutex,
                        INTERVAL, timed_out);
        /* currently holding gil.mutex */
        if (timed_out &&
            _Py_atomic_load_relaxed(&_PyRuntime.ceval.gil.locked) &&
            _PyRuntime.ceval.gil.switch_number == saved_switchnum) {
            /* If we timed out and no switch occurred in the meantime, it is time
           	to ask the GIL-holding thread to drop it.
            set gil_drop_request to 1 */
            SET_GIL_DROP_REQUEST();
        }
        /* go on to the while loop to check if the gil is locked */
    }
    /* omit */
}

mutex

mutex is a mutex used for protecting locked, last_holder, switch_number, and other variables in _gil_runtime_state.

cond

cond is a condition variable, combined with mutex, used for signaling the release of the GIL.

switch_cond and switch_mutex

switch_cond is another condition variable, combined with switch_mutex can be used for making sure that the thread acquiring the GIL is not the thread that released the GIL, avoiding a waste of the time slice.

It can be turned off without the definition of FORCE_SWITCHING.

static void drop_gil(PyThreadState *tstate)
{
/* omit */
#ifdef FORCE_SWITCHING
    if (_Py_atomic_load_relaxed(&_PyRuntime.ceval.gil_drop_request) &&
        tstate != NULL)
    {
    	/* if the gil_drop_request is set and tstate is not null */
        /* lock the mutex switch_mutex */
        MUTEX_LOCK(_PyRuntime.ceval.gil.switch_mutex);
        if (((PyThreadState*)_Py_atomic_load_relaxed(
                    &_PyRuntime.ceval.gil.last_holder)
            ) == tstate)
        {
        /* if the last_holder is the current thread, release the switch_mutex,
        wait until there's a signal for switch_cond */
        RESET_GIL_DROP_REQUEST();
            /* NOTE: if COND_WAIT does not atomically start waiting when
               releasing the mutex, another thread can run through, take
               the GIL and drop it again, and reset the condition
               before we even had a chance to wait for it. */
            COND_WAIT(_PyRuntime.ceval.gil.switch_cond,
                      _PyRuntime.ceval.gil.switch_mutex);
    }
        MUTEX_UNLOCK(_PyRuntime.ceval.gil.switch_mutex);
    }
#endif
}

When the GIL will be released

The main_loop in cpython/Python/ceval.c is a big for loop, and a big switch statement.

The big for loop loads opcode one by one, and the big switch statement executes different C code according to the opcode.

The for loop will check the variable gil_drop_request and release the gil if necessary.

Not every opcode will check the gil_drop_request, some opcode that ends with FAST_DISPATCH() will go to the next statement directly, while some opcode that ends with DISPATCH() acts as continue statement and will go to the beginning of the for loop.

/* cpython/Python/ceval.c */
main_loop:
    for (;;) {
        /* omit */
        if (_Py_atomic_load_relaxed(&_PyRuntime.ceval.eval_breaker)) {
            opcode = _Py_OPCODE(*next_instr);
            if (opcode == SETUP_FINALLY ||
                opcode == SETUP_WITH ||
                opcode == BEFORE_ASYNC_WITH ||
                opcode == YIELD_FROM) {
                /* go to switch statement without check for the gil */
                goto fast_next_opcode;
            }
            /* omit */
            if (_Py_atomic_load_relaxed(
                        &_PyRuntime.ceval.gil_drop_request))
            {
            	/* if the gil_drop_request is set by other thread */
                /* Give another thread a chance */
                if (PyThreadState_Swap(NULL) != tstate)
                    Py_FatalError("ceval: tstate mix-up");
                drop_gil(tstate);

                /* Other threads may run now */

                take_gil(tstate);

                /* Check if we should make a quick exit. */
                if (_Py_IsFinalizing() &&
                    !_Py_CURRENTLY_FINALIZING(tstate))
                {
                    drop_gil(tstate);
                    PyThread_exit_thread();
                }

                if (PyThreadState_Swap(tstate) != NULL)
                    Py_FatalError("ceval: orphan tstate");
            }
            /* omit */
        }

    fast_next_opcode:
		/* omit */
    switch (opcode) {
        case TARGET(NOP): {
            FAST_DISPATCH();
        }
        /* omit */
        case TARGET(UNARY_POSITIVE): {
            PyObject *value = TOP();
            PyObject *res = PyNumber_Positive(value);
            Py_DECREF(value);
            SET_TOP(res);
            if (res == NULL)
                goto error;
            DISPATCH();
        }
    	/* omit */
    }
    /* omit */
}

ceval