In [1]:
%load_ext cython

## There's a GIL

The GIL is the global interpreter lock which makes python thread safe by doing things to be objects. Cython functions do not need to be used.

The number of cores is displayed in the task manager performance.

In [2]:
%%cython

cdef extern from 'math.h':
    double cos(double x)
    double sin(double x)
    double tan(double x)

cdef double func(double x):
    cdef double ss = 0
    cdef int i
    for i in range(1000):
        ss = (cos(i) * sin(i) * tan(i))**100
    return ss
    
cpdef runWithGil():
    cdef int i
    cdef double ss
    for i in range(100):
        ss = func(i)
    return ss

In [3]:
%timeit runWithGil()

10.4 ms ± 308 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [12]:
%%cython

from cython.parallel cimport prange

cdef extern from 'math.h':
    double cos(double x) nogil
    double sin(double x) nogil
    double tan(double x) nogil

cdef double func(double x) nogil:
    cdef double ss = 0
    cdef int i
    for i in prange(1000, nogil = True, schedule='static', chunksize=1):
        ss = (cos(i) * sin(i) * tan(i))**100
    return ss

cdef double calcParallel() nogil:
    cdef double ss = 0
    cdef int i
    for i in prange(100):
        ss = func(i)
    return ss

def runSumNoGil():
    return calcParallel()

In [13]:
%timeit runSumNoGil()

11 ms ± 400 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Using the gil here did not speed anything up and the fact we cannot use python clases like numpy arrays makes the whole thing better avoided. Python offers parallelisation libraries which can do the job better with a minimal but debuggable interface.