# Description

Similar as `07` but with numba disabled to compare with a pure Python implementation.

# Disable numba

In [1]:
%env NUMBA_DISABLE_JIT=1

env: NUMBA_DISABLE_JIT=1


# Remove pycache dir

In [2]:
!echo ${CODE_DIR}

/opt/code


In [3]:
!find ${CODE_DIR} -regex '^.*\(__pycache__\)$' -print

/opt/code/libs/clustermatch/__pycache__
/opt/code/libs/clustermatch/sklearn/__pycache__
/opt/code/libs/clustermatch/scipy/__pycache__
/opt/code/libs/clustermatch/pytorch/__pycache__


In [4]:
!find ${CODE_DIR} -regex '^.*\(__pycache__\)$' -prune -exec rm -rf {} \;

In [5]:
!find ${CODE_DIR} -regex '^.*\(__pycache__\)$' -print

# Modules

In [6]:
import numpy as np

from clustermatch.coef import ccc

In [7]:
# let numba compile all the code before profiling
ccc(np.random.rand(10), np.random.rand(10))

0.28

# Data

In [8]:
n_genes, n_samples = 10, 30000

In [9]:
np.random.seed(0)

In [10]:
data = np.random.rand(n_genes, n_samples)

In [11]:
data.shape

(10, 30000)

# With default `internal_n_clusters`

In [12]:
def func():
    n_clust = list(range(2, 10 + 1))
    return ccc(data, internal_n_clusters=n_clust)

In [13]:
%%timeit func()
func()

5.04 s ± 33.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [14]:
%%prun -s cumulative -l 50 -T 09-cm_many_samples-default_internal_n_clusters.txt
func()

 
*** Profile printout saved to text file '09-cm_many_samples-default_internal_n_clusters.txt'. 


These results are just slightly worse than the numba-compiled version (notebook `07`).

# With reduced `internal_n_clusters`

In [15]:
def func():
    n_clust = list(range(2, 5 + 1))
    return ccc(data, internal_n_clusters=n_clust)

In [16]:
%%timeit func()
func()

500 ms ± 3.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [17]:
%%prun -s cumulative -l 50 -T 09-cm_many_samples-less_internal_n_clusters.txt
func()

 
*** Profile printout saved to text file '09-cm_many_samples-less_internal_n_clusters.txt'. 


These results are slightly better than the numba-compiled version (notebook `07`), which is surprising. In the future, it would be interesting to disable threading here to get accurate profiling results to debug this issue.