In [None]:
import os
import sys

os.chdir("..")
sys.path.append("../../")

# Get the bpparam for `bpmapply`

## Introduction

In this section, we will show how to use `get_bpparam()` function to get a **R** `BiocParallel::MulticoreParam` or `BiocParallel::SnowParam` object to combine with @parallelization = `bpmapply`.

To get detailed information of the input and output of the function, please check [API](../set_up/_autosummary/pyscDesign3.get_bpparam.rst).

To get more information on `BiocParallel` **R** package to help you set the parameters, please check the [document](https://www.bioconductor.org/packages/devel/bioc/manuals/BiocParallel/man/BiocParallel.pdf).

## Step 1: Import packages

In [None]:
import time
import anndata as ad
import pyscDesign3

## Step 2: Call `get_bpparam` function

- For Linux/Mac users:

The possible parallel method include `mcmapply`, `pbmcmapply` and `bpmapply`. If you are using the `bpmapply` method, then you should run this function and you can choose either `MulticoreParam` or `SnowParam` mode.

- For windows users:

The only choice is to choose `bpmapply` method and run this function with `SnowParam` mode. **Setting more than 1 core is not allowed in other methods.**

In [None]:
bpparam = pyscDesign3.get_bpparam(mode="MulticoreParam", show=True, stop_on_error=False)

## Step 3: Read in data and Run the scDesign3 methods

The raw data is from the [scvelo](https://scvelo.readthedocs.io/scvelo.datasets.pancreas/) and we only choose top 30 genes to save time.

In [None]:
data = ad.read_h5ad("data/PANCREAS.h5ad")
data = data[:, 0:30]
data

Here we simply show the differnece when fitting the marginal models using the `SnowParam` mode.

In [None]:
# create the instance
test1 = pyscDesign3.scDesign3(n_cores=1, parallelization="bpmapply", bpparam=bpparam, return_py=False)
test2 = pyscDesign3.scDesign3(n_cores=4, parallelization="bpmapply", bpparam=bpparam, return_py=False)

# construct data
test1.construct_data(
    anndata=data,
    default_assay_name="counts",
    celltype="cell_type",
    pseudotime="pseudotime",
    corr_formula="1",
)
test2.construct_data(
    anndata=data,
    default_assay_name="counts",
    celltype="cell_type",
    pseudotime="pseudotime",
    corr_formula="1",
)

Fit marginal using 1 core

In [None]:
start = time.time()
test1.fit_marginal(
    mu_formula="s(pseudotime, k = 10, bs = 'cr')",
    sigma_formula="s(pseudotime, k = 5, bs = 'cr')",
    family_use="nb",
    usebam=False,
)
end = time.time()
print("Total time cost when using 1 core is {:.2f} sec".format(end-start))

Fit marginal using 4 cores

In [None]:
start = time.time()
test2.fit_marginal(
    mu_formula="s(pseudotime, k = 10, bs = 'cr')",
    sigma_formula="s(pseudotime, k = 5, bs = 'cr')",
    family_use="nb",
    usebam=False,
)
end = time.time()
print("Total time cost when using 4 cores is {:.2f} sec".format(end-start))