In [None]:
import os
import sys

os.chdir("..")
sys.path.append("../../")

# Use `scdesign3()` to achieve all-in-one simulation

## Introduction

In this section, we will show how to use `scDesign3` method `scdesign3()` to perform all-in-one simulation and get the new dataset.

To get detailed information of the input and output of the function, please check [API](../set_up/_autosummary/pyscDesign3._core.scDesign3.scdesign3.rst).

## Step 1: Import packages and Read in data

### import packages

When importing the `pyscDesign3` package, the initiation process includes finding the **R** interpreter and detect whether the **R** package **scDesign3** is installed. If the **R** package **scDesign3** isn't installed, `pyscDesign3` will try to automatically install the dependency.

In [None]:
import anndata as ad
import pyscDesign3

### Read in data

The input data should be an `anndata.AnnData` object because so far only the transformation of `anndata.AnnData` to **R** `SingleCellExperiment` object has been implemented. 

Here, we read in the `h5ad` file directly. The raw data is from the [scvelo](https://scvelo.readthedocs.io/scvelo.datasets.pancreas/) and we only choose top 30 genes to save time.

```{eval-rst}
.. Note::
    If you have any problem in building an `anndata.AnnData` object, you can check the `anndata` `document <https://anndata.readthedocs.io/en/latest/>`_ .
```

In [None]:
data = ad.read_h5ad("data/PANCREAS.h5ad")
data = data[:, 0:30]
data

## Step 2: Create an instance of the scDesign3 class

When creating the instance, the basic setting can be specified, including how many cores used to computing, which parallel method to use and whether to return a more pythonic output.

Details of the settings are shown in [API](../set_up/_autosummary/pyscDesign3._core.scDesign3.__init__.rst).

In [None]:
test = pyscDesign3.scDesign3(n_cores=1, parallelization="mcmapply",return_py=True)

## Step 3: call `scdesign3()` method

In [None]:
simu_res = test.scdesign3(
    anndata=data,
    default_assay_name="counts",
    celltype="cell_type",
    pseudotime="pseudotime",
    mu_formula="s(pseudotime, k = 10, bs = 'cr')",
    sigma_formula="s(pseudotime, k = 5, bs = 'cr')",
    family_use="nb",
    usebam=True,
    corr_formula="1",
    copula="gaussian",
)

## Step 4: Check the simulation results and do downstream analysis if needed.

As we set `return_py` = True when initializing, the return value of the `scdesign3()` will be converted to a more pythonic version. 

In [None]:
simu_res["new_count"].head()

In [None]:
simu_res["model_aic"]

The class property `whole_pipeline_res` also stores the simulation result, however, in `rpy2.robjects.vectors.ListVector` version. Actually, if `return_py` = False, the return value is exactly the same as that in the property. You can call `print()` to show the result, which will give you a totally **R** style output.

In [None]:
print(test.whole_pipeline_res)

You can use `rx2` method to get your interested result.

In [None]:
print(test.whole_pipeline_res.rx2("model_aic"))

```{eval-rst}
.. Caution::
    If you are familiar to `rpy2` package or if you do not need any manipulation of the result, you may set the `return_py` as False. 
    
    If you are new to `rpy2`, you may prefer to set the `return_py` as True as the output will be transformed to a version which may be more familiar to you though the conversion will need extra cost.
```