# Run *SDePER* on simulated data: Platform effect demonstration + scRNA-seq data as reference + NO CVAE

In this Notebook we run SDePER on STARmap-based simulated data. To best demonstrate the platform effect, we treat all 2,002 STARmap single cells as fake spatial spots and performed cell type deconvolution.

**Platform effect demonstration** means we actually use single cells for cell type deconvolution to best demonstrate the platform effect. The reference data for deconvolution includes all single cells with the **matched 12 cell types**.

**scRNA-seq data as reference** means the reference data is scRNA-seq data ([GSE115746](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE115746)) from the same tissue with simulated spatial data, therefore **platform effect exists**.

**NO CVAE** means we DO NOT use CVAE to remove platform effect although platform effect exists here.

==================================================================================================================

So here we use the **3 input files** as shown below:

1. raw nUMI counts of all 2,002 STARmap single cells (cells × genes): [STARmap_cell_nUMI.csv](https://github.com/az7jh2/SDePER_Analysis/blob/main/Simulation/Run_SDePER_on_simulation_data/Scenario_1/ref_spatial/STARmap_cell_nUMI.csv)
2. raw nUMI counts of reference scRNA-seq data (cells × genes): `scRNA_data_full.csv`. Since the file size of csv file of raw nUMI matrix of all 23,178 cells and 45,768 genes is up to 2.3 GB, we do not provide this file in our repository. It's just a **matrix transpose** of [GSE115746_cells_exon_counts.csv.gz](https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE115746&format=file&file=GSE115746%5Fcells%5Fexon%5Fcounts%2Ecsv%2Egz) in [GSE115746](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE115746) to satisty the file format requirement that rows as cells and columns as genes.
3. cell type annotations for cells of **the matched 12 cell types** in reference scRNA-seq data (cells × 1): [ref_scRNA_cell_celltype.csv](https://github.com/az7jh2/SDePER_Analysis/blob/main/Simulation/Run_SDePER_on_simulation_data/Scenario_1/ref_scRNA_seq/ref_scRNA_cell_celltype.csv)

NOTE: No adjacency matrix was used as input!

==================================================================================================================

SDePER settings are:

* number of used CPU cores `n_core`: 64
* **whether to use CVAE to remove platform effect `use_cvae`: false**

ALL other options are left as default.

==================================================================================================================

the `bash` command to start cell type deconvolution is

`runDeconvolution -q STARmap_cell_nUMI.csv -r scRNA_data_full.csv -c ref_scRNA_cell_celltype.csv -n 64 --use_cvae false`

Note this Notebook uses **SDePER v1.5.0**. Cell type deconvolution result is renamed as [PlatEffDemo_ref_scRNA_SDePER_NO_CVAE_celltype_proportions.csv](https://github.com/az7jh2/SDePER_Analysis/blob/main/Simulation/Run_SDePER_on_simulation_data/PlatEffDemo/PlatEffDemo_ref_scRNA_SDePER_NO_CVAE_celltype_proportions.csv).

In [1]:
import subprocess

cmd = '''runDeconvolution -q STARmap_cell_nUMI.csv \
                          -r scRNA_data_full.csv \
                          -c ref_scRNA_cell_celltype.csv \
                          -n 64 \
                          --use_cvae false
'''

subprocess.run(cmd, check=True, text=True, shell=True)


SDePER (Spatial Deconvolution method with Platform Effect Removal) v1.5.0


running options:
spatial_file: /home/exouser/Spatial/STARmap_cell_nUMI.csv
ref_file: /home/exouser/Spatial/scRNA_data_full.csv
ref_celltype_file: /home/exouser/Spatial/ref_scRNA_cell_celltype.csv
marker_file: None
loc_file: None
A_file: None
n_cores: 64
threshold: 0
use_cvae: False
use_imputation: False
diagnosis: False
verbose: True
use_fdr: True
p_val_cutoff: 0.05
fc_cutoff: 1.2
pct1_cutoff: 0.3
pct2_cutoff: 0.1
sortby_fc: True
n_marker_per_cmp: 20
filter_cell: True
filter_gene: True
n_hv_gene: 200
n_pseudo_spot: 100000
pseudo_spot_min_cell: 2
pseudo_spot_max_cell: 8
seq_depth_scaler: 10000
cvae_input_scaler: 10
cvae_init_lr: 0.01
num_hidden_layer: 1
use_batch_norm: True
cvae_train_epoch: 500
use_spatial_pseudo: False
redo_de: True
seed: 383
lambda_r: [0.1, 0.268, 0.72, 1.931, 5.179, 13.895, 37.276, 100.0]
lambda_g: [0.1, 0.268, 0.72, 1.931, 5.179, 13.895, 37.276, 100.0]
diameter: 200
impute_diameter: [160, 

CompletedProcess(args='runDeconvolution -q STARmap_cell_nUMI.csv                           -r scRNA_data_full.csv                           -c ref_scRNA_cell_celltype.csv                           -n 64                           --use_cvae false\n', returncode=0)