# Installation for Pysodb

This tutorial demonstrates how to install the pysodb package in a conda environment.

## Installing softwares

### 1. The first step is to install Anaconda and Visual Studio Code in advance.

Reference tutorials can be found at https://docs.anaconda.com/anaconda/install/index.html and https://docs.anaconda.com/anaconda/user-guide/tasks/integration/vscode/

### 2. Launch Visual Studio Code and open a terminal window. 


Henceforth, various packages or modules will be installed via the command line

## Installation pysobd

### 3. Select the installation path and open it

In [None]:
cd <path>

### 4. Clone pycodb code

In [None]:
git clone https://github.com/TencentAILabHealthcare/pysodb.git

### 5. Open the pycodb directory

In [None]:
cd pysodb

### 6. Create a conda environment

In [None]:
conda env create -n <environment_name> --file pysodb.yml

### 7. Activate a conda environment

If the conda environment is used on the terminal, run the following command to activate it:

In [None]:
conda activate <environment_name>

### 8. Install a pysodb package from source code

In [None]:
python setup.py install

### 9. Install pysodb as a dependency or third-party package with pip

In [None]:
pip install <third-party package>

## Usage

The next steps demonstrate usage of the pycodb package via Jupyter.

"Select Kernel" through jupyter, and then select the python environment

### import pysodb package

In [1]:
import pysodb

### Initialization

In [2]:
sodb = pysodb.SODB() 

### Get the list of datasets

In [3]:
dataset_list = sodb.list_dataset()

### Get the list of datasets with specific category. 

categories ["Spatial Transcriptomics", "Spatial Proteomics", "Spatial Metabolomics", "Spatial Genomics", "Spatial MultiOmics"]

And, take the example of "Spatial Transcriptomics":

In [4]:
dataset_list = sodb.list_dataset_by_category("Spatial Transcriptomics")
dataset_list

['maynard2021trans',
 'codeluppi2018spatial',
 'xia2022the',
 'backdahl2021spatial',
 'eng2019transcriptome',
 'berglund2018spatial',
 'Sanchez2021A',
 'thrane2018spatially',
 'Dhainaut2022Spatial',
 'Buzzi2022Spatial',
 'Gouin2021An',
 'Wang2018Three_1k',
 'wang2021easi',
 'lohoff2021integration',
 'chen2020spatial',
 'wang2022high',
 'Sun2022Excitatory',
 'Garcia2021Mapping',
 'ji2020multimodal',
 'Dixon2022Spatially',
 'Zeng2023Integrative',
 'asp2019a',
 'seqFISH_VISp',
 'Wang2018three',
 'rodriques2019slide',
 'chen2021decoding',
 'stickels2020highly',
 'liu2022spatiotemporal',
 'Alon2021Expansion',
 'Allen2022Molecular_lps',
 'chen2022spatiotemporal_compre_20',
 'carlberg2019exploring',
 'zhang2021spatially',
 'Marshall2022High_human',
 'Vickovic2019high_update',
 'scispace',
 'hunter2021spatially',
 'Kadur2022Human',
 'fawkner2021spatiotemporal',
 'stahl2016visualization',
 'ortiz2020molecular',
 'Vickovic2019high',
 'Biermann2022Dissecting',
 'DARTFISH',
 'Marshall2022High_mous

### Load a specific experiment 

Loading a specific experiment needs two arguments, dataset_name and experiment_name.

Two arguments are available at https://gene.ai.tencent.com/SpatialOmics/.

In [5]:
adata = sodb.load_experiment('hunter2021spatially','sample_B')
adata

load experiment[sample_B] in dataset[hunter2021spatially]


AnnData object with n_obs × n_vars = 2179 × 32268
    obs: 'col_0', 'leiden'
    var: 'gene_ids', 'feature_types', 'genome', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'leiden', 'leiden_colors', 'log1p', 'moranI', 'neighbors', 'pca', 'spatial_neighbors', 'umap'
    obsm: 'X_pca', 'X_umap', 'spatial', 'spatial_pixel', 'spatial_real'
    varm: 'PCs'
    obsp: 'connectivities', 'distances', 'spatial_connectivities', 'spatial_distances'

### Load a specific dataset

In [6]:
adataset = sodb.load_dataset('hunter2021spatially') 
adataset

load experiment[sample_A] in dataset[hunter2021spatially]
load experiment[sample_C] in dataset[hunter2021spatially]
load experiment[sample_B] in dataset[hunter2021spatially]


{'sample_A': AnnData object with n_obs × n_vars = 2425 × 32268
     obs: 'col_0', 'leiden'
     var: 'gene_ids', 'feature_types', 'genome', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
     uns: 'hvg', 'leiden', 'leiden_colors', 'log1p', 'moranI', 'neighbors', 'pca', 'spatial_neighbors', 'umap'
     obsm: 'X_pca', 'X_umap', 'spatial', 'spatial_pixel', 'spatial_real'
     varm: 'PCs'
     obsp: 'connectivities', 'distances', 'spatial_connectivities', 'spatial_distances',
 'sample_C': AnnData object with n_obs × n_vars = 2677 × 32268
     obs: 'col_0', 'leiden'
     var: 'gene_ids', 'feature_types', 'genome', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
     uns: 'hvg', 'leiden', 'leiden_colors', 'log1p', 'moranI', 'neighbors', 'pca', 'spatial_neighbors', 'umap'
     obsm: 'X_pca', 'X_umap', 'spatial', 'spatial_pixel', 'spatial_real'
     varm: 'PCs'
     obsp: 'connectivities', 'distances', 'spatial_connectivities', 'spatial_distances',
 'sample_B': Ann