# Demo

## The aim
The purpose of this notebook is to demonstrate the utilities in the simulation framework.

Note that the exact experimental protocol has not been fixed yet, so the changes will appear – it should not be treated as a final notebook.

## Loading the data and training the model

We will first read some data from the disk. We will select two cell types.

In [None]:
import anndata as ad
import pathlib
import scvi

DATA_PATH = pathlib.Path("../data")


adata = ad.read_h5ad(DATA_PATH / "non_malignant.h5ad")
adata = adata[(adata.obs["celltype"] == "Tcells") | (adata.obs["celltype"] == "Bcells")].copy()

Now we will train the model:

In [None]:
scvi.model.SCVI.setup_anndata(adata, batch_key="batch")

model = scvi.model.SCVI(adata, n_hidden=64, n_layers=2)
model.train(early_stopping=True, max_epochs=300)
model.save(DATA_PATH / "scvi-model")

In case you have already trained the model and you want to reuse it, just do the following:

In [None]:
scvi.model.SCVI.setup_anndata(adata, batch_key="batch")
model = scvi.model.SCVI.load(DATA_PATH / "scvi-model", adata)

The parameters (10 samples from the posterior) can be retrieved as:

In [None]:
zinb_params = model.get_likelihood_parameters(n_samples=10, indices=[1,2,3])

Now we can analyse how the dropout changes with mean, across several cells and all genes:

In [None]:
b_cells_ind = [0, 100, 129, 8]
t_cells_ind = [-4, -3, -1]


zinb_params_b_cells = model.get_likelihood_parameters(n_samples=1, indices=b_cells_ind)
zinb_params_t_cells = model.get_likelihood_parameters(n_samples=1, indices=t_cells_ind)

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(8, 6))

plt.scatter(zinb_params_b_cells["mean"].ravel(), zinb_params_b_cells["dropout"].ravel(), label="B-cells")
plt.scatter(zinb_params_t_cells["mean"].ravel(), zinb_params_t_cells["dropout"].ravel(), label="T-cells")

plt.xlabel("mean")
plt.ylabel("dropout")