Fig. X: Title of the Analysis
----
Short description of what is being done here. 

# Preliminaries

## Dependecy notebooks

TODO: define here links to other notebooks which need to be run in the correct order prior
to running this notebook (e.g. data preprocessing/data generation).
If there aren't any, remove this section.

## Import packages

In [20]:
# import standard packages
import scanpy as sc
import scanpy.external as sce
import scvelo as scv
import cellrank as cr

import numpy as np
import pandas as pd 

import matplotlib.pyplot as plt
import seaborn as sns

import os
import sys

# set verbosity levels
cr.settings.verbosity = 2
scv.settings.verbosity = 3 

## Print package versions for reproducibility

If you want to exactly reproduce the results shown here, please make sure that your package versions match what is printed below. 

In [2]:
cr.logging.print_versions()

cellrank==1.0.0-rc.12 scanpy==1.6.0 anndata==0.7.4 numpy==1.19.2 numba==0.51.2 scipy==1.5.2 pandas==1.1.3 scikit-learn==0.23.2 statsmodels==0.12.0 python-igraph==0.8.3 scvelo==0.2.2 pygam==0.8.0 matplotlib==3.3.2 seaborn==0.11.0


## Set up paths

Define the paths to load data, cache results and write figure panels.

TODO for Mike

Import the paths to the top-level folders.

In [25]:
sys.path.insert(0, "PATH_TO_ROOT, LIKE '../..'")
from paths import DATA_DIR, CACHE_DIR, FIG_DIR

ModuleNotFoundError: No module named 'paths'

Set up the paths to save figures.

In [13]:
scv.settings.figdir = str(FIG_DIR)
sc.settings.figdir = str(FIG_DIR)
cr.settings.figdir = str(FIG_DIR)

## Set up caching

Note: we use a caching extension called `scachepy` for this analysis, see [here](https://github.com/theislab/scachepy). We do this to speed up the runtime of this notebook by avoiding the most expensive computations. Below, we check whether you have scachepy installed and if you don't, then we automatically recompute all results. 

In [23]:
try:
    import scachepy
    c = scachepy.Cache(CACHE_DIR / ADD_SPECIFIC_PATH)
except ImportError:
    c = None
    
use_caching = c is not None
use_caching

False

## Set global parameters

Set some plotting parameters.

In [17]:
scv.settings.set_figure_params('scvelo', dpi_save=400, dpi=80, transparent=True, fontsize=20, color_map='viridis')

Do we want to write figures?

In [18]:
save_figure = True

## Define utility functions

Any utility functions you may need in this notebook go here. 

## Load the data

Load the AnnData object, create raw copy and pre-process the copy.

In [24]:
adata = cr.datasets.pancreas(DATA_DIR / ADD_SPECIFIC_PATH)
adata

NameError: name 'ADD_SPECIFIC_PATH' is not defined

# Main analysis part 1

Structure your analysis by sections and subsections. Often, the first part of the analysis will be about pre-processing the data and computing velocities. 

## Pre-process the data

Often, we want to process the raw data a bit differently than the data in `.X`.

### Raw data (for plotting)

In [None]:
adata_raw = adata.copy()
sc.pp.filter_genes(adata_raw, min_cells=10)
scv.pp.normalize_per_cell(adata_raw)
sc.pp.log1p(adata_raw)

# annotate highly variable genes, but don't filter them out
sc.pp.highly_variable_genes(adata_raw)
print(f"This detected {np.sum(adata_raw.var['highly_variable'])} highly variable genes. ")
adata.raw = adata_raw

### Data for velocity computation

In [None]:
# filter, normalise, log transform
scv.pp.filter_and_normalize(adata, min_shared_counts=20, log=True, n_top_genes=2000)

# compute pca, knn graph and scvelo's moments
sc.tl.pca(adata)
sc.pp.neighbors(adata, n_pcs=30, n_neighbors=30)
scv.pp.moments(adata, n_pcs=None, n_neighbors=None)

## Visualize annotations

Often, we want to show UMAP plots with cluster annotation of marker genes here.

## Compute velocities using scVelo

In [None]:
# compute/load from cache the dyn. model params and compute velocities
if use_caching:
    c.tl.recover_dynamics(adata, fname='recover_dynamics', force=False)
else:
    scv.tl.recover_dynamics(adata)
    
scv.tl.velocity(adata, mode='dynamical')

## Project velocities onto the embedding

Often, we want to show velocities in the UMAP embedding here.

# Main analysis part 2

Typically, the second part of the analysis will involve CellRank. 