# Plain Jupyter Notebook in Python

## Introduction
This notebook shall help you get started quickly with FASTGenomics in general and with this environment in particular. The environment is a reproduction of the [single-cell tutorial](https://github.com/theislab/single-cell-tutorial) image as presented in [Lücken & Theis (2019)](https://www.embopress.org/doi/full/10.15252/msb.20188746).

### Data loading
Loading data from the FASTGenomics database is realized with the package `fgread`. It imports the data as AnnData objects. If you attached multiple datasets to this analysis, the variable `dsets` will have more than one element. Please note that for technical reasons the first element is accessed via `dsets[1]`.

### `Scanpy` environment
The `scanpy` environment includes many packages used for preprocessing and downstream analysis of single-cell data. Additional to `scanpy` and `anndata`, which provide the basic functionality, you can use
* `scikit-learn` for machine learning methods
* `louvain` for Louvain clustering
* `MulticoreTSNE` to allow t-SNE dimensional reduction
* `python-igraph` for complex network analysis
* `fa2` for visualization using Gephi's ForceAtlas2
* `statsmodels` for statistical computations complementary to `scipy`
* and many more.

### R support via `rpy2`
This environment also supports R functionality with the import of the package `rpy2`. This means that you can use R code in this notebook, just make sure that you start your cells with the cell magic command `%%R`. R libraries that are available in this environment include:
* `BiocManager` to install Bioconductor packages
* `scran` for preprocessing
* `MAST` for differential expression analysis
* `slingshot` for pseudotime and trajectory inference
* `monocle2` for pseudotime and trajectory inference
* `gam` for finding genes that change over pseudotime
* `RColorBrewer` and `clusterExperiment` for visualization in R
* `ComplexHeatmap` for heatmap plotting

### Add more packages
If the package you require is not available in this environment, you can simply install it using, e.g., `!pip install [your_package]`. To get a list of already installed packages use `!conda list`.

## Import packages
Here, we import the most important packages to provide the full functionality of this environment.

In [None]:
import scanpy as sc
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import fgread
import warnings

import rpy2.rinterface_lib.callbacks
import logging
from rpy2.robjects import pandas2ri
import anndata2ri

Error: Error: Jupyter notebook failed to launch. 
Error: The Jupyter notebook server failed to launch in time

In [None]:
# Ignore some Python warning messages for better readability
sc.settings.verbosity = 3             # verbosity: errors (0), warnings (1), info (2), hints (3)
warnings.filterwarnings('ignore')     # warnings will not be shown for better readybility
warnings.simplefilter("ignore", category=DeprecationWarning)

### Set up R support

In [None]:
# Ignore R warning messages
#Note: this can be commented out to get more verbose R output
rpy2.rinterface_lib.callbacks.logger.setLevel(logging.ERROR)

# Automatically convert rpy2 outputs to pandas dataframes
pandas2ri.activate()
anndata2ri.activate()
%load_ext rpy2.ipython

In [None]:
%%R
# Load all the R libraries we will be using in the notebook
library(scran)
library(RColorBrewer)
library(gam)
library(ggplot2)
library(plyr)

## Load data
The data can be loaded with `fgread` like this:

In [None]:
import fgread # FASTGenomics reading function

In [None]:
# Get data sets
dsets = fgread.get_datasets()
dsets

In [None]:
# Load first data set into anndata object
adata = fgread.read_datasets(dsets[1])
adata