# FigureOneLabs: Proper acquisition of cell class identity in organoids allows definition of fate specification programs of the human cerebral cortex (Uzquiano, 2022)
## Recreating figure one

In [None]:
import os
import numpy as np
import pandas as pd
import scipy
import anndata
import scanpy as sc
import pybiomart
import scvi
import torch
import random
import seaborn as sns

Create a function, `PlotUMAP`, that creates a UMAP plot for each marker in the provided `markers` list. The function takes several parameters:
* `adata`: an AnnData object.
* `markers`: a list of markers (genes) to be visualised on the UMAP plot.
* `layer`: specifies the layer of data to use, with a default value of `log2_counts_scvi`.
* `size`: sets the size of the points in the plot.
* `vmin` and `vmax`: set the minimum and maximum values for the colour scale.

In [None]:
def PlotUMAP(adata, markers, layer='log2_counts_scvi', size=2, vmin='p0', vmax='p99'):
    # Loop over each marker in the markers list
    for i in range(len(markers)):
        # Within the loop, the umap function of the from the scanpy.plotting module (sc.pl) is called for each marker
        sc.pl.umap(adata, # The AnnData object to be plotted
                   color=markers[i], # Colours the UMAP plot points according to the expression of the current marker
                   layer=layer, # Layer of data for the plot
                   size=size, # Size of points in the plot
                   cmap=sns.blend_palette(['lightgray', sns.xkcd_rgb['red orange']], as_cmap=True), # Colourmap for the plot
                   vmin=vmin, vmax=vmax) # Minimum and maximum values for the colour scale

Generate a dotplot, showing the expression levels of each gene marker in the `markers` list across different clusters identified by Leiden clustering. Create a UMAP plot for each gene marker, using the `log2_counts_scvi` layer. These plots help in understanding the expression patterns of specific genes across different cell clusters and in a reduced dimensional space, providing insights into cell type-specific expression profiles and potential biological functions.

In [None]:
# Dataset1-MarkerGeneExpression.tif

# List of gene markers - each element in the list is a gene identifier
markers = ['EMX1','SOX2','PAX3','INSM1','EOMES',
           'PAX2','LAMP5','NEFL','FOXP2','WLS',
           'GDF7','FOXD3','SIX1','ISL1','TWIST1']
# Create a dotplot showing the expression of multiple genes across different clusters
# groupby="leiden_scvi" specifies how to group the data for the dotplot - each cluster will be represented as a separate group in the dot plot
sc.pl.dotplot(adata, markers, groupby='leiden_scvi')
# Use the PlotUMAP function to create a UMAP plot for each marker gene
PlotUMAP(adata, markers, layer='log2_counts_scvi', size=5)