
Author: Erno Hänninen

Created: 10.12.2022

Title: AnnotateZhouData.ipynb

Description:
- Notebook that processes raw 10x count data using the corresponding processed and integrated dataset as reference. The integrated dataset described in Zhou paper is not publicly available. A corresponding in house version was used instead.

Procedure
- Read both raw and processed data
- From the raw data filter away cells not occuring in the reference
- Move the cell type annotation column from reference to the raw data
- This is data is writed to file
    

Usage:
- This script is launched and parameterized from the pipeline (data_processing_wf.nf)


In [None]:
import scanpy as sc
import scib
import os
# os.environ[“MY_ENV_VAR”]
os.environ["MKL_NUM_THREADS"] = "15"
os.environ["NUMEXPR_NUM_THREADS"] = "15"
os.environ["OMP_NUM_THREADS"] = "15"

## Load and explore the integrated Zhou data


In [183]:
#Read the annotated data (annotated by Yuan)
adata_processed  = scib.pp.read_seurat(annotated_data_path)

In [None]:
adata_processed

In [None]:
# Plot the celltypes and timpointe
sc.pl.umap(adata_processed, color="ident")
sc.pl.umap(adata_processed, color="cellBatch")

## Filter, annotate and process raw data

In [189]:
#Read the raw matrixes
adata_raw = sc.read_10x_mtx(raw_read_path, prefix="GSE169109_") #.gz files
adata_raw

In [191]:
#Raw data processing
#Gene filtering
shared_genes = adata_processed.var_names.intersection(adata_raw.var_names) #Takes the intersection of genes
shared_genes
adata_raw = adata_raw[:, shared_genes].copy() #Do the actual filtering

In [None]:
#Cell filtering
shared_cells = adata_processed.obs_names.intersection(adata_raw.obs_names) #Takes the intersection of cells

adata_raw = adata_raw[shared_cells,:].copy() #Do the actual filtering

adata_filtered = adata_raw.copy()

In [None]:
#Rename columns
adata_filtered.obs["Cell_types"] = adata_processed.obs["ident"]
adata_filtered.obs["sample"] = adata_processed.obs["cellBatch"]

In [None]:
adata_filtered

In [None]:
#Write the processed data to file
adata_filtered.write("Processed_zhou_adata.h5ad")