## Analysis of Sequencing Data

**This is the analysis of the CBA spiral ganglion sequencing data containing age groups 3, 12, and 24 months.**<br>
This analysis includes quality control, clustering, and annotation. Previously done clustering and cell type annotation has been preserved and was used as a guide.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scanpy as sc
import anndata as ad
import scipy.stats as st
import sys, os
import seaborn as sns
from matplotlib import rc_context

pd.set_option('display.max_columns', None)

sc.settings.set_figure_params(
    dpi=90,
    facecolor="white",
    color_map= "viridis_r")

### Reading Data

We read the data from an .h5ad file that contains the raw count matrix with some annotations and metadata.

Information carried over is:
- plate ID (a unique barcode for each 384-well plate sequenced)
- age of animals

In [4]:
# Reading data
adatas = {"adata_3mo": sc.read_h5ad(r"C:\Users\Johann\Documents\Coding\Python\Master Thesis\Data\CBA_3mo_SGN.h5ad"),
          "adata_12mo": sc.read_h5ad(r"C:\Users\Johann\Documents\Coding\Python\Master Thesis\Data\CBA_12mo_SGN.h5ad"),
          "adata_24mo": sc.read_h5ad(r"C:\Users\Johann\Documents\Coding\Python\Master Thesis\Data\CBA_24mo_SGN.h5ad")}

# Concatenating all age groups
adata_original = ad.concat([adatas["adata_3mo"], adatas["adata_12mo"], adatas["adata_24mo"]],
                           label= "age",
                           keys= ["3 months", "12 months", "24 months"])

# Making a clean copy ...
adata = ad.AnnData(adata_original.X.copy())

# # ... and preserving some of the metadata
adata.obs["plate"] = adata_original.obs["plate"].values
adata.obs["age"] = adata_original.obs["age"].values

adata

AnnData object with n_obs × n_vars = 8617 × 20454
    obs: 'plate', 'age'

### Quality Control

We calculate quality control metrics.

In [None]:
# QC