# Clustering by RNA: Integration

*See the original hypothesis in `notebooks/cluster-rna.ipynb`*

Integrating data from [Poscablo 2024](https://doi.org/10.1016/j.cell.2024.04.018) and an [internal perinatal nicotine exposure dataset (dubbed PNE)](https://cells.ucsc.edu/?ds=mouse-pne-project+bonemarrow) may provide more insight. 

**Hypothesis**: When comparing young and old HSCs, old HSCs form distinct subclusters that may inform whether said old HSC would differentiate down the known hematopoiesis tree or to become a non-canonical MKP.

In [1]:
# setup: imports

import scanpy as sc
from matplotlib.figure import Figure

from pathlib import Path
from typing import Any
import pyrootutils

# setup: path constants

ROOT = pyrootutils.setup_root(Path.cwd(), indicator='.git')
DATA = ROOT / "data"
FIGURES = ROOT / "figures/cluster-rna-integrated"

POSCABLO_PATH = DATA / "rna_annotation_normalized.h5ad"
PNE_PATH = DATA / "pne-bonemarrow.h5ad"

# setup: verify paths

for path in POSCABLO_PATH, PNE_PATH:
    if not path.exists():
        raise FileNotFoundError(f"Could not find required input: {str(path)}")

FIGURES.mkdir(parents=True, exist_ok=True)