# Splitting of HLCA by compartment(s)

Here we'll split the HLCA into three subsets for figure generation, i.e. into epithelial, immune, and endothelial-stromal sub-HLCAs:

### Load modules and set paths:

In [1]:
import scanpy as sc
import os
import matplotlib.pyplot as plt

for pretty code formatting:

In [2]:
%load_ext lab_black

In [3]:
path_HLCA = "../../data/HLCA_core_h5ads/HLCA_v2.h5ad"
dir_HLCA_subsets = "../../data/HLCA_core_h5ads/HLCA_subsets/"

### Split atlas, re-calculate umaps, and store:

In [4]:
adata = sc.read(path_HLCA)

set mapping of clusters to compartments:

In [5]:
cl2comp = {"0": "epithelial", "1": "immune", "2": "endothelial_and_stromal"}

initiate dictionary to store the atlas subsets in:

In [6]:
subadatas = dict()

Now subset to each of the specified groups using clusters, and re-calcualte neighbor graph (based on scANVI embeddign) and umaps, then store:

In [7]:
for cl_number, comp_name in cl2comp.items():
    subadata = adata[adata.obs.leiden_1 == cl_number, :].copy()
    sc.pp.neighbors(subadata, n_neighbors=15, use_rep="X_scanvi_emb")
    sc.tl.umap(subadata)
    subadata.obsm["X_umap_scanvi"] = subadata.obsm["X_umap"]
    subadatas[comp_name] = subadata
    subadata.write(os.path.join(dir_HLCA_subsets, f"HLCA_{comp_name}.h5ad"))
    del subadata