## Stitch HCA_F_RepTsp13902013 & HCA_F_RepTsp13902014 (Uterovaginal canal)

In [None]:
import scanpy as sc

Nadav did a cool thing and stitched together a human limb of three separate Visium slides. He did all the image lifting:
- put together the three images into one
- the size factors were exactly the same (likely as the result of the same exact zoom being used when taking the slide pictures), so there was no need to mess with the source images on that axis
- once the images were put together, he corrected the underlying spot coordinates so they're all compatible

The files exist in three separate Visium mapping folders, with an extra `spatial_0` folder with the new stuff. And now we need to join them somehow!

We can't exactly just `sc.read_visium()` the thing as that specifically requires `.h5` formatted matrices. However, we can emulate the final outcome of `sc.read_visium()` by importing the three count matrices separately, merging them, and then adding the stitched together limb into the necessary spatial slots of the object. Start with the count matrices!

### 1. Import annotated h5ad files per sample

In [None]:
HCA_F_RepTsp13902013 = sc.read_10x_h5("/nfs/team292/vl6/FetalReproductiveTract/VISIUM/data/HCA_F_RepTsp13902013/raw_feature_bc_matrix.h5")
HCA_F_RepTsp13902013.var_names_make_unique()
HCA_F_RepTsp13902014 = sc.read_10x_h5("/nfs/team292/vl6/FetalReproductiveTract/VISIUM/data/HCA_F_RepTsp13902014/raw_feature_bc_matrix.h5")
HCA_F_RepTsp13902014.var_names_make_unique()

In [None]:
HCA_F_RepTsp13902013.obs_names = ["HCA_F_RepTsp13902013_"+i for i in HCA_F_RepTsp13902013.obs_names]
HCA_F_RepTsp13902013.obs["sample"] = "HCA_F_RepTsp13902013"
HCA_F_RepTsp13902014.obs_names = ["HCA_F_RepTsp13902014_"+i for i in HCA_F_RepTsp13902014.obs_names]
HCA_F_RepTsp13902014.obs["sample"] = "HCA_F_RepTsp13902014"

### 2. Concatenate anndata objects

In [None]:
adata = HCA_F_RepTsp13902013.concatenate(HCA_F_RepTsp13902014, index_unique=None)
adata

Start preparing the spatial stuff in the object, mirroring `sc.read_spatial()` source code in terms of import and storage and whatnot.

Create the base slots for the thing. The chosen library ID doesn't really matter.

In [None]:
adata.uns["spatial"] = dict()
library_id = "joint"
adata.uns["spatial"][library_id] = dict()

The image is imported like so. Use the stitched together one Nadav provided.

In [None]:
from matplotlib.image import imread

adata.uns["spatial"][library_id]['images'] = dict()
adata.uns["spatial"][library_id]['images']["hires"] = imread("HCA_F_RepTsp13902013_14_tissue_hires_image.png")

Import the scale factor JSON. This is technically marginally different syntax than the scanpy code but it gets the job done.

As a reminder, this is consistent across all three images, so use whichever one and it's ok.

In [None]:
import json

with open("/nfs/team292/vl6/FetalReproductiveTract/VISIUM/data/HCA_F_RepTsp13902013/spatial/scalefactors_json.json", "r") as fid:
    adata.uns["spatial"][library_id]['scalefactors'] = json.load(fid)

And now for the fun part - the spatial coordinates, which are present in three separate files. Import the three separate files.

In [None]:
import pandas as pd

p1 = pd.read_csv("HCA_F_RepTsp13902013_tissue_positions_list.csv", header=None)
p2 = pd.read_csv("HCA_F_RepTsp13902014_tissue_positions_list.csv", header=None)

Name the columns appropriately and set the index. In the process also add the sample ID as a prefix to the barcodes so it matches the object.

In [None]:
p1.columns = [
    'barcode',
    'in_tissue',
    'array_row',
    'array_col',
    'pxl_col_in_fullres',
    'pxl_row_in_fullres',
]

p1.index = p1['barcode']

p2.columns = [
    'barcode',
    'in_tissue',
    'array_row',
    'array_col',
    'pxl_col_in_fullres',
    'pxl_row_in_fullres',
]

p2.index = p2['barcode']


In [None]:
p1.head()

In [None]:
p1.shape

### 3. Read the overlapping barcodes and add as metadata

In [None]:
with open("HCA_F_RepTsp13902013_14_overlapping_barcodes.json", "r") as f:
    overlapping_barcodes = json.load(f)

In [None]:
len(list(overlapping_barcodes.keys()))

In [None]:
p1['overlaps_with'] = None

In [None]:
for i in p1.index:
    if i in list(overlapping_barcodes.keys()):
        p1.loc[i, 'overlaps_with'] = overlapping_barcodes[i]

In [None]:
overlapping_barcodes_reverse = {v: k for k, v in overlapping_barcodes.items()}

In [None]:
p2['overlaps_with'] = None

In [None]:
for i in p2.index:
    if i in list(overlapping_barcodes_reverse.keys()):
        p2.loc[i, 'overlaps_with'] = overlapping_barcodes_reverse[i]

Concatenate the positions files

In [None]:
positions = pd.concat([p1, p2])

In [None]:
import numpy as np

In [None]:
positions['is_overlap'] = np.where(positions['overlaps_with'].isna() == True, 0, 1)

In [None]:
positions['is_overlap'].value_counts(dropna = False)

Absorb into the object.

In [None]:
adata.obs = adata.obs.join(positions, how="left")

adata.obsm['spatial'] = adata.obs[
    ['pxl_row_in_fullres', 'pxl_col_in_fullres']
].to_numpy()
adata.obs.drop(
    columns=['barcode', 'pxl_row_in_fullres', 'pxl_col_in_fullres'],
    inplace=True,
)

Plot the sample...

In [None]:
spatial = adata.obsm['spatial'].copy()
adata.obsm['spatial'][:,0] = spatial[:,1]
adata.obsm['spatial'][:,1] = spatial[:,0]

In [None]:
sc.pl.spatial(adata, color="sample")

In [None]:
sc.pl.spatial(adata, color="is_overlap")

In [None]:
sc.pl.spatial(adata, color="in_tissue")

In [None]:
adata.obs['in_tissue_and_is_overlap'] = np.where((adata.obs['is_overlap'] == 1) & (adata.obs['in_tissue'] == 1), 1, 0)

In [None]:
sc.pl.spatial(adata, color="in_tissue_and_is_overlap")

### 4. Save joint anndata object

In [None]:
adata.write('HCA_F_RepTsp13902013_14_joint.h5ad')