## HRA Hierarchical Tissue Unit Annotation

In this notebook, we will build on [an existing one on hierarchical tissue unit annotation](https://github.com/HickeyLab/Hierarchical-Tissue-Unit-Annotation) by [Dr. John Hickey](https://bme.duke.edu/people/john-hickey/). Concretely, we will take a CSV file with cell positions, types, donor IDs, and extraction sites, and then create a nost-dist-vis widget. For more information and documentation on hra-jupyter-widgets, please see [https://github.com/x-atlas-consortia/hra-jupyter-widgets/blob/main/usage.ipynb](https://github.com/x-atlas-consortia/hra-jupyter-widgets/blob/main/usage.ipynb).

## Load libraries

In [None]:
# Import native packages
import time
import sys
import math
import os

In [None]:
#Install and import external packages
%pip install matplotlib
import matplotlib.pyplot as plt

%pip install pandas
import pandas as pd

%pip install seaborn
import seaborn as sns

%pip install numpy
import numpy as np

%pip install -U scikit-learn
from sklearn.neighbors import NearestNeighbors
from sklearn.cluster import MiniBatchKMeans
from sklearn.cluster import KMeans

%pip install ipywidgets
import ipywidgets as widgets

In [None]:
# Import hra-jupyter-widgets. For documentation, please see https://github.com/x-atlas-consortia/hra-jupyter-widgets/blob/main/usage.ipynb
%pip install hra_jupyter_widgets
from hra_jupyter_widgets import (
    BodyUi,
    CdeVisualization,
    Eui,
    EuiOrganInformation,
    FtuExplorer,
    FtuExplorerSmall,
    MedicalIllustration,
    ModelViewer,
    NodeDistVis,
    Rui,
)

## Download data from Dryad

In [None]:
#  I tried using curl to download the CSV file from Dryad, but I got a 403 response (forbidden). So I downloaded the file manually via the browser from https://datadryad.org/stash/downloads/file_stream/2572152. Sicne it is 2.91 GB big, I added it to gitignore.
!curl -L https://datadryad.org/stash/downloads/file_stream/2572152 -o 23_09_CODEX_HuBMAP_alldata_Dryad_merged.csv

## Read data as DataFrame

In [None]:
# Read the CSV file and convert it to a df
df = pd.read_csv('data/23_09_CODEX_HuBMAP_alldata_Dryad_merged.csv', index_col=0)
df

In [None]:
# Only keep cells from one dataset by selecting 1 donor and 1 region
df_filtered = df[(df['donor'] == "B004") & (
    df['unique_region'] == "B004_Ascending")]

In [None]:
# Make new df with only x, y, and Cell Type columns (needed for node-dist-vis)
df_cells = df_filtered[['x', 'y', 'Cell Type']]
df_cells

In [None]:
# Next, let's define a function that turns a DataFrame into a node list that can then be passed into the CdeVisualization or NodeDistVis widget
def make_node_list(df:pd.DataFrame, is_3d:bool = False):
  """Turn a DataFrame into a list of dicts for passing them into a HRA widget

  Args:
      df (pd.DataFrame): A DataFrame with cells
  """
  if not is_3d:
    df.loc[:, 'z'] = 0
  
  node_list = [{'x': row['x'], 'y': row['y'], 'z': row['z'], 'Cell Type': row['Cell Type']}
                 for index, row in df.iterrows()]

  return node_list
  

In [None]:
# Prepare df_cells for visualization with NodeDistVis widget
node_list = make_node_list(df_cells, False)
node_list

In [None]:
# Finally, let's instantiate the NodeDistVis class with some parameters. We pass in the node_list, indicate Endothelial cells as targets for the edges. 
# As we are not supplying an edge list, we need to provide a max_edge_distance, which is set to 1000 (generiously)

node_dist_vis = NodeDistVis(
    nodes = node_list,
    node_target_key="Cell Type",
    node_target_value="Endothelial",
    max_edge_distance = 1000
)

# Display our new widget
display(node_dist_vis)

## Next, let's get all regions and make a 3D tissue stack.

In [None]:
# Only keep cells from one dataset by selecting 1 donor and 3 regions
df_filtered_3d = df[(df['donor'] == "B004") & (
    df['unique_region'] == 'B004_Descending') | (df['unique_region'] == 'B004_Ascending') | (df['unique_region'] == 'B004_Transverse')]

In [None]:
# Set a z-offset
offset = 1000

# Set z axis (or any other axis) by region
df_filtered_3d['z'] = df_filtered_3d['unique_region'].apply(lambda v: 0 if v == 'B004_Descending' 
                                                            else offset if v == 'B004_Ascending'
                                                            else offset * 2)

# Make new df with only x, y, z, and Cell Type columns
df_cells_3d = df_filtered_3d[['x', 'y', 'z','Cell Type']]
df_cells_3d

In [None]:
# Prepare df_cells_3d for visualization with CdeVisualization widget
node_list = make_node_list(df_cells_3d, True)
node_list

In [None]:
# Finally, let's instantiate the NodeDistVis class with some parameters. We pass in the node_list, indicate Endothelial cells as targets for the edges.
# As we are not supplying an edge list, we need to provide a max_edge_distance, which is set to 1000 (generiously)
cde = CdeVisualization(
    nodes=node_list
)

# Display our new widget
display(cde)