In [None]:
%load_ext autoreload
%autoreload 2

# This notebook illustrates how to annotate a Xenium morphology image of Mouse Brain

Welcome!
In this tutorial, we’ll walk through the process of annotating a **Xenium image** from a **FFPE mouse brain sample**. This notebook will guide you from raw data download to generating high-quality tissue annotations using both automated and manual approaches.

## 📦 Download the Sample Data

We'll use the publicly available dataset from 10x Genomics.
Run the following commands in your terminal to download all necessary files and set up the demo directory

<small>

```bash
mkdir ../data/tissue_tag_example_xenium
cd ../data/tissue_tag_example_xenium
curl -O https://cf.10xgenomics.com/samples/xenium/3.0.0/Xenium_Prime_Mouse_Brain_Coronal_FF/Xenium_Prime_Mouse_Brain_Coronal_FF_outs.zip
unzip Xenium_Prime_Mouse_Brain_Coronal_FF_outs.zip
tar -zvxf cell_feature_matrix.tar.gz
```
</small>


## 🧭 Annotation Strategy

We’ll explore several annotation strategies, ranging from automated predictions to manual refinement:

1. **Fully Automatic** – Quickly capture broad tissue features (e.g., tissue boundaries, white and gray matter) using a simple pixel classifier.
2. **Semi-Automatic** – Gene-guided labeling combined with manual annotation and drawing for finer control.

All annotations will be saved as a `TissueTag` annotation class object in an `.h5` file.


---

Let's get started! 🚀

In [None]:
# initialisation 
import os
import panel as pn
import socket
import numpy as np
import tissue_tag as tt
import tissue_tag.annotation

os.environ["BOKEH_ALLOW_WS_ORIGIN"] = "*"
# host = '5011' # set the port to the value in the address bar when operating in farm
host = '8888' # when working locally e.g. desktop

In [None]:
# set path
# here you can either read a single image (grayscale or RGB) or generate a virtual H&E from 2 images in the next cell
Path = '../' #directory of tissuetag repo
path = Path +'data/tissue_tag_example_xenium/'

# Step 1 - Create de-novo annotations from gene expression (or not)

Load a visium image and downscale it to a more manageable size. `res_in_ppm` is the desired pixels per micron in the output.

In [None]:
tt_object = tt.read_xenium(path, ppm_out=1, image_quantiles=(5e-4, 0.9995), image_output="fluorescence", plot=True)

# Automatic annotations - Create de-novo annotations from gene expression (or not)

In [None]:
import anndata
import pandas as pd

adata = anndata.io.read_mtx(path + "cell_feature_matrix/matrix.mtx.gz").T
adata.var = pd.read_csv(path + "cell_feature_matrix/features.tsv.gz", sep="\\t", header=None)
adata.var.columns = ["gene_ids", "gene_name", "feature_types"]
adata.var_names = adata.var["gene_name"]

adata.obs = pd.read_csv(path + "cell_feature_matrix/barcodes.tsv.gz", sep="\\t", header=None)
adata.obs.columns = ["barcodes"]
adata.obs_names = adata.obs["barcodes"]

In [None]:
# Define anatomical structure annotations
# and their associated color codes.
#
# Color families:
#   • Red:         'red', 'darkred', 'firebrick', 'indianred'
#   • Green:       'green', 'darkgreen', 'lime', 'seagreen', 'forestgreen'
#   • Blue:        'blue', 'darkblue', 'royalblue', 'dodgerblue', 'deepskyblue'
#   • Cyan:        'cyan', 'lightcyan', 'darkcyan', 'teal'
#   • Magenta:     'magenta', 'purple', 'darkmagenta', 'orchid', 'violet'
#   • Yellow/Orange:'gold', 'orange', 'darkorange', 'goldenrod'
#   • Brown:       'brown', 'saddlebrown', 'chocolate', 'peru', 'tan'
#   • Gray/Black:  'black', 'gray', 'darkgray', 'dimgray', 'lightgray'
#   • White:       'white'

tt_object.annotation_map = {
    'unassigned':      'yellow',
    'isocortex':       'green',
    'hippocampus':     'darkgreen',
    'olfactory':       'orange',
    'striatum':        'red',
    'thalamus':        'blue',
    'amygdala':        'lime',
    'choroid_plexus':  'gold',
    'pia':             'deepskyblue',
    'white_matter':    'white',
    'gray_matter':     'teal',
    'dentate_gyrus':   'violet',
    'layer_1':         'tan',

}
# note if you need to add annotation to an exxisting object add them in the end and do not change the order as this corresponds to the pixel values of the label image. e.g. unassigned = 1, isocortex=2 etc

In [None]:
# Define gene markers per region
# Format: region_name: List of (gene, expression threshold)

gene_markers = {
    'gray_matter': [
        ('Gad1', 2500),
        ('Gad2', 2500),
    ],
    'white_matter': [
        ('Gfap',500)
    ]
}

# Generate training labels from gene expression
# This maps regions based on marker expression

tissue_tag.annotation.gene_labels_from_adata(
    adata=adata,
    gene_markers=gene_markers,
    tissue_tag_annotation=tt_object,
    diameter=20,  # Labeling diameter
    override_labels=True,         # Replace any existing labels
    normalize=True               # Use raw expression
)

In [None]:
# Visualize the assigned labels
tissue_tag.annotation.median_filter(tt_object,filter_radius=8)
tissue_tag.annotation.plot_labels(tt_object, alpha=0.5) # i wasn't able to supress putput in annotation.plot_labels

# Part 2 - Iterative annotation section

At this stage, you can choose whether to use datashader - `use_datashader=True` for rendering the image (recommended for large images/high-resolution annotation). While the annotation process is slower with datashader, loading would be reasonable. If the image is too large, without datashader, the image might not load or take an extremely long time to load. 

Annotation is done by creating convex shapes in single strokes, the pixels inside the convex region would be filled in the `update_annotator` step. it's recommended to use a mouse with a wheel for easy scrolling in and out.
*to remove a label the use can click on the label to remove and press the backspace key. 

In [None]:
# use annotator to label tissue regions according to categories indicated above
annotator = tissue_tag.annotation.annotator(tt_object, use_datashader=True)
pn.io.notebook.show_server(annotator, notebook_url=f'localhost:'+host)
#annotator.servable()

In [None]:
# This step fills in the shapes created for the pixel classifier
tissue_tag.annotation.plot_labels(tt_object, alpha=0.5)

In [None]:
%%time
# Train and predict the image pixels with a random forest classifier. This step takes about 1 to 10 min depending on number of training areas and resolution  
tissue_tag.annotation.pixel_label_classifier(tt_object)

From this point go back to the annotator and correct annotations untill happy with results. 

# Part 3 - Gene-Guided Labeling

In this step, we overlay **gene expression–based labels** on top of the previous classification.  
These labels serve as additional guidance for **manual tissue annotation**, helping refine regional identity using known gene markers.

We define a dictionary of **marker genes** and their expression thresholds for key brain regions.  
From this, we generate spatial labels directly from the `AnnData` object and add them to the existing `TissueTag` annotation.

> 🔍 These gene-based labels are not final annotations—they are **hints** to support accurate manual refinement.

In [None]:
# select gene markers 
gene_markers = {
    'isocortex': [('Pak7',500),('Myl4',500),('Ttc9b',500)],
    'amygdala':[('Acvr2a',300)],
    'olfactory': [('Cdhr1',500)],
    'striatum': [('Adora2a',200),('Gprin3',200)],
    'thalamus': [('Plekhg1',500)],
	'choroid_plexus':[('Tcf21',500)],
    'hippocampus': [('Zbtb20',500)],
}

# generate training data from gene expression
tissue_tag.annotation.gene_labels_from_adata(
    adata = adata,
    gene_markers = gene_markers,    
    tissue_tag_annotation=tt_object,
    diameter = 20,
    normalize=False
) # generate gene-marker-labels
tissue_tag.annotation.plot_labels(tt_object, alpha=0.25)

In [None]:
# use annotator to label tissue regions according to categories indicated above
annotator = tissue_tag.annotation.annotator(tt_object, use_datashader=True)
pn.io.notebook.show_server(annotator, notebook_url=f'localhost:'+host)
#annotator.servable()

In [None]:
tissue_tag.annotation.plot_labels(tt_object, alpha=0.5)

In [None]:
tissue_tag.annotation.assign_annotation_label_to_positions(tt_object)
tt_object.positions

In [None]:
tissue_tag.annotation.plot_cell_label_annotations(tt_object, alpha=0.5)

#  Save annotations (and load)

The resulting images and information can be saved for later use.

In [None]:
isExist = os.path.exists(path+'tissue_annotations')
if not(isExist):
    os.mkdir(path+'/tissue_annotations/')
    
tt_object.save_annotation(file_path=path+'/tissue_annotations/annotations.h5')

In [None]:
# optional - load annotations and as an intermediate step 
tt_object = tt.load_annotation(file_path=path + '/tissue_annotations/annotations.h5')

In [None]:
# use annotator to label tissue regions according to categories indicated above
annotator = tissue_tag.annotation.annotator(tt_object, use_datashader=True)
pn.io.notebook.show_server(annotator, notebook_url=f'localhost:'+host)
#annotator.servable()