# Tutorial 4: Detect anomalous subdomains with multimodal data

Building upon completed anomaly detection, STANDS also supports further recognition of anomalous subtypes for heterogeneous analysis of carcinogenic regions. Similarly, this section also involves multimodal learning of H&E images and spatial gene expression.

This tutorial will guide you through the process step by step. The dataset used here is exactly the same as in Tutorial 2. The reference dataset consists of healthy human breast tissue, while the target dataset consists of human breast cancer tissue, including two types of anomalous regions: cancer in situ and invasive cancer.

## Preparation

In [1]:
import warnings
warnings.filterwarnings("ignore")

In [2]:
import stands
import scanpy as sc
import squidpy as sq
sc.set_figure_params(figsize=(4, 4))

In [3]:
# replace with the path of downloaded demo or another datasets.
input_dir = '/volume3/kxu/data/'
ref_name = 'Breast_full_ref1'
tgt_name = 'Breast_full_tumor1'

ref = sc.read_h5ad(input_dir + ref_name + '.h5ad')
tgt = sc.read_h5ad(input_dir + tgt_name + '.h5ad')
ref_g, tgt_g = stands.read_cross(input_dir, input_dir, ref_name, tgt_name, train_mode=False)
label = [1 if i == 'Cancer in situ' else 2 if i == 'Invasive cancer' else 0 for i in tgt.obs['domain.type']]

## Detect anomalous regions

The first step is to identify anomalous regions using the exact same approach as in Tutorial 2.

In [4]:
model = stands.ADNet(n_epochs = 10)
model.fit(ref_g)

# obtain anomaly scores and predicted labels
score, pred = model.predict(tgt_g)

Begin to fine-tune the model on reference datasets...


Train Epochs: 100%|██████████| 10/10 [01:42<00:00,  9.33s/it, D_Loss=-.221, G_Loss=0.472]

Fine-tuning has been finished.





Detect anomalous spots on target dataset...


Inference Epochs:  19%|██        | 19/100 [00:00<00:00, 368.01it/s]

GMM-based thresholder has converged.
Anomalous spots have been detected.






In [5]:
tgt.obs['pseudo.label'] = pred
train = tgt[tgt.obs['pseudo.label'] == 1, :]
train_g = stands.read(train, train_mode=False)

## Detect anomalous subdomains

STANDS requires feeding the generator trained in the previous `stands.ADNet` into the `stands.SubNet` to ensure consistency in feature extraction.

In [6]:
sub_model = stands.SubNet(model.G, n_epochs=50)
sub_pred = sub_model.fit(train_g)

train.obs['subpred'] = sub_pred
train.obs['subpred'] = train.obs['subpred'].astype(int)
sub_pred = [train.obs.loc[i, 'subpred'] if i in train.obs_names else 0 for i in tgt.obs_names]

Begin to detect anomalous subdomains...


Train Epochs: 100%|██████████| 50/50 [01:32<00:00,  1.62s/it, Loss=1e-15]

Anomalous subdomains have been detected.


## Evaluate anomalous subdomains detection results

STANDS also provides an evaluation function to compute evaluation metric `FI*NMI` for the subtyping performance.

In [7]:
stands.evaluate(['F1*NMI'], y_true=label, y_pred=sub_pred)

0.42

For further details on the `stands.SubNet`, including various parameters, please refer to the API reference documentation.