<!-- ## Step-by-step instructions to run HEST-Benchmark

This tutorial will guide you to:

- **Reproduce** HEST-Benchmark results provided in the paper (Random Forest regression and Ridge regression models)
- Benchmark your **own** model -->


## Overview

Each task involves predicting the expression levels of the 50 most variable genes from 112×112 μm H&E-stained image patches centered on each spatial transcriptomics spot. The tasks are formulated as multivariate regression problems.

| **Task ID** | **Oncotree** | **Number of Samples** | **Technology** | **Sample ID** |
|-------------|--------------|-----------------------|----------------|---------------|
| Task 1      | IDC          | 4                     | Xenium         | TENX95, TENX99, NCBI783, NCBI785      |
| Task 2      | PRAD         | 23                    | Visium         | MEND139~MEND162      |
| Task 3      | PAAD         | 3                     | Xenium         | TENX116, TENX126, TENX140      |
| Task 4      | SKCM         | 2                     | Xenium         | TENX115, TENX117      |
| Task 5      | COAD         | 4                     | Xenium         | TENX111, TENX147, TENX148, TENX149      |
| Task 6      | READ         | 4                     | Visium         | ZEN36, ZEN40, ZEN48, ZEN49      |
| Task 7      | ccRCC        | 24                    | Visium         | INT1~INT24      |
| Task 8      | LUAD         | 2                     | Xenium         | TENX118, TENX141      |
| Task 9     | IDC-LymphNode | 4                    | Visium         | NCBI681, NCBI682, NCBI683, NCBI684     |



### Reproducing HEST-Benchmark results 

- Ensure that HEST has been properly installed (see README, Installation)
- Automatic download preprocessed patches, h5ad and gene targets  
- Automatic download of publicly available patch encoders

**Note:** Not all public foundation models can be shared due to licensing issues. We provide model-specific instructions that users can follow to access weights:

#### CONCH installation (model + weights request)

1. Request access to the model weights from the Huggingface model page [here](https://huggingface.co/MahmoodLab/CONCH).

2. Install the CONCH PyTorch model:

```
pip install git+https://github.com/Mahmoodlab/CONCH.git
```

#### UNI weights request

Request access to the model weights from the Huggingface model page [here](https://huggingface.co/MahmoodLab/UNI).

#### GigaPath weights request

Request access to the model weights from the Huggingface model page  [here](https://huggingface.co/prov-gigapath/prov-gigapath).

#### Remedis (weights only)

1. Request access to the model weights from the Huggingface model page  [here](https://physionet.org/content/medical-ai-research-foundation/1.0.0/).
2. Download the model weights (`path-152x2-remedis-m_torch.pth`) and place them in `{weights_root}/fm_v1/remedis/path-152x2-remedis-m_torch.pth` where `weights_root` is specified in the config. You can also directly modify the path of remedis in `{PATH_TO_HEST/src/hest/bench/local_ckpts.json}`.


In [11]:
# import hest
# print(hest.__version__)
# print(hest.__file__)

In [1]:
# force to use local copy
#import sys
# Add your local hest path at the front of sys.path
#sys.path.insert(0, '/home/k/kxu/.local/lib/python3.11/site-packages')

In [2]:
import hest
print(hest.__version__)
print(hest.__file__)

0.0.1
/home/k/kxu/.local/lib/python3.11/site-packages/hest/__init__.py


### Launching HEST-bench via CLI

In [3]:
%%bash
ls /ceph-fast/package/u22/hest/1.2.0/lib/python3.11/site-packages/hest/bench/

benchmark.py
cpath_model_zoo
__init__.py
__pycache__
st_dataset.py
trainer.py
utils


In [4]:
%%bash
ls /home/k/kxu/.local/lib/python3.11/site-packages/hest/bench/

benchmark.py
cpath_model_zoo
__init__.py
__pycache__
st_dataset.py
trainer.py
utils


In [5]:
%%bash
cp /project/simmons_hts/kxu/hest/local_ckpts.json \
   /home/k/kxu/.local/lib/python3.11/site-packages/hest/bench/

In [3]:
%%bash
ls /home/k/kxu/.local/lib/python3.11/site-packages/hest/bench/

benchmark.py
cpath_model_zoo
__init__.py
local_ckpts.json
__pycache__
st_dataset.py
trainer.py
utils


In [8]:
# from hestcore.segmentation import get_path_relative as orig_get_path_relative
# import os

# def local_ckpt_path(file, rel_path):
#     # ignore the original file location, always look in local folder
#     return os.path.join("/project/simmons_hts/kxu/hest/", rel_path)

# import hest.bench.benchmark as bench
# bench.get_path_relative = local_ckpt_path

# import hest.bench.benchmark as bench
# import json, os

# local_json = "/project/simmons_hts/kxu/hest/local_ckpts.json"
# weights_root = "/project/simmons_hts/kxu/hest/fm_v1"

# with open(local_json) as f:
#     ckpts = json.load(f)

# def get_bench_weights_local(weights_root, name):
#     path = ckpts[name]
#     return os.path.join(weights_root, path) if path else None

# bench.get_bench_weights = get_bench_weights_local

In [4]:
%%bash
#export PYTHONPATH=/project/simmons_hts/kxu/hest:$PYTHONPATH
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libffi.so.7
python -m hest.bench.benchmark --config /project/simmons_hts/kxu/hest/bench_config.yaml 

ERROR: ld.so: object '/usr/lib/x86_64-linux-gnu/libffi.so.7' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.


Updating overwrite with False
Updating kfold with False
Updating benchmark_encoders with False
Updating config with /project/simmons_hts/kxu/hest/bench_config.yaml
15:48:26 | INFO | Saving models to eval/fm_v1...


Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]Fetching 2 files: 100%|██████████| 2/2 [00:00<00:00, 1556.91it/s]


15:48:27 | INFO | Fetch the bench data...


Fetching 294 files:   0%|          | 0/294 [00:00<?, ?it/s]Fetching 294 files:  86%|████████▋ | 254/294 [00:00<00:00, 2501.72it/s]Fetching 294 files: 100%|██████████| 294/294 [00:00<00:00, 2573.69it/s]


15:48:27 | INFO | Benchmarking on the following datasets ['IDC']
15:48:27 | INFO | run parameters Namespace(seed=1, overwrite=False, bench_data_root='eval/bench_data', embed_dataroot='eval/ST_data_emb', weights_root='eval/fm_v1', results_dir='eval/ST_pred_results', batch_size=128, num_workers=1, private_weights_root=None, exp_code=None, gene_list='var_50genes.json', method='ridge', alpha=None, kfold=False, benchmark_encoders=False, normalize=True, dimreduce='PCA', latent_dim=256, encoders=['ctranspath'], datasets=['IDC'], config='/project/simmons_hts/kxu/hest/bench_config.yaml')
15:48:27 | INFO | HESTBench task: IDC, Encoder: ctranspath
15:48:27 | INFO | Embedding tiles for IDC using ctranspath encoder


  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]


15:48:31 | INFO | Missing keys: ['layers.0.blocks.1.attn_mask', 'layers.1.blocks.1.attn_mask', 'layers.2.blocks.1.attn_mask', 'layers.2.blocks.3.attn_mask', 'layers.2.blocks.5.attn_mask']
15:48:31 | INFO | Unexpected keys: []


  0%|          | 0/3 [00:00<?, ?it/s]
  0%|          | 0/61 [00:00<?, ?it/s][AERROR: ld.so: object '/usr/lib/x86_64-linux-gnu/libffi.so.7' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
  with torch.inference_mode(), torch.cuda.amp.autocast(dtype=precision):

  2%|▏         | 1/61 [00:01<01:05,  1.08s/it][A
  3%|▎         | 2/61 [00:01<00:30,  1.94it/s][A
  5%|▍         | 3/61 [00:01<00:19,  2.97it/s][A
  7%|▋         | 4/61 [00:01<00:16,  3.54it/s][A
  8%|▊         | 5/61 [00:01<00:15,  3.57it/s][A
 10%|▉         | 6/61 [00:02<00:14,  3.78it/s][A
 11%|█▏        | 7/61 [00:02<00:14,  3.81it/s][A
 13%|█▎        | 8/61 [00:02<00:13,  4.02it/s][A
 15%|█▍        | 9/61 [00:02<00:12,  4.09it/s][A
 16%|█▋        | 10/61 [00:03<00:12,  3.98it/s][A
 18%|█▊        | 11/61 [00:03<00:12,  4.08it/s][A
 20%|█▉        | 12/61 [00:03<00:11,  4.21it/s][A
 21%|██▏       | 13/61 [00:03<00:13,  3.47it/s][A
 23%|██▎       | 14/61 [00:04<00:12,  3.72it/s][A
 2

 20%|██        | 33/163 [00:08<00:31,  4.11it/s][A
 21%|██        | 34/163 [00:08<00:31,  4.08it/s][A
 21%|██▏       | 35/163 [00:08<00:30,  4.17it/s][A
 22%|██▏       | 36/163 [00:08<00:30,  4.11it/s][A
 23%|██▎       | 37/163 [00:09<00:30,  4.13it/s][A
 23%|██▎       | 38/163 [00:09<00:28,  4.36it/s][A
 24%|██▍       | 39/163 [00:09<00:29,  4.18it/s][A
 25%|██▍       | 40/163 [00:09<00:30,  4.09it/s][A
 25%|██▌       | 41/163 [00:10<00:29,  4.20it/s][A
 26%|██▌       | 42/163 [00:10<00:30,  3.97it/s][A
 26%|██▋       | 43/163 [00:10<00:29,  4.09it/s][A
 27%|██▋       | 44/163 [00:10<00:29,  4.10it/s][A
 28%|██▊       | 45/163 [00:10<00:28,  4.19it/s][A
 28%|██▊       | 46/163 [00:11<00:26,  4.38it/s][A
 29%|██▉       | 47/163 [00:11<00:27,  4.27it/s][A
 29%|██▉       | 48/163 [00:11<00:25,  4.48it/s][A
 30%|███       | 49/163 [00:11<00:26,  4.31it/s][A
 31%|███       | 50/163 [00:12<00:26,  4.34it/s][A
 31%|███▏      | 51/163 [00:12<00:25,  4.35it/s][A
 32%|███▏   

using gene_list var_50genes.json


  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
100%|██████████| 3/3 [00:00<00:00,  6.36it/s]


15:49:41 | INFO | Loaded train split with 14775 samples: (14775, 768)


  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
100%|██████████| 1/1 [00:00<00:00,  3.37it/s]


15:49:41 | INFO | Loaded test split with 20761 samples: (20761, 768)
perform PCA dim reduction
Using alpha: 0.0078125
15:49:45 | INFO | {'n_train': 14775, 'n_test': 20761, 'l2_errors': [3.392437587555049, 3.893052494177589, 0.6169989740244569, 1.7011565302026177, 5.354259213186444, 1.151173729715427, 0.8988647540908775, 3.2100849967823732, 5.835415972031101, 0.38252781420365006, 4.907274429372622, 4.213039347359241, 12.670177583598653, 16.53537507518364, 8.884976108647553, 7.739504746832268, 17.607674507500512, 3.126338787859611, 4.7619739580728195, 2.795848426227924, 2.4658376512674507, 2.9538504617941927, 1.1254978092967705, 0.9986774328387226, 1.7867550840031512, 0.8646257276938405, 0.7764030900586001, 0.5663666860497624, 2.3004828088032636, 3.1208093833972055, 11.970454785073235, 0.38198181621020266, 0.9340130455891441, 0.41553574463770154, 8.81310688206496, 2.596773314092396, 11.169018672999137, 19.754690462939603, 4.7362300062167755, 4.091561905639441, 5.1158636782482025, 1.45615

100%|██████████| 3/3 [00:00<00:00, 5506.74it/s]
100%|██████████| 1/1 [00:00<00:00, 7530.17it/s]


using gene_list var_50genes.json


  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
100%|██████████| 3/3 [00:00<00:00, 18.84it/s]


15:49:46 | INFO | Loaded train split with 27776 samples: (27776, 768)


  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
100%|██████████| 1/1 [00:00<00:00, 21.28it/s]


15:49:46 | INFO | Loaded test split with 7760 samples: (7760, 768)
perform PCA dim reduction
Using alpha: 0.0078125
15:49:53 | INFO | {'n_train': 27776, 'n_test': 7760, 'l2_errors': [10.770724360926843, 2.998967303537276, 1.4162252664691743, 30.718838600769555, 8.615988839452253, 14.377915818904016, 0.9945913346538989, 3.95126232619808, 5.120006175407561, 0.22729387134407164, 23.724936772723087, 2.087766634201664, 12.814889315784205, 16.163917355066598, 11.846921241236565, 15.800009629039522, 26.43882168778942, 2.2860494898056416, 3.090362309933066, 2.233958037797262, 4.582581982500615, 1.8582464939975598, 1.3580392974603592, 0.8349877177646357, 1.7980317645588484, 0.3610145895707849, 2.6663028431731566, 1.6685296808414598, 27.466925565570804, 2.4757802075271544, 7.784230345048516, 0.38947249968394176, 1.6107332775432548, 0.8177536807932598, 3.0416206956109657, 3.6573963802751552, 7.907979946462339, 10.449123888035663, 0.8603193351629842, 5.05255175386983, 27.57945092917559, 4.31564314

100%|██████████| 3/3 [00:00<00:00, 6750.49it/s]
100%|██████████| 1/1 [00:00<00:00, 8405.42it/s]


using gene_list var_50genes.json


  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
100%|██████████| 3/3 [00:00<00:00, 17.26it/s]


15:49:53 | INFO | Loaded train split with 31526 samples: (31526, 768)


  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
100%|██████████| 1/1 [00:00<00:00, 32.89it/s]


15:49:53 | INFO | Loaded test split with 4010 samples: (4010, 768)
perform PCA dim reduction
Using alpha: 0.0078125
15:50:01 | INFO | {'n_train': 31526, 'n_test': 4010, 'l2_errors': [5.324679324499561, 6.861622374181957, 0.9376388716963164, 11.013225725413141, 4.072077580596363, 2.5770203727150642, 1.5861620248386565, 5.700120060127406, 16.20051220555741, 1.8014485410295962, 3.9996535201659, 7.916633078626824, 16.641119086706603, 39.6583325853715, 8.811235731016419, 17.477595669000376, 25.410528055451387, 6.303002428471617, 6.137603471849508, 5.523720953476762, 6.1381825686271965, 4.948141942333553, 2.262772305333532, 3.1107294235609313, 1.597702822537208, 0.7985851475362613, 1.2876012196680742, 0.6382440783770754, 4.184396274809218, 2.4779847309526644, 11.189691729703334, 1.4700346048173054, 1.3917327861045414, 0.5567008811912467, 10.89070860256485, 5.731011581456509, 16.638100988250155, 29.729639340362336, 1.9629868327325566, 7.107799034306685, 11.954774308947517, 4.283855744812348, 

100%|██████████| 3/3 [00:00<00:00, 9245.34it/s]
100%|██████████| 1/1 [00:00<00:00, 9404.27it/s]


using gene_list var_50genes.json


  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
100%|██████████| 3/3 [00:00<00:00, 15.62it/s]


15:50:02 | INFO | Loaded train split with 32531 samples: (32531, 768)


  if not is_categorical_dtype(df_full[k]):
  if not is_categorical_dtype(df_full[k]):
100%|██████████| 1/1 [00:00<00:00, 38.99it/s]


15:50:02 | INFO | Loaded test split with 3005 samples: (3005, 768)
perform PCA dim reduction
Using alpha: 0.0078125
15:50:10 | INFO | {'n_train': 32531, 'n_test': 3005, 'l2_errors': [10.335941093147989, 1.7970105059519832, 0.5616088989390486, 5.736440368060929, 6.376135139740198, 3.9007558245289173, 1.6879095902450234, 4.554486188188503, 1.7343008174144952, 0.20408529137964945, 2.5248538698913006, 2.595042940874338, 14.620846749567438, 9.491911169999037, 2.8014455576086728, 3.366259286670526, 7.909101704024713, 2.3170094541459174, 3.751171010510172, 3.7669393229703423, 1.0954146660981257, 1.7227318045767999, 2.9142386551662662, 0.7781768032616305, 1.0939724327275582, 0.2957715592293628, 1.1857338673196982, 7.0360926974132765, 6.387886927419834, 3.8236598145556795, 11.165695977722894, 0.7406350965738179, 1.0598334019787168, 2.0851086049469627, 11.485379869613313, 4.0993260959209445, 9.56166003062324, 6.775205926829014, 1.1063931246310672, 12.500785900548385, 2.3123773256611093, 4.914386

In [8]:
# %%bash
# export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libffi.so.7
# python ../src/hest/bench/benchmark.py --config ../bench_config/bench_config.yaml

### Benchmarking your own model with HEST-Benchmark 


In [None]:
from hest.bench import benchmark
import torch

PATH_TO_CONFIG = .. # path to `bench_config.yaml`
model = .. # PyTorch model (torch.nn.Module)
model_transforms = .. # transforms to apply during inference (torchvision.transforms.Compose)
precision = torch.float32

benchmark(        
    model, 
    model_transforms,
    precision,
    config=PATH_TO_CONFIG, 
)