## Step-by-step instructions to run HEST-Benchmark

This tutorial will guide you to:

- **Reproduce** HEST-Benchmark results provided in the paper (Random Forest regression and Ridge regression models)
- Benchmark your **own** model


## Reproducing HEST-Benchmark results 

- Ensure that HEST has been properly installed (see README, Installation)
- Download preprocessed patches, h5ad and gene targets  
- Download publicly available patch encoders

**Note:** Not all public foundation models can be shared due to licensing issues. We provide model-specific instructions that users can follow to access weights:

#### CONCH installation (model + weights)

1. Request access to the model weights from the Huggingface model page [here](https://huggingface.co/MahmoodLab/CONCH).

2. Download the model weights (`pytorch_model.bin`) and place them in your `fm_v1` directory `fm_v1/conch_v1_official/pytorch_model.bin`

3. Install the CONCH PyTorch model:

```
git clone https://github.com/mahmoodlab/CONCH.git
cd CONCH
pip install -e .
```

#### UNI installation (weights only)

1. Request access to the model weights from the Huggingface model page [here](https://huggingface.co/MahmoodLab/UNI).

2. Download the model weights (`pytorch_model.bin`) and place them in your `fm_v1` directory `fm_v1/uni_v1_official/pytorch_model.bin`


#### GigaPath installation (weight only)

1. Request access to the model weights from the Huggingface model page  [here](https://huggingface.co/prov-gigapath/prov-gigapath).
2. Download the model weights (`prov-gigapath.bin`) and place them in your `fm_v1` directory `fm_v1/uni_v1_official/pytorch_model.bin`


In [3]:
from huggingface_hub import snapshot_download

local_dir='../hest_bench' 

# Download HEST-Benchmark data 
snapshot_download(repo_id="MahmoodLab/hest-bench", repo_type='dataset', local_dir=local_dir)

# Download HEST-Benchmark foundation models 


Fetching 303 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 303/303 [00:50<00:00,  6.04it/s]


'/media/disk2/hest/hest/hest_bench'

- Set paths in `bench_config/bench_config.yaml`. For instance:

```yaml
# Directory paths
source_dataroot: '../hest_bench'
results_dir: '../hest_bench/results'
embed_dataroot: '../hest_bench/ST_data_emb'
weights_root: '../hest_bench/fm_v1'

# Inference parameters
precision: 'fp32'
batch_size: 128
num_workers: 1

# Encoders to benchmark
encoders: ["kimianet", "plip", "resnet50_trunc", "ciga", 
           "ctranspath", "phikon_official_hf", 
           #"uni_v1_official", # uncomment after requesting the weights
           "remedis", # uncomment after requesting the weights
           #"conch_v1_official" # uncomment after requesting the weights
]

# Datasets to benchmark
datasets: ["IDC", "PRAD", "PAAD", "SKCM", "COAD", 
           "READ", "CCRCC", "HCC", "LUNG", "LYMPH_IDC"]
```


Run `python src/hest/bench/training/predict_expression.py --config bench_config/bench_config.yaml`.

All results will automatically be dumped in the `results_dir` specified in the `.yaml` config.
