PathGLS is a novel, reference-free evaluation framework designed to assess the trustworthiness of Vision-Language Models in computational pathology without relying on expert-annotated ground truths.
It evaluates pathology VLMs across three complementary dimensions:
-
Grounding (
$S_g$ ): Fine-grained visual-text alignment using High-Resolution Multiple Instance Learning. -
Logic (
$S_l$ ): Entailment graph consistency using a domain-specific Natural Language Inference model to detect broken reasoning chains. -
Stability (
$S_s$ ): Output variance under adversarial visual perturbations and semantic prompt injections.
PathGLS supports both patch-level and whole-slide image level analysis.
PathGLS requires specific system-level libraries for WSI processing and specialized NLP models for logic extraction.
For Debian/Ubuntu systems, the underlying C-library for reading WSIs is required:
sudo apt-get update
sudo apt-get install -y openslide-tools
Create a virtual environment (Python 3.9+ recommended) and install standard dependencies:
conda create -n pathgls python=3.10 -y
conda activate pathgls
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
Download the specific SciSpacy model required by the Medical Knowledge Engine:
pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.1/en_core_sci_md-0.5.1.tar.gz
Before running the evaluation, you must configure the paths in config.py.
IMAGE_ROOT: Directory containing your images (e.g.,.jpg,.pngfor ROI, or.svs,.ndpi,.tifffor WSIs).INPUT_DATA_FILE: A JSON file containing the dataset metadata. The structure should be:
[
{
"image_id": "case_001",
"original_path": "path/to/case_001.jpg",
"text_control": "Optional: Ground truth or initial description."
}
]You can switch the evaluation subject (the VLM being tested) by modifying MODEL_SOURCE in config.py:
"local": Uses the local VLM engine defined inLOCAL_MODEL_PATH."api": Uses a remote API (requires settingAPI_KEY,API_HOST, andAPI_MODEL_NAME).
