# Phase 1 - Training-free baseline
To evaluate how well different models encode correspondence, we use **SPair-71k**, a standard benchmark for semantic correspondence. Each image pair in this dataset comes with annotated keypoints that represent the same semantic part (e.g., the tip of a dogâ€™s ear, the wheel of a car) across different object instances or viewpoints.

### Evaluation Protocol
We follow the standard protocol from DIFT [1], using **PCK@T** (Percentage of Correct Keypoints) as the main metric. PCK measures the percentage of keypoints predicted within a certain normalized distance from the ground truth. We use multiple thresholds (e.g., 0.05, 0.1, 0.2) to analyze performance at different precision levels.

Results will be reported:
- Per keypoint
- Per image

This analysis will show how each backbone behaves across categories and difficulty levels.

### DINOv2

In [None]:

# DINOv2 - Small - 224
!python evaluate.py --phase 1 --model dinov2 --model_arch vits14 --resolution 224 --dataset spair --batch_size 8 --num_workers 4

# DINOv2 - Small - 518
!python evaluate.py --phase 1 --model dinov2 --model_arch vits14 --resolution 518 --dataset spair --batch_size 8 --num_workers 4

# DINOv2 - Base - 224
!python evaluate.py --phase 1 --model dinov2 --model_arch vitb14 --resolution 224 --dataset spair --batch_size 8 --num_workers 4

# DINOv2 - Base - 518
!python evaluate.py --phase 1 --model dinov2 --model_arch vitb14 --resolution 518 --dataset spair --batch_size 8 --num_workers 4

# DINOv2 - Large - 224
!python evaluate.py --phase 1 --model dinov2 --model_arch vitl14 --resolution 224 --dataset spair --batch_size 8 --num_workers 4

# DINOv2 - Large - 518
!python evaluate.py --phase 1 --model dinov2 --model_arch vitl14 --resolution 518 --dataset spair --batch_size 8 --num_workers 4


### DINOv3

In [None]:
# DINOv3 - Small - 224
!python evaluate.py --phase 1 --model dinov3 --model_arch vits16 --resolution 224 --dataset spair --batch_size 8 --num_workers 4

# DINOv3 - Small - 512
!python evaluate.py --phase 1 --model dinov3 --model_arch vits16 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# DINOv3 - Base - 224
!python evaluate.py --phase 1 --model dinov3 --model_arch vitb16 --resolution 224 --dataset spair --batch_size 8 --num_workers 4

# DINOv3 - Base - 512
!python evaluate.py --phase 1 --model dinov3 --model_arch vitb16 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# DINOv3 - Large - 224
!python evaluate.py --phase 1 --model dinov3 --model_arch vitl16 --resolution 224 --dataset spair --batch_size 8 --num_workers 4

# DINOv3 - Large - 512
!python evaluate.py --phase 1 --model dinov3 --model_arch vitl16 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

### SAM

In [None]:
# SAM - Base - 512
!python evaluate.py --phase 1 --model sam --model_arch vitb16 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# SAM - Base - 1024
!python evaluate.py --phase 1 --model sam --model_arch vitb16 --resolution 1024 --dataset spair --batch_size 8 --num_workers 4

# SAM - Large - 512
!python evaluate.py --phase 1 --model sam --model_arch vitl16 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# SAM - Large - 1024
!python evaluate.py --phase 1 --model sam --model_arch vitl16 --resolution 1024 --dataset spair --batch_size 8 --num_workers 4

### CLIP

In [None]:
# CLIP - ViT-B/32 - 224
!python evaluate.py --phase 1 --model clip --model_arch vitb32 --resolution 224 --dataset spair --batch_size 8 --num_workers 4

# CLIP - ViT-B/32 - 512
!python evaluate.py --phase 1 --model clip --model_arch vitb32 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# CLIP - ViT-B/16 - 224
!python evaluate.py --phase 1 --model clip --model_arch vitb16 --resolution 224 --dataset spair --batch_size 8 --num_workers 4

# CLIP - ViT-B/16 - 512
!python evaluate.py --phase 1 --model clip --model_arch vitb16 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# CLIP - ViT-L/14 - 224
!python evaluate.py --phase 1 --model clip --model_arch vitl14 --resolution 224 --dataset spair --batch_size 8 --num_workers 4

# CLIP - ViT-L/14 - 512
!python evaluate.py --phase 1 --model clip --model_arch vitl14 --resolution 518 --dataset spair --batch_size 8 --num_workers 4

# Phase 2 - Light Finetuning of the Last Layers
In the second stage, we keep the same pipeline but unfreeze the last layers of the
backbone and fine-tune them using keypoint supervision from SPair-71k.

By testing different numbers of finetuned layers, we can observe how performance evolves
as the model is given more flexibility to adapt to the task. This highlights how a small amount of fine-tuning can significantly boost correspondence quality.

### DINOv2 - Small

In [None]:
# DINOv2 - Small - Layer 2 - Training
!python train.py --model dinov2 --model_arch vits14 --fine_tune_layers 2 --resolution 518 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv2 - Small - Layer 2 - Evaluation
!python evaluate.py --phase 2 --model dinov2 --model_arch vits14_ft2 --resolution 518 --dataset spair --batch_size 8 --num_workers 4

# DINOv2 - Small - Layer 4 - Training
!python train.py --model dinov2 --model_arch vits14 --fine_tune_layers 4 --resolution 518 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv2 - Small - Layer 4 - Evaluation
!python evaluate.py --phase 2 --model dinov2 --model_arch vits14_ft4 --resolution 518 --dataset spair --batch_size 8 --num_workers 4

# DINOv2 - Small - Layer 6 - Training
!python train.py --model dinov2 --model_arch vits14 --fine_tune_layers 6 --resolution 518 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv2 - Small - Layer 6 - Evaluation
!python evaluate.py --phase 2 --model dinov2 --model_arch vits14_ft6 --resolution 518 --dataset spair --batch_size 8 --num_workers 4

### DINOv2 - Base

In [None]:
# DINOv2 - Base - Layer 2 - Training
!python train.py --model dinov2 --model_arch vitb14 --fine_tune_layers 2 --resolution 518 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv2 - Base - Layer 2 - Evaluation
!python evaluate.py --phase 2 --model dinov2 --model_arch vitb14_ft2 --resolution 518 --dataset spair --batch_size 8 --num_workers 4

# DINOv2 - Base - Layer 4 - Training
!python train.py --model dinov2 --model_arch vitb14 --fine_tune_layers 4 --resolution 518 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv2 - Base - Layer 4 - Evaluation
!python evaluate.py --phase 2 --model dinov2 --model_arch vitb14_ft4 --resolution 518 --dataset spair --batch_size 8 --num_workers 4

# DINOv2 - Base - Layer 6 - Training
!python train.py --model dinov2 --model_arch vitb14 --fine_tune_layers 6 --resolution 518 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv2 - Base - Layer 6 - Evaluation
!python evaluate.py --phase 2 --model dinov2 --model_arch vitb14_ft6 --resolution 518 --dataset spair --batch_size 8 --num_workers 4

### DINOv3 - Small

In [None]:
# DINOv3 - Small - Layer 2 - Training
!python train.py --model dinov3 --model_arch vits16 --fine_tune_layers 2 --resolution 512 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv3 - Small - Layer 2 - Evaluation
!python evaluate.py --phase 2 --model dinov3 --model_arch vits16_ft2 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# DINOv3 - Small - Layer 4 - Training
!python train.py --model dinov3 --model_arch vits16 --fine_tune_layers 4 --resolution 512 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv3 - Small - Layer 4 - Evaluation
!python evaluate.py --phase 2 --model dinov3 --model_arch vits16_ft4 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# DINOv3 - Small - Layer 6 - Training
!python train.py --model dinov3 --model_arch vits16 --fine_tune_layers 6 --resolution 512 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv3 - Small - Layer 6 - Evaluation
!python evaluate.py --phase 2 --model dinov3 --model_arch vits16_ft6 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

### DINOv3 - Base

In [None]:
# DINOv3 - Base - Layer 2 - Training
!python train.py --model dinov3 --model_arch vitb16 --fine_tune_layers 2 --resolution 512 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv3 - Base - Layer 2 - Evaluation
!python evaluate.py --phase 2 --model dinov3 --model_arch vitb16_ft2 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# DINOv3 - Base - Layer 4 - Training
!python train.py --model dinov3 --model_arch vitb16 --fine_tune_layers 4 --resolution 512 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv3 - Base - Layer 4 - Evaluation
!python evaluate.py --phase 2 --model dinov3 --model_arch vitb16_ft4 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# DINOv3 - Base - Layer 6 - Training
!python train.py --model dinov3 --model_arch vitb16 --fine_tune_layers 6 --resolution 512 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# DINOv3 - Base - Layer 6 - Evaluation
!python evaluate.py --phase 2 --model dinov3 --model_arch vitb16_ft6 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

### SAM - Base

In [None]:
# SAM - Base - Layer 2 - Training
!python train.py --model sam --model_arch vitb16 --fine_tune_layers 2 --resolution 1024 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# SAM - Base - Layer 2 - Evaluation
!python evaluate.py --phase 2 --model sam --model_arch vitb16_ft2 --resolution 1024 --dataset spair --batch_size 8 --num_workers 4

# SAM - Base - Layer 4 - Training
!python train.py --model sam --model_arch vitb16 --fine_tune_layers 4 --resolution 1024 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# SAM - Base - Layer 4 - Evaluation
!python evaluate.py --phase 2 --model sam --model_arch vitb16_ft4 --resolution 1024 --dataset spair --batch_size 8 --num_workers 4

# SAM - Base - Layer 6 - Training
!python train.py --model sam --model_arch vitb16 --fine_tune_layers 6 --resolution 1024 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# SAM - Base - Layer 6 - Evaluation
!python evaluate.py --phase 2 --model sam --model_arch vitb16_ft6 --resolution 1024 --dataset spair --batch_size 8 --num_workers 4

### CLIP - ViT-B/16

In [None]:
# CLIP - ViT-B/16 - Layer 2 - Training
!python train.py --model clip --model_arch vitb16 --fine_tune_layers 2 --resolution 512 --dataset spair --batch_size 4 --num_workers 4 --epochs 3

# CLIP - ViT-B/16 - Layer 2 - Evaluation
!python evaluate.py --phase 2 --model clip --model_arch vitb16_ft2 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# CLIP - ViT-B/16 - Layer 4 - Training
!python train.py --model clip --model_arch vitb16 --fine_tune_layers 4 --resolution 512 --dataset spair --batch_size 4 --num_workers 4 --epochs 3
# CLIP - ViT-B/16 - Layer 4 - Evaluation
!python evaluate.py --phase 2 --model clip --model_arch vitb16_ft4 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

# CLIP - ViT-B/16 - Layer 6 - Training
!python train.py --model clip --model_arch vitb16 --fine_tune_layers 6 --resolution 512 --dataset spair --batch_size 4 --num_workers 4 --epochs 3
# CLIP - ViT-B/16 - Layer 6 - Evaluation
!python evaluate.py --phase 2 --model clip --model_arch vitb16_ft6 --resolution 512 --dataset spair --batch_size 8 --num_workers 4

#  Phase 3 - Prediction
In the baselines above, the final correspondence is obtained using argmax on the similarity map. However, this has clear limitations:
i) it only predicts discrete pixel locations
ii) it is sensitive to local noise and can miss subtle details.

As proposed by Zhang et al. [3], we replace this with **window soft-argmax**:

1. Find the peak location with argmax.
2. Apply soft-argmax only within a small fixed window around the peak.

This allows sub-pixel refinement and makes the prediction more robust to noisy similarity maps. In this step, you will evaluate how this change affects PCK across different thresholds.

### DINOv2

In [None]:
# ==================== DINOv2 - Baseline Models ====================
# DINOv2 - Small - Baseline
!python evaluate.py --phase 3 --model dinov2 --model_arch vits14 --resolution 518 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# DINOv2 - Base - Baseline
!python evaluate.py --phase 3 --model dinov2 --model_arch vitb14 --resolution 518 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# DINOv2 - Large - Baseline
!python evaluate.py --phase 3 --model dinov2 --model_arch vitl14 --resolution 518 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# ==================== DINOv2 - Fine-tuned Models ====================
# DINOv2 - Best Small Layer 2
!python evaluate.py --phase 3 --model dinov2 --model_arch vits14_ft2 --resolution 518 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# DINOv2 - Best Base Layer 2
!python evaluate.py --phase 3 --model dinov2 --model_arch vitb14_ft2 --resolution 518 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

### DINOv3

In [None]:
# ==================== DINOv3 - Baseline Models ====================
# DINOv3 - Small - Baseline
!python evaluate.py --phase 3 --model dinov3 --model_arch vits16 --resolution 512 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# DINOv3 - Base - Baseline
!python evaluate.py --phase 3 --model dinov3 --model_arch vitb16 --resolution 512 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# DINOv3 - Large - Baseline
!python evaluate.py --phase 3 --model dinov3 --model_arch vitl16 --resolution 512 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# ==================== DINOv3 - Fine-tuned Models ====================
# DINOv3 - Best Small Layer 4
!python evaluate.py --phase 3 --model dinov3 --model_arch vits16_ft4 --resolution 512 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# DINOv3 - Best Base Layer 4
!python evaluate.py --phase 3 --model dinov3 --model_arch vitb16_ft4 --resolution 512 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

### SAM

In [None]:
# ==================== SAM - Baseline Models ====================
# SAM - Base - Resolution 512
!python evaluate.py --phase 3 --model sam --model_arch vitb16 --resolution 512 --dataset spair --match_method windowed_softargmax--batch_size 8 --num_workers 4

# SAM - Base - Resolution 1024
!python evaluate.py --phase 3 --model sam --model_arch vitb16 --resolution 1024 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# SAM - Large - Resolution 512
!python evaluate.py --phase 3 --model sam --model_arch vitl16 --resolution 512 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# ==================== SAM - Fine-tuned Models ====================
# SAM - Best Base Layer 4 - Resolution 512
!python evaluate.py --phase 3 --model sam --model_arch vitb16_ft4 --resolution 512 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# SAM - Best Base Layer 4 - Resolution 1024
!python evaluate.py --phase 3 --model sam --model_arch vitb16_ft4 --resolution 1024 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

### CLIP

In [None]:
# ==================== CLIP - Baseline Models ====================
# CLIP - ViT-B/32 - Resolution 224
!python evaluate.py --phase 3 --model clip --model_arch vitb32 --resolution 224 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# CLIP - ViT-B/32 - Resolution 512
!python evaluate.py --phase 3 --model clip --model_arch vitb32 --resolution 512 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# CLIP - ViT-B/16 - Resolution 224
!python evaluate.py --phase 3 --model clip --model_arch vitb16 --resolution 224 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# CLIP - ViT-B/16 - Resolution 512
!python evaluate.py --phase 3 --model clip --model_arch vitb16 --resolution 512 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# CLIP - ViT-L/14 - Resolution 224
!python evaluate.py --phase 3 --model clip --model_arch vitl14 --resolution 224 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# CLIP - ViT-L/14 - Resolution 518
!python evaluate.py --phase 3 --model clip --model_arch vitl14 --resolution 518 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# ==================== CLIP - Fine-tuned Model ====================
# CLIP - Best ViT-B/16 Layer 6 - Resolution 512
!python evaluate.py --phase 3 --model clip --model_arch vitb16_ft6 --resolution 512 --dataset spair --match_method windowed_softargmax --batch_size 8 --num_workers 4

# Phase 4 - Generalization
To assess the generalization capabilities of our models, we evaluate both baseline and fine-tuned models on **PF-PASCAL** and **PF-WILLOW**, two additional semantic correspondence benchmarks with different characteristics from SPair-71k.

### Datasets
- **PF-PASCAL**: Contains 1,351 image pairs across 20 object categories from PASCAL VOC, with keypoint annotations for semantic parts.
- **PF-WILLOW**: Features 900 image pairs of various objects with stronger viewpoint and appearance variations.


### DINOv2

In [None]:
# ==================== DINOv2 - Small ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov2 --model_arch vits14 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov2 --model_arch vits14 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov2 --model_arch vits14 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov2 --model_arch vits14 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== DINOv2 - Base ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov2 --model_arch vitb14 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov2 --model_arch vitb14 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov2 --model_arch vitb14 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov2 --model_arch vitb14 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== DINOv2 - Large ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov2 --model_arch vitl14 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov2 --model_arch vitl14 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov2 --model_arch vitl14 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov2 --model_arch vitl14 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== DINOv2 - Best Small Layer 2 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov2 --model_arch vits14_ft2 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov2 --model_arch vits14_ft2 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov2 --model_arch vits14_ft2 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov2 --model_arch vits14_ft2 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== DINOv2 - Best Base Layer 2 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov2 --model_arch vitb14_ft2 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov2 --model_arch vitb14_ft2 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov2 --model_arch vitb14_ft2 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov2 --model_arch vitb14_ft2 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

### DINOv3

In [None]:
# ==================== DINOv3 - Small ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov3 --model_arch vits16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov3 --model_arch vits16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov3 --model_arch vits16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov3 --model_arch vits16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== DINOv3 - Base ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov3 --model_arch vitb16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov3 --model_arch vitb16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov3 --model_arch vitb16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov3 --model_arch vitb16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== DINOv3 - Large ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov3 --model_arch vitl16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov3 --model_arch vitl16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov3 --model_arch vitl16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov3 --model_arch vitl16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== DINOv3 - Best Small Layer 4 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov3 --model_arch vits16_ft4 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov3 --model_arch vits16_ft4 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov3 --model_arch vits16_ft4 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov3 --model_arch vits16_ft4 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== DINOv3 - Best Base Layer 4 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov3 --model_arch vitb16_ft4 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov3 --model_arch vitb16_ft4 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model dinov3 --model_arch vitb16_ft4 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model dinov3 --model_arch vitb16_ft4 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

### SAM

In [None]:
# ==================== SAM - Base - Resolution 512 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitb16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitb16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitb16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitb16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== SAM - Base - Resolution 1024 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitb16 --resolution 1024 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitb16 --resolution 1024 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitb16 --resolution 1024 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitb16 --resolution 1024 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== SAM - Large - Resolution 512 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitl16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitl16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitl16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitl16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== SAM - Large - Resolution 1024 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitl16 --resolution 1024 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitl16 --resolution 1024 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitl16 --resolution 1024 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitl16 --resolution 1024 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== SAM - Best Base Layer 4 - Resolution 512 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitb16_ft4 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitb16_ft4 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitb16_ft4 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitb16_ft4 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== SAM - Best Base Layer 4 - Resolution 1024 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitb16_ft4 --resolution 1024 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitb16_ft4 --resolution 1024 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model sam --model_arch vitb16_ft4 --resolution 1024 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model sam --model_arch vitb16_ft4 --resolution 1024 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

### CLIP

In [None]:
# ==================== CLIP - ViT-B/32 - Resolution 224 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitb32 --resolution 224 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitb32 --resolution 224 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitb32 --resolution 224 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitb32 --resolution 224 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== CLIP - ViT-B/32 - Resolution 512 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitb32 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitb32 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitb32 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitb32 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== CLIP - ViT-B/16 - Resolution 224 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitb16 --resolution 224 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitb16 --resolution 224 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitb16 --resolution 224 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitb16 --resolution 224 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== CLIP - ViT-B/16 - Resolution 512 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitb16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitb16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitb16 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitb16 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== CLIP - ViT-L/14 - Resolution 224 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitl14 --resolution 224 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitl14 --resolution 224 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitl14 --resolution 224 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitl14 --resolution 224 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== CLIP - ViT-L/14 - Resolution 518 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitl14 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitl14 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitl14 --resolution 518 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitl14 --resolution 518 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# ==================== CLIP - Best ViT-B/16 Layer 6 - Resolution 512 ====================
# Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitb16_ft6 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4

# Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitb16_ft6 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4

# Windowed Soft-Argmax - PF-PASCAL
!python evaluate.py --phase 4 --model clip --model_arch vitb16_ft6 --resolution 512 --dataset pfpascal --batch_size 8 --num_workers 4 --match_method windowed_softargmax

# Windowed Soft-Argmax - PF-WILLOW
!python evaluate.py --phase 4 --model clip --model_arch vitb16_ft6 --resolution 512 --dataset pfwillow --batch_size 8 --num_workers 4 --match_method windowed_softargmax