# Adaptive Re-ranking Pipeline - Google Colab Execution

This notebook runs the complete adaptive re-ranking pipeline in Google Colab.

## Pipeline Overview:
1. Setup environment and install dependencies
2. Download datasets
3. Run VPR evaluation
4. Extract features (8 improved features)
5. Train logistic regression model
6. Apply model with threshold calibration
7. Run adaptive image matching (only hard queries)
8. Evaluate adaptive re-ranking
9. Generate threshold analysis plots
10. Serialize results to MATLAB


## Step 1: Setup Environment


In [None]:
# Mount Google Drive (optional - if you want to save results)
from google.colab import drive
drive.mount('/content/drive')


In [None]:
# Clone repository
!git clone --recursive https://github.com/FarInHeight/Visual-Place-Recognition-Project.git
%cd Visual-Place-Recognition-Project


In [None]:
# Install image matching models
%cd image-matching-models
!pip install -e .[all]
%cd ..


In [None]:
# Install other dependencies
!pip install faiss-cpu scikit-learn joblib matplotlib scipy tqdm


## Step 2: Download Datasets

**Note**: You may need to download datasets manually from Google Drive links and upload them to Colab.


In [None]:
# Download datasets (if not already downloaded)
# Note: You may need to download manually from Google Drive links
# !python download_datasets.py


## Step 3: Run VPR Evaluation

Run this for each test dataset (SF-XS, Tokyo-XS, SVOX)

**Important**: Replace `[timestamp]` with the actual timestamp from your log directory.


In [None]:
# Example: SF-XS Test
# Replace paths with your actual dataset paths
!python VPR-methods-evaluation/main.py \
  --num_workers 4 \
  --batch_size 32 \
  --log_dir log_sf_xs_test \
  --method=cosplace --backbone=ResNet18 --descriptors_dimension=512 \
  --image_size 512 512 \
  --database_folder data/sf_xs/test/database \
  --queries_folder data/sf_xs/test/queries \
  --num_preds_to_save 20 \
  --recall_values 1 5 10 20 \
  --save_for_uncertainty \
  --device cuda


## Step 4: Extract Features (8 Improved Features)

**Important**: Replace `[timestamp]` with the actual timestamp from your log directory.


In [None]:
# Extract features for test (SF-XS test)
# Replace [timestamp] with actual timestamp
!python -m extension_6_1.stage_1_extract_features_no_inliers \
  --preds-dir logs/log_sf_xs_test/[timestamp]/preds \
  --z-data-path logs/log_sf_xs_test/[timestamp]/z_data.torch \
  --output-path data/features_and_predictions/features_sf_xs_test_improved.npz \
  --positive-dist-threshold 25


## Step 5: Train Logistic Regression Model

**Note**: You need training and validation features first (SVOX train, SF-XS val).


In [None]:
# Train model with C tuning
!python -m extension_6_1.stage_3_train_logreg_easy_queries \
  --train-features data/features_and_predictions/features_svox_train_improved.npz \
  --val-features data/features_and_predictions/features_sf_xs_val_improved.npz \
  --output-model logreg_easy_queries_optimal_C_tuned.pkl \
  --threshold-method f1


## Step 6: Apply Model with Threshold Calibration


In [None]:
# Apply to SF-XS test (with calibration)
!python -m extension_6_1.stage_4_apply_logreg_easy_queries \
  --model-path logreg_easy_queries_optimal_C_tuned.pkl \
  --feature-path data/features_and_predictions/features_sf_xs_test_improved.npz \
  --output-path data/features_and_predictions/logreg_sf_xs_test.npz \
  --hard-queries-output data/features_and_predictions/hard_queries_sf_xs_test.txt \
  --calibrate-threshold


## Step 7: Run Full Re-ranking (Ground Truth)

**Important**: Run full re-ranking FIRST to get ground-truth results. This is the longest step (~2-4 hours).


In [None]:
# Run full re-ranking for SF-XS test (all queries)
# Replace [timestamp] with actual timestamp
!python match_queries_preds.py \
  --preds-dir logs/log_sf_xs_test/[timestamp]/preds \
  --matcher superpoint-lg \
  --device cuda \
  --num-preds 20 \
  --out-dir logs/log_sf_xs_test/[timestamp]/preds_superpoint-lg


## Step 8: Run Adaptive Image Matching (Only Hard Queries)

This is much faster than full re-ranking since it only processes hard queries.


In [None]:
# Run adaptive matching (only hard queries)
# Replace [timestamp] with actual timestamp
!python match_queries_preds_adaptive.py \
  --preds-dir logs/log_sf_xs_test/[timestamp]/preds \
  --hard-queries-list data/features_and_predictions/hard_queries_sf_xs_test.txt \
  --out-dir logs/log_sf_xs_test/[timestamp]/preds_superpoint-lg_adaptive \
  --matcher superpoint-lg \
  --device cuda \
  --num-preds 20


## Step 9: Evaluate Adaptive Re-ranking


In [None]:
# Evaluate adaptive re-ranking
# Replace [timestamp] with actual timestamp
!python -m extension_6_1.stage_5_adaptive_reranking_eval \
  --preds-dir logs/log_sf_xs_test/[timestamp]/preds \
  --inliers-dir logs/log_sf_xs_test/[timestamp]/preds_superpoint-lg_adaptive \
  --logreg-output data/features_and_predictions/logreg_sf_xs_test.npz \
  --num-preds 20 \
  --positive-dist-threshold 25 \
  --recall-values 1 5 10 20


## Step 10: Threshold Analysis (Generate Plots)

This generates plots showing R@1 vs threshold with selected threshold markers.


In [None]:
# Run comprehensive threshold analysis
# Replace [timestamp] with actual timestamps
!python adaptive_reranking_threshold_analysis.py \
  --model-path logreg_easy_queries_optimal_C_tuned.pkl \
  --datasets sf_xs_test tokyo_xs_test \
  --feature-paths \
    data/features_and_predictions/features_sf_xs_test_improved.npz \
    data/features_and_predictions/features_tokyo_xs_test_improved.npz \
  --preds-dirs \
    logs/log_sf_xs_test/[timestamp]/preds \
    log_tokyo_xs_test/[timestamp]/preds \
  --inliers-dirs \
    logs/log_sf_xs_test/[timestamp]/preds_superpoint-lg \
    log_tokyo_xs_test/[timestamp]/preds_superpoint-lg \
  --output-dir output_stages/threshold_analysis_comprehensive \
  --threshold-range 0.1 0.99 \
  --threshold-step 0.05 \
  --num-preds 20 \
  --positive-dist-threshold 25


## Step 11: Serialize Results to MATLAB


In [None]:
# Serialize results to MATLAB .mat files
!python serialize_results_to_matlab.py \
  --results-dir output_stages/threshold_analysis_comprehensive \
  --model-path logreg_easy_queries_optimal_C_tuned.pkl \
  --feature-path data/features_and_predictions/features_sf_xs_test_improved.npz \
  --output-dir output_stages/matlab_files


## Step 12: Download Results

Download plots and results before the Colab session ends.


In [None]:
# Download plots and results
from google.colab import files
import zipfile
import os

# Create zip file with results
with zipfile.ZipFile('results.zip', 'w') as zipf:
    # Add plots
    if os.path.exists('output_stages/threshold_analysis_comprehensive'):
        for file in os.listdir('output_stages/threshold_analysis_comprehensive'):
            if file.endswith('.png'):
                zipf.write(f'output_stages/threshold_analysis_comprehensive/{file}')
        # Add summary reports
        for file in os.listdir('output_stages/threshold_analysis_comprehensive'):
            if file.endswith('.md'):
                zipf.write(f'output_stages/threshold_analysis_comprehensive/{file}')
    
    # Add MATLAB files
    if os.path.exists('output_stages/matlab_files'):
        for file in os.listdir('output_stages/matlab_files'):
            if file.endswith('.mat'):
                zipf.write(f'output_stages/matlab_files/{file}')

# Download
files.download('results.zip')


## Important Notes

### ‚ùì Do You Need Colab Pro?

**Answer: NO! Free Colab works fine for most steps.**

| Step | Free Colab? | Notes |
|------|-------------|-------|
| VPR Evaluation | ‚úÖ Yes | Fast (~5-10 min), no timeout risk |
| Feature Extraction | ‚úÖ Yes | Very fast (~1-2 min) |
| Model Training | ‚úÖ Yes | Fast (~1-2 min) |
| Model Application | ‚úÖ Yes | Very fast (~1 min) |
| **Full Re-ranking** | ‚ö†Ô∏è **Maybe** | **2-4 hours - may timeout on free tier** |
| Adaptive Matching | ‚úÖ Yes | 30-60 min, usually fine |
| Threshold Analysis | ‚úÖ Yes | Fast (~10-20 min) |

### Free Colab Limitations:
1. **Session timeout**: ~12 hours of inactivity (keep tab active!)
2. **GPU hours**: ~12 hours/day (usually enough)
3. **Storage**: ~80GB (usually enough)
4. **File persistence**: Files deleted when session ends (save to Drive!)

### Recommendations for Free Colab:
1. **Save to Google Drive**: Mount Drive and save important results
2. **Use GPU**: Enable GPU in Runtime ‚Üí Change runtime type (T4 is free!)
3. **Keep tab active**: During long operations to prevent timeout
4. **Hybrid approach**: Run full re-ranking locally if you have GPU
5. **Download results**: Download plots and .mat files before session ends

### When You Might Need Colab Pro:
- ‚ùå You don't have a local GPU AND need to run full re-ranking in Colab
- ‚ùå You need background execution (can't keep tab active)
- ‚ùå You need more than 12 GPU hours/day
- ‚ùå You need 24-hour sessions (vs 12 hours on free)

### Time Estimates (Free Colab with T4 GPU):
- VPR evaluation: ~5-10 minutes per dataset ‚úÖ
- Full re-ranking: ~2-4 hours per dataset ‚ö†Ô∏è (may timeout)
- Adaptive matching: ~30-60 minutes ‚úÖ
- Threshold analysis: ~10-20 minutes ‚úÖ

### Best Strategy for Free Colab:
1. **Colab (Free)**: VPR evaluation, feature extraction, model training/application
2. **Local (if GPU)**: Full re-ranking (no timeout risk)
3. **Colab (Free)**: Adaptive matching, threshold analysis
4. **Download**: All results before session ends

**This way, you don't need Colab Pro!** üéâ
