Skip to content

This repository represents the official implementation of the paper titled "Towards Generalizable Scene Change Detection (CVPR 2025)".

License

Notifications You must be signed in to change notification settings

1124jaewookim/towards-generalizable-scene-change-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Towards Generalizable Scene Change Detection

This repository represents the official implementation of the paper titled "Towards Generalizable Scene Change Detection (CVPR 2025)".

Paper License Website CVPR 2025 Paper

Jaewoo Kim · Uehwan Kim
CVPR 2025



Comparative results of the current state-of-the-art model and our GeSCF on various unseen images.

💡 Problem Formulation

We formulate the research problem by casting a fundamental question: "Can contemporary SCD models detect arbitrary real-world changes beyond the scope of research data?" Our findings, as shown in the figure above, indicate that their reported effectiveness does not hold in real-world applications. Specifically, we observe that they (1) produce inconsistent change masks when the input order is reversed, and (2) exhibit significant performance drops when deployed to unseen domains with different visual features.

In this work, we address these two pivotal SCD problems by proposing a novel framework (GeSCF) and a novel benchmark (GeSCD) to foster SCD research in generalizability.

🔧 Installation

Follow the steps below to set up the environment for running GeSCF:

1. Clone this repository

git clone https://github.com/1124jaewookim/towards-generalizable-scene-change-detection.git
cd towards-generalizable-scene-change-detection/src

2. Install required dependencies

You can install dependencies manually or using a requirements.txt file.

3. Download pretrained weights

🗂 Datasets

For a comprehensive evaluation of SCD performance, we consider three standard SCD datasets with different characteristics and our proposed ChangeVPR dataset.

Please follow this page to download the VL-CMU-CD dataset.

Please follow this page to download the TSUNAMI dataset.

Please follow this page to download the ChangeSim dataset.

To download our ChangeVPR dataset, go here to download it.

📁 Dataset Structure

Prepare your dataset in the following structure:

your_dataset_root/
└── ChangeVPR/
    └── SF-XL/
        ├── t0/               # Images at time t0 
        │   ├── 00000000.png
        │   ├── 00000001.png
        │   └── ...
        ├── t1/               # Images at time t1
        │   ├── 00000000.png
        │   ├── 00000001.png
        │   └── ...
        └── mask/             # Ground-truth binary change masks
            ├── 00000000.png
            ├── 00000001.png
            └── ...

or

your_dataset_root/
└── VL_CMU_CD/
    └── test/
        ├── t0/               # Images at time t0
        │   ├── 000_1_00_0.png
        │   ├── 000_1_01_0.png
        │   └── ...
        ├── t1/               # Images at time t1
        │   ├── 000_1_00_0.png
        │   ├── 000_1_01_0.png
        │   └── ...
        └── gt/               # Ground-truth binary change masks
            ├── 000_1_00_0.png
            ├── 000_1_01_0.png
            └── ...

🚀 Inference on Dataset

To run inference on an entire dataset, use the following command src/test.sh for convenience:

CUDA_VISIBLE_DEVICES=0 python test.py \
    --test-dataset VL_CMU_CD \
    --output-size 512 \
    \
    --dataset-path F:/GeSCD/VL_CMU_CD/test \
    \
    --feature-facet key \
    --feature-layer 17 \
    --embedding-layer 32 \
    \
    --sam-backbone vit_h \
    --pseudo-backbone vit_h \
    \
    --points-per-side 32 \
    --pred-iou-thresh 0.7 \
    --stability-score-thresh 0.7

📌 Note

Argument Description
--test-dataset Dataset name (e.g., VL_CMU_CD, ChangeVPR, TSUNAMI)
--dataset-path Path to the dataset root directory
--output-size Final resolution of the output change mask
--feature-facet Which ViT token to extract (key, query, or value)
--feature-layer ViT layer to extract features from
--embedding-layer ViT layer to extract token embeddings for similarity
--sam-backbone Backbone used in Segment Anything (e.g., vit_h, vit_l, vit_b)
--pseudo-backbone Backbone used in the pseudo mask generator
--points-per-side Controls the sampling density for SAM proposals
--pred-iou-thresh Higher value → fewer but more confident masks
--stability-score-thresh Higher value → fewer but more stable masks

Tips for Faster Inference

  • Use smaller backbones:
    Replace vit_h with vit_l or vit_b for --sam-backbone and --pseudo-backbone.
  • Reduce --points-per-side to 16 for fewer region proposals.
  • Increase --pred-iou-thresh and --stability-score-thresh to filter out weak or noisy masks.

🏆 However, the best performance reported in the paper was achieved using:
--sam-backbone vit_h, --pseudo-backbone vit_h, --points-per-side 32,
--pred-iou-thresh 0.7, and --stability-score-thresh 0.7.

🖼️ Inference on Single Image Pair

To run inference on a single image pair (e.g., for visualization or quick testing), use the following command src/test_sinlge.sh:

📌 Make sure to set --test-dataset to Random when testing with a manually specified image path.

CUDA_VISIBLE_DEVICES=0 python test_single.py \
    --test-dataset Random \
    --output-size 512 \
    \
    --img-t0-path F:/GeSCD/ChangeVPR/SF-XL/t0/00000001.png \
    --img-t1-path F:/GeSCD/ChangeVPR/SF-XL/t1/00000001.png \
    --gt-path   F:/GeSCD/ChangeVPR/SF-XL/mask/00000001.png \
    \
    --feature-facet key \
    --feature-layer 17 \
    --embedding-layer 32 \
    \
    --sam-backbone vit_h \
    --pseudo-backbone vit_h \
    \
    --points-per-side 32 \
    --pred-iou-thresh 0.7 \
    --stability-score-thresh 0.7

📝 The ground-truth mask (--gt-path) is optional.
If provided, precision/recall/F1 will be calculated and logged.

🎨 Demo

Our GeSCF, as a SAM-based zero-shot framework, demonstrates exceptional robustness across a wide range of terrain conditions, extending even to challenging remote sensing change detection scenarios. Below are examples showing the triplets of t0, t1, and GeSCF's corresponding predictions.



Qualitative results of the GeSCF on ChangeVPR SF-XL (urban) split.


Qualitative results of the GeSCF on ChangeVPR St Lucia (suburban) split.


Qualitative results of the GeSCF on ChangeVPR Nordland (rural) split.


Qualitative results of the GeSCF on SECOND (test) benchmark.

🎖 Acknowledgement

We sincerely thank CSCDNet, CDResNet, DR-TANet and C-3PO for providing a strong benchmark of the SCD baselines. We also thank Segment Anything for providing an excellent vision foundation model.

🔗 Related Repositories

This project shares a similar research direction with other works exploring zero-shot scene change detection. Notable examples include segment-any-change, zero-shot-scd, and MV3DCD.

📃 Citation

If you find the work useful for your research, please cite:

@InProceedings{Kim_2025_CVPR,
    author    = {Kim, Jae-Woo and Kim, Ue-Hwan},
    title     = {Towards Generalizable Scene Change Detection},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {24463-24473}
}

About

This repository represents the official implementation of the paper titled "Towards Generalizable Scene Change Detection (CVPR 2025)".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published