RDumb: A simple approach that questions our progress in continual test-time adaptation

🆕 CCC can now be streamed, no download required!

By default, CCC is now streamed from the cloud, with no download or generation of the data required. For example:

import eval

dataloader = eval.get_webds_loader("baseline_20_transition+speed_1000_seed_44")
for batch in dataloader:
    # ...

Available datasets: baseline_<baseline acc>_transition+speed_<speed>_seed_<seed>

baseline acc: 0, 20, 40
speed: 1000, 2000, 5000
seed: 43, 44, 45

RDumb: A simple approach that questions our progress in continual test-time adaptation

This repository contains the code used in our NeurIPs 2023 paper to evaluate models on our benchmark, Continuously Changing Corruptions (CCC). Using CCC, we are able to show that all current TTA models fail and become worse than a pretrained, non-adapting model. We show how a very simple baseline approach sets the state of the art not just on CCC, but on previous benchmarks as well, as well as on different architectures.

Dataset (Continuously Changing Corruptions)

CCC can be thought of as ImageNet-C, specifically built to evaluate continuously adapting models. Each image in CCC is noised using 2 noises. Using 2 noises, we can keep the baseline accuracy of the dataset constant, while enabling smooth transitions between pairs of noises.

You do not need to generate or download the dataset; by default, it is streamed from the cloud. The code to generate the dataset can be found in generate.py. The code is parallelizable, which means that the whole dataset can be generated quickly.

For example, you start generating using the following script:

python3 generate.py                  \
    --imagenetval /imagenet_dir/val/ \
    --dest /destination/folder/      \
    --baseline 40                    \
    --processind 1                   \
    --totalprocesses 1

To use more processes, simply run the script multiple times with different proccessind arguments. Because CCC is made up of 3 seeds x 3 transition speeds, it is recommended to use a total number of processes that is a multiple of 9. Here is an example Slurm script that can be used to launch multiple processes:

#!/bin/bash
#SBATCH --job-name=ccc
#SBATCH --array=0-89

singularity exec gen.sif
python3 generate.py                     \
    --imagenetval /path/to/imagenetval  \
    --dest /path/to/dest/               \
    --baseline 20                       \
    --processind ${SLURM_ARRAY_TASK_ID} \
    --totalprocesses 90                 \

Note: CCC-Hard, CCC-Medium, and CCC-Easy are generated with --baseline 0, 20, and 40 respectively.

Evaluating Adaptive Models

There a few TTA methods that are avaiable to test, including ours, RDumb. Because each difficulty level of CCC contains 3 seeds x 3 transition speeds, the evaluation code is built to evaluate the 9 runs all at once. A sample evaluation can be ran in the following manner:

python3 eval.py
--mode rdumb
--dset /path/to/ccc/
--logs /logs/folder/
--baseline 20
--processind ${SLURM_ARRAY_TASK_ID}

Citation

@inproceedings{
press2023rdumb,
title={{RD}umb: A simple approach that questions our progress in continual test-time adaptation},
author={Ori Press and Steffen Schneider and Matthias Kuemmerer and Matthias Bethge},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=VfP6VTVsHc}
}

Acknowledgements

Much of model code is based on the original Tent and EATA code. The generation code is based on ImageNet-C code. Other repos used: RPL, CPL, and CoTTA. A previous version of the dataset and code was published in Shift Happens '22 @ ICML.

See the LICENSE for more details and licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/static		.github/static
models		models
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
generate.py		generate.py
make_imagenet_c.py		make_imagenet_c.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🆕 CCC can now be streamed, no download required!

RDumb: A simple approach that questions our progress in continual test-time adaptation

Dataset (Continuously Changing Corruptions)

Evaluating Adaptive Models

Citation

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

oripress/CCC

Folders and files

Latest commit

History

Repository files navigation

🆕 CCC can now be streamed, no download required!

RDumb: A simple approach that questions our progress in continual test-time adaptation

Dataset (Continuously Changing Corruptions)

Evaluating Adaptive Models

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages