Skip to content

drimpossible/ACM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ACM

This repository contains the code for the paper:

Online Continual Learning Without the Storage Constraint
Ameya Prabhu, Zhipeng Cai, Puneet Dokania, Philip Torr, Vladlen Koltun, Ozan Sener [Arxiv] [PDF] [Bibtex]

Figure which describes our ACM model

Installation and Dependencies

Our code was run on a 16GB RTX 3080Ti Laptop GPU with 64GB RAM and PyTorch >=1.13, although better GPU/RAM space will allow for faster experimentation.

  • Install all requirements required to run the code on a Python >=3.9 environment by:
# First, activate a new virtual environment
pip3 install -r requirements.txt

Fast Dataset Setup

  • There is a fast, direct mechanism to download and use our datasets implemented in this repository.
  • Input the directory where the dataset was downloaded into data_dir field in src/opts.py.
  • All codes in this repository were run on this dataset.

Recreating the Datasets

  • YOUR_DATA_DIR would contain two subfolders: cglm and cloc. Following are instructions to setup each dataset:

Continual Google Landmarks V2 (CGLM)

Download Images

  • You can download Continual Google Landmarks V2 dataset by following instructions on their Github repository, run in the DATA_DIR directory:
wget -c https://raw.githubusercontent.com/cvdfoundation/google-landmark/master/download-dataset.sh
mkdir train && cd train
bash ../download-dataset.sh train 499

Recreating Metadata

  • Download metadata by running the following commands in the scripts directory:
wget -c https://s3.amazonaws.com/google-landmark/metadata/train_attribution.csv
python cglm_scrape.py
  • Parse the XML files and organize it as a dictionary.
  • Ordering used in the paper is available to download from here.
  • Now, select only images that are a part of the order file and your dataset should be ready!

Continual YFCC100M (CLOC)

Extremely Fast Image Downloader

  • Download the cloc.txt file from this link inside the YOUR_DATASET_DIR/cloc directory.
  • The cloc.txt file contains 36.8M image links, removing missing/broken links from the original download file of CLOC.
  • Download the dataset parallely and scalably using img2dataset, finishes in <a day on a 8-node server (read instructions in img2dataset repo for further distributed download options):
pip install img2dataset
img2dataset --url_list cyfcc.txt --input_format "txt" --output_form webdataset output_folder images --process_count 16 --thread_count 256 --resize_mode no --skip_reencode True
  • Match the urls and file indexes to the idx for training script given in the original CLOC repo via this script .

Running the Code

Replication

Additional Experiments

  • To reproduce our KNN scaling graphs (Figure 1b), please run the following on a computer with high RAM:
cd scripts/
python knn_scaling.py
python plot_knn_results.py
  • To reproduce the blind classifier, please run the following:
cd scripts/
python run_blind.py
If you discover any bugs in the code please contact me, I will cross-check them with my nightmares.

Updates

  • New ordering files using the upload_date instead of date from EXIF metadata (more unique timestamps and more faithful to the story), we get this new order file. Differerent from order file at CLDatasets repo. Do not crosscompare.
  • However, no substantial changes observed in trends! The label correlation does not go away (slightly increases infact with better ordering, by breaking ties of same timestamps which led to random ordering!)

Citation

We hope ACM is a strong method for comparison, and this idea/codebase is useful for your cool CL idea! To cite our work:

@article{prabhu2023online,
  title={Online Continual Learning Without the Storage Constraint},
  author={Prabhu, Ameya and Cai, Zhipeng and Dokania, Puneet and Torr, Philip and Koltun, Vladlen and Sener, Ozan},
  journal={arXiv preprint arXiv:2305.09253},
  year={2023}
}

About

Codebase for adaptive continual memory

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages