This repository provides a Official PyTorch implementation of our CVPR 2023 paper Meta-Learning with a Geometry-Adaptive Preconditioner.
Model-agnostic meta-learning (MAML) is one of the most successful meta-learning algorithms. It has a bi-level optimization structure where the outer-loop process learns a shared initialization and the inner-loop process optimizes task-specific weights. Although MAML relies on the standard gradient descent in the inner-loop, recent studies have shown that controlling the inner-loop’s gradient descent with a meta-learned preconditioner can be beneficial. Existing preconditioners, however, cannot simultaneously adapt in a task-specific and path-dependent way. Additionally, they do not satisfy the Riemannian metric condition, which can enable the steepest descent learning with preconditioned gradient. In this study, we propose Geometry-Adaptive Preconditioned gradient descent (GAP) that can overcome the limitations in MAML; GAP can efficiently meta-learn a preconditioner that is dependent on task-specific parameters, and its preconditioner can be shown to be a Riemannian metric. Thanks to the two properties, the geometry-adaptive preconditioner is effective for improving the inner-loop optimization. Experiment results show that GAP outperforms the state-of-the-art MAML family and preconditioned gradient descent-MAML (PGD-MAML) family in a variety of few-shot learning tasks.
- We propose
a new preconditioned gradient descent method called GAP
, where it learns a preconditioner that enables a geometry-adaptive learning in the inner-loop optimization. - We prove that GAP's preconditioner has two desirable properties: (1) It is
both task-specific and path-dependent
. (2) It is aRiemannian metric
. - For large-scale architectures, we provide
a low-computational approximation called Approximate GAP
that can be theoretically shown to approximate the GAP method. - For popular few-shot learning benchmark tasks, we empirically show that
GAP outperforms the state-of-the-art MAML family and PGD-MAML family
.
This codes requires the following
- Python 3.6 or above
- PyTorch 1.8 or above
- Torchvision 0.5 or above
- Torchmeta 1.8
- mini-ImageNet
- Download mini-imagenet.tar.gz from https://github.com/renmengye/few-shot-ssl-public
- Make the folder miniimagenet and move the mini-imagenet.tar.gz into miniimagenet.
- Set the path of the folder where the data is downloaded through the argument
--folder
- Require argument
--download
when running the command for the first time.
- tiered-ImageNet
- Download tiered-imagenet.tar from https://github.com/renmengye/few-shot-ssl-public
- Make the folder tieredimagenet and move the tiered-imagenet.tar into tieredimagenet.
- Set the path of the folder where the data is downloaded through the argument
--folder
. - Require argument
--download
when running the command for the first time.
- Cars
- Download cars_train.tgz and cars_test.tgz from http://imagenet.stanford.edu/internal/car196/
- Download car_devkit.tgz from https://ai.stanford.edu/~jkrause/cars/
- Make the folder cars and move the cars_train.tgz, cars_test.tgz, and car_devkit.tgz into tieredimagenet.
- Set the path of the folder where the data is downloaded through the argument
--folder
. - Require argument
--download
when running the command for the first time.
- CUB
- Download CUB_200_2011.tgz dataset from http://www.vision.caltech.edu/visipedia-data/CUB-200-2011/
- Make the folder cub and move the CUB_200_2011.tgz into cub.
- Set the path of the folder where the data is downloaded through the argument
--folder
. - Require argument
--download
when running the command for the first time.
If you want to train 4-Conv network on mini-ImageNet, run this command:
# GAP on 5-way 1-shot
python main.py --dataset miniImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --iter 80000 --outer_lr1 0.0001 --outer_lr2 0.003 --batch_size 4 --use-cuda --GAP --download
# GAP on 5-way 5-shot
python main.py --dataset miniImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 5 --iter 80000 --outer_lr1 0.0001 --outer_lr2 0.0001 --batch_size 2 --use-cuda --GAP --download
# Approximate GAP on 5-way 1-shot
python main.py --dataset miniImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --iter 80000 --outer_lr1 0.0001 --outer_lr2 0.003 --batch_size 4 --use-cuda --GAP --approx --download
# Approximate GAP on 5-way 5-shot
python main.py --dataset miniImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 5 --iter 80000 --outer_lr1 0.0001 --outer_lr2 0.0001 --batch_size 2 --use-cuda --GAP --approx --download
If you want to train 4-Conv network on tiered-ImageNet, run this command:
# GAP on 5-way 1-shot
python main.py --dataset tieredImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --iter 130000 --outer_lr1 0.0001 --outer_lr2 0.003 --batch_size 4 --use-cuda --GAP --download
# GAP on 5-way 5-shot
python main.py --dataset tieredImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 5 --iter 200000 --outer_lr1 0.0001 --outer_lr2 0.0001 --batch_size 2 --use-cuda --GAP --download
# Approximate GAP on 5-way 1-shot
python main.py --dataset tieredImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --iter 130000 --outer_lr1 0.0001 --outer_lr2 0.003 --batch_size 4 --use-cuda --GAP --approx --download
# Approximate GAP on 5-way 5-shot
python main.py --dataset tieredImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 5 --iter 200000 --outer_lr1 0.0001 --outer_lr2 0.0001 --batch_size 2 --use-cuda --GAP --approx --download
To evaluate the trained model(s) using GAP, run this command:
# Source: miniImageNet >> Target: miniImageNet on 5-way 1-shot
python main.py --dataset_for_source miniImageNet --dataset_for_target miniImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --use-cuda --GAP --test
# Source: miniImageNet >> Target: tieredImageNet on 5-way 1-shot
python main.py --dataset_for_source miniImageNet --dataset_for_target tieredImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --use-cuda --GAP --test
# Source: miniImageNet >> Target: Cars on 5-way 1-shot
python main.py --dataset_for_source miniImageNet --dataset_for_target CARS --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --use-cuda --GAP --test
# Source: miniImageNet >> Target: CUB on 5-way 1-shot
python main.py --dataset_for_source miniImageNet --dataset_for_target CUB --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --use-cuda --GAP --test
To evaluate the trained model(s) using Approximate GAP, run this command:
# Source: miniImageNet >> Target: miniImageNet on 5-way 1-shot
python main.py --dataset_for_source miniImageNet --dataset_for_target miniImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --use-cuda --GAP --approx --test
# Source: miniImageNet >> Target: tieredImageNet on 5-way 1-shot
python main.py --dataset_for_source miniImageNet --dataset_for_target tieredImageNet --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --use-cuda --GAP --approx --test
# Source: miniImageNet >> Target: Cars on 5-way 1-shot
python main.py --dataset_for_source miniImageNet --dataset_for_target CARS --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --use-cuda --GAP --approx --test
# Source: miniImageNet >> Target: CUB on 5-way 1-shot
python main.py --dataset_for_source miniImageNet --dataset_for_target CUB --gpu_id 0 --N_ways 5 --K_shots_for_support 1 --use-cuda --GAP --approx --test
- Few-shot Regression for the sinusoid regression benchmark
- Few-shot classification on mini-ImageNet benchmark
- Few-shot classification on tiered-ImageNet benchmark
- Few-shot cross domain classification benchmark
Please consider citing our work if you find our repository/paper useful.
@InProceedings{Kang_2023_CVPR,
author = {Kang, Suhyun and Hwang, Duhun and Eo, Moonjung and Kim, Taesup and Rhee, Wonjong},
title = {Meta-Learning With a Geometry-Adaptive Preconditioner},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
pages = {16080-16090}
}
Please contact the author if you have any questions about our repository/paper: Suhyun Kang (su_hyun@snu.ac.kr
).