Comparing different active learning strategies for image classification (FSDL course 2021 capstone project)
This repository builds upon the template of lab 08 of the Full Stack Deep Learning Spring 2021 labs and extends it with new datasets, models and a full active learning strategy experiment framework.
It was implemented as capstone project for the Spring course 2021 by Stefan Josef, Matthias Pfenninger and Ravindra Bharati.
Datasets: DroughtWatch, MNIST, Cassava and DeepWeeds
Models: PyTorch's ResNet50 extended with own input and output layers for both datasets mentioned above
Active learning experiment frameworks: Self-developed extension of the course's BaseDataModule that builds on top of a PyTorch's LightningDataModule (see training/run_experiment.py) and separate experiment routine that uses the modAL library (see training/run_modal_experiment.py)
Active learning sampling strategies: The following sampling strategies are available:
- Uncertainty Sampling
- least_confidence
- margin
- ratio
- entropy
- least_confidence_pt
- margin_ptmargin_pt
- ratio_pt
- entropy_pt
- Bayesian Uncertainty Sampling
- bald
- max_entropy
- least_confidence_mc
- margin_mc
- ratio_mc
- entropy_mc
- Diversity Sampling
- mb_outliers_mean
- mb_outliers_max
- mb_clustering
- mb_outliers_glosh
- Mixed Sampling
- mb_outliers_mean_least_confidence
- mb_outliers_mean_entropy
- Other Advanced Strategies
- active_transfer_learning
- dal
- Baseline
- random
git pull https://github.com/ravindrabharathi/fsdl-active-learning2 # clone from git
cd fsdl-active-learning2
make conda-update # creates a conda env with the base packages
conda activate fsdl-active-learning-2021 # activates the conda env
make pip-tools # installs required pip packages inside the conda env
# active learning experiment with DroughtWatch
python training/run_experiment.py \
--sampling_method=active_transfer_learning \
--data_class=DroughtWatch \
--model_class=ResnetClassifier \
--n_train_images=1000 \
--al_samples_per_iter=500 \
--al_iter=20 \
--max_epochs=20 \
--pretrained=True \
--binary \
--rgb \
--lr=3e-4 \
--gpus=1 \
--wandb
# active learning experiment with MNIST
python training/run_experiment.py \
--data_class=MNIST \
--model_class=MNISTResnetClassifier \
--gpus=1 \
--wandb
# active learning experiment via modAL framework
python training/run_modal_experiment.py \
--data_class=DroughtWatch \
--model_class=ResnetClassifier \
--al_query_strategy=margin_sampling
--gpus=1 \
--wandb
# clone project from github
!git clone https://github.com/ravindrabharathi/fsdl-active-learning2
%cd fsdl-active-learning2
# install necessary packages and add library directory to your pythonpath
!pip3 install boltons wandb pytorch_lightning==1.2.8 pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 torchtext==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
!pip3 install modAL tensorflow skorch hdbscan
%env PYTHONPATH=.:$PYTHONPATH
# initialize w&b with your personal info
!wandb login your_wandb_key
!wandb init --project your_wandb_project --entity your_wandb_entity #
# start experimenting
!python training/run_experiment.py \
--data_class=MNIST \
--model_class=MNISTResnetClassifier \
--gpus=1 \
--wandb
For more examples refer to the notebooks in the notebooks folder.
For more details please refer to the documentation and detailed project report.
We are happy if you want to contribute. Contact us on LinkedIn (see links above) if you want to discuss anything or open an issue here in the repository.