This repository contains the code necessary for reproducing the results from my master thesis: "Experimental comparison of regularization on learning dynamics in deep learning".
mixup is a neural network training method that generates new samples by linear interpolation of multiple samples and their labels. In the image domain, the mixup method has a proven record of better generalization ability than the traditional Empirical Risk Minimization method (ERM). At the same time, we lack an intuitive understanding of why mixup is helping. In this work, we attempted to understand better the mixup phenomenon, particularly regarding its impact on the difficulty of decision-making by neural networks. First, we conduct a series of experiments to gather necessary knowledge about the nature of mixup. Next, we make a hypothesis that gives an in-depth understating of why mixup improves generalization.
To get access to the full PDF version of the thesis, please contact me at: piotr.helm.97@gmail.com
Install dependencies
# clone project
git clone https://github.com/piotrhm/master-thesis.git
cd master-thesis
# [OPTIONAL] create conda environment
conda create -n myenv python=3.8
conda activate myenv
# eg. source activate master-thesis
# install pytorch according to instructions
# https://pytorch.org/get-started/
# install requirements
pip install -r requirements.txt
Train model with default configuration
# train on CPU
python train.py trainer.gpus=0
# train on GPU
python train.py trainer.gpus=1
Train model with chosen experiment configuration from configs/experiment/
python train.py experiment=experiment_name.yaml
You can override any parameter from command line like this
python train.py trainer.max_epochs=20 datamodule.batch_size=64