mixup: Beyond Empirical Risk Minimization
Switch branches/tags
Nothing to show
Clone or download
ynd Merge pull request #8 from junweima/fix
fix 0 percent training accuracy
Latest commit eaff31a Sep 9, 2018
Type Name Latest commit message Commit time
Failed to load latest commit information.
models Initial commit Feb 27, 2018
LICENSE Initial commit Feb 27, 2018
LICENSE-pytorch-cifar added pytorch-cifar license Feb 28, 2018
README.md Initial commit Feb 27, 2018
train.py fix 0 percent training accuracy Sep 7, 2018
utils.py Undefined names: import torch in utils.py Jun 4, 2018



By Hongyi Zhang, Moustapha Cisse, Yann Dauphin, David Lopez-Paz.

Facebook AI Research


Mixup is a generic and straightforward data augmentation principle. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples.

This repository contains the implementation used for the results in our paper (https://arxiv.org/abs/1710.09412).


If you use this method or this code in your paper, then please cite it:

title={mixup: Beyond Empirical Risk Minimization},
author={Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz},
journal={International Conference on Learning Representations},

Requirements and Installation

  • A computer running macOS or Linux
  • For training new models, you'll also need a NVIDIA GPU and NCCL
  • Python version 3.6
  • A PyTorch installation


Use python train.py to train a new model. Here is an example setting:

$ CUDA_VISIBLE_DEVICES=0 python train.py --lr=0.1 --seed=20170922 --decay=1e-4


This project is CC-BY-NC-licensed.


The CIFAR-10 reimplementation of mixup is adapted from the pytorch-cifar repository by kuangliu.