Skip to content
PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"
Branch: master
Clone or download
carpedm20 Merge pull request #35 from nkcr/patch-1
Adds a "single" mode that loads and trains a given dag
Latest commit ed2232d Nov 9, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
assets add best_rnn_epoch27.png Feb 15, 2018
data Partial initial cleanup, working towards PyTorch 0.4 bug in cell. Mar 2, 2018
models Moves the Node definition to utils.py (typo fix) Nov 6, 2018
.gitignore initial commit Feb 15, 2018
LICENSE update LICENSE Mar 26, 2018
README.md add link for official code Apr 3, 2018
config.py
dag.json
generate_gif.py initial commit Feb 15, 2018
main.py Parses the "single" mode Nov 6, 2018
requirements.txt fix dimension and dropout bug during the test() Feb 18, 2018
run.sh
tensorboard.py initial commit Feb 15, 2018
trainer.py Adds parameters to use a single dag in single mode Nov 6, 2018
utils.py

README.md

Efficient Neural Architecture Search (ENAS) in PyTorch

PyTorch implementation of Efficient Neural Architecture Search via Parameters Sharing.

ENAS_rnn

ENAS reduce the computational requirement (GPU-hours) of Neural Architecture Search (NAS) by 1000x via parameter sharing between models that are subgraphs within a large computational graph. SOTA on Penn Treebank language modeling.

**[Caveat] Use official code from the authors: link**

Prerequisites

  • Python 3.6+
  • PyTorch
  • tqdm, scipy, imageio, graphviz, tensorboardX

Usage

Install prerequisites with:

conda install graphviz
pip install -r requirements.txt

To train ENAS to discover a recurrent cell for RNN:

python main.py --network_type rnn --dataset ptb --controller_optim adam --controller_lr 0.00035 \
               --shared_optim sgd --shared_lr 20.0 --entropy_coeff 0.0001

python main.py --network_type rnn --dataset wikitext

To train ENAS to discover CNN architecture (in progress):

python main.py --network_type cnn --dataset cifar --controller_optim momentum --controller_lr_cosine=True \
               --controller_lr_max 0.05 --controller_lr_min 0.0001 --entropy_coeff 0.1

or you can use your own dataset by placing images like:

data
├── YOUR_TEXT_DATASET
│   ├── test.txt
│   ├── train.txt
│   └── valid.txt
├── YOUR_IMAGE_DATASET
│   ├── test
│   │   ├── xxx.jpg (name doesn't matter)
│   │   ├── yyy.jpg (name doesn't matter)
│   │   └── ...
│   ├── train
│   │   ├── xxx.jpg
│   │   └── ...
│   └── valid
│       ├── xxx.jpg
│       └── ...
├── image.py
└── text.py

To generate gif image of generated samples:

python generate_gif.py --model_name=ptb_2018-02-15_11-20-02 --output=sample.gif

More configurations can be found here.

Results

Efficient Neural Architecture Search (ENAS) is composed of two sets of learnable parameters, controller LSTM θ and the shared parameters ω. These two parameters are alternatively trained and only trained controller is used to derive novel architectures.

1. Discovering Recurrent Cells

rnn

Controller LSTM decide 1) what activation function to use and 2) which previous node to connect.

The RNN cell ENAS discovered for Penn Treebank and WikiText-2 dataset:

ptb wikitext

Best discovered ENAS cell for Penn Treebank at epoch 27:

ptb

You can see the details of training (e.g. reward, entropy, loss) with:

tensorboard --logdir=logs --port=6006

2. Discovering Convolutional Neural Networks

cnn

Controller LSTM samples 1) what computation operation to use and 2) which previous node to connect.

The CNN network ENAS discovered for CIFAR-10 dataset:

(in progress)

3. Designing Convolutional Cells

(in progress)

Reference

Author

Taehoon Kim / @carpedm20

You can’t perform that action at this time.