Code for National Data Science Bowl. 10th place.
Lua Shell
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
appendix Update aws_copy_models.sh Mar 17, 2015
data first commit Mar 17, 2015
figure first commit Mar 17, 2015
models first commit Mar 17, 2015
.gitignore first commit Mar 17, 2015
LICENSE first commit Mar 17, 2015
LeakyReLU.lua first commit Mar 17, 2015
NOTICE first commit Mar 17, 2015
README.md Update README.md Mar 17, 2015
TrueNLLCriterion.lua first commit Mar 17, 2015
cnn_48x48.lua first commit Mar 17, 2015
cnn_72x72.lua first commit Mar 17, 2015
cnn_96x96.lua first commit Mar 17, 2015
convert_data.lua first commit Mar 17, 2015
cuda_test.lua first commit Mar 17, 2015
ensemble.lua first commit Mar 17, 2015
iproc.lua first commit Mar 17, 2015
minibatch_sgd.lua first commit Mar 17, 2015
predict.lua first commit Mar 17, 2015
preprocess.lua first commit Mar 17, 2015
run.sh first commit Mar 17, 2015
settings.lua Update settings.lua Mar 18, 2015
train.lua first commit Mar 17, 2015
transform.lua first commit Mar 17, 2015
util.lua first commit Mar 17, 2015

README.md

Kaggle-NDSB

Code for National Data Science Bowl at Kaggle. Ranked 10th/1049.

Summary

Ensemble Deep CNNs trained with real-time data augmentation.

Preprocessing centering, convert to a square image with padding, convert to a negative.
Source Destination
Data augmentation real-time data agumentation (apply the random transformation each minibatchs). transformation method includes translation, scaling, rotation, perspective cropping and contrast scaling.
Neural Network Architecture Three CNN architectures for different rescaling inputs. cnn_96x96, cnn_72x72, cnn_48x48
Normalization Global Contrast Normalization (GCN)
Optimization method minibatch-SGD with Nesterov momentum.
Results
Model Public LB score
cnn_48x48 single model 0.6718
cnn_72x72 single model 0.6487
cnn_96x96 single model 0.6561
cnn_48x48 average of 8 models 0.6507
cnn_72x72 average of 8 models 0.6279
cnn_96x96 average of 8 models 0.6311
ensemble (cnn_48x48(x8) * 0.2292 + cnn_72x72(x8) * 0.3494 + cnn_96x96(x8) * 0.4212 + 9.828e-6) 0.6160

Developer Environment

Installation

Install CUDA, Torch7, NVIDIA CuDNN, cudnn.torch.

Checking CUDA environment

th cuda_test.lua

Please check your Torch7/CUDA environment when this code fails.

Convert dataset

Place the data files into a subfolder ./data.

ls ./data
test  train  train.txt test.txt classess.txt
  • th convert_data.lua

Training, Validation, Make submission

training & validate single cnn_48x48 model.

th train.lua -model 48 -seed 101
ls -la models/cnn*.t7

make submission file.

th predict.lua -model 48 -seed 101
ls -la models/submission*.txt

when use cnn_72x72 model.

th train.lua -model 72 -seed 101
th predict.lua -model 72 -seed 101

when use cnn_96x96 model.

th train.lua -model 96 -seed 101
th predict.lua -model 96 -seed 101

Ensemble

This task is very heavy. I used x20 g2.xlarge instances for this task and it's takes 4 days.

(helper tool can be found at ./appendix folder.)

th train.lua -model 48 -seed 101
th train.lua -model 48 -seed 102
th train.lua -model 48 -seed 103
th train.lua -model 48 -seed 104
th train.lua -model 48 -seed 105
th train.lua -model 48 -seed 106
th train.lua -model 48 -seed 107
th train.lua -model 48 -seed 108
th train.lua -model 72 -seed 101
th train.lua -model 72 -seed 102
th train.lua -model 72 -seed 103
th train.lua -model 72 -seed 104
th train.lua -model 72 -seed 105
th train.lua -model 72 -seed 106
th train.lua -model 72 -seed 107
th train.lua -model 72 -seed 108
th train.lua -model 96 -seed 101
th train.lua -model 96 -seed 102
th train.lua -model 96 -seed 103
th train.lua -model 96 -seed 104
th train.lua -model 96 -seed 105
th train.lua -model 96 -seed 106
th train.lua -model 96 -seed 107
th train.lua -model 96 -seed 108

th predict.lua -model 48 -seed 101
th predict.lua -model 48 -seed 102
th predict.lua -model 48 -seed 103
th predict.lua -model 48 -seed 104
th predict.lua -model 48 -seed 105
th predict.lua -model 48 -seed 106
th predict.lua -model 48 -seed 107
th predict.lua -model 48 -seed 108
th predict.lua -model 72 -seed 101
th predict.lua -model 72 -seed 102
th predict.lua -model 72 -seed 103
th predict.lua -model 72 -seed 104
th predict.lua -model 72 -seed 105
th predict.lua -model 72 -seed 106
th predict.lua -model 72 -seed 107
th predict.lua -model 72 -seed 108
th predict.lua -model 96 -seed 101
th predict.lua -model 96 -seed 102
th predict.lua -model 96 -seed 103
th predict.lua -model 96 -seed 104
th predict.lua -model 96 -seed 105
th predict.lua -model 96 -seed 106
th predict.lua -model 96 -seed 107
th predict.lua -model 96 -seed 108

th ensemble.lua > submission.txt