- Background
- File Structure
- Usage
- Accuracy
This project is based on tensorflow 1.0, and python 2.7.
I spent quite a few days trying to improve the accuracy of this model, some cannot work, and some can. I have achieved 4.8% error rate on sequencial digits recognition. As a beginner of machine learning, it's quite good for me. But there are still quite a lot modern net structure I haven't use, so I believe it should be able to achieve even higher accuracy.
ckpt/
#store checkpoint filestrain_data/
train/
#extracted from train.tar.gztest/
#extracted from test.tar.gzextra/
#extracted from extra.tar.gzfull_train_imgs.tfrecords
#generated using svhn_data.pyfull_test_imgs.tfrecords
full_extra_imgs.tfrecords
digit_struct.py
#data structure for reading original imagessvhn_data.py
#convert images to tfrecord filessvhn.py
#model, training operation, loss operationsvhn_input.py
#generate input queue for training and evaluationsvhn_train.py
svhn_eval.py
multi_digit_reader.py
- Download train.tar.gz, test.tar.gz, extra.tar.gz
- Extract them into
/train_data
- Run
svhn_data.py
to generate tfrecord files - The data is ready now. You can run
svhn_train.py
to train it from start, or copy everything fromckpt-95.1%-acc/
tockpt/
(if there is nockpt/
folder, create one), and runsvhn_eval.py
to get the model accuracy - If you want to train the model from start, make sure there is nothing in
ckpt/
, or it will load the ckeckpoints fromckpt/
. The checkpoint is saved intrain_data/
, if you want to continue from your last training, then just put your last checkpoint intockpt/
- Run python multi_digit_reader.py image-name.png to read a complete image. This is not accurate at all, I'm trying to come up with a better way.
|without extra images(70K training images set) | 76% |
|use extra images(600K training+extra images set) | 86% |
|extra + 6 conv + 1 fc | 89.8% |
|extra + 6 conv + 2 fc | 91.2% |
|extra + 7 conv + 2 fc | 92.2% |
|extra + 7 conv + 2 fc + densely connect | 92.2% |
|extra + 8 conv + 2 fc | cannot train |
|extra + 7 conv + 2 fc + inception block | cannot train |
|extra + 7 conv + 2 fc + spatial transformer | cannot train |
|extra + 7 conv + 2 fc + increase number of params | 93.3% |
|extra + 7 conv + 2 fc + increase number of params + bacth normalization | 94.5% |
|extra + 7 conv + 2 fc + increase number of params + bacth normalization | 94.5% |
|extra + 7 conv + 2 fc + increase number of params + bacth normalization + clear some comments(???) | 95.1% |
|extra + 7 conv + 2 fc + increase number of params + bacth normalization + max-avg pooling | 95.2% |
|Update Nvidia driver from 375 -> 381 | 95.6% |
(WTF???)