Deep Learning for Image Processing

This project detects multidigit sequences in natural scenes. The dataset for this project can be found at http://ufldl.stanford.edu/housenumbers/

This project has been updated from when I created it as the Machine Learning Capstone project for Udacity. I have added Batch normalization and a deeper network as well as creating 5 data points from each picture. The write-up included is out of date.

The model gets 98 precent training accuracy, 94 precent validation accuracy, and 95 precent test accurcay after 100 epochs. The model included Batch Norm, Dropout, and Weight Decay for regularization. Using so much regularization and still having so much difference between the training and validation error means that the training distribuation is very different than the validation distribution. To solve this more data is most likely needed.

The following libraries are required:

Tensorflow 0.9.0
numpy 1.10.4
scipy 0.17.0
matplotlib 1.5.1
PIL 1.1.7
opencv 3.1.0

The program is run with python Camera.py To download and process the data run the first option "Process The Datasets" This will download, extract, and process the images. It will save the data into the files "train_dataset.npy", "train_labels.npy", "valid_dataset.npy", "valid_labels.npy", test_dataset.npy", and "test_labels.py". The next step it to train the model using the second option "Train the model". This will train the model and save it into a file called model.ckpt. Analytics from the datasets and the training can be found using the third option "Display Analytics". The fouth option "Example Use" displays some example images along with the labels and predictions. The fifth option "Use the model" allows users to use the camera or to load an image from disk that will be feed to the network.

The trained model is included (along with the file that includes the Analytics and a test folder to allow for seeing some images in the example use) so users can run the Example use case or the camera without training the model. Before training the model the data will have to be fetched which should run automatically when running Camera.py

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
latex		latex
.gitignore		.gitignore
Analytics.pkl		Analytics.pkl
Analytics.py		Analytics.py
Camera.py		Camera.py
Data_Split.py		Data_Split.py
Digit_Struct_File.py		Digit_Struct_File.py
Download.py		Download.py
Extract.py		Extract.py
Generate_Dataset.py		Generate_Dataset.py
Graph.py		Graph.py
Model.py		Model.py
Network.py		Network.py
Preprocess.py		Preprocess.py
README.md		README.md
Report.pdf		Report.pdf
Report.tex		Report.tex
Save.py		Save.py
Visualize.py		Visualize.py

blester125/multi_digit_recognition

Folders and files

Latest commit

History

Repository files navigation

Deep Learning for Image Processing

About

Topics

Resources

Stars

Watchers

Forks

Languages