Skip to content

pitsios-s/SVHN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SVHN

Street View House Numbers (SVHN) is a real-world dataset containing images of house numbers taken from Google's street view. This repository contains the source code needed to built machine learning algorithms that can "recognize" the numbers on the images.

For this purpose, we will use deep learning and neural network techniques through Tensorflow.

The dataset can be downloaded from here: http://ufldl.stanford.edu/housenumbers/

There are 2 formats available, one with full images and one with cropped. In our case we used the second one.

In order for the code to run, the following libraries are needed, which can be installed via pip:

  • numpy
  • matplotlib
  • scipy
  • tensorflow (either with GPU support enabled, or not. In our case we used the lib with the GPU support)

The source code resided under the src directory and contains the following files:

  • display_image.py: A small script to load the dataset and display the images in RGB and grayscale format

  • svhn.py: A class for loading and manipulating the dataset

  • mlp.py: Contains the code needed to build a Multilayer Perceptron baseline classifier. The highest accuracy achieved with this, was around 86%

  • cnn.py: Contains the code needed to build a deep convolution network. We were able to achieve an accuracy of 94.5%

Finally, for the code to run properly, the data files should be placed in a directory called res, at the same level of src directory. The dataset names are "train_32x32.mat" for the training set, "test_32x32.mat" for the test set and "extra_32x32.mat" for an additional training set. The third file is optional, but if used, it dramatically increases prediction accuracy.

Directory report contains a short report for our work on the project, while presentation directory contains the file of our final presentation.

Note: Current implementation implies that the whole dataset(including extra cases) can fit into RAM. This typically requires about 9GB RAM. This will be fixed in the future.

This project was conducted as part of the "Content Analytics" class of the M.Sc. in Business Analytics, Athens University of Economics & Business.

About

Number Recognition using Deep Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages