## MNIST Dataset Notebook
    A jupyter notebook explaining how to read the MNIST dataset effciently into memory in Python.

## Contents
- [MNIST database](#id1)
    - [Dataset](#id2)
    - [Performance](#id3)
    - [Classifiers](#id4)
- [References](#idr)

<a id="id1"></a>
## MNIST database
- The **MNIST** database (**Modified National Institute of Standards and Technology database**) is a large database of **handwritten digits** that is commonly used for training various image processing systems.
- The database is also widely used for training and testing in the field of **machine learning**.
-  It was created by "re-mixing" the samples from **NIST's original datasets**.
- The creators felt that since **NIST's** training dataset was taken from **American Census Bureau employees**, while the testing dataset was taken from **American high school students**, it was not well-suited for machine learning experiments.
- Furthermore, the black and white images from **NIST** were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced **grayscale levels**.
- The **MNIST** database **contains 60,000 training images** and **10,000 testing images**.
- Half of the training set and half of the test set were taken from **NIST's** training dataset, while the other half of the training set and the other half of the test set were taken from **NIST's** testing dataset.
- An **extended dataset** similar to **MNIST** called **EMNIST** has been published in 2017, which contains **240,000 training images**, and **40,000 testing images** of handwritten digits and characters.
<img src="Images/Sample MNIST.png" alt="Sample MNIST" title="Sample MNIST" />
__*Sample images from MNIST test dataset.*__

<a id="id2"></a>
### Dataset
- The set of images in the **MNIST** database is a combination of two of **NIST's** databases: **Special Database 1** and **Special Database 3**. **Special Database 1** and **Special Database 3** consist of digits written by high school students and employees of the **United States Census Bureau**, respectively.

<a id="id3"></a>
### Performance
- Some researchers have achieved **"near-human performance"** on the **MNIST** database, using a committee of **neural networks**; in the same paper, the authors achieve performance double that of humans on other recognition tasks.
- The highest error rate listed on the original website of the database is **12 percent**, which is achieved using a **simple linear classifier with no preprocessing**.
- **In 2004**, a best-case error rate of **0.42 percent** was achieved on the database by researchers using a **new classifier** called the **LIRA**, which is a **neural classifier with three neuron layers** based on **Rosenblatt's** perceptron principles.
- Some researchers have tested artificial intelligence systems using the database put under random distortions. The systems in these cases are usually neural networks and the distortions used tend to be either affine distortions or elastic distortions. Sometimes, these systems can be very successful; one such system achieved an error rate on the database of **0.39 percent**.
- **In 2011**, an error rate of **0.27 percent**, improving on the previous best result, was reported by researchers using a similar system of neural networks. **In 2013**, an approach based on regularization of neural networks using DropConnect has been claimed to achieve a **0.21 percent** error rate. **Recently**, the single convolutional neural network best performance was **0.31 percent** error rate. As of August 2018, the best performance of a single convolutional neural network trained on **MNIST** training data using realtime data augmentation is **0.26** percent error rate. Also, the Parallel Computing Center (Khmelnitskiy, Ukraine) obtained an ensemble of only 5 convolutional neural networks which performs on **MNIST** at **0.21 percent** error rate. **Incorrect labeling of the testing dataset may prevent reaching test error rates of 0%**.

<a id="id4"></a>
### Classifiers
| Type | Classifier | Distortion | Preprocessing | Error rate (%) |
| ------------- | ------------ | ----------- | ------------ | ----------- |	 	 	 	
|Deep Neural network|2-layer 784-800-10|None|None|1.6|
|Deep Neural network|2-layer 784-800-10|elastic distortions|None|0.7|
|Non-linear classifier|40 PCA + quadratic classifier|None|None|3.3|
|Deep neural network|6-layer 784-2500-2000-1500-1000-500-10|elastic distortions|None|0.35|
|Convolutional neural network|6-layer 784-40-80-500-1000-2000-10|None|Expansion of the training data|0.31|
|Convolutional neural network|6-layer 784-50-100-500-1000-10-10|None|Expansion of the training data|0.27|
|Convolutional neural network|Committee of 35 CNNs, 1-20-P-40-P-150-10|elastic distortions|Width normalizations|0.23|
|Convolutional neural network|Committee of 5 CNNs, 6-layer 784-50-100-500-1000-10-10|None|Expansion of the training data|0.21|
|K-Nearest Neighbors|K-NN with non-linear deformation (P2DHMDM)|None|Shiftable edges|0.52|
|Linear classifier|Pairwise linear classifier|None|Deskewing|7.6|
|Boosted Stumps|Product of stumps on Haar features|None|Haar features|0.87|
|Support vector machine|Virtual SVM, deg-9 poly, 2-pixel jittered|None|Deskewing|0.56|

<a id="idr"></a>
###### References: 
- __[MNIST database](https://en.wikipedia.org/wiki/MNIST_database)__