# Project - Convolutional Neural Networks: Street View Housing Number Digit Recognition

**Marks: 30**

Dear Learner,

Welcome to project on Classification using Convolutional Neural Networks. We will work with the same dataset that we used for the Neural Network project: Street View Housing Numbers image dataset.

Do read the problem statement and the guidelines around the same.

----
### Context: 
-------

The ability to process visual information using machine learning algorithms can be very useful as demonstrated in various applications. The Street View House Numbers (SVHN) dataset is one of the most popular ones. It has been used in neural networks created by Google to read house numbers and match them to their geolocations. This is a great benchmark dataset to play with, learn and train models that accurately identify street numbers, and incorporate into all sorts of projects.

---------
### Objective:
------------
The objective of the exercise is to perform an image classification exercise on the given dataset to come up with a model that can help identify the digit images which have issues like picture brightness, blurriness. 

--------
### More about the dataset
------------
- The dataset is provided as a .h5 file. The basic preprocessing steps have been done.

---------------------------
### Guidelines
-----------------------------------------
- Note that some of the questions are similar to the ones asked in the previous project. Semi filled codes are not provided in those cases.
- You need to download the dataset from the given link and add it to your drive. Use colab for this exercise. 
- You will need to mount the drive and give proper path to read the dataset.
- The exercise consists of semi written code blocks. You need to fill the blocks as per the instructions to achieve the required results.
- To be able to complete the assessment in the expected time, do not change the variable names. The codes might throw errors when the names are changed. 
- The marks of each requirement is mentioned in the question.
- You can raise your issues on the discussion forum on the Olympus.
- Uncomment the code snippets and work on them
--------------------------------------------
Wishing you all the best!





### Mount the drive
Let us start by mounting the drive.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


Let us check for the version of installed tensorflow.

In [None]:
import tensorflow as tf
print(tf.__version__)

2.3.0


### Load the dataset
- Let us now, load the dataset that is available as a .h5 file.
- Split the data into train and the test dataset

In [None]:
import h5py
import numpy as np

# Open the file as readonly
# Make changes in path as required
h5f = h5py.File('/content/drive/My Drive/SVHN_single_grey1.h5', 'r')

# Load the training and the test set
X_train = h5f['X_train'][:]
y_train1 = h5f['y_train'][:]
X_test = h5f['X_test'][:]
y_test1 = h5f['y_test'][:]


# Close this file
h5f.close()

Let us import the required libraries now.

In [None]:
## Importing the required libraries
import numpy as np
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, BatchNormalization, Dropout, Flatten, LeakyReLU
from tensorflow.keras.utils import to_categorical

# fix random seed for reproducibility
seed = 7
np.random.seed(seed)

### Visualising images (1.5 marks)
- Use X_train to visualise the first 10 images. (1 marks)
- Use Y_train to print the first 10 labels (0.5 mark)

In [None]:
# visualizing the first 10 images in the dataset and their labels
# Your code here

### Data preparation (6 marks)

- Print the first image in the train image and figure out the shape of the images (0.5 mark)
- Reshape the train and the test dataset to make them fit the first convolutional operation that we will create later. Figure out the required shape (3 marks)
- Normalise the train and the test dataset by dividing by 255. (1 mark)
- Print the new shapes of the train and the test set. (0.5 mark)
- One hot encode the target variables (1 marks)

In [None]:
# Shape of the images and the first image
# Your code here

In [None]:
# Reshaping the dataset to flatten them. Remember that we are trying to reshape the 2D image data into a 3D data where there is just one channel

#Uncomment below to answer
# x_train = X_train.reshape(__________)
# x_test = X_test.reshape(___________)

In [None]:
# Normalize inputs from 0-255 to 0-1

# Your code here

In [None]:
# New shape 

#Your code here

In [None]:
# one hot encode output
# Your code here

### Model Building (7 marks)
- Write a function that returns a sequential model with the following architecture
 - First Convolutional layer with 16 filters and kernel size =3. Use the 'same' padding and provide apt input shape.
 - Add a leaky relu layer next with the value 0.1
 - First Convolutional layer with 32 filters and kernel size =3. Use the 'same' padding.
 - Another leakyRelu same as above.
 - A maxpooling layer with pool size of 2
 - Flatten the output from the previous layer
 - Add a dense layer with 32 nodes
 - Add a leakyRelu layer with slope(0.1)
 - Add the final output layer with nodes equal to the number of classes and softmax activation.
 - Compile the model with the categorical_crossentropy loss, adam optmizers (lr = 0.001) and accuracy metric.
- Do not fit the model here, just return the compiled model
- Call the model and print the model summary
- Fit the model on the train data with a validation split of 0.2, batch size = 32, verbose = 1 and 20 epochs. Store the model building history to use it later for visualisation.


In [None]:
# define model

from tensorflow.keras import losses
from tensorflow.keras import optimizers

# Uncomment below to answer

# def cnn_model_1():
#     model_1 = Sequential()
#     #Your code here
#     return model_1

In [None]:
# Call the function and print the model summary

In [None]:
# Fit the model and save the history
# Uncomment below to answer

# history_model_1 = model_1.fit()

### Plotting the validation and training accuracies (1.5 marks)

In [None]:
# plotting the accuracies
# Your code here


**Comments**


### Iteration 2 (12 marks)
- Experiment with adding dropout layers to make the model generalise better and report the results.
- Feel free to explore various architectures that can help you generalise better.
- Repeat all the steps done above and plot the results

In [None]:
# Uncomment below and complete

# def cnn_model_2():
#     # initialized a sequential model
#     model_2 = Sequential()
#     # Your code here
#     return model_2

In [None]:
#Call the function and print model summary

#Your code here

In [None]:
# Fit the model
# Uncomment below and complete
# history_model_2 = model_2.fit()

In [None]:
# plotting the accuracies

#Your code here

#### Comments:


### Test set prediction and final comments (Using the better model of the two iterations) (2 marks)
- predict on the test set and comment on the resultls obtained. (2 marks)


In [None]:
# predict on the test dataset
# Your code here

#### Comments

