# Deep Learning Project 2 

* Part 1 due on Friday 2/11
* Part 2 due on Thursday 2/15

## Multi-class classification via neural networks

In this project, you will learn how to implement all the basic components of a neural network, including forward propagation, gradient computation, and back propagation, following the development framework presented by Professor Andrew Ng for [binary classification](https://www.youtube.com/watch?v=eqEc66RFY0I&list=PLkDaE6sCZn6Ec-XTbcX1uRg2_u4xOEky0&index=7). The activities in this project will help you gain valuable intuition regarding several of the fundamental programming techniques powering neural network libraries. You will also develop appreciation, through hands-on application, for some of the practical considerations involved in training a neural network. 

As in Project 1, we will use the MNIST dataset to experiment with binary and multi-class classification problems. 

### Learning outcomes
After completing project 2, you will be able to:
* Implement neural networks that use cross-entropy loss for binary classification 
* Apply ideas originating in binary classification to multi-class classification problems
* Describe and apply activation functions, such as sigmoid and softmax
* Understand the role of parameters and hyper parameter initialization

### Multi-class classification: the MNIST dataset
The MNIST dataset consists of 70,000 gray-scale images (samples) of hand-written digits 0 through 9. The multi-class classification problem consists of classifying each sample accurately as belonging to one of ten classes. This dataset is divided into training (60,000) and test (10,000) datasets. 

### 1. Logistic regression using a neural network implementation framework (93% undergrad, 83% grad)
Prof. Andrew Ng's Coursera videos, assigned in Module 1, explain how logistic regression can be implemented as a single neuron that receives images as input and predicts their classification into one of two classes (cats vs. non-cats in his videos). He explains in detail how the process can be separated into a forward pass, calculation of a loss function, and numerical optimization using gradient descent in the back propagation step. 

Your job for this part of the project is to implement the "logistic regression with a neural network mindset" approach described by Professor Ng. For this you, will use a Jupyter notebook provided as part of his Coursera course. A folder containing the notebook as well as other files and folders needed can be found [here](https://drive.google.com/drive/folders/1E4vHPakEiAgdWVcNJnlEnYhHqPU-N35D?usp=sharing).

#### Implementation requirements (60%)
* The Jupyter notebook contains step-by-step implementation instructions. Follow these instructions carefully.
* Your code should use the vectorization techniques learned from Prof. Ng's videos. **Pay attention to the order of dimensions of the data matrix X, they are ordered as (features, samples)** 
* You can use the cat/non-cat dataset to test your implementation, but it's not required.
* You do need to test your binary classification code on the MNIST dataset. To do this, set up a binary classification problem using the MNIST multi-class dataset (you might need to do some preprocessing) and then test your code (e.g. the loss function is being minimized, the accuracy is reasonable).
    
**Suggestions:** 
* Always keep the size of your matrices and vectors in mind to avoid confusion.
* Include some tests or sanity checks as you've seen in our homework assignment and in Prof. Ng's notebook.
* Avoid loops. Learn to use vectorization.

#### Analysis (40%)
Experiment with your code to provide answers and corresponding justification for the following questions:

* (15%) Are there digits in the MNIST dataset that seem harder to classify? If so, did you see similar trends when you applied unsupervised learning in Project 1?
* (15%) How do the learning curve, the train accuracy, and the test accuracy change as function of the learning rate assuming a fix number of iterations (say, 2000 iterations)?
* (10%) What do you think about the accuracy results you are getting? Do you trust them? Why or why not. *Hint:* Think about how you preprocessed the MNIST data to set up your binary classification problem.
* (3% bonus) Your algorithm has learned a weight vector and a bias value. Is there something you can say about the weight vector? Is there meaning to it or is it just a useless byproduct?


### 2. Extending the framework for multi-class classification (7% undergrad, 17% grad)

There are various ways in which one can extend the idea behind binary classification for the purpose of multi-class classification. For example:
 * One can train 10 binary classification models in parallel (10 neurons with 10 outputs, one for each class), where 10 different weight vectors and 10 bias values can be learned simultaneously.  
 * A second slightly different option is to put together all 10 weight vectors w into a matrix W and all 10 bias values into a vector b and replace the sigmoid activation function with a softmax activation function. Your code will have to deal with a weight matrix W instead of vectors w. This would be a true neural network of one layer consisting of 10 nodes (neurons).
 * A third, more complicated, option is to implement a two-layer neural network. The first (hidden) layer can consist of N nodes (say 100) and the second layer would be the output layer consisting of 10 nodes and a softmax activation function. 
 
For this part, you must implement one of the three models described above and demonstrate that your code is producing accurate results. 



### What to turn in Project 2

By Sunday 9/20 at noon, you should turn in:
* A Jupyter notebook with your code for Project 2 - Part 1 (include output cells you want me to look at!)
* Your analysis of binary classification via a single (logistic regression) neuron in a PDF or in Markdown cells in your Jupyter notebook.

By Friday 9/25 at noon, you should turn in:
* A Jupyter notebook with your code for Project 2 - Part 2 (include output cells you want me to look at!)
* Results of multi-classification on the MNIST dataset and brief analysis in a PDF or Mardown cells in your Jupyter notebook.
    