# What is Machine Learning?

It is a study of computer algorithm that improve automatically through experience. These algorithms build a Mathematical model (Neural Network) that is based on a sample called the "training data".

# What are Neural Networks?

It is a series of algorithms built such that they find the patterns and relations in a set of data, that mimics the working of the human brain.

<img src="assets/NeuralNetwork.png" width=600px>

Each of the nodes in the neural network is called a perceptron.

## Perceptrons

Every unit of a Neural Network is called a perceptron. It is essentially an artificial neuron.

<img src="assets/simple_neuron.png" width=400px>

Each perceptron has some number of weighted inputs and a bias, which are summed up (linear combination) and then passed through an activation function to get the output.

Mathematically this looks like: 

$$
\begin{align}
y &= f(w_1 x_1 + w_2 x_2 + b) \\
y &= f\left(\sum_i w_i x_i +b \right)
\end{align}
$$

With vectors this is the dot/inner product of two vectors:

$$
h = \begin{bmatrix}
x_1 \, x_2 \cdots  x_n
\end{bmatrix}
\cdot 
\begin{bmatrix}
           w_1 \\
           w_2 \\
           \vdots \\
           w_n
\end{bmatrix}
$$

## Activation Functions

They are mathematical equations used to determine the output of a neural network.

<img src="assets/Activation_Functions.png" width=700px>



## Tensors

Neural network computation is essentially just a bunch of linear algebra operations on *tensors*. Tensors are the data structures used for Neural Network computation and PyTorch and other such Deep Learning Libararies are built around them.

<img src="assets/Tensors.png" width=500px>

## Single Perceptron 

Now let's create our first perceptron/neuron.

In [2]:
#Import pytorch
import torch

In [None]:
#Define activation function (sigmoid)


In [None]:
#Generate Data



In [None]:
#Obtaining the output


You can do multiplication and sum in the same operation using matrix multiplication. Matrix multiplication is much more efficient with modern libraries and GPUs.

Here we'll use [`torch.mm()`](https://pytorch.org/docs/stable/torch.html#torch.mm) to perform matrix multiplication on features and weights.

**Note** : For matrix multiplication we need to have the two matrix in the form [M x N] and [N x O] .

Here in our case we have both the Tensors in a vector form which is [1 x 5].

To counter this issue we need to reshape this Tensor. We have a few options [`weights.reshape()`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.reshape), [`weights.resize_()`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.resize_), and [`weights.view()`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.view).

* `weights.reshape(a, b)` will return a new tensor with the same data as `weights` with size `(a, b)` sometimes, and sometimes a clone, as in it copies the data to another part of memory.
* `weights.resize_(a, b)` returns the same tensor with a different shape. However, if the new shape results in fewer elements than the original tensor, some elements will be removed from the tensor (but not from memory). If the new shape results in more elements than the original tensor, new elements will be uninitialized in memory. Here I should note that the underscore at the end of the method denotes that this method is performed **in-place**. Here is a great forum thread to [read more about in-place operations](https://discuss.pytorch.org/t/what-is-in-place-operation/16244) in PyTorch.
* `weights.view(a, b)` will return a new tensor with the same data as `weights` with size `(a, b)`.

> In today's session we'll use `.view()` .

In [1]:
#Using torch.mm()



## Creating our first neural network

Well that was a single neuron, which alone doesn't do much but when stacked together they showcase their real potential.

<img src="assets/multilayer_diagram_weights.png" width=500px>

Here the bottom layer provides the inputs and is called the **Input layer**. The layer that follows is what is called as the **Hidden layer**. The topmost layer, provides us with the output and is fittingly called the **Output layer**.

This network can be represented mathematically with matrices as shown below.

$$
\vec{h} = [h_1 \, h_2] = 
\begin{bmatrix}
x_1 \, x_2 \cdots \, x_n
\end{bmatrix}
\cdot 
\begin{bmatrix}
           w_{11} & w_{12} \\
           w_{21} &w_{22} \\
           \vdots &\vdots \\
           w_{n1} &w_{n2}
\end{bmatrix}
$$

The output for this small network is found by treating the hidden layer as inputs for the output unit. The network output is expressed simply

$$
y =  f_2 \! \left(\, f_1 \! \left(\vec{x} \, \mathbf{W_1}\right) \mathbf{W_2} \right)
$$

In [None]:
### Generate some data


# Features 


# Define the size of each layer in our network



# Weights for inputs to hidden layer


# Weights for hidden layer to output layer


# and bias terms for hidden and output layers



In [None]:
# Output


Once executed correctly you should see the output `tensor([[ 0.3171]])`. 