### ENV Set Up

In [37]:
# ENV Setup
# For Dark Theme nad to Rest to Default
# !pip install jupyterthemes

# from jupyterthemes import get_themes
# from jupyterthemes.stylefx import set_nb_theme

# set_nb_theme('onedork')

# !jt -r
%config IPCompleter.greedy=True
print('Deep Learning With PyTorch')

Deep Learning With PyTorch


In [6]:
# !pip install jovian --upgrade

# !jovian clone aakashns/01-pytorch-basics

[jovian] NOTE: Jovian is currently in beta, so if you face any issues, 
               please report them here: https://github.com/JovianML/jovian-py/issues[0m
[jovian] Fetching aakashns/01-pytorch-basics ..[0m
[jovian] Downloading files..[0m
[jovian] [32mCloned successfully to '01-pytorch-basics'[0m
[33m[4m
Next steps:[0m[1m
  $ cd 01-pytorch-basics
  $ jovian install
  $ conda activate <env_name>
  $ jupyter notebook
[0m[0m

Replace <env_name> with the name of your environment (without the '<' & '>')
Jovian uses Anaconda ( https://conda.io/ ) under the hood,
so please make sure you have it installed and added to path.
* If you face issues with `jovian install`, try `conda env update`.
* If you face issues with `conda activate`, try `source activate <env_name>`
  or `activate <env_name>` to activate the virtual environment.
[0m


With this, we complete our discussion of tensors and gradients in PyTorch, and we're ready to move on to the next topic: *Linear regression*.

## Credits

The material in this series is heavily inspired by the following resources:

1. [PyTorch Tutorial for Deep Learning Researchers](https://github.com/yunjey/pytorch-tutorial) by Yunjey Choi: 

2. [FastAI development notebooks](https://github.com/fastai/fastai_docs/tree/master/dev_nb) by Jeremy Howard: 


# Chapter One [  PyTorch Basics  ]
### Tensor and Gradient

In [10]:
# !ls
# !pwd
# !cd 01-pytorch-basics/

# !pip install torch
import torch

In [14]:
t1 = torch.tensor(4.)
# print(t1)
# print(type(t1))
# t1.dtype

tensor(4.)
<class 'torch.Tensor'>


torch.float32

In [15]:
# Vector
t2 = torch.tensor([1.,2,3,4])
t2

tensor([1., 2., 3., 4.])

In [18]:
# Matrix
t3 = torch.tensor([[1.,2],[10,21],[65,56]])
t3

tensor([[ 1.,  2.],
        [10., 21.],
        [65., 56.]])

In [24]:
# 3-Dimension
t4 = torch.tensor([
    [[11,12,13],
    [14,15,16]],
    [[17,18,19],
    [20.,21,22]]
])
# Have to supply same number of element
# expected sequence of length 3 at dim 2 (got 4)

t4

tensor([[[11., 12., 13.],
         [14., 15., 16.]],

        [[17., 18., 19.],
         [20., 21., 22.]]])

Tensor can have any Number of Dimension, and Different lengths along each Dimension. We can inspect the length along each dimension using __.shape__ property

In [22]:
t1.shape

torch.Size([])

In [26]:
t2.shape

torch.Size([4])

In [27]:
t3.shape

torch.Size([3, 2])

In [28]:
t4.shape

torch.Size([2, 2, 3])

### Tensor Operations and Gradients
We can combine tensors with the usual arithmetic operations. Lets look an Example.

In [41]:
# Create Tensor
x = torch.tensor(2.)
w = torch.tensor(5., requires_grad=True)
b = torch.tensor(9., requires_grad=True)


We've Created 3 Tensors ```x```,```w``` and ```b```, all numbers. w and b have an additional parameter __requires_grad__ set to __True__. We'll see that what it does in just a moment.

Let's create a new Tensor ```y``` by combining these Tensors.

In [42]:
y = w * x + b

y

tensor(19., grad_fn=<AddBackward0>)

As expected, ```y``` is a tensor with the value ```5 * 2 + 9 = 19```. What makes __PyTorch__ special is that we can automatically compute the derivative of ```y``` w.r.t. the tensors that have ```requires_grad``` set to ```True``` i.e. ```w``` and ```b```. To compute the derivatives, we can call the ```.backward``` method on our result ```y```.

In [43]:
# Compute Derivative
y.backward()

The Derivative of ```y``` w.r.t the input tensors are stored in the ```.grad``` property of the respective Tesnor

In [44]:
# Display Gradients
print('dy/dx:', x.grad)
print('dy/dw:', w.grad)
print('dy/db:', b.grad)

dy/dx: None
dy/dw: tensor(2.)
dy/db: tensor(1.)


As expected, `dy/dw` has the same value as `x` i.e. `3`, and `dy/db` has the value `1`. Note that `x.grad` is `None`, because `x` doesn't have `requires_grad` set to `True`. 

The "grad" in `w.grad` stands for gradient, which is another term for derivative, used mainly when dealing with matrices. 

## Interoperability with ```Numpy```

[Numpy](http://www.numpy.org/) is a popular open source library used for mathematical and scientific computing in Python. It enables efficient operations on large multi-dimensional arrays, and has a large ecosystem of supporting libraries:

* [Matplotlib](https://matplotlib.org/) for plotting and visualization
* [OpenCV](https://opencv.org/) for image and video processing
* [Pandas](https://pandas.pydata.org/) for file I/O and data analysis

Instead of reinventing the wheel, PyTorch interoperates really well with Numpy to leverage its existing ecosystem of tools and libraries.

In [7]:
import numpy as np

In [12]:

x = np.array([[1,2], [3,4.]])
print(x, x.dtype)

[[1. 2.]
 [3. 4.]] float64


We can Convert the numpy array to a torch Tensor by ```torch.from_numpy()``` or ```torch.tensor()```

In ```torch.from_numpy()``` it uses same space and Memory where as ```torch.tensor()``` create a copy of the data on Memory

In [11]:
# Convert numpy array to torch tensor:
y = torch.from_numpy(x)
# y = torch.tensor(x)
y

tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)

In [13]:
x.dtype, y.dtype


(dtype('float64'), torch.float64)

We can Convert a PyTorch tensor to Numpy array using the ```.numpy``` method of a tensor

In [14]:
z = y.numpy()
print(z)
z.dtype

[[1. 2.]
 [3. 4.]]


dtype('float64')

The Interoperability between PyTorch and Numpy is really important because most Datasets you'll  work with will likely to be read and processed as Numpy arrays

In [1]:
# import jovian
# jovian.commit()

# Chapter - II [  Linear Regression  ]

### Linear Regression and Gradient Descent

### Linear Regression with PyTorch

In  this chapter we'll discuss one of the foundational  algorithms of machine learning in this post: *Linear regression*. We'll create a model that predicts crop yields for apples and oranges (*target variables*) by looking at the average temperature, rainfall and humidity (*input variables or features*) in a region. Here's the training data:

![linear-regression-training-data](https://i.imgur.com/6Ujttb4.png)

In a linear regression model, each target variable is estimated to be a weighted sum of the input variables, offset by some constant, known as a bias :

```
yield_apple  = w11 * temp + w12 * rainfall + w13 * humidity + b1
yield_orange = w21 * temp + w22 * rainfall + w23 * humidity + b2
```

Visually, it means that the yield of apples is a linear or planar function of temperature, rainfall and humidity:

![linear-regression-graph](https://i.imgur.com/4DJ9f8X.png)

The *learning* part of linear regression is to figure out a set of weights `w11, w12,... w23, b1 & b2` by looking at the training data, to make accurate predictions for new data (i.e. to predict the yields for apples and oranges in a new region using the average temperature, rainfall and humidity). This is done by adjusting the weights slightly many times to make better predictions, using an optimization technique called *gradient descent*.

In [33]:
import numpy as np
import torch

## Training data

The training data can be represented using 2 matrices: `inputs` and `targets`, each with one row per observation, and one column per variable.

In [34]:
# Input Data (temp, rainfall, humidity)
inputs = np.array([
                   [73,67,43],
                   [91,88,64],
                   [87,134,58],
                   [102,43,37],
                   [69,96,70]
                  ], dtype='float32')

inputs.dtype

dtype('float32')

In [35]:
# Targets (Apple, Orange)
targets = np.array([
    [56,70],
    [81,101],
    [119,113],
    [22,37],
    [103,119]
],dtype='float32')
# targets.dtype

We've separated the input and target variables, because we'll operate on them separately. Also, we've created numpy arrays, because this is typically how you would work with training data: read some CSV files as numpy arrays, do some processing, and then convert them to PyTorch tensors as follows:

In [36]:
inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)
print(inputs)
print(targets)

tensor([[ 73.,  67.,  43.],
        [ 91.,  88.,  64.],
        [ 87., 134.,  58.],
        [102.,  43.,  37.],
        [ 69.,  96.,  70.]])
tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 113.],
        [ 22.,  37.],
        [103., 119.]])


## Linear regression model from scratch

The weights and biases (`w11, w12,... w23, b1 & b2`) can also be represented as matrices, initialized as random values. The first row of `w` and the first element of `b` are used to predict the first target variable i.e. yield of apples, and similarly the second for oranges.

In [106]:
w = torch.randn(2, 3, requires_grad=True)
b = torch.randn(2, requires_grad=True)

print(w)
print(b)


tensor([[ 0.8659, -1.3779,  0.2994],
        [ 1.6028,  0.8040, -1.5446]], requires_grad=True)
tensor([-3.2363,  0.2313], requires_grad=True)


```torch.randn``` creates a tensor with given shape, with elements picked randomly from a [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution) with mean 0 and standard devation 1.

_Our Model_ is simply a function that performs a matrix multiplication of the ```inputs``` and the weights `w`(transposed) and adds the bias `b` (replicated for each observation)  


![matrix-mult](https://i.imgur.com/WGXLFvA.png)

We can define the model as follows:

In [82]:
def model(x):
    return x @ w.t() + b

In [107]:
# Generate Prediction

preds = model(inputs)
print(preds)

tensor([[-19.4703, 104.6852],
        [-26.5326, 117.9829],
        [-95.1784, 157.8225],
        [ 36.9164, 141.1382],
        [-54.8105,  79.8857]], grad_fn=<AddBackward0>)


`@` represents matrix multiplication in PyTorch, and the `.t` method returns the transpose of a tensor.

The matrix obtained by passing the input data into the model is a set of predictions for the target variables.

Lets compare the predictions of our model with the actual Target


In [45]:
# compare with target
print(targets)

tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 113.],
        [ 22.,  37.],
        [103., 119.]])


You can see that there's a huge difference between the predictions of our model, and the actual values of the target variables. Obviously, this is because we've initialized our model with random weights and biases, and we can't expect it to *just work*.

# Loss Function

Before we improve our model, we need a way to evalute how well our model is performing. We can compare the model's prediction with the actual targets, using the following method.


   * Calculate teh difference between the two matrices (preds and targets)
   * Squre all elements of the difference matrix to remove negative values.
   * Calculate the average of the elements in the resulting matrix.

The result is a single number, known as the mean squared error (MSE).

In [119]:
diff = targets - preds
diffSquare = diff * diff

torch.sum(diffSquare) / diff.numel()

tensor(10413.3311, grad_fn=<DivBackward0>)

In [120]:
# Mean Squre Error (MSE) loss
def mse(t1, t2):
    diff = t1 - t2
    return torch.sum(diff *  diff) / diff.numel()

`torch.sum` returns the sum of all the elements in a tensor, and the `.numel` method returns the number of elements in a tensor. Let's compute the mean squared error for the current predictions of our model 

In [121]:
# compute loss
loss = mse(preds, targets)
print(loss)

tensor(10413.3311, grad_fn=<DivBackward0>)
