In [1]:
from IPython.display import Audio, Image, YouTubeVideo

# LESSON 4: Introduction to PyTorch

## CHAPTER 1: Welcome!

![mat-headshot.png](attachment:mat-headshot.png)
Hi, I'm Matt!

### Welcome!

Welcome! In this lesson, you'll learn how to use PyTorch for building deep learning models. PyTorch was released in early 2017 and has been making a pretty big impact in the deep learning community. It's developed as an open source project by the [Facebook AI Research team](https://research.fb.com/category/facebook-ai-research-fair/), but is being adopted by teams everywhere in industry and academia. In my experience, it's the best framework for learning deep learning and just a delight to work with in general. By the end of this lesson, you'll have trained your own deep learning model that can classify images of cats and dogs.

I'll first give you a basic introduction to PyTorch, where we'll cover __tensors__ - the main data structure of PyTorch. I'll show you how to create tensors, how to do simple operations, and how tensors interact with NumPy.

Then you'll learn about a module called __autograd__ that PyTorch uses to calculate gradients for training neural networks. Autograd, in my opinion, is amazing. It does all the work of backpropagation for you by calculating the gradients at each operation in the network which you can then use to update the network weights.

Next you'll use PyTorch to build a network and run data forward through it. After that, you'll define a loss and an optimization method to train the neural network on a dataset of handwritten digits. You'll also learn how to test that your network is able to generalize through __validation__.

However, you'll find that your network doesn't work too well with more complex images. You'll learn how to use pre-trained networks to improve the performance of your classifier, a technique known as __transfer learning__.

Follow along with the videos and work through the exercises in your own notebooks. If you get stuck, check out my solution videos and notebooks.

### Get the notebooks

The notebooks for this lesson will be provided in the classroom, but if you wish to follow along on your local machine, then the instructions below will help you get setup and ready to learn!

All the notebooks for this lesson are available from [our deep learning repo on GitHub](https://github.com/udacity/deep-learning-v2-pytorch). Please clone the repo by typing
```
git clone https://github.com/udacity/deep-learning-v2-pytorch.git
```
in your terminal. Then navigate to the ``intro-to-pytorch`` directory in the repo.

Follow along in your notebooks to complete the exercises. I'll also be providing solutions to the exercises, both in videos and in the notebooks marked (``Solution)``.


### Dependencies

These notebooks require PyTorch v0.4 or newer, and torchvision. The easiest way to install PyTorch and torchvision locally is by following [the instructions on the PyTorch site](https://pytorch.org/get-started/locally/). Choose the stable version, your appropriate OS and Python versions, and how you'd like to install it. You'll also need to install numpy and jupyter notebooks, the newest versions of these should work fine. Using the conda package manager is generally best for this,
```
conda install numpy jupyter notebook
```
If you haven't used conda before, please [read the documentation](https://conda.io/docs/) to learn how to create environments and install packages. I suggest installing Miniconda instead of the whole Anaconda distribution. The normal package manager pip also works well. If you have a preference, go with that.

The final part of the series has a soft requirement of a GPU used to accelerate network computations. Even if you don't have a GPU available, you'll still be able to run the code and finish the exercises. PyTorch uses a library called CUDA to accelerate operations using the GPU. If you have a GPU that [CUDA](https://developer.nvidia.com/cuda-zone) supports, you'll be able to install all the necessary libraries by installing PyTorch with conda. If you can't use a local GPU, you can use cloud platforms such as [AWS](https://docs.aws.amazon.com/dlami/latest/devguide/gpu.html), [GCP](https://cloud.google.com/gpu/), and [FloydHub](https://www.floydhub.com/) to train your networks on a GPU.

Our Nanodegree programs also provide GPU workspaces in the classroom, as well as credits for AWS.

### Feedback

If you have problems with the notebooks, please contact support or create an issue on the repo. We're also happy to incorporate your improvements through pull requests.


## CHAPTER 2: Single layer neural networks

In [2]:
id = '6Z7WntXays8'
YouTubeVideo(id=id, width=600)

Part 1 - 1
Task List
* Calculate the output of this single layer network using ``torch.sum()`` or ``.sum()``


## CHAPTER 3: Single layer neural networks solution

In [3]:
id = 'mNJ8CujTtpo'
YouTubeVideo(id=id, width=600)

Part 1 - 2

* Calculate the output of this single layer network using matrix multiplication.

## CHAPTER 4: Networks Using Matrix Multiplication

In [4]:
id = 'QLaGMz8Ca3E'
YouTubeVideo(id=id, width=600)

Part 1 - 3

* Calculate the output for the multi-layer network.


## CHAPTER 5: Multilayer Networks Solution

In [5]:
id = 'iMIo9p5iSbE'
YouTubeVideo(id=id, width=600)

## CHAPTER  6: Neural Networks in PyTorch

In [6]:
id = 'CSQOdOb2mlg'
YouTubeVideo(id=id, width=600)

Part 2 - 1
Task List
* Build a multi-layer network to identify handwritten digits in an image


## CHAPTER 7: Neural Networks Solution

In [7]:
id = 'zym36ihtOMY'
YouTubeVideo(id=id, width=600)

Part 2 - 2
Task List
* Implement the softmax function



## CHAPTER 8: Implementing Softmax Solution

In [8]:
id = '8KRX7HvqfP0'
YouTubeVideo(id=id, width=600)

Part 2 - 3
* Build a multi-layer network that uilizes the ReLU activation function


## CHAPTER 9: Network Architectures in PyTorch

In [9]:
id = '9ILiZwbi9dA'
YouTubeVideo(id=id, width=600)

Part 3 - 1
* Build a multi-layer network that utilizes log-softmax as the output activation function, calculte the loss using the negative log likehook loss.


## CHAPTER 10: Network Architectures Solution

In [10]:
id = 'zBWlOeX2sQM'
YouTubeVideo(id=id, width=600)

Part 3 - 2
Task List
* Implement the training pass for our network.
* View it's predictions!


## CHAPTER 11: Training a Network Solution

In [11]:
id = 'ExyFG2MjsKs'
YouTubeVideo(id=id, width=600)

## CHAPTER 12: Classifying Fashion-MNIST

In [12]:
id = 'AEJV_RKZ7VU'
YouTubeVideo(id=id, width=600)

Part 4
Task List
* Build and train a neural network to classify clothing images.

## CHAPTER 13: Fashion-MNIST Solution

In [13]:
id = 'R6Y4hPLVQWM'
YouTubeVideo(id=id, width=600)

## CHAPTER 14: Inference and Validation

In [14]:
id = 'XACXlkIdS7Y'
YouTubeVideo(id=id, width=600)

Part 5 - 1
* Implement the validation loop and print out the total accuracy.

## CHAPTER 15: Validation Solution

In [15]:
id = 'AjrXltxqsK4'
YouTubeVideo(id=id, width=600)

## CHAPTER 16: Dropout Solution

In [16]:
id = '3Py2SbtZLbc'
YouTubeVideo(id=id, width=600)

## CHAPTER 17: Saving and Loading Models

In [17]:
id = '3ZJfo2bR-uw'
YouTubeVideo(id=id, width=600)

## CHAPTER 18: Loading Image Data

In [18]:
id = 'hFu7GTfRWks'
YouTubeVideo(id=id, width=600)

## CHAPTER 19: Loading Image Data Solution

In [19]:
id = 'd_NhvI1yEf0'
YouTubeVideo(id=id, width=600)

## CHAPTER 20: Transfer Learning

In [20]:
id = 'S9F7MtJ5jls'
YouTubeVideo(id=id, width=600)

## CHAPTER 21: Transfer Learning Solution

In [21]:
id = '4n6T93hKRD4'
YouTubeVideo(id=id, width=600)

## CHAPTER 22: Tips, Tricks, and Other Notes

### Watch those shapes

In general, you'll want to check that the tensors going through your model and other code are the correct shapes. Make use of the ``.shape`` method during debugging and development.

### A few things to check if your network isn't training appropriately

Make sure you're clearing the gradients in the training loop with ``optimizer.zero_grad()``. If you're doing a validation loop, be sure to set the network to evaluation mode with ``model.eval()``, then back to training mode with ``model.train()``.

### CUDA errors

Sometimes you'll see this error:
```
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #1 ‘mat1’
```
You'll notice the second type is ``torch.cuda.FloatTensor``, this means it's a tensor that has been moved to the GPU. It's expecting a tensor with type ``torch.FloatTensor``, no ``.cuda`` there, which means the tensor should be on the CPU. PyTorch can only perform operations on tensors that are on the same device, so either both CPU or both GPU. If you're trying to run your network on the GPU, check to make sure you've moved the model and all necessary tensors to the GPU with ``.to(device)`` where ``device`` is either ``"cuda"`` or ``"cpu"``.
