# Homework 3: Introduction to PyTorch

PyTorch is a framework for creating and training neural networks. It's one of the most common neural network libraries, alongside TensorFlow, and is used extensively in both academia and industry. In this homework, we'll explore the basic operations within PyTorch, and we'll design a neural network to classify images.

Let's start by importing the libraries that we'll need:

In [1]:
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn

import numpy as np

%matplotlib inline
import matplotlib.pyplot as plt

If you can't import torch, go to www.pytorch.org and follow the instructions there for downloading PyTorch. You can select CUDA Version as None, as we won't be working with any GPUs on this homework.

## PyTorch: Tensors

In PyTorch, data is stored as multidimensional arrays, called tensors. Tensors are very similar to numpy's ndarrays, and they support many of the same operations. We can define tensors by explicity setting the values, using a python list:

In [None]:
A = torch.tensor([[1, 2], [4, -3]])
B = torch.tensor([[3, 1], [-2, 3]])

print("A:")
print(A)

print('\n')

print("B:")
print(B)

Just like numpy, PyTorch supports operations like addition, multiplication, transposition, dot products, and concatenation of tensors. Look up and fill in the operations for the following:

In [None]:
print("Sum of A and B:")
### YOUR CODE HERE

print('\n')

print("Elementwise product of A and B:")
### YOUR CODE HERE

print('\n')

print("Matrix product of A and B:")
### YOUR CODE HERE

print('\n')

print("Transposition of A:")
### YOUR CODE HERE

print('\n')

print("Concatenation of A and B in the 0th dimension:")
### YOUR CODE HERE

print('\n')

print("Concatenation of A and B in the 1st dimension:")
### YOUR CODE HERE


PyTorch also has tools for creating large tensors automatically, without explicity specifying the values. Find the corresponding tensor initialzers and fill in below. Your print statements should look like the following:

3x4x5 Tensor of Zeros:
```
tensor([[[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]])
```


5x5 Tensor with random elements sampled from a standard normal distrubtion: (these should be randomly generated values)
```
tensor([[ 0.2850,  0.5033, -1.8570, -1.6525,  0.3613],
        [-0.7505,  0.4573, -0.2454,  0.1668,  0.7241],
        [ 0.2976,  0.9827, -0.4879, -1.1144, -1.8235],
        [-0.0264,  0.7341, -0.2235,  0.5306,  0.8385],
        [ 0.2740,  0.3522, -0.5244, -0.1132,  0.5135]])
```

Tensor created from a range:
```
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
```


In [None]:
print("3x4x5 Tensor of Zeros:")
### YOUR CODE HERE

print('\n')

print("5x5 Tensor with random elements sampled from a standard normal distrubtion:")
### YOUR CODE HERE

print('\n')

print("Tensor created from a range:")
### YOUR CODE HERE


Now, use PyTorch tensors to complete the following computation:

Create a tensor of integers from the range 0 to 99, inclusive. Add 0.5 to each element in the tensor, and square each element of the result. Then, negate each element of the tensor, and apply the exponential to each element (i.e., change each element x into e^x). Now, sum all the elements of the tensor. Multiply this tensor by 2 and square each element and print your result.

If you're right, you should get something very close to $$\pi \approx 3.14 .$$

In [None]:
val = torch.arange(100).float()

### YOUR CODE HERE

print(val)

Now we'll try writing a computation that's prevalent throughout a lot of deep learning algorithms - calculating the softmax function:
$$softmax(x_i) = \frac{e^{x_i}}{\sum_{j = 0}^{n - 1} e^{x_j}}$$
Calculate the softmax function for the $val$ tensor below where $n$ is the number of elements in $val$, and $x_i$ is each element in $val$. DO NOT use the built-in softmax function. We should end up with a tensor that represents a probability distribution that sums to 1. (hint: you should calculate the sum of the exponents first)

In [None]:
val1 = torch.arange(10).float()

### YOUR CODE HERE

print(result1)
print(torch.sum(result1))

To do this, you'll need to use the PyTorch documentation at https://pytorch.org/docs/stable/torch.html. Luckily, PyTorch has very well-written docs.

## PyTorch: Autograd

Autograd is PyTorch's automatic differentiation tool: It allows us to compute gradients by keeping track of all the operations that have happened to a tensor. In the context of neural networks, we'll interpret these gradient calculations as backpropagating a loss through a network.

To understand how autograd works, we first need to understand the idea of a __computation graph__. A computation graph is a directed, acyclic graph (DAG) that contains a blueprint of a sequence of operations. For a neural network, these computations consist of matrix multiplications, bias additions, ReLUs, softmaxes, etc. Nodes in this graph consist of the operations themselves, while the edges represent tensors that flow forward along this graph.

In PyTorch, the creation of this graph is __dynamic__. This means that tensors themselves keep track of their own computational history, and this history is build as the tensors flow through the network; this is unlike TensorFlow, where an external controller keeps track of the entire computation graph. This dynamic creation of the computation graph allows for lots of cool control-flows that are not possible (or at least very difficult) in TensorFlow.

![alt text](https://raw.githubusercontent.com/pytorch/pytorch/master/docs/source/_static/img/dynamic_graph.gif)
<center>_Dynamic computation graphs are cool!_</center>
_ _

Let's take a look at a simple computation to see what autograd is doing. First, let's create two tensors and add them together. To signal to PyTorch that we want to build a computation graph, we must set the flag requires_grad to be True when creating a tensor.

In [None]:
a = torch.tensor([1, 2], dtype=torch.float, requires_grad=True)
b = torch.tensor([8, 3], dtype=torch.float, requires_grad=True)

c = a + b

Now, since a and b are both part of our computation graph, c will automatically be added:

In [None]:
c.requires_grad

When we add a tensor to our computation graph in this way, our tensor now has a grad_fn attribute. This attribute tells autograd how this tensor was generated, and what tensor(s) this particular node was created from.

In the case of c, its grad_fn is of type AddBackward1, PyTorch's notation for a tensor that was created by adding two tensors together:

In [None]:
c.grad_fn

Every grad_fn has an attribute called next_functions: This attribute lets the grad_fn pass on its gradient to the tensors that were used to compute it.

In [None]:
c.grad_fn.next_functions

If we extract the tensor values corresponding to each of these functions, we can see a and b! 

In [None]:
print(c.grad_fn.next_functions[0][0].variable)
print(c.grad_fn.next_functions[1][0].variable)

In this way, autograd allows a tensor to record its entire computational history, implicitly creating a computational graph -- All dynamically and on-the-fly!

## PyTorch: Modules and Parameters

In PyTorch, collections of operations are encapsulated as __modules__. One way to visualize a module is to take a section of a computational graph and collapse it into a single node. Not only are modules useful for encapsulation, they have the ability to keep track of tensors that are contained inside of them: To do this, simply wrap a tensor with the class torch.nn.Parameter.

To define a module, we must subclass the type torch.nn.Module. In addition, we must define a _forward_ method that tells PyTorch how to traverse through a module.

For example, let's define a logistic regression module. This module will contain two parameters: The weight vector and the bias. Calling the _forward_ method will output a probability between zero and one.

In [None]:
class LogisticRegression(nn.Module):
    
    def __init__(self):
        
        super().__init__()
        self.weight = nn.Parameter(torch.randn(10))
        self.bias = nn.Parameter(torch.randn(1))
        self.sigmoid = nn.Sigmoid()
        
    def forward(self, vector):
        return self.sigmoid(torch.dot(vector, self.weight) + self.bias)
        

Note that we have fixed the dimension of our weight to be 10, so our module will only accept 10-dimensional data.

We can now create a random vector and pass it through the module:

In [None]:
module = LogisticRegression()
vector = torch.randn(10)
output = module(vector)

In [None]:
output

Now, say that our loss function is mean-squared-error and our target value is 1. We can then write our loss as:

In [None]:
loss = (output - 1) ** 2

In [None]:
loss

To minimize this loss, we just call loss.backward(), and all the gradients will be computed for us! Note that wrapping a tensor as a Parameter will automatically set requires_grad = True.

In [None]:
loss.backward()

In [None]:
print(module.weight.grad)
print(module.bias.grad)

## Fully-connected Networks for Image Classification

Using this knowledge, you will create a neural network in PyTorch for image classification on the CIFAR-10 dataset. PyTorch uses the $DataLoader$ class for you to load data into batches to feed to your learning algorithms - we highly suggest you familiarze yourself with this as well as the Dataset API here: https://pytorch.org/docs/stable/data.html. Fill in the below code to instantiate 3 DataLoaders for your training, validation and test sets. We would prefer that you NOT use the `torchvision.transform` API - we want you to get some practice in data preprocessing! Here are the transformations we want you to perform:
1. Split the `val_and_test_set` into two separate datasets (each with 5000 elements)
2. Convert all the `np.array` elements into `torch.tensor` elements.
3. All values will be pixel values in our images are in the range of [0, 256]. Normalize this so that each pixel is in the range [0, 1].
3. Flatten all images. All your images will be of shape (32, 32, 3), we need them as flat (32 * 32 * 3) size tensors as input to our neural network.
4. Load everything into a DataLoader. (check how this works in the PyTorch docs!) 

Be sure to have the options `shuffle=True` (so that your dataset is shuffled so that samples from the dataset are not correlated) and also `batch_size=32` or larger. This is a standard minibatch size. If you're curious about what batch size does (and are somewhat familiar with statistics), here's a great answer https://stats.stackexchange.com/questions/316464/how-does-batch-size-affect-convergence-of-sgd-and-why.

In [2]:
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./cifardata', train=True, download=True, transform=transform)
#trainset = [(np.asarray(image) / 256, label) for image, label in trainset]
Trsampler = torch.utils.data.SubsetRandomSampler(np.arange(5000,25000, dtype=np.int64))
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4096, sampler = Trsampler, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./cifardata', train=False, download=True, transform=transform)

Vsampler = torch.utils.data.SubsetRandomSampler(np.arange(0, 5000, dtype=np.int64))
Tsampler = torch.utils.data.SubsetRandomSampler(np.arange(0, 5000, dtype=np.int64))

Valid_loader = torch.utils.data.DataLoader(trainset, batch_size=64,
                                          sampler = Vsampler, shuffle=False, num_workers=2)
Test_loader = torch.utils.data.DataLoader(testset, batch_size=256,
                                          sampler = Tsampler, shuffle=False, num_workers=2)
#val_and_test_set = [(np.asarray(image) / 256, label) for image, label in val_and_test_set]



classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Files already downloaded and verified
Files already downloaded and verified


CIFAR-10 consists of 32 x 32 color images, each corresponding to a unique class indicating the object present within the image. Use Matplotlib to print out the first few images.

In [None]:
for images, x in trainloader:
    print(images.shape[0])

Now try to build and train a plain neural network that properly classifies images in the CIFAR-10 dataset. Try to achieve at least around 40% accuracy (the higher the better!).

Take a look at the PyTorch documentation for some help in how to do this.

Google is your friend -- Looking things up on the PyTorch docs and on StackOverflow will be helpful.

In [3]:
class NeuralNet(nn.Module):
    
    def __init__(self, input_dim, hidden_dim, output_dim):
        
        super().__init__()
        self.weight_1 = nn.Parameter(torch.randn(input_dim, hidden_dim))
        self.weight_2 = nn.Parameter(torch.randn(hidden_dim, output_dim))
        self.bias_1 = nn.Parameter(torch.full((1, hidden_dim), 1.41414141))
        self.bias_2 = nn.Parameter(torch.full((1, output_dim), 1.41414141))
        self.sigmoid = nn.Sigmoid()
        ### YOUR CODE HERE
        
    def forward(self, data):
        
        ### YOUR CODE HERE
        data = data.view(data.size()[0], -1)
        
        hidden = self.sigmoid(torch.mm(data, self.weight_1) + self.bias_1)
        return self.sigmoid(torch.mm(hidden, self.weight_2) + self.bias_2)
        

The following NN works but is of only 11% accuracy :'(

In [None]:
class NeuralNet2(nn.Module):
    
    def __init__(self, input_dim, hidden_dim, output_dim):
        
        super().__init__()
        self.conv1 = torch.nn.Conv2d(3, 24, kernel_size=3, stride=1, padding=1)
        self.pool = torch.nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
        self.fc1 = torch.nn.Linear(input_dim, hidden_dim)
        self.fc2 = torch.nn.Linear(hidden_dim, output_dim)
        
    def forward(self, data):
        data = nn.functional.relu(self.conv1(data))
        data = self.pool(data)
        data = data.view(-1, 24*16*16)
        data = torch.nn.functional.relu(self.fc1(data))
        data = self.fc2(data)
        return(data)

In [None]:
def reshape(images):
    '''
    Reshapes a set of images of the shape (batch_size, width, height, channels)
    into the proper shape (batch_size, width * height * channels) that the model can accept.
    '''
    x = images.reshape(images.shape[0], -1).float
    return x

The following cell was for initial testing purpose which only gave me an accuracy of 11% :'(

In [None]:
from tqdm import tqdm
from torch.autograd import Variable
EPOCHS = 2
LEARNING_RATE = 0.01
INPUT_SIZE = 24*16*16
HIDDEN_SIZE = 640

OUTPUT_SIZE = 10

net = NeuralNet(INPUT_SIZE, HIDDEN_SIZE, OUTPUT_SIZE)
secondnet = NeuralNet2(INPUT_SIZE, HIDDEN_SIZE, OUTPUT_SIZE)
### Define an optimizer and a loss function here. We pass our network parameters to our optimizer here so we know
### which values to update by how much.
optimizer = torch.optim.SGD(net.parameters(), lr=LEARNING_RATE)
loss_fn = nn.CrossEntropyLoss()

for epoch in tqdm(range(EPOCHS)):
    
    total_loss = 0
    
    for images, labels in trainloader:
        print(images.size())
        #data = reshape(images)
        
        ### <YOUR CODE HERE> ####
        outputs = secondnet(Variable(images))
        loss = loss_fn(outputs, Variable(labels))
        
        # Zero gradients, call .backward(), and step the optimizer.
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        ### </YOUR CODE HERE> ###
        
        total_loss += loss.item()
        
    average_loss = total_loss / len(trainloader)
    
    val_loss = 0
    for inputs, labels in Valid_loader:
            
        inputs, labels = Variable(inputs), Variable(labels)

        val_outputs = secondnet(inputs)
        val_loss_size = loss_fn(val_outputs, labels)
        val_loss += val_loss_size.item()
    true_val_loss = val_loss/len(Valid_loader)
    
    print("(epoch, train_loss, val_loss) = ({0}, {1}, {2})".format(epoch, average_loss, true_val_loss))

In [7]:
from tqdm import tqdm
from torch.autograd import Variable
EPOCHS = 64
LEARNING_RATE = 0.01
INPUT_SIZE = 3*32*32
HIDDEN_SIZE = 640

OUTPUT_SIZE = 10

net = NeuralNet(INPUT_SIZE, HIDDEN_SIZE, OUTPUT_SIZE)
### Define an optimizer and a loss function here. We pass our network parameters to our optimizer here so we know
### which values to update by how much.
optimizer = torch.optim.Adam(net.parameters(), lr=LEARNING_RATE)
loss_fn = nn.CrossEntropyLoss()

for epoch in tqdm(range(EPOCHS)):
    
    total_loss = 0
    
    for images, labels in trainloader:
        #data = reshape(images)
        data = torch.as_tensor(images)
        ### <YOUR CODE HERE> ####
        outputs = net(data)
        loss = loss_fn(outputs, Variable(labels))
        
        # Zero gradients, call .backward(), and step the optimizer.
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        ### </YOUR CODE HERE> ###
        
        total_loss += loss.item()
        
    average_loss = total_loss / len(trainloader)
    
    val_loss = 0
    for inputs, labels in Valid_loader:
            
        inputs, labels = Variable(inputs), Variable(labels)

        val_outputs = net(inputs)
        val_loss_size = loss_fn(val_outputs, labels)
        val_loss += val_loss_size.item()
        
    true_val_loss = val_loss/len(Valid_loader)
    print("(epoch, train_loss, val_loss) = ({0}, {1}, {2})".format(epoch, average_loss, true_val_loss))

  2%|▏         | 1/64 [00:12<13:22, 12.74s/it]

(epoch, train_loss, val_loss) = (0, 2.276385450363159, 2.182380679287488)


  3%|▎         | 2/64 [00:27<13:54, 13.46s/it]

(epoch, train_loss, val_loss) = (1, 2.1397921562194826, 2.1035599165324923)


  5%|▍         | 3/64 [00:49<16:08, 15.88s/it]

(epoch, train_loss, val_loss) = (2, 2.083799457550049, 2.0726055646244483)


  6%|▋         | 4/64 [01:17<19:26, 19.45s/it]

(epoch, train_loss, val_loss) = (3, 2.051314687728882, 2.0557260120971295)


  8%|▊         | 5/64 [01:48<22:43, 23.10s/it]

(epoch, train_loss, val_loss) = (4, 2.0259772300720216, 2.0429058829440345)


  9%|▉         | 6/64 [02:15<23:21, 24.16s/it]

(epoch, train_loss, val_loss) = (5, 2.0023616790771483, 2.0358708946010733)


 11%|█         | 7/64 [02:43<23:58, 25.24s/it]

(epoch, train_loss, val_loss) = (6, 1.9835288047790527, 2.0238905239708815)


 12%|█▎        | 8/64 [03:10<24:01, 25.74s/it]

(epoch, train_loss, val_loss) = (7, 1.9672231674194336, 2.0214270458945744)


 14%|█▍        | 9/64 [03:38<24:23, 26.61s/it]

(epoch, train_loss, val_loss) = (8, 1.9503082513809205, 2.016754176043257)


 16%|█▌        | 10/64 [04:07<24:26, 27.15s/it]

(epoch, train_loss, val_loss) = (9, 1.934702754020691, 2.014897545681724)


 17%|█▋        | 11/64 [04:40<25:39, 29.05s/it]

(epoch, train_loss, val_loss) = (10, 1.9218452215194701, 2.008464597448518)


 19%|█▉        | 12/64 [05:13<26:16, 30.32s/it]

(epoch, train_loss, val_loss) = (11, 1.910205340385437, 2.005813687662535)


 20%|██        | 13/64 [05:54<28:20, 33.35s/it]

(epoch, train_loss, val_loss) = (12, 1.898662304878235, 2.006450924692275)


 22%|██▏       | 14/64 [06:30<28:22, 34.04s/it]

(epoch, train_loss, val_loss) = (13, 1.8912533044815063, 2.0105047723915)


 23%|██▎       | 15/64 [07:02<27:28, 33.64s/it]

(epoch, train_loss, val_loss) = (14, 1.880330538749695, 2.008422359635558)


 25%|██▌       | 16/64 [07:36<26:55, 33.66s/it]

(epoch, train_loss, val_loss) = (15, 1.8726033687591552, 2.0110658256313467)


 27%|██▋       | 17/64 [08:07<25:50, 32.99s/it]

(epoch, train_loss, val_loss) = (16, 1.8626423120498656, 2.005887452560135)


 28%|██▊       | 18/64 [08:43<25:51, 33.73s/it]

(epoch, train_loss, val_loss) = (17, 1.8554442882537843, 2.0118831350833557)


 30%|██▉       | 19/64 [09:16<25:10, 33.56s/it]

(epoch, train_loss, val_loss) = (18, 1.847984004020691, 2.0131433221358286)


 31%|███▏      | 20/64 [09:49<24:25, 33.30s/it]

(epoch, train_loss, val_loss) = (19, 1.84121413230896, 2.0081227625472637)


 33%|███▎      | 21/64 [10:19<23:15, 32.46s/it]

(epoch, train_loss, val_loss) = (20, 1.835422968864441, 2.0099584814868394)


 34%|███▍      | 22/64 [10:52<22:52, 32.67s/it]

(epoch, train_loss, val_loss) = (21, 1.8270803451538087, 2.014195611205282)


 36%|███▌      | 23/64 [11:27<22:41, 33.20s/it]

(epoch, train_loss, val_loss) = (22, 1.8206854820251466, 2.0128201668775536)


 38%|███▊      | 24/64 [12:00<22:05, 33.14s/it]

(epoch, train_loss, val_loss) = (23, 1.8135206937789916, 2.0134127879444557)


 39%|███▉      | 25/64 [12:32<21:18, 32.79s/it]

(epoch, train_loss, val_loss) = (24, 1.8083905220031737, 2.0111940812460984)


 41%|████      | 26/64 [13:05<20:57, 33.09s/it]

(epoch, train_loss, val_loss) = (25, 1.8035419702529907, 2.0114160459252854)


 42%|████▏     | 27/64 [13:47<22:01, 35.71s/it]

(epoch, train_loss, val_loss) = (26, 1.7976524829864502, 2.018115884141077)


 44%|████▍     | 28/64 [14:30<22:35, 37.66s/it]

(epoch, train_loss, val_loss) = (27, 1.791428279876709, 2.0150814961783494)


 45%|████▌     | 29/64 [15:04<21:24, 36.71s/it]

(epoch, train_loss, val_loss) = (28, 1.7888504981994628, 2.01721406133869)


 47%|████▋     | 30/64 [15:38<20:22, 35.95s/it]

(epoch, train_loss, val_loss) = (29, 1.7840125799179076, 2.0151581613323355)


 48%|████▊     | 31/64 [16:20<20:40, 37.60s/it]

(epoch, train_loss, val_loss) = (30, 1.7795496940612794, 2.015768505349944)


 50%|█████     | 32/64 [16:51<19:04, 35.76s/it]

(epoch, train_loss, val_loss) = (31, 1.7757586002349854, 2.0176706314086914)


 52%|█████▏    | 33/64 [17:23<17:49, 34.50s/it]

(epoch, train_loss, val_loss) = (32, 1.7709418296813966, 2.0242208861097506)


 53%|█████▎    | 34/64 [17:55<16:57, 33.91s/it]

(epoch, train_loss, val_loss) = (33, 1.7669003486633301, 2.018828206424472)


 55%|█████▍    | 35/64 [18:31<16:36, 34.38s/it]

(epoch, train_loss, val_loss) = (34, 1.7632784128189087, 2.0237284762949885)


 56%|█████▋    | 36/64 [19:05<15:59, 34.25s/it]

(epoch, train_loss, val_loss) = (35, 1.7601085662841798, 2.0214488959010644)


 58%|█████▊    | 37/64 [19:36<14:58, 33.29s/it]

(epoch, train_loss, val_loss) = (36, 1.7578677892684937, 2.028432263603693)


 59%|█████▉    | 38/64 [20:12<14:52, 34.31s/it]

(epoch, train_loss, val_loss) = (37, 1.753878402709961, 2.030676885496212)


 61%|██████    | 39/64 [20:43<13:53, 33.35s/it]

(epoch, train_loss, val_loss) = (38, 1.7499709129333496, 2.0275689921801603)


 62%|██████▎   | 40/64 [21:14<13:02, 32.60s/it]

(epoch, train_loss, val_loss) = (39, 1.746769332885742, 2.0297913521151)


 64%|██████▍   | 41/64 [21:51<12:58, 33.84s/it]

(epoch, train_loss, val_loss) = (40, 1.7427177906036377, 2.040441191649135)


 66%|██████▌   | 42/64 [22:28<12:46, 34.86s/it]

(epoch, train_loss, val_loss) = (41, 1.7399041175842285, 2.035642329650589)


 67%|██████▋   | 43/64 [23:06<12:32, 35.81s/it]

(epoch, train_loss, val_loss) = (42, 1.7358195543289185, 2.0373106953463975)


 69%|██████▉   | 44/64 [23:39<11:38, 34.90s/it]

(epoch, train_loss, val_loss) = (43, 1.7332677841186523, 2.038607635075533)


 70%|███████   | 45/64 [24:12<10:50, 34.24s/it]

(epoch, train_loss, val_loss) = (44, 1.7303296327590942, 2.0413995694510545)


 72%|███████▏  | 46/64 [24:47<10:19, 34.44s/it]

(epoch, train_loss, val_loss) = (45, 1.727752923965454, 2.0379853656020344)


 73%|███████▎  | 47/64 [25:18<09:31, 33.62s/it]

(epoch, train_loss, val_loss) = (46, 1.7250313758850098, 2.042662754843507)


 75%|███████▌  | 48/64 [25:53<09:03, 33.94s/it]

(epoch, train_loss, val_loss) = (47, 1.7217178106307984, 2.0390784725358215)


 77%|███████▋  | 49/64 [26:25<08:19, 33.30s/it]

(epoch, train_loss, val_loss) = (48, 1.7195943593978882, 2.045817732810974)


 78%|███████▊  | 50/64 [26:56<07:35, 32.52s/it]

(epoch, train_loss, val_loss) = (49, 1.7175128698348998, 2.042241274556027)


 80%|███████▉  | 51/64 [27:27<06:59, 32.27s/it]

(epoch, train_loss, val_loss) = (50, 1.7153172254562379, 2.0452793655516226)


 81%|████████▏ | 52/64 [27:58<06:22, 31.87s/it]

(epoch, train_loss, val_loss) = (51, 1.712898850440979, 2.052297255660914)


 83%|████████▎ | 53/64 [28:30<05:48, 31.69s/it]

(epoch, train_loss, val_loss) = (52, 1.711170244216919, 2.0470517556878587)


 84%|████████▍ | 54/64 [29:02<05:17, 31.78s/it]

(epoch, train_loss, val_loss) = (53, 1.7092689514160155, 2.0545096020155316)


 86%|████████▌ | 55/64 [29:34<04:47, 31.97s/it]

(epoch, train_loss, val_loss) = (54, 1.7068809032440186, 2.0553509449657006)


 88%|████████▊ | 56/64 [30:06<04:15, 31.94s/it]

(epoch, train_loss, val_loss) = (55, 1.7052963018417358, 2.0551333970661405)


 89%|████████▉ | 57/64 [30:37<03:42, 31.83s/it]

(epoch, train_loss, val_loss) = (56, 1.7039803743362427, 2.0576768826834764)


 91%|█████████ | 58/64 [31:15<03:21, 33.54s/it]

(epoch, train_loss, val_loss) = (57, 1.7029662370681762, 2.0550389018239854)


 92%|█████████▏| 59/64 [31:46<02:43, 32.70s/it]

(epoch, train_loss, val_loss) = (58, 1.7015573501586914, 2.0603972613057002)


 94%|█████████▍| 60/64 [32:18<02:10, 32.50s/it]

(epoch, train_loss, val_loss) = (59, 1.6998630523681642, 2.059656016434295)


 95%|█████████▌| 61/64 [32:53<01:40, 33.48s/it]

(epoch, train_loss, val_loss) = (60, 1.6981909275054932, 2.059574508968788)


 97%|█████████▋| 62/64 [33:30<01:08, 34.28s/it]

(epoch, train_loss, val_loss) = (61, 1.6967071056365968, 2.0596172356907325)


 98%|█████████▊| 63/64 [34:05<00:34, 34.69s/it]

(epoch, train_loss, val_loss) = (62, 1.695846724510193, 2.0619518455070787)


100%|██████████| 64/64 [34:51<00:00, 38.03s/it]

(epoch, train_loss, val_loss) = (63, 1.6948533058166504, 2.063190140301668)





In [8]:
### YOUR CODE HERE - Here, we test the overall accuracy of our model.
    
correct = 0
total = 0
with torch.no_grad():
    for data in Test_loader:
        images, labels = data
        images, labels = Variable(images), Variable(labels)
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 5000 test images: %d %%' % (
    100 * correct / total))

Accuracy of the network on the 5000 test images: 38 %


## Submission
For submiting, please download this notebook as a `.py` file. To do so, click on `File -> Download as -> Python (.py)`. Put the downloaded `assignment3.py` into this folder and commit the file.

## Additional Resources
If you're interested in using PyTorch as a framework for deep learning (especially for your final projects! We highly recommend you use this!), check out the PyTorch tutorials: https://pytorch.org/tutorials/. They have tutorials for everything from image to text to reinforcement learning tasks.