# Minimizing the Loss over Training Data

When you have training data for classification, the loss is calculated based on the performance of the classifier over a batch of the training data.

The parameters of the neural network (the weights and biases) which minimize the loss are discovered by descending the loss gradient, as you can see below.

We've provided a utility class 'Data' (in data_reader.py) to load the training data (it works for all the toy problems).

In [2]:
import torch
import torch.nn.functional as F
from data_reader import Data

data = Data("data/toy_problem_1_train.txt")

labels, features = data.get_sample()

print("Labels:\n"+str(labels))

print("Features:\n"+str(features))
    
target = torch.autograd.Variable(torch.LongTensor(labels))
#print("Labels Tensor:\n"+str(target))

features = torch.autograd.Variable(torch.Tensor(features))
#print("Features Tensor:\n"+str(features))

Labels:
[1, 1, 1, 0, 1, 0, 0, 1, 1, 1]
Features:
[[-30, -6], [-68, 93], [-88, 74], [12, -28], [-99, -35], [31, -45], [86, 49], [-63, 55], [-18, 100], [-77, 39]]


We initialize the weights randomly.

We can now perform 100 learning iterations below as many times as we want.

Each learning iteration involves a forward pass and a backward pass.

The forward pass involves the computation of the loss from the training data and the current parameters.

The backward pass is performed automatically by Pytorch when you call loss.backward().

Pytorch calculates all the gradients with respect to the loss.

These gradients are stored in each parameter's 'grad' member variable.

Notice that the code for the learning iterations is almost identical to that of exercise 510.

In [4]:
weights = torch.nn.Parameter(torch.rand(2, 2))

print(weights)

for i in range(101):
    
    if weights.grad is not None:
        weights.grad.data.zero_()

    # Forward pass
    
    labels, features = data.get_sample()
    
    features = torch.autograd.Variable(torch.Tensor(features))
    #print(data)
    
    target = torch.autograd.Variable(torch.LongTensor(labels))
    #print(target)
    
    result = torch.mm(features, weights)
    #print(result)
    
    loss = F.cross_entropy(result, target)
    #print("Cross entropy loss: "+str(loss))
    
    # Backward pass
    
    loss.backward()
    
    gradient = weights.grad
    
    learning_rate = 0.01
    
    weights.data = weights.data - learning_rate * gradient.data
    
    if i % 10 == 0:
        print("The loss is now "+str(loss.data[0]))
    
print("\tThe weights are now "+str(weights.data))

torch.save(weights, "models/toy_problem_1_trained_model.bin")

Parameter containing:
 0.0388  0.6214
 0.0345  0.1621
[torch.FloatTensor of size 2x2]

The loss is now 21.946632385253906
The loss is now 0.04520086571574211
The loss is now 0.010468387976288795
The loss is now 0.0038279653526842594
The loss is now 0.00018894045206252486
The loss is now 0.1948653757572174
The loss is now 0.015952305868268013
The loss is now 0.06670306622982025
The loss is now 0.47406989336013794
The loss is now 0.00046990858390927315
The loss is now 1.3463650248013437e-05
	The weights are now 
 0.7213 -0.0611
-0.3002  0.4968
[torch.FloatTensor of size 2x2]



## Parameters

As we know, the final parameters learnt by the algorithm should look something like this

$$\begin{bmatrix}1 & 0 \\ 0 & 1\end{bmatrix}$$

or this

$$\begin{bmatrix}2 & 1 \\ 1 & 2\end{bmatrix}$$

Basically the weights values at 0,0 and 1,1 in the matrix should be higher than the weights at 1,0 and 0,1.

## Classifier for Toy Problem 1

We have just trained a classifier for Toy Problem 1.

You can evaluate the performance of the classifier on the test data.

In [5]:
data = Data("data/toy_problem_1_test.txt")

weights = torch.load("models/toy_problem_1_trained_model.bin")

print(weights)

labels, features = data.get_all()

features = torch.autograd.Variable(torch.Tensor(features))
#print(features)

target = torch.autograd.Variable(torch.LongTensor(labels))
#print(target)

result = torch.mm(features, weights)
#print(result)

maxv, observed = torch.max(result, 1)

total = 0
correct = 0
for i in range(len(labels)):
    total += 1
    #print(str(target.data[i]) + " " + str(observed.data[i]))
    if target.data[i] == observed.data[i]:
        correct += 1
accuracy = correct / total
print("Accuracy: "+str(accuracy))

Parameter containing:
 0.7213 -0.0611
-0.3002  0.4968
[torch.FloatTensor of size 2x2]

Accuracy: 0.999
