# Classifier for Toy Problem 3

Now let's build a classifier for Toy Problem 3.

We've provided a utility class 'Data' (in data_reader.py) to load the training data (it works for all the toy problems).

In [1]:
import torch
import torch.nn.functional as F
from data_reader import Data

data = Data("data/toy_problem_3_train.txt")

labels, features = data.get_sample()

print("Labels:\n"+str(labels))

print("Features:\n"+str(features))
    
target = torch.autograd.Variable(torch.LongTensor(labels))
#print("Labels Tensor:\n"+str(target))

features = torch.autograd.Variable(torch.Tensor(features))
#print("Features Tensor:\n"+str(features))

Labels:
[1, 1, 1, 1, 1, 0, 1, 0, 0, 1]
Features:
[[59, -85], [-97, 67], [-93, 31], [31, -73], [-29, 69], [-31, -10], [-62, 64], [11, 21], [63, 69], [-21, 79]]


We initialize the weights randomly.

In [2]:
weights = torch.nn.Parameter(torch.rand(2, 2))
print(weights)

Parameter containing:
 0.5054  0.7487
 0.0225  0.4477
[torch.FloatTensor of size 2x2]



We can now perform 100 learning iterations below as many times as we want.

Notice that the code for the learning iterations is identical to that of exercise 530.

In [6]:
for i in range(101):
    labels, features = data.get_sample(10)
    
    features = torch.autograd.Variable(torch.Tensor(features))
    #print(data)
    
    target = torch.autograd.Variable(torch.LongTensor(labels))
    #print(target)
    
    result = torch.mm(features, weights)
    #print(result)
    
    loss = F.cross_entropy(result, target)
    #print("Cross entropy loss: "+str(loss))
    
    loss.backward()
    
    gradient = weights.grad
    
    learning_rate = 0.01
    
    weights.data = weights.data - learning_rate * gradient.data
    
    if i % 10 == 0:
        print("The loss is now "+str(loss.data[0]))
    
    weights.grad.data.zero_()

print("\tThe weights are now "+str(weights.data))

torch.save(weights, "models/toy_problem_3_trained_model.bin")

The loss is now 2.601632833480835
The loss is now 2.004544734954834
The loss is now 12.074604034423828
The loss is now 11.238866806030273
The loss is now 5.32888650894165
The loss is now 6.22322416305542
The loss is now 10.873896598815918
The loss is now 1.5366473197937012
The loss is now 2.999246597290039
The loss is now 10.520666122436523
The loss is now 12.201106071472168
	The weights are now 
 0.5275  0.7267
 0.0379  0.4323
[torch.FloatTensor of size 2x2]



## The Loss

Observe the loss that is printed at the end of every 10 iterations.

Now matter how many hundreds of times you run the hill-climbing code, the loss does not decrease.

This tells us that the machine learning algorithm is probably not learning anthing much.

## Parameters

You can try find the parameters manually.  We know that the matrix should be a 2x2 matrix like this.

$$\begin{bmatrix}w_{0,0} & w_{0,1} \\ w_{1,0} & w_{1,1} \end{bmatrix}$$

But we won't be able to come up with a good set of values that works well on this problem.

## Classifier Test - Toy Problem 3

We have just trained a classifier for Toy Problem 3.

It doesn't seem to be learning anything (the loss on the training data does not decrease).

But, to make sure, let us evaluate the performance of the classifier on the test data.

In [7]:
data = Data("data/toy_problem_3_test.txt")

weights = torch.load("models/toy_problem_3_trained_model.bin")
print(weights)

labels, features = data.get_all()

features = torch.autograd.Variable(torch.Tensor(features))
#print(features)

target = torch.autograd.Variable(torch.LongTensor(labels))
#print(target)

result = torch.mm(features, weights)
#print(result)

maxv, observed = torch.max(result, 1)

total = 0
correct = 0
for i in range(len(labels)):
    total += 1
    #print(str(target.data[i]) + " " + str(observed.data[i]))
    if target.data[i] == observed.data[i]:
        correct += 1
accuracy = correct / total
print("Accuracy: "+str(accuracy))

Parameter containing:
 0.5275  0.7267
 0.0379  0.4323
[torch.FloatTensor of size 2x2]

Accuracy: 0.517


As you can see, the accuracy is 50%.

50% accuracy on a 2-class classification problem is not a good score because you can get that score by randomly tossing a coin and using it to pick your categories.

Why is the performance of the single-layer neural network so bad?

The slides of the course will tell you the reason (starting from slide 100).