# Lab 8-1: About XOR

**Jonathan Choi 2021**

**[Deep Learning By Torch] End to End study scripts of Deep Learning by implementing code practice with Pytorch.**

If you have an any issue, please PR below.

[[Deep Learning By Torch] - Github @JonyChoi](https://github.com/jonychoi/Deep-Learning-By-Torch)

Here, the 'About XOR' will gonna handle about the XOR problem, that is proved "1 Layer" perceptron cannot solve the problem. We are gonna define the XOR Problem as a torch.tensor, and will create the 1 linear layer(~=1 layer perceptron) to sure about whether really the 1 layer perceptron cannot solve the problem.

## Imports

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

In [2]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

torch.manual_seed(1)
if device == 'cuda':
    torch.cuda.manual_seed_all(1)

## Define the XOR Problem

In [8]:
X = torch.FloatTensor([[0, 0], [0, 1], [1, 0], [1, 1]]).to(device)
Y = torch.FloatTensor([[0], [1], [1], [0]]).to(device)

## Create 1 Layer Perceptron Model to solve the XOR.

In [9]:
#nn layers
linear = torch.nn.Linear(2, 1)
sigmoid = torch.nn.Sigmoid()

In [10]:
#model
model = torch.nn.Sequential(linear, sigmoid).to(device)

In [11]:
#Define cost & optimizer
criterion = torch.nn.BCELoss().to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=1)

## Train the 1-Layer Perceptron

In [43]:
for step in range(10001):
    #prediction
    pred = model(X)

    #cost
    cost = criterion(pred, Y)

    #Reduce cost
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    if step % 100 == 0:
        result = torch.sigmoid(pred).squeeze().detach().cpu().numpy()
        print('Epoch: {:2d}/1000 \n Prediction: {} Cost: {:.6f}'.format(step, result, cost.item()))

Epoch:  0/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147
Epoch: 100/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147
Epoch: 200/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147
Epoch: 300/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147
Epoch: 400/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147
Epoch: 500/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147
Epoch: 600/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147
Epoch: 700/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147
Epoch: 800/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147
Epoch: 900/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147
Epoch: 1000/1000 
 Prediction: [0.62248445 0.62245303 0.6224656  0.6224342 ] Cost: 0.693147

## Result

As the results below, we can see the 1 layer perceptron(1 linear layer) **cannot solve the XOR problem.**(The maximum accruacy is 0.5, this means no matter how hard train it, it can only correct the half. - the property of the linear)

This means that the 1 layer perceptron cannot solve the ***non-linear** problem, that can only solve the ***Linear*** Problem.

![](https://miro.medium.com/max/962/1*YzgdEbLiB17x3jB28gSAdw.png)

https://towardsdatascience.com/the-magic-behind-the-perceptron-network-eaa461088367

In [42]:
#Accuracy computation
# True if hypothesis > 0.5 else False
with torch.no_grad():
    pred = model(X)
    predicted = (pred > 0.5).float()
    accuracy = (predicted == Y).float().mean()
    print('\n Hypothesis: ', pred.detach().cpu().numpy(), '\n Predicted: ', predicted.detach().cpu().numpy(), '\n Accuracy: ', accuracy.item())


 Hypothesis:  [[0.500107  ]
 [0.49997312]
 [0.5000269 ]
 [0.49989298]] 
 Correct:  [[1.]
 [0.]
 [1.]
 [0.]] 
 Accuracy:  0.5


## High Level Implementation with ```nn.Module```

In [44]:
class XOR_SingleLayer(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(2, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        out = self.linear(x)
        out = self.sigmoid(out)
        return out

In [49]:
model = XOR_SingleLayer().to(device)

In [50]:
optimizer = optim.SGD(model.parameters(), lr=1)

In [53]:
nb_epochs = 10001

for epoch in range(nb_epochs):

    #prediction
    pred = model(X)

    #cost
    cost = F.binary_cross_entropy(pred, Y).to(device)

    #Reduce the cost
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    if epoch % 1000 == 0:
        result = torch.sigmoid(pred).squeeze().detach().cpu().numpy()
        print('Epoch: {:2d} / 10000, Result: {}, Cost: {:.6f}'.format(epoch, result, cost.item()))

Epoch:  0 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147
Epoch: 1000 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147
Epoch: 2000 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147
Epoch: 3000 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147
Epoch: 4000 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147
Epoch: 5000 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147
Epoch: 6000 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147
Epoch: 7000 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147
Epoch: 8000 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147
Epoch: 9000 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147
Epoch: 10000 / 10000, Result: [0.6224185  0.62248766 0.6224309  0.6225001 ], Cost: 0.693147


In [54]:
#Accuracy computation
# True if hypothesis > 0.5 else False
with torch.no_grad():
    pred = model(X)
    predicted = (pred > 0.5).float()
    accuracy = (predicted == Y).float().mean()
    print('\n Hypothesis: ', pred.detach().cpu().numpy(), '\n Predicted: ', predicted.detach().cpu().numpy(), '\n Accuracy: ', accuracy.item())


 Hypothesis:  [[0.49982643]
 [0.5001209 ]
 [0.49987915]
 [0.5001736 ]] 
 Predicted:  [[0.]
 [1.]
 [0.]
 [1.]] 
 Accuracy:  0.5
