# Lab 8-2: XOR with Nerual Net

**Jonathan Choi 2021**

**[Deep Learning By Torch] End to End study scripts of Deep Learning by implementing code practice with Pytorch.**

If you have an any issue, please PR below.

[[Deep Learning By Torch] - Github @JonyChoi](https://github.com/jonychoi/Deep-Learning-By-Torch)

Here, we are going to learn about how the multi-layer perceptron can solve the XOR Problem.

## Imports

In [16]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

In [17]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

torch.manual_seed(1)
if device == 'cuda':
    torch.cuda.manual_seed_all(1)

## Define the XOR Problem

In [18]:
X = torch.FloatTensor([[0, 0], [0, 1], [1, 0], [1, 1]]).to(device)
Y = torch.FloatTensor([[0], [1], [1], [0]]).to(device)

## Create the Multi Layer Perceptron (Multi Linear Layer)

In [19]:
# nn Layers
linear1 = nn.Linear(2, 2)
linear2 = nn.Linear(2, 1)
sigmoid = nn.Sigmoid()

In [20]:
#model
model = nn.Sequential(linear1, sigmoid, linear2, sigmoid).to(device)

In [21]:
#define cost & optimizer
criterion = nn.BCELoss().to(device)
optimizer = optim.SGD(model.parameters(), lr=1)

## Train the multi layer perceptron(Neural Network)

We can say multi layer(over 1 layer) perceptron with backpropagation as **'Neural Network'**.

Actually, the MLP (Multi Layer Perceptron) is the subset of the DNN (Neural Network).

About the DNN, NN, and MLP, please check additional writes at 08.0 - About the Neural Network.md

In [22]:
for step in range(10001):

    #prediction
    pred = model(X)

    #cost
    cost = criterion(pred, Y)

    #Reduce cost
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    if step % 1000 == 0:
        result = sigmoid(pred).squeeze().detach().cpu().numpy()
        print('Epoch:{:2d}/10000, result: {} cost: {:.6f}'.format(step, result, cost.item()))

Epoch: 0/10000, result: [0.6378827  0.63997304 0.6364925  0.63870925] cost: 0.702758
Epoch:1000/10000, result: [0.6133584  0.5975663  0.67112035 0.6092756 ] cost: 0.618071
Epoch:2000/10000, result: [0.5035969  0.72880906 0.7289416  0.5031296 ] cost: 0.012341
Epoch:3000/10000, result: [0.50156736 0.7300963  0.73012793 0.5013755 ] cost: 0.005362
Epoch:4000/10000, result: [0.50099635 0.730452   0.7304612  0.50087863] cost: 0.003411
Epoch:5000/10000, result: [0.500729   0.7306182  0.73061895 0.50064355] cost: 0.002494
Epoch:6000/10000, result: [0.5005743  0.73070943 0.73071426 0.500508  ] cost: 0.001966
Epoch:7000/10000, result: [0.5004734  0.7307734  0.7307743  0.50041914] cost: 0.001618
Epoch:8000/10000, result: [0.5004025  0.73081535 0.73081684 0.5003571 ] cost: 0.001377
Epoch:9000/10000, result: [0.5003502  0.7308465  0.7308482  0.50031084] cost: 0.001199
Epoch:10000/10000, result: [0.50030965 0.73087233 0.7308715  0.50027514] cost: 0.001060


## Results

As below, the prediction shows the multi layer perceptron solved the XOR problem.

We can say this as ***non-linear*** function, that multi layering can act as non-linear function, otherwise the single layer perceptron can only be as ***linear*** function.

In [23]:
#Accuracy computation
#True if hypothesis > 0.5 else False

with torch.no_grad():
    prediction = model(X)
    predicted = (prediction > 0.5).float()
    accuracy = (predicted == Y).float().mean()

    print('Prediction: {} \nPredicted: {}\nAccuracy: {}'.format(prediction.squeeze().detach().cpu().numpy(), predicted.squeeze().detach().cpu().numpy(), accuracy))

Prediction: [0.00123849 0.99905306 0.9990484  0.00110031] 
Predicted: [0. 1. 1. 0.]
Accuracy: 1.0


## High-level Implementation with ```nn.Module```

In [24]:
class XOR_MultiLayer(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear1 = nn.Linear(2, 2)
        self.linear2 = nn.Linear(2, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        return nn.Sequential(
            linear1,
            sigmoid,
            linear2,
            sigmoid
        )(x)

In [25]:
model = XOR_MultiLayer()

In [26]:
optimizer = optim.SGD(model.parameters(), lr=1)

### Take a Moment!

just writing as ```nn.Sigmoid(pred)``` makes an error of "TypeError: __init__() takes 1 positional argument but 2 were given."

=> You are using it as an instance method so you must include self as the first argument

https://stackoverflow.com/questions/50275814/sigmoid-takes-1-positional-argument-but-2-were-given

---

So we should use torch.sigmoid if we want to apply the sigmoid.

In [27]:
nb_epochs = 10001

for epoch in range(nb_epochs):

    #prediction
    pred = model(X)

    #cost function
    cost = F.binary_cross_entropy(pred, Y)

    #Reduce cost
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    if epoch % 1000 == 0:
        result = torch.sigmoid(pred).squeeze().detach().cpu().numpy()
        print('Epoch: {:2d}/10000, result: {}, cost: {:.6f}'.format(epoch, result, cost.item()))

Epoch:  0/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060
Epoch: 1000/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060
Epoch: 2000/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060
Epoch: 3000/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060
Epoch: 4000/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060
Epoch: 5000/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060
Epoch: 6000/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060
Epoch: 7000/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060
Epoch: 8000/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060
Epoch: 9000/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060
Epoch: 10000/10000, result: [0.50030965 0.73087233 0.7308715  0.5002751 ], cost: 0.001060


In [28]:
#Accuracy computation
#True if hypothesis > 0.5 else False

with torch.no_grad():
    prediction = model(X)
    predicted = (prediction > 0.5).float()
    accuracy = (predicted == Y).float().mean()

    print('Prediction: {} \nPredicted: {}\nAccuracy: {}'.format(prediction.squeeze().detach().cpu().numpy(), predicted.squeeze().detach().cpu().numpy(), accuracy))

Prediction: [0.00123849 0.99905306 0.9990484  0.00110031] 
Predicted: [0. 1. 1. 0.]
Accuracy: 1.0
