# Lab 8-3: XOR with Wide and Deep Neural Network

**Jonathan Choi 2021**

**[Deep Learning By Torch] End to End study scripts of Deep Learning by implementing code practice with Pytorch.**

If you have an any issue, please PR below.

[[Deep Learning By Torch] - Github @JonyChoi](https://github.com/jonychoi/Deep-Learning-By-Torch)

Here, we are going to learn about how the wide and deep neural network can change the process of learning and the reducing cost of XOR problem.

## Imports

In [12]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

In [13]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

torch.manual_seed(1)
if device == 'cuda':
    torch.cuda.manual_seed_all(1)

## Define XOR Problem

In [14]:
X = torch.FloatTensor([[0, 0], [0, 1], [1, 0], [1, 1]]).to(device)
Y = torch.FloatTensor([[0], [1], [1], [0]]).to(device)

## Create the Wide and Deep Model

In [15]:
# nn layers
linear1 = nn.Linear(2, 10)
linear2 = nn.Linear(10, 10)
linear3 = nn.Linear(10, 10)
linear4 = nn.Linear(10, 1)
sigmoid = nn.Sigmoid()

In [16]:
model = nn.Sequential(linear1, sigmoid, linear2, sigmoid, linear3, sigmoid, linear4, sigmoid).to(device)

In [17]:
criterion = nn.BCELoss().to(device)
optimizer = optim.SGD(model.parameters(), lr =1)

In [18]:
for step in range(10001):

    #prediction
    pred = model(X)

    #cost
    cost = criterion(pred, Y)

    #Reduce the cost
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    if step % 1000 == 0:
        result = sigmoid(pred).squeeze().detach().cpu().numpy()
        print('Epoch: {:4d}/10000, Result: {}, Cost: {:.6f}'.format(step, result, cost.item()))

Epoch:    0/10000, Result: [0.6539865 0.6540312 0.6539376 0.6539956], Cost: 0.731973
Epoch: 1000/10000, Result: [0.62243706 0.62250036 0.6224063  0.62246627], Cost: 0.693140
Epoch: 2000/10000, Result: [0.62244695 0.6225462  0.6224133  0.6224938 ], Cost: 0.693107
Epoch: 3000/10000, Result: [0.6224198  0.6225931  0.62239367 0.62253016], Cost: 0.693069
Epoch: 4000/10000, Result: [0.62223357 0.6227078  0.62229025 0.6226171 ], Cost: 0.692835
Epoch: 5000/10000, Result: [0.58411795 0.6377132  0.62946147 0.6384127 ], Cost: 0.615173
Epoch: 6000/10000, Result: [0.50020653 0.7308674  0.7308163  0.5003069 ], Cost: 0.001065
Epoch: 7000/10000, Result: [0.5000864  0.7309806  0.73095864 0.5001244 ], Cost: 0.000437
Epoch: 8000/10000, Result: [0.5000536  0.7310111  0.73099756 0.500076  ], Cost: 0.000268
Epoch: 9000/10000, Result: [0.5000386  0.7310245  0.7310147  0.50005406], Cost: 0.000192
Epoch: 10000/10000, Result: [0.50003004 0.73103225 0.73102444 0.5000416 ], Cost: 0.000149


In [22]:
# Accuracy computation
# True if hypothesis > 0.5 else False
with torch.no_grad():
    pred = model(X)
    predicted = (pred > 0.5).float()
    accuracy = (predicted == Y).float().mean() *100
    print('Prediction: {}, \nPredicted: {}, \nAccuracy: {}'.format(pred.detach().cpu().numpy(), predicted.detach().cpu().numpy(), accuracy))

Prediction: [[1.19966135e-04]
 [9.99865890e-01]
 [9.99826610e-01]
 [1.66115104e-04]], 
Predicted: [[0.]
 [1.]
 [1.]
 [0.]], 
Accuracy: 100.0


## High-Level Implementation with ```nn.Module```

In [23]:
class XOR_Wide_Deep(nn.Module):
    def __init__(self):
        super().__init__()
        self.sq = nn.Sequential(
            nn.Linear(2, 10),
            nn.Sigmoid(),
            nn.Linear(10, 10),
            nn.Sigmoid(),
            nn.Linear(10, 10),
            nn.Sigmoid(),
            nn.Linear(10, 1),
            nn.Sigmoid()
        )
    
    def forward(self, x):
        return self.sq(x)

In [27]:
model = XOR_Wide_Deep().to(device)

In [28]:
optimizer = optim.SGD(model.parameters(), lr=1)

In [29]:
nb_epochs = 10001

for epoch in range(nb_epochs):

    #prediction
    pred = model(X)

    #cost
    cost = F.binary_cross_entropy(pred, Y).to(device)

    #Reduce cost
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    if epoch % 1000 == 0:
        result = torch.sigmoid(pred).squeeze().detach().cpu().numpy()
        print('Epoch: {:2d}/10000, result: {}, cost: {:.6f}'.format(epoch, result, cost.item()))

Epoch:  0/10000, result: [0.60209876 0.60213494 0.6020612  0.60210985], cost: 0.708109
Epoch: 1000/10000, result: [0.6224386  0.6224769  0.62242305 0.6224517 ], cost: 0.693127
Epoch: 2000/10000, result: [0.6224384  0.62247753 0.62241894 0.62245524], cost: 0.693141
Epoch: 3000/10000, result: [0.62243104 0.6224865  0.6224224  0.6224708 ], cost: 0.693133
Epoch: 4000/10000, result: [0.6224201  0.62248063 0.62240714 0.622475  ], cost: 0.693163
Epoch: 5000/10000, result: [0.6224164 0.6225073 0.6224153 0.6224915], cost: 0.693116
Epoch: 6000/10000, result: [0.6223871 0.6225118 0.6223942 0.622507 ], cost: 0.693122
Epoch: 7000/10000, result: [0.6223548  0.62254095 0.62237835 0.6225446 ], cost: 0.693105
Epoch: 8000/10000, result: [0.62221515 0.62258583 0.6223145  0.6226085 ], cost: 0.692985
Epoch: 9000/10000, result: [0.6205708  0.62311643 0.62223154 0.62352747], cost: 0.690540
Epoch: 10000/10000, result: [0.5002793  0.7306931  0.73068523 0.5004934 ], cost: 0.001713


In [30]:
# Accuracy computation
# True if hypothesis > 0.5 else False
with torch.no_grad():
    pred = model(X)
    predicted = (pred > 0.5).float()
    accuracy = (predicted == Y).float().mean() *100
    print('Prediction: {}, \nPredicted: {}, \nAccuracy: {}'.format(pred.detach().cpu().numpy(), predicted.detach().cpu().numpy(), accuracy))

Prediction: [[0.00111703]
 [0.99814284]
 [0.9980995 ]
 [0.00196971]], 
Predicted: [[0.]
 [1.]
 [1.]
 [0.]], 
Accuracy: 100.0
