# **Solve the XOR Problem with a Neural Network**

Source:  [https://github.com/d-insight/code-bank.git](https://github.com/d-insight/code-bank.git)  
License: [MIT License](https://opensource.org/licenses/MIT). See open source [license](LICENSE) in the Code Bank repository. 

-------------

## Overview

This illustration shows the solving of the non-linear XOR problem with a neural network. The illustration can be used in conjunction with showing alternative methods, such as basis expansions, for warping and then separating a space.

-------------

## **Part 0**: Setup

In [None]:
# Import all packages 
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
import matplotlib
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from mpl_toolkits.mplot3d import axes3d

# Special code to ignore un-important warnings 
import warnings
warnings.filterwarnings('ignore')


In [None]:
# define all constants

# Part 1
EPOCHS_1 = 20000

# Part 2
N                     = 100
STD                   = 0.30
HIDDEN_LAYER1_NEURONS = 4
EPOCHS_2              = 500
FIGSIZE               = (10, 10)
FONTSIZE              = 16
PLOT_X1_LABEL         = '\nNo A                              Yes A'
PLOT_X2_LABEL         = '\nNo B                              Yes B'


## **Part 1**: Minimum number of layers and nodes for XOR

A fundamental question when designing neural networks is how many hidden layers and neurons per hidden layer to use. We explore this question in the context of the XOR problem. Note how we require a high number of epochs given the scarce training data. Here is a useful complementary article explaining the intuition: https://towardsdatascience.com/beginners-ask-how-many-hidden-layers-neurons-to-use-in-artificial-neural-networks-51466afa0d3e

Will the network always achieve 100% accuracy? No, this depends on the randomly initialized weights.

In [None]:
# generate toy data
FEATURES = np.array([[0,0], [0,1], [1,0], [1,1], [0,0], [0,1], [1,0], [1,1]], "float32")
LABELS   = np.array([ [0],   [1],   [1],   [0] ,  [0],   [1],   [1],   [0]],  "float32")  # must be same order

# define Model:  Keras Sequential Fully-Connected Feed-Forward Network
model = Sequential()
model.add(Dense(2, input_dim = 2, activation = 'sigmoid', kernel_initializer = 'glorot_uniform'))  # Hidden Layer
model.add(Dense(1, activation = 'sigmoid', kernel_initializer = 'glorot_uniform'))                 # Output Layer
model.compile(loss = 'mean_squared_error', optimizer = 'nadam', metrics = ['binary_accuracy'])

# fit Model
model.fit(FEATURES, LABELS, epochs = EPOCHS_1, verbose = 0)
loss, accuracy = model.evaluate(FEATURES, LABELS, verbose = 0)
print('Loss: {} - Accuracy: {}\n'.format(round(loss, 4), accuracy))

# Predict Outcome
predictions = model.predict(FEATURES).round()
for i in range(len(LABELS)):
    print('Label', LABELS[i], '  ', 'Prediction',predictions[i])

## **Part 2**: Fuzzy XOR

We now generate a fuzzy XOR problem, where the classes can overlap. The purpose of this demo is to show how Keras can be used to classify a quasi-XOR relationship with fuzzy and dispersed data.  The data is dispersed around 0 and 1 in the respective region depending on STD dispersion param; if STD is large, there can be overlap between classes.

There are four regions of similarity in outcome values:

        -----------
        | UL | UR |
        -----------
        | BL | BR |
        -----------

    There are two factors (of two dimensions):

        Factor A = -1 in the BL region
        Factor A =  1 in the BR region
        Factor B = -1 in the BL region
        Factor B =  1 in the UL region
        in an XOR(A,B) relationship, the observations are -1 in the UR region

In [None]:
# Global Data Variables -----------------------------------------------------------------------
Y  = []
X1 = []
X2 = []
COLORS = []

# Generate data -------------------------------------------------------------------------------
for i in range(N):

    # Failure by neither (left bottom)
    Y.append(0)
    COLORS.append('red')
    X1.append(np.random.normal(loc=0.0, scale=STD))
    X2.append(np.random.normal(loc=0.0, scale=STD))

    # Success by A (right bottom)
    Y.append(1)
    COLORS.append('green')
    X1.append(np.random.normal(loc=1.0, scale=STD))
    X2.append(np.random.normal(loc=0.0, scale=STD))

    # Success by B (left upper)
    Y.append(1)
    COLORS.append('green')
    X1.append(np.random.normal(loc=0.0, scale=STD))
    X2.append(np.random.normal(loc=1.0, scale=STD))

    # Failure by both (right upper)
    Y.append(0)
    COLORS.append('red')
    X1.append(np.random.normal(loc=1.0, scale=STD))
    X2.append(np.random.normal(loc=1.0, scale=STD))

DATA = np.array(list(zip(X1, X2)))
Y    = np.array(Y)


# Plot the data in 2-D  ------------------------------------------------------------------------
fig = plt.figure(1, figsize=FIGSIZE)
ax = fig.add_subplot(111)
ax.tick_params(axis='both', which='major', labelsize=FONTSIZE)
ax.set_xlabel(PLOT_X1_LABEL, fontsize = FONTSIZE)
ax.set_ylabel(PLOT_X2_LABEL, fontsize = FONTSIZE)
for i in range(len(X1)):
    ax.plot(X1[i], X2[i], Y[i], c=COLORS[i], marker='o', markersize=10)
plt.hlines(0.5, xmin = min(X1), xmax = max(X1), colors='black', linewidth=1)
plt.vlines(0.5, ymin = min(X2), ymax = max(X2), colors='black', linewidth=1)
plt.show()

print('Neural network training.\n')

# Define Model: Keras sequential feed-forward neural network
model = Sequential()
model.add(Dense(HIDDEN_LAYER1_NEURONS, input_dim=2, activation='sigmoid'))
model.add(Dense(1, activation='sigmoid')) # Final output layer with one resulting neuron
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['binary_accuracy'])

# Fit
model.fit(DATA, Y, epochs=EPOCHS_2, verbose=0)
loss, accuracy = model.evaluate(DATA, Y, verbose = 0)
print('Loss: {} - Accuracy: {}\n'.format(round(loss, 4), accuracy))

# Predict
Y_HAT = [int(round(value[0])) for value in model.predict(DATA)]
HUE   = []
MARK  = []

for i in range(len(Y_HAT)):
    if bool(Y_HAT[i] == Y[i]):
        MARK.append('o')
        HUE.append(COLORS[i])
    else:
        MARK.append('x')
        HUE.append('red' if COLORS[i]=='green' else 'green')

# Plot Results
fig = plt.figure(2, figsize=FIGSIZE)
ax = fig.add_subplot(111)
ax.tick_params(axis='both', which='major', labelsize=FONTSIZE)
ax.set_xlabel(PLOT_X1_LABEL, fontsize = FONTSIZE)
ax.set_ylabel(PLOT_X2_LABEL, fontsize = FONTSIZE)
for i in range(len(X1)):
    ax.plot(X1[i], X2[i], Y_HAT[i], c=HUE[i], marker=MARK[i], markersize=10)
plt.hlines(0.5, xmin = min(X1), xmax = max(X1), colors='black', linewidth=1)
plt.vlines(0.5, ymin = min(X2), ymax = max(X2), colors='black', linewidth=1)
plt.show()