# Data : we shall generate the OR data

x1 x2  --> y

0 0 --> 0

0 1 --> 1

1 0 --> 1

1 1 --> 1

In [1]:
import numpy as np
num_imputs = 4

In [2]:
input_array = np.array([[0,0],[0,1],[1,0],[1,1]])
input_array

array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

In [3]:
# output data -- OR Function
output_array = np.array([0,1,1,1])
output_array

array([0, 1, 1, 1])

# Perceptron Model
## Importing libraries

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import SGD

1. **Importing Sequential from tensorflow.keras.models:**

Sequential is a way to create a model in TensorFlow. Think of it as a container where you add layers one by one in the order they should process data.
Example: Imagine building a robot. You start with the frame (Sequential), and then you'll add parts to it in a specific order.

2. **Importing Dense from tensorflow.keras.layers:**

Dense is a type of layer used in neural networks, where every neuron (basic unit of computation in neural networks) is connected to all neurons in the previous layer.
Example: Now, in our robot, we add a layer of sensors (Dense layer). Each sensor is connected to all parts of the previous layer, say the frame, to gather information.

3. **Importing SGD from tensorflow.keras.optimizers:**

SGD stands for Stochastic Gradient Descent. It's an optimizer that helps to minimize the errors the model makes during training. In simple terms, it's like a guide that tells the model how to improve itself.
Example: Think of SGD as a robot technician who adjusts and fine-tunes the robot (our model) to perform better based on the errors it makes during its tasks.

**Putting it all together:**

A. You first create a model (Sequential).

B. Then you add layers to it (Dense), like adding sensors and parts to our robot.

C. Finally, you use an optimizer (SGD) to improve its performance, like a technician who tunes the robot for better efficiency.

# Model Definition

In [None]:
classifier = Sequential()
classifier.add(Dense(input_shape=(2,), units=1, activation='sigmoid'))
# bias term is added by default
classifier.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 1)                 3         
                                                                 
Total params: 3 (12.00 Byte)
Trainable params: 3 (12.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


The output shows a summary of a simple neural network model, named "sequential", created using TensorFlow's Keras API. This model consists of a single layer, which is a Dense layer. Key insights from this summary are:

1. Layer Type: The model has one layer, which is a Dense layer. This type of layer is fully connected, meaning each input neuron is connected to each output neuron.

2. Output Shape: The output shape of this layer is (None, 1). This means that for each input instance, the layer produces a single output. The None in the shape represents the batch size, which can vary.

3. Parameters: There are 3 parameters in total. In a Dense layer, parameters are generally the weights and biases. Since the input shape is (2,) (indicating two input features) and the layer has 1 neuron, there are 2 weights (one for each input feature) and 1 bias (added by default), totaling 3 parameters.

4. Trainable Parameters: All 3 parameters are trainable, meaning they will be updated during the training process to minimize the model's error.

5. Non-trainable Parameters: There are no non-trainable parameters in this model.

The purpose of this model is to perform a simple form of classification or regression with an input of 2 features. The sigmoid activation function suggests it's likely used for a binary classification task, where the output would be a probability between 0 and 1, indicating the likelihood of belonging to a certain class. The model is very basic and is likely for educational or demonstration purposes due to its simplicity.

In [None]:
classifier.compile(optimizer=SGD(learning_rate=0.5), loss='binary_crossentropy', metrics=['accuracy'])

This line of code is configuring the training process for the neural network model named classifier. Let's break it down:

1. Compile Method: The compile method is used to configure how the model learns during training. It sets up important aspects like the optimizer, loss function, and metrics.

2. Optimizer - SGD: Here, Stochastic Gradient Descent (SGD) is used as the optimizer with a learning rate of 0.5. The optimizer is an algorithm that adjusts the weights of the network to minimize the loss. The learning rate controls how much the weights are adjusted during training, with a higher rate potentially leading to faster learning but also a risk of overshooting the optimal solution.

3. Loss Function - binary_crossentropy: The loss function chosen is 'binary_crossentropy', which is commonly used for binary classification tasks. This function measures how far off the predictions are from the actual labels (0 or 1 in binary classification) and the model aims to minimize this value during training.

4. Metrics - ['accuracy']: The model will track 'accuracy' during training and evaluation. Accuracy is the fraction of correctly classified instances among the total instances.

I

In [None]:
classifier.fit(input_array, output_array, epochs=100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.src.callbacks.History at 0x7ef4a80b9570>

# Model prediction and validation

In [None]:
y_pred = classifier.predict(input_array)
y_pred = (y_pred > 0.5)
y_pred



array([[False],
       [ True],
       [ True],
       [ True]])

This output shows the results of making predictions with the classifier neural network model on some input data (input_array). Here's a breakdown of what's happening:

1. Prediction (classifier.predict(input_array)): The model takes an array of inputs (input_array) and predicts outputs for them. These outputs are in the form of probabilities, given the use of a sigmoid activation function in the model's layer.

2. Thresholding (y_pred > 0.5): The predictions are then converted into boolean values (True or False). This is done by checking whether each predicted probability is greater than 0.5. If a prediction is greater than 0.5, it's considered True (typically representing class 1 in binary classification), otherwise False (representing class 0).

3. Output Array: The final output is an array showing these boolean results for each input. In your case, the array [[False], [True], [True], [True]] indicates that the first input was classified into class 0 (False), and the next three inputs were classified into class 1 (True).

4. Additional Info (1/1 [==============================] - 0s 97ms/step): This part indicates the progress of the prediction process. It shows that the prediction was completed in one step, taking 97 milliseconds.

In summary, the model predicted the class (True or False) for each instance in the input array, with these predictions likely representing the probability of belonging to a particular class in a binary classification task.



In [None]:
from sklearn.metrics import accuracy_score
print("Accuracy", accuracy_score(y_pred, output_array))

Accuracy 1.0


##### We see that after just few epochs the model's accuracy is 100% which means it is able to classify all the points correctly

##the above is during 1957


# Perceptron Network for XOR data

### Data
#### we shall generate the XOR data
0 0 --> 0

0 1 -- > 1

1 0 -- > 1

1 1 -- > 0

In [None]:
input_array = np.array([[0,0],[0,1],[1,0],[1,1]])
input_array

array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

In [None]:
# output data -- XOR Function
output_array = np.array([0,1,1,0])
output_array

array([0, 1, 1, 0])

In [None]:
classifier = Sequential()
classifier.add(Dense(input_shape=(2,), units=1, activation='sigmoid'))
# bias term is added by default
classifier.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_1 (Dense)             (None, 1)                 3         
                                                                 
Total params: 3 (12.00 Byte)
Trainable params: 3 (12.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [None]:
classifier.compile(optimizer=SGD(learning_rate=0.5), loss='binary_crossentropy', metrics=['accuracy'])

In [None]:
classifier.fit(input_array, output_array, epochs=100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.src.callbacks.History at 0x7ef40d1de4a0>

In [None]:
y_pred = classifier.predict(input_array)
y_pred = (y_pred > 0.5)
print("Accuracy", accuracy_score(y_pred, output_array))

Accuracy 0.5


In [None]:
y_pred

array([[False],
       [False],
       [ True],
       [ True]])

##### We see that after 100 epochs, the model's accuracy is 50% which means it is not able to classify all the points correctly. The model is not able to segregate the classes correctly (it's missclassifier)

# MultiLayer Perceptron (XOR Function)

In [None]:
input_array = np.array([[0,0],[0,1],[1,0],[1,1]])
output_array = np.array([0,1,1,0])

In [None]:
from IPython.utils.text import dedent
classifier = Sequential()
classifier.add(Dense(input_shape=(2,), units=10, activation='relu'))
classifier.add(Dense(units=1, activation='sigmoid'))
# bias term is added by default
classifier.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_2 (Dense)             (None, 10)                30        
                                                                 
 dense_3 (Dense)             (None, 1)                 11        
                                                                 
Total params: 41 (164.00 Byte)
Trainable params: 41 (164.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [None]:
classifier.compile(optimizer=SGD(learning_rate=0.5), loss='binary_crossentropy', metrics=['accuracy'])

In [None]:
classifier.fit(input_array, output_array, epochs=100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.src.callbacks.History at 0x7ef40ce4cdf0>

In [None]:
y_pred = classifier.predict(input_array)
y_pred = (y_pred > 0.5)
print("Accuracy", accuracy_score(y_pred, output_array))

Accuracy 1.0


Conclusion :

1. Perceptron Model for OR Data: This section shows a perceptron model trained on OR data (inputs producing a logical OR output). The model achieves 100% accuracy, indicating it successfully learned the OR operation.

2. Perceptron Network for XOR Data: This part attempts to use a similar perceptron model for XOR data. However, the model only reaches 50% accuracy, suggesting the inability of a simple perceptron to handle the XOR operation, which is not linearly separable.

3. MultiLayer Perceptron for XOR Data: The final section uses a MultiLayer Perceptron with two layers to model XOR data. This approach successfully learns the XOR operation, reaching 100% accuracy, demonstrating the capability of multi-layer networks to solve problems that single-layer perceptrons cannot.

The key difference observed is the inability of a single-layer perceptron to solve XOR problems and the effectiveness of multi-layer perceptrons in handling more complex, non-linearly separable data.