<a href="https://colab.research.google.com/github/rlaqhalx/machine_learning_study/blob/main/Deep_learning_solving_XOR_question.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam, SGD

# XOR Dataset

OR and AND can be solved through linear regression. However, XOR could not be solved with linear line. So People used concept of MLP to solve this problem.

![](https://i.imgur.com/llFchxI.png)

In [2]:
x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
y_data = np.array([[0], [1], [1], [0]], dtype=np.float32)

# XOR Binary Logistic Regression

I have tried to solve this problem using Binary Logistic Regression to see why and how it does not work for XOR question

*verbose=0: This setting means that no output will be displayed during the training process. It runs silently without showing any progress or logs on the console*

In [4]:
model = Sequential([
  Dense(1, activation='sigmoid')
])

model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate=0.1))

model.fit(x_data, y_data, epochs=1000, verbose=0)

<keras.src.callbacks.History at 0x7e62e8f8c280>

Model prediction was poor. It should be 0, 1, 1, 0 but all numbers are close to 0.5.

In [5]:
y_pred = model.predict(x_data)

print(y_pred)

[[0.5012214 ]
 [0.5006556 ]
 [0.4997278 ]
 [0.49916196]]


# XOR Deep learing(MLP)

I have used relu as activation function and this hidden layer will have 8 neurons.

In [6]:
model = Sequential([
  Dense(8, activation='relu'),
  Dense(1, activation='sigmoid'),
])

model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate=0.1))

model.fit(x_data, y_data, epochs=1000, verbose=0)

<keras.src.callbacks.History at 0x7e62e8e14100>

In [7]:
y_pred = model.predict(x_data)

print(y_pred)

[[0.03667725]
 [0.9864899 ]
 [0.98732024]
 [0.00765591]]


# Keras Functional API

I have used the Sequential API in Keras. While the Sequential API is convenient for designing straightforward & squential models, in practice, the Functional API is primarily used because it allows for more complex network architectures.

So far, we have been using the Sequential API in Keras. While the Sequential API is convenient for designing straightforward models, in practical applications, the Functional API is primarily used because it allows for more complex network architectures.

So I rewrote the XOR deep learning problem using the Functional API below.


Note that I have imported Model and Input from keras libary for Functional API

In [8]:
import numpy as np
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.optimizers import Adam, SGD

I used model.summary() to check the structure of the model.

While it can be challenging to inspect the structure when using the Sequential API, one advantage of using the Functional API is that it makes it easy to check the structure using model.summary().


1. Defining Input Layer
```
input = Input(shape=(2,))
```
This creates an input layer for the network. It specifies that the input data will have two features (shape=(2,)).

* (2,) denotes that the input data is expected to have two features while data looks like this: x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)


2. Defining Hidden Layer:
```
hidden = Dense(8, activation='relu')(input)
```
This creates a hidden layer with 8 neurons and ReLU activation function. The input to this layer is the input layer defined earlier.

3. Defining Output Layer:
```
output = Dense(1, activation='sigmoid')(hidden)
```
This creates the output layer with 1 neuron and a sigmoid activation function. The input to this layer is the hidden layer.

* The sigmoid activation function is commonly used in the output layer of binary classification problems. It squashes the output values between 0 and 1, which can be interpreted as probabilities.
* In multi-class classification, you might use a softmax activation function in the output layer. This extends the concept of sigmoid to handle multiple classes by providing a probability distribution over all the classes.

4. Creating the Model:
```
model = Model(inputs=input, outputs=output)
```
This statement establishes the overall structure of the neural network. It specifies that the input is the input layer, and the output is the output layer.

5. Compiling the Model:
```
model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate=0.1))
```
This configures the model for training. It specifies the loss function (binary_crossentropy for binary classification), and the optimizer (SGD with a learning rate of 0.1).

6. Inspecting the Model:
```
model.summary()
```
This command prints a summary of the model's architecture, showing the number of parameters in each layer and the overall structure.


In [9]:
input = Input(shape=(2,))
hidden = Dense(8, activation='relu')(input)
output = Dense(1, activation='sigmoid')(hidden)

model = Model(inputs=input, outputs=output)

model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate=0.1))

model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 2)]               0         
                                                                 
 dense_4 (Dense)             (None, 8)                 24        
                                                                 
 dense_5 (Dense)             (None, 1)                 9         
                                                                 
Total params: 33 (132.00 Byte)
Trainable params: 33 (132.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


Finally do model.fit and model.predict(x_data) to see how well y has been predicted

In [10]:
model.fit(x_data, y_data, epochs=1000, verbose=0)

y_pred = model.predict(x_data)

print(y_pred)

[[0.07183885]
 [0.98805887]
 [0.9872562 ]
 [0.00680454]]
