# Getting Started: Training a Neural Network to Act as an XOR gate

This is a minimal example about how you can use common data science libraries in RHOAI to train an AI model.

## Data Loading

Read the data from a CSV file by using Pandas.
Pandas loads the data into a `DataFrame` object:

In [None]:
import pandas as pd

data = pd.read_csv("xor.csv")

Inspect the data.

The XOR gate receives 2 input values and returns one output.
The gate returns `1` (or `true`) when the 2 input values are different.
Othewise, the gate returns `0` (or `false`).

The first unnamed column is the dataframe index. You can ignore the index in this example.

In [None]:
data

## Data Inspection

After loading the data, you can  explore the cases, for example, by gathering basic metrics:

In [None]:
data.describe()

You can also create data visualizations with libraries such as `matplotlib` to discover patterns and correlations:

In [None]:
%pip install seaborn==0.13.0

import seaborn as sns
import matplotlib.pyplot as plt

# Set graph style
sns.set(style="whitegrid")

# Create the plot
plt.figure(figsize=(4, 4))
plot = sns.scatterplot(
    x='Input 1',
    y='Input 2',
    hue='Output',
    data=data,
    palette={0: "red", 1: "green"},
    s=200
)
plt.xticks([0, 1])
plt.yticks([0, 1])
plt.title('XOR Gate Cases')
plt.show()

## Data Preparation

Typically, you must prepocess, clean, normalize, and prepare the data in a format that is suitable for the library and the model that you are training.
You must also split the data into input and output data.

In most cases, you would also split the data into training, test, validation subsets, so that you can evaluate the performance of your model after training.

In this case, for the sake of simplicity, just split the data into inputs and output.

To select the inputs, select all rows (`:`) and the first two columns (`:2`):

In [None]:
inputs = data.iloc[:, :2]
inputs

To select the output, pick the last column:

In [None]:
output = data.iloc[:, -1:]
output

## Training

After your data is clean and ready, you can create and train your model.

In this case, the exercise uses a simple neural network by using the `tensorflow` and `keras` libraries.

In [None]:
from keras.layers import Dense
from keras.models import Sequential

# Define a sequential neural network
model = Sequential([
    Dense(32, input_dim=2, activation="relu"),
    Dense(1, activation="sigmoid")
])

# Compile the model
model.compile(loss="mean_squared_error", optimizer='rmsprop', metrics=['accuracy'])

# Train the model until it reaches 100% accuracy. Given the simple XOR use case and that our training
# data encompasses the entire problem space of possible inputs and outputs, we know we can get it to 100%.
epochs = 0
while model.fit(inputs, output, epochs=epochs+10, initial_epoch=epochs).history['accuracy'][-1] != 1.0:
    epochs += 10

The model is now trained.

## Evaluation

After training, evaluate your model.
Typically you should use a dedicated test subset to evaluate the model, but this simple case uses the same training data for testing.

In [None]:
loss, accuracy = model.evaluate(inputs, output)
print("Model accuracy:", accuracy)

Finally, you can also feed inputs into the trained model and verify the result.

In [None]:
predictions = model.predict(inputs)
expected_predictions = output.to_numpy()
rounded_predictions = [int(round(p[0])) for p in predictions]

# Print the predicted values
print("Predictions:")
print("Expected\tPredicted")
print("-------------------------")
for input_val, output_val in zip(expected_predictions, rounded_predictions):
    print(f"{input_val[0]}\t\t{output_val}")