# *Lab 3 Manual: Basics of Neural Networks with Tensorflow*

# **Libraries import and Data Cleaning**

**pandas:** the first library we learned, used for loading and handling datasets.

**numpy:** helps with math operations and working with arrays.

**tensorflow / keras:** used to build and train our neural network model.

**train_test_split:** splits the data into training and testing sets.

**StandardScaler:** prepares the data by scaling the features for better model performance.

In [17]:
#arsenal

In [18]:
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from keras import Sequential, Input
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.metrics import Precision, Recall
import seaborn as sns

Next step is to load and prepare the data. You can refer to Lab Manual 1 for more information about Data Cleaning.

In [19]:
df = pd.read_csv('/kaggle/input/fantasy-football/cleaned_merged_seasons.csv', low_memory=False)
# Basic cleaning
df = df.drop(columns=['team_x','team_a_score','team_h_score','round','kickoff_time','threat','creativity','influence'])







# Drop irrelevant columns

df['position'] = df['position'].map({'GK': 0,'GKP': 0, 'DEF': 1, 'MID': 2, 'FWD' : 3})
df['was_home'] = df['was_home'].astype(int)

In [20]:
df.head()

Unnamed: 0,season_x,name,position,assists,bonus,bps,clean_sheets,element,fixture,goals_conceded,...,saves,selected,total_points,transfers_balance,transfers_in,transfers_out,value,was_home,yellow_cards,GW
0,2016-17,Aaron Cresswell,1,0,0,0,0,454,10,0,...,0,14023,0,0,0,0,55,0,0,1
1,2016-17,Aaron Lennon,2,0,0,6,0,142,3,0,...,0,13918,1,0,0,0,60,1,0,1
2,2016-17,Aaron Ramsey,2,0,0,5,0,16,8,3,...,0,163170,2,0,0,0,80,1,0,1
3,2016-17,Abdoulaye Doucouré,2,0,0,0,0,482,7,0,...,0,1051,0,0,0,0,50,0,0,1
4,2016-17,Adam Forshaw,2,0,0,3,0,286,6,1,...,0,2723,1,0,0,0,45,1,1,1


In [21]:

df['was_home'].value_counts()



was_home
0    48105
1    48064
Name: count, dtype: int64

In [22]:
df['position'].value_counts()

position
2    39163
1    33683
3    12669
0    10654
Name: count, dtype: int64

Let's take a look at the data after applying the preprocessing techniques

In [23]:
df.head(10)

Unnamed: 0,season_x,name,position,assists,bonus,bps,clean_sheets,element,fixture,goals_conceded,...,saves,selected,total_points,transfers_balance,transfers_in,transfers_out,value,was_home,yellow_cards,GW
0,2016-17,Aaron Cresswell,1,0,0,0,0,454,10,0,...,0,14023,0,0,0,0,55,0,0,1
1,2016-17,Aaron Lennon,2,0,0,6,0,142,3,0,...,0,13918,1,0,0,0,60,1,0,1
2,2016-17,Aaron Ramsey,2,0,0,5,0,16,8,3,...,0,163170,2,0,0,0,80,1,0,1
3,2016-17,Abdoulaye Doucouré,2,0,0,0,0,482,7,0,...,0,1051,0,0,0,0,50,0,0,1
4,2016-17,Adam Forshaw,2,0,0,3,0,286,6,1,...,0,2723,1,0,0,0,45,1,1,1
5,2016-17,Adam Lallana,2,1,2,33,0,205,8,3,...,0,155525,11,0,0,0,70,0,1,1
6,2016-17,Adam Smith,1,0,0,23,0,34,9,3,...,0,21505,7,0,0,0,45,1,0,1
7,2016-17,Adrián San Miguel del Castillo,0,0,0,16,0,450,10,2,...,4,94480,2,0,0,0,50,0,0,1
8,2016-17,Alex Iwobi,2,1,0,12,0,21,8,3,...,0,48146,3,0,0,0,60,1,1,1
9,2016-17,Alex McCarthy,0,0,0,0,0,101,7,0,...,0,8821,0,0,0,0,45,1,0,1


In [36]:
ORDER_COL = "GW" if "GW" in df.columns else "round"

def _form_last4(g: pd.DataFrame) -> pd.Series:
    ordered = g.sort_values(ORDER_COL)
    feat = ordered["total_points"].shift(1).rolling(window=4, min_periods=1).mean() / 10.0
    return feat.reindex(g.index)  # align back to the group's original order

df["form"] = (
    df.groupby(["season_x", "element"], sort=False, group_keys=False)
      .apply(_form_last4)
)
first_row_mask = df.groupby(["season_x","element"]).cumcount() == 0
df.loc[first_row_mask, "form"] = 0.0


  .apply(_form_last4)


In [37]:
df.head(1000)

Unnamed: 0,season_x,name,position,assists,bonus,bps,clean_sheets,element,fixture,goals_conceded,...,selected,total_points,transfers_balance,transfers_in,transfers_out,value,was_home,yellow_cards,GW,form
0,2016-17,Aaron Cresswell,1,0,0,0,0,454,10,0,...,14023,0,0,0,0,55,0,0,1,0.000
1,2016-17,Aaron Lennon,2,0,0,6,0,142,3,0,...,13918,1,0,0,0,60,1,0,1,0.000
2,2016-17,Aaron Ramsey,2,0,0,5,0,16,8,3,...,163170,2,0,0,0,80,1,0,1,0.000
3,2016-17,Abdoulaye Doucouré,2,0,0,0,0,482,7,0,...,1051,0,0,0,0,50,0,0,1,0.000
4,2016-17,Adam Forshaw,2,0,0,3,0,286,6,1,...,2723,1,0,0,0,45,1,1,1,0.000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,2016-17,Matthew Lowton,1,0,0,13,0,55,122,2,...,34408,0,-956,541,1497,45,1,1,13,0.225
996,2016-17,Matt Phillips,2,1,0,15,0,446,124,1,...,125060,5,109844,111476,1632,52,0,0,13,0.700
997,2016-17,Matt Targett,1,0,0,0,0,299,128,0,...,15056,0,-678,14,692,43,1,0,13,0.000
998,2016-17,Mesut Özil,2,0,0,10,0,14,121,1,...,372585,2,-6773,9506,16279,96,1,0,13,0.375


In [24]:
df.head()

Unnamed: 0,season_x,name,position,assists,bonus,bps,clean_sheets,element,fixture,goals_conceded,...,saves,selected,total_points,transfers_balance,transfers_in,transfers_out,value,was_home,yellow_cards,GW
0,2016-17,Aaron Cresswell,1,0,0,0,0,454,10,0,...,0,14023,0,0,0,0,55,0,0,1
1,2016-17,Aaron Lennon,2,0,0,6,0,142,3,0,...,0,13918,1,0,0,0,60,1,0,1
2,2016-17,Aaron Ramsey,2,0,0,5,0,16,8,3,...,0,163170,2,0,0,0,80,1,0,1
3,2016-17,Abdoulaye Doucouré,2,0,0,0,0,482,7,0,...,0,1051,0,0,0,0,50,0,0,1
4,2016-17,Adam Forshaw,2,0,0,3,0,286,6,1,...,0,2723,1,0,0,0,45,1,1,1


In [25]:
df.head(5)[["season_x","element","name","GW","total_points"]]


Unnamed: 0,season_x,element,name,GW,total_points
0,2016-17,454,Aaron Cresswell,1,0
1,2016-17,142,Aaron Lennon,1,1
2,2016-17,16,Aaron Ramsey,1,2
3,2016-17,482,Abdoulaye Doucouré,1,0
4,2016-17,286,Adam Forshaw,1,1


# **Defining Features and Target** 
Before training a neural network model, we need to extract the features and the target (ground truth) from the dataset, depending on the task we want the model to learn.

**Features (X):** We select the columns Pclass, Sex, Age, Fare, SibSp, and Parch from the dataset. These variables will be used as inputs to the model.

**Target (y):** We choose the Survived column as the target variable, which represents the outcome we want the model to predict.

In [26]:
# Feature and target
X = df[['Pclass', 'Sex', 'Age', 'Fare', 'SibSp', 'Parch']]
y = df['Survived']

KeyError: "None of [Index(['Pclass', 'Sex', 'Age', 'Fare', 'SibSp', 'Parch'], dtype='object')] are in the [columns]"

In [None]:
# Show the first 5 rows of features
print("Features (X):")
print(X.head())

# Show the first 5 rows of target values
print("\nTarget (y):")
print(y.head())

#Show the dimensions of both the Features and Targets dataframes
print("X and y dimensions:",X.shape,y.shape)

# **Features Scaling**

We usually scale features before training machine learning models because raw features can have very different ranges and units, which may cause problems.

For example, in the Titanic dataset:

* Age ranges from about 0.1–80 years

* Fare ranges from about 20–500 dollars

Without scaling, algorithms that rely on distances or weights may treat Fare as more important than Age simply because the numbers are larger.

Standardization solves this by transforming each feature. We don't need to do that with the target labels, only the features.

In [None]:
# Scale the features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Print first 5 rows of scaled features
print("Scaled Features (first 5 rows):")
print(X_scaled[:5])

print("Shape of 1 row:")
print(X_scaled[0].shape)

In [None]:
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

In [None]:
#Explore the size and dimension of the train and test data after splitting:
print("Train data shape:",X_train.shape, "Train Label shape", y_train.shape)
print("Test data shape:",X_test.shape, "Test Label shape", y_test.shape)

# **Build a Feedforward Neural Network model**

**Input Layer:** Input(shape=(X_train.shape[1],)) creates the input layer with one neuron for each feature in the training data.

**First Hidden Layer:** A dense (fully connected) layer with 16 neurons. Each neuron learns from all the inputs of the previous layer.

**ReLU activation:** Outputs the input value if it’s positive, or 0 if it’s negative.

**Fully Connected Network:** Every neuron in one layer connects to all neurons in the next, each with its own weight, enabling the model to learn complex feature relationships.

In [None]:
model = Sequential([
    Input(shape=(X_train.shape[1],)),
    layers.Dense(16, activation='relu'),
    layers.Dense(8, activation='relu'),
    layers.Dense(1, activation='sigmoid')  # binary classification
    ])

model.summary() displays a layer-by-layer summary of the neural network, including the number of parameters and how data flows through the model.

From the model's summary, we can see that the model parameters, it has a total of 257 parameters, all of which are trainable, meaning they will be updated during training. There are no non-trainable parameters, so every weight and bias in the network is being learned from the data.

In [None]:
model.summary()

[Extra] If you are curious how our model reached 257 parameters, here's a breakdown:
| Connection                   | Weights Calculation | Biases | Total Params |
| ---------------------------- | ------------------- | ------ | ------------ |
| Input (6) → Dense(16, ReLU)  | (6 * 16 = 96)  | 16     | 112          |
| Dense(16) → Dense(8, ReLU)   | (16 * 8 = 128) | 8      | 136          |
| Dense(8) → Dense(1, Sigmoid) | (8 * 1 = 8)    | 1      | 9            |
| **Total**                    | —                   | —      | **257**      |



# **Compiling the Model**

Before training, we need to compile the model by specifying how it will learn:

Optimizer = 'adam': An efficient variant of gradient descent that adapts the learning rate during training.

Loss = 'binary_crossentropy': We chose this because our Titanic task has only two possible outputs, either **survived** or **not survived**. This loss function is designed for binary classification problems.

Accuracy, Precision, and Recall: are evaluation metrics that track how often the model’s predictions match the true labels, giving an easy-to-understand measure of performance.

In [None]:
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy',
                       Precision(name='precision'),
                       Recall(name='recall')])

# **Train the Neural Network Model**

**Validation Split:** Part of the training data is set aside (e.g., 80% train, 20% val) to monitor learning.

**Epoch:** One full pass through training data. More epochs help learning, but too many → overfitting.

**Loss** measures how far predictions are from true labels. We want this value to be as small as possible, ideally approaching 0.

**Accuracy** is the percentage of correct predictions. Values range from 0 to 1 (or 0% to 100%), and higher is better.

Same for both **precision** and **recall.** Values range from 0 to 1 (or 0% to 100%), and higher is better.

In [None]:
history = model.fit(X_train, y_train, epochs=20, validation_split=0.2, verbose=1)

To avoid overfitting when training with many epochs, we use **early stopping**, which automatically stops training once the validation performance stops improving.

In [None]:
from tensorflow.keras.callbacks import EarlyStopping

early_stop = EarlyStopping(monitor='val_accuracy', patience=3, restore_best_weights=True)

history = model.fit(X_train, y_train,
                    epochs=50,
                    validation_split=0.2,
                    verbose=1,
                    callbacks=[early_stop])

Rather than feeding the entire dataset through the network in a single step, the training data is divided into smaller subsets called **batches**. The model processes one batch at a time and updates its weights after each batch

In [None]:
history2 = model.fit(X_train, y_train, epochs=20, batch_size=32, validation_split=0.2, verbose=1)

**Epochs**

One epoch = full pass through training data (all batches).

Too few: underfitting (model hasn’t learned enough).

Too many: overfitting + wasted compute.

Recommendation: start with moderate epochs (e.g., 30–100) and use early stopping.

**Batch Size**

Definition: number of samples before a weight update.

Small (16–32): more updates, better generalization, slower per epoch.

Large (128–256): fewer updates, faster per epoch, higher memory use, risk of weaker generalization.

Common defaults: 32 or 64.

# **Visualize the training progress.**

In [None]:
import matplotlib.pyplot as plt

In [None]:
# Plot accuracy, precision, and recall from training history
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Val Accuracy')

plt.plot(history.history['precision'], label='Train Precision')
plt.plot(history.history['val_precision'], label='Val Precision')

plt.plot(history.history['recall'], label='Train Recall')
plt.plot(history.history['val_recall'], label='Val Recall')

plt.xlabel('Epoch')
plt.ylabel('Score')
plt.title('Training Progress (Accuracy, Precision, Recall)')
plt.legend()
plt.show()


# **Evaluate on the test set.**

In [None]:
test_loss, test_accuracy, test_precision, test_recall = model.evaluate(X_test, y_test, verbose=0)

print("Test Loss:", test_loss)
print("Test Accuracy:", test_accuracy)
print("Test Precision:", test_precision)
print("Test Recall:", test_recall)