<font>
<div dir=ltr align=center>
<img src='Sharif_logo.png' width=250 height=250> <br>
<font color=0F5298 size=7>
Applied Data Science<br>
<font color=2565AE size=5>
Spring 2025<br>
<font color=3C99D size=5>
HW9 - Neural Networks <br>
<font color=696880 size=4>
Ali Mohammadzade Shabestari - 401106482 - Computer Engineering



# 1. Import Libraries

In [92]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score, r2_score
from sklearn.neural_network import MLPClassifier, MLPRegressor
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import Adam
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader

# 2. Loading & Preprocessing Dataset

## 2. 1. Loading

In [63]:
df = pd.read_csv('abalone.csv')

df.head()

Unnamed: 0,Sex,Length,Diameter,Height,Whole_weight,Shucked_weight,Viscera_weight,Shell_weight,Rings
0,M,0.455,0.365,0.095,0.514,0.2245,0.101,0.15,15
1,M,0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,7
2,F,0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,9
3,M,0.44,0.365,0.125,0.516,0.2155,0.114,0.155,10
4,I,0.33,0.255,0.08,0.205,0.0895,0.0395,0.055,7


In [64]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4177 entries, 0 to 4176
Data columns (total 9 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Sex             4177 non-null   object 
 1   Length          4177 non-null   float64
 2   Diameter        4177 non-null   float64
 3   Height          4177 non-null   float64
 4   Whole_weight    4177 non-null   float64
 5   Shucked_weight  4177 non-null   float64
 6   Viscera_weight  4177 non-null   float64
 7   Shell_weight    4177 non-null   float64
 8   Rings           4177 non-null   int64  
dtypes: float64(7), int64(1), object(1)
memory usage: 293.8+ KB


## 2. 2. Preprocessing

Encode column `Sex` (one-hot encoding).

In [65]:
# Step 1: Convert 'Sex' column to dummy variables 
sex_dummies = pd.get_dummies(df['Sex'], prefix='Sex').astype(int)

# Step 2: Drop the original 'Sex' column and add the new dummies
df = pd.concat([df.drop('Sex', axis=1), sex_dummies], axis=1)

Define bins and labels for categorization

In [66]:
bins = [0, 8, 10, 15, 30]  # Example bins
labels = ['Young', 'Adult', 'Mature', 'Old']  # Example categories

# Add a new column with categorized values
df['Age'] = pd.cut(df['Rings'], bins=bins, labels=labels, right=False)

df['Age'] = df['Age'].cat.codes  # Convert categories to numerical codes

# Add a binary category column based on 'Age'
df['Binary_Age'] = (df['Age'] > 1).astype(int)

# Display the updated dataframe
df.head()

Unnamed: 0,Length,Diameter,Height,Whole_weight,Shucked_weight,Viscera_weight,Shell_weight,Rings,Sex_F,Sex_I,Sex_M,Age,Binary_Age
0,0.455,0.365,0.095,0.514,0.2245,0.101,0.15,15,0,0,1,3,1
1,0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,7,0,0,1,0,0
2,0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,9,1,0,0,1,0
3,0.44,0.365,0.125,0.516,0.2155,0.114,0.155,10,0,0,1,2,1
4,0.33,0.255,0.08,0.205,0.0895,0.0395,0.055,7,0,1,0,0,0


Split dataframe into X and y vectors.

In [67]:
X = df.drop(columns=['Rings', 'Age', 'Binary_Age'])
yr = df['Rings']
yc = df['Age']
yb = df['Binary_Age']

Standardize X values.

In [68]:
scaler = StandardScaler()
X = scaler.fit_transform(X)

## 2. 3. Split

In [69]:
Xr_train, Xr_test, yr_train, yr_test = train_test_split(X, yr, test_size=0.2, random_state=42)
Xc_train, Xc_test, yc_train, yc_test = train_test_split(X, yc, test_size=0.2, random_state=42)
Xb_train, Xb_test, yb_train, yb_test = train_test_split(X, yb, test_size=0.2, random_state=42)

## 2. 4. Metric Function

It's a function that prints F1 Score for each model, comparing to the desired threshold.

In [70]:
def print_f1(model, true, prediction, threshold):
    print(f"🚀 {model}")
    f1 = f1_score(true, prediction, average='weighted')
    print(f"F1 Score: {f1:.4f}")
    print(f"Treshold: {threshold}")
    print(f"Meets threshold: {f1 > threshold}")

In [71]:
def print_r2(model, true, prediction, threshold):
    print(f"🚀 {model}")
    r2 = r2_score(true, prediction)
    print(f"R2 Score: {r2:.4f}")
    print(f"Treshold: {threshold}")
    print(f"Meets threshold: {r2 > threshold}")

# 3. Multilayer Perceptron

A Multilayer Perceptron (MLP) is a type of neural network made of layers of connected neurons. It includes an input layer, one or more hidden layers, and an output layer. Each neuron processes inputs using weights and an activation function. MLPs learn patterns from data using backpropagation and are used for tasks like classification and regression.

## 3. 1. Multilayer Perceptron Classifier

In [72]:
# Train MLP Classifier
mlp_clf = MLPClassifier(learning_rate_init=0.0005, activation='tanh', hidden_layer_sizes=(100, 100), max_iter=3000, random_state=42)
mlp_clf.fit(Xc_train, yc_train)

# Predict
yc_pred = mlp_clf.predict(Xc_test)

# Evaluate
print_f1("Multilayer Perceptron", yc_test, yc_pred, 0.75)

🚀 Multilayer Perceptron
F1 Score: 0.6422
Treshold: 0.75
Meets threshold: False


## 3. 1. Multilayer Perceptron Regressor

In [73]:
# Train MLP Regressor
mlp_reg = MLPRegressor(hidden_layer_sizes=(100,), max_iter=1000, random_state=42)
mlp_reg.fit(Xr_train, yr_train)

# Make predictions
yr_pred = mlp_reg.predict(Xr_test)

# Evaluate
print_r2("Multilayer Perceptron", yr_test, yr_pred, 0.8)

🚀 Multilayer Perceptron
R2 Score: 0.5818
Treshold: 0.8
Meets threshold: False


I tried many tunings, but it looks like the f1 and r2 scores cannot be better. ☹️

# 4. 4-Layer Feedforward Network with Keras

## 4.1. Binary Classification

In [77]:
# Train Keras Classifier
clf_model = Sequential([
    Dense(64, activation='relu', input_shape=(Xb_train.shape[1],)),
    Dense(64, activation='relu'),
    Dense(32, activation='relu'),
    Dense(1, activation='sigmoid')
])

clf_model.compile(optimizer=Adam(0.001), loss='binary_crossentropy', metrics=[])

clf_model.fit(Xb_train, yb_train, epochs=30, batch_size=32, verbose=0)

# Predict
yb_pred_probs = clf_model.predict(Xb_test)
yb_pred = (yb_pred_probs > 0.5).astype(int)

# Evaluate
print_f1("Keras Classifier", yb_test, yb_pred, 0.75)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m27/27[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step 
🚀 Keras Classifier
F1 Score: 0.8036
Treshold: 0.75
Meets threshold: True


## 4. 2. Regression

In [79]:
# Train Keras Regressor
reg_model = Sequential([
    Dense(64, activation='relu', input_shape=(Xr_train.shape[1],)),
    Dense(32, activation='tanh'),
    Dense(16, activation='sigmoid'),
    Dense(1)
])

reg_model.compile(optimizer=Adam(0.001), loss='mse')

# Train 
reg_model.fit(Xr_train, yr_train, epochs=150, batch_size=64, verbose=0)

# Predict
yr_pred = reg_model.predict(Xr_test).flatten()

# Evaluate
print_r2("Keras Regressor", yr_test, yr_pred, 0.8)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m27/27[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step 
🚀 Keras Regressor
R2 Score: 0.5734
Treshold: 0.8
Meets threshold: False


# 5. 4-Layer Feedforward Network with PyTorch

Implementing a 4-layer feedforward network in `PyTorch` involves defining a custom neural network class that inherits from `nn.Module`. You start by defining the layers in the constructor `__init__`, using `nn.Linear()` for fully connected layers and activation functions like `ReLU` or `Tanh` in between. The forward pass is implemented in the `forward()` method, where you specify how data flows through the layers. For binary classification, the final layer uses `nn.Sigmoid()`, while for regression, it uses a linear output. You then initialize the model, define a loss function (e.g., `BCEWithLogitsLoss` for classification or `MSELoss` for regression), and an optimizer (e.g., `Adam`). The model is trained using the training data, and the evaluation is done based on metrics like `F1-score` for classification or `R²` for regression.

In [83]:
class FeedforwardNN(nn.Module):
    def __init__(self, input_dim, output_dim=1, classification=True):
        super().__init__()
        self.classification = classification
        self.model = nn.Sequential(
            nn.Linear(input_dim, 64),
            nn.ReLU(),
            nn.Linear(64, 32),
            nn.ReLU(),
            nn.Linear(32, 16),
            nn.ReLU(),
            nn.Linear(16, output_dim),
        )
        if classification:
            self.activation = nn.Sigmoid()
        
    def forward(self, x):
        x = self.model(x)
        return self.activation(x) if self.classification else x

In [84]:
def train_model(model, dataloader, criterion, optimizer, epochs=100):
    model.train()
    for epoch in range(epochs):
        for xb, yb in dataloader:
            pred = model(xb).squeeze()
            loss = criterion(pred, yb)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

In [88]:
def to_tensor(x):
    return torch.tensor(x, dtype=torch.float32)

## 5. 1. Binary Classification

In [90]:
# Convert data

train_ds = TensorDataset(to_tensor(Xb_train), to_tensor(yb_train))
test_x = to_tensor(Xb_test)

# Model
clf_model = FeedforwardNN(Xb_train.shape[1], classification=True)
clf_loss = nn.BCELoss()
clf_opt = optim.Adam(clf_model.parameters(), lr=0.001)

# Train
clf_loader = DataLoader(train_ds, batch_size=64, shuffle=True)
train_model(clf_model, clf_loader, clf_loss, clf_opt, epochs=100)

# Evaluate
clf_model.eval()
with torch.no_grad():
    yb_pred_proba = clf_model(test_x).numpy()
    yb_pred = (yb_pred_proba > 0.5).astype(int)
    print_f1("PyTorch Classifier", yb_test, yb_pred, 0.75)

🚀 PyTorch Classifier
F1 Score: 0.8073
Treshold: 0.75
Meets threshold: True


## 5. 2. Regression

In [91]:
train_ds_r = TensorDataset(to_tensor(Xr_train), to_tensor(yr_train))
test_x_r = to_tensor(Xr_test)

# Model
reg_model = FeedforwardNN(Xr_train.shape[1], classification=False)
reg_loss = nn.MSELoss()
reg_opt = optim.Adam(reg_model.parameters(), lr=0.001)

# Train
reg_loader = DataLoader(train_ds_r, batch_size=64, shuffle=True)
train_model(reg_model, reg_loader, reg_loss, reg_opt, epochs=100)

# Evaluate
reg_model.eval()
with torch.no_grad():
    y_pred_r = reg_model(test_x_r).numpy()
    r2 = r2_score(yr_test, y_pred_r)
    print_r2("PyTorch Regressor", yr_test, y_pred_r, 0.8)

🚀 PyTorch Regressor
R2 Score: 0.5862
Treshold: 0.8
Meets threshold: False


# 6. 4-Layer Non-Sequential Feedforward Network with Keras

Implementing a 4-layer non-sequential feedforward network in `Keras` involves using the Functional API, which allows for more flexibility than the `Sequential` model. First, you define the input layer using `Input()`, specifying the shape of the data. Then, you add hidden layers using the `Dense()` layer, with activations like `ReLU`, `Tanh`, or `Sigmoid`. Each layer is connected by passing the output of one layer as the input to the next. The output layer, for binary classification, uses a `Sigmoid` activation, while for regression, it uses a linear activation. The model is compiled with an optimizer (e.g., `Adam`) and a loss function (e.g., `binary_crossentropy` for classification or `mse` for regression). Finally, the model is trained using `.fit()` and evaluated based on appropriate metrics like `F1-score` for classification or `R²` for regression.

## 6. 1. Binary Classification

In [None]:
# Input layer
inputs = Input(shape=(Xb_train.shape[1],))

# Hidden layers
x = Dense(64, activation='relu')(inputs)
x = Dense(32, activation='tanh')(x)
x = Dense(16, activation='relu')(x)
x = Dense(8, activation='relu')(x)

# Output layer
outputs = Dense(1, activation='sigmoid')(x)

# Define model
clf_model = Model(inputs, outputs)
clf_model.compile(optimizer=Adam(0.001), loss='binary_crossentropy')

# Train
clf_model.fit(Xb_train, yb_train, epochs=50, batch_size=32, verbose=0)

# Predict
yb_pred_probs = clf_model.predict(Xb_test).flatten()
yb_pred = (yb_pred_probs > 0.5).astype(int)

# Evaluate
print_f1("Keras Functional API Classifier", yb_test, yb_pred, 0.75)

[1m27/27[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step 
🚀 Keras Functional API Classifier
F1 Score: 0.7943
Treshold: 0.75
Meets threshold: True


## 6. 2. Regression

In [94]:
# Input
inputs = Input(shape=(Xr_train.shape[1],))

# Hidden layers
x = Dense(64, activation='relu')(inputs)
x = Dense(32, activation='tanh')(x)
x = Dense(16, activation='relu')(x)
x = Dense(8, activation='relu')(x)

# Output
outputs = Dense(1)(x)

# Define model
reg_model = Model(inputs, outputs)
reg_model.compile(optimizer=Adam(0.001), loss='mse')

# Train
reg_model.fit(Xr_train, yr_train, epochs=100, batch_size=32, verbose=0)

# Predict and evaluate
yr_pred = reg_model.predict(Xr_test).flatten()
r2 = r2_score(yr_test, yr_pred)

print_r2("Keras Functional API Regressor", yr_test, yr_pred, 0.8)

[1m27/27[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step 
🚀 Keras Functional API Regressor
R2 Score: 0.5860
Treshold: 0.8
Meets threshold: False


# 7. Question

Q: Why neural networks are so powerful and what the diffcult part is in designing neural networks❓

A: Neural networks are strong because they can automatically extract intricate representations and patterns from data, which enables them to solve a variety of problems, including natural language processing and image categorization.  Adapting over layers of neurons, each learning increasingly abstract characteristics of the incoming data, and modeling non-linear interactions are their strong points.  However, correct architecture selection (e.g., number of layers, neurons, and activation functions), tweaking hyperparameters (e.g., learning rate and regularization), and avoiding problems (e.g., overfitting or vanishing gradients) can make neural network design difficult.  Furthermore, deep network training necessitates a significant quantity of data and computer power, and identifying the best model can be an iterative lengthy process. ✅