# Task 1: Artificial Neural Network (ANN)

In [1]:
## Import Libraries
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.metrics import accuracy_score

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense


In [13]:
# Load Dataset 
df = pd.read_csv("StudentsPerformance.csv")
df.head()


Unnamed: 0,gender,race/ethnicity,parental level of education,lunch,test preparation course,math score,reading score,writing score
0,female,group B,bachelor's degree,standard,none,72,72,74
1,female,group C,some college,standard,completed,69,90,88
2,female,group B,master's degree,standard,none,90,95,93
3,male,group A,associate's degree,free/reduced,none,47,57,44
4,male,group C,some college,standard,none,76,78,75


## Dataset Selection

For this task, the **Students Performance in Exams** dataset was selected from Kaggle.

Dataset Link:  
https://www.kaggle.com/datasets/spscientist/students-performance-in-exams

This is a tabular dataset containing both numerical and categorical attributes related to students’ academic performance. It is suitable for implementing an Artificial Neural Network (ANN) as it allows the model to learn patterns from structured data.

The objective is to predict whether a student **passes or fails** based on academic and demographic features.


In [14]:
# Check for missing values
df.isnull().sum()


gender                         0
race/ethnicity                 0
parental level of education    0
lunch                          0
test preparation course        0
math score                     0
reading score                  0
writing score                  0
dtype: int64

In [15]:
# Handle missing values (if any)
df.fillna(df.mean(numeric_only=True), inplace=True)


In [16]:
#Create Target Variable (Pass / Fail)
df['pass_fail'] = df['math score'].apply(lambda x: 1 if x >= 50 else 0)


In [17]:
#Encode Categorical Variables
encoder = LabelEncoder()

df['gender'] = encoder.fit_transform(df['gender'])
df['lunch'] = encoder.fit_transform(df['lunch'])
df['test preparation course'] = encoder.fit_transform(df['test preparation course'])


In [18]:
#Feature Selection and Scaling
X = df[['gender', 'reading score', 'writing score', 'lunch', 'test preparation course']]
y = df['pass_fail']

scaler = StandardScaler()
X = scaler.fit_transform(X)


## Data Preprocessing

Data preprocessing was performed before training the ANN model.

- **Handling Missing Values:**  
  The dataset was checked for missing values, and no missing values were found.

- **Encoding Categorical Variables:**  
  Categorical features such as gender, lunch type, and test preparation course were converted into numerical form using label encoding so that they can be processed by the neural network.

- **Feature Scaling:**  
  Standardization was applied to scale numerical features so that all input values are on a similar range, which improves ANN training efficiency.


In [19]:
#Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


## Build ANN Model

In [29]:

model = Sequential()
model.add(Dense(16, activation='relu', input_shape=(5,)))
model.add(Dense(1, activation='sigmoid'))


## ANN Architecture and Activation Functions

The Artificial Neural Network consists of the following layers:

- **Input Layer:**  
  The input layer receives five features: gender, reading score, writing score, lunch type, and test preparation course.

- **Hidden Layer:**  
  One hidden layer with sixteen neurons is used to learn complex and non-linear relationships in the data. The ReLU activation function is applied in this layer.

- **Output Layer:**  
  The output layer contains one neuron with a Sigmoid activation function to predict whether a student passes or fails.

ReLU is used to improve learning speed and non-linearity, while Sigmoid is used to produce probabilities between 0 and 1.


## Compile Model

In [30]:

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)


## Train Model

In [31]:

model.fit(X_train, y_train, epochs=30, batch_size=16)


Epoch 1/30
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.4050 - loss: 0.7561
Epoch 2/30
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.6612 - loss: 0.5925
Epoch 3/30
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.8612 - loss: 0.4831
Epoch 4/30
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.9150 - loss: 0.4077
Epoch 5/30
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.9162 - loss: 0.3532
Epoch 6/30
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.9100 - loss: 0.3124
Epoch 7/30
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.9112 - loss: 0.2803
Epoch 8/30
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.9125 - loss: 0.2550
Epoch 9/30
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[

<keras.src.callbacks.history.History at 0x2434d312aa0>

In [35]:
import os
import tensorflow as tf

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # Only errors, hides warnings


## Evaluate Model

In [36]:

y_pred = (model.predict(X_test) > 0.5).astype(int)
accuracy = accuracy_score(y_test, y_pred)

print("Model Accuracy:", accuracy)


[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step
Model Accuracy: 0.895


## Loss Function, Optimizer, and Evaluation

- **Loss Function:**  
  Binary Crossentropy loss function was used because the problem is a binary classification task (Pass/Fail).

- **Optimizer:**  
  The Adam optimizer was selected due to its adaptive learning rate and efficient convergence.

- **Evaluation Metric:**  
  The model performance was evaluated using accuracy, which measures the proportion of correct predictions made by the model on unseen test data.
