# Task#01: Student Performance Prediction using Artificial Neural Network (ANN)

This notebook implements an Artificial Neural Network (ANN) to predict whether a student **Passes or Fails** based on academic scores and demographic features.
    
The model uses proper preprocessing techniques, feature scaling, and a multi-layer neural network for binary classification.

In [20]:
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.metrics import accuracy_score

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

## Load Dataset

In [21]:
data = pd.read_csv("StudentsPerformance.csv")
data.head()

Unnamed: 0,gender,race/ethnicity,parental level of education,lunch,test preparation course,math score,reading score,writing score
0,female,group B,bachelor's degree,standard,none,72,72,74
1,female,group C,some college,standard,completed,69,90,88
2,female,group B,master's degree,standard,none,90,95,93
3,male,group A,associate's degree,free/reduced,none,47,57,44
4,male,group C,some college,standard,none,76,78,75


## Dataset Selection

For this task, the **Students Performance in Exams** dataset was selected from Kaggle.

Dataset Link:  
https://www.kaggle.com/datasets/spscientist/students-performance-in-exams

This is a tabular dataset containing both numerical and categorical attributes related to students’ academic performance. It is suitable for implementing an Artificial Neural Network (ANN) as it allows the model to learn patterns from structured data.

The objective is to predict whether a student **passes or fails** based on academic and demographic features.

In [22]:
# Create Target Variable (Pass / Fail)
data['total_score'] = (
    data['math score'] +
    data['reading score'] +
    data['writing score']
)

data['pass_fail'] = np.where(data['total_score'] >= 180, 1, 0)

In [23]:
# Handle Categorical Data (Encoding) as ANN only work with numbers
encoder = LabelEncoder()
categorical_cols = [
    'gender',
    'race/ethnicity',
    'parental level of education',
    'lunch',
    'test preparation course'
]

for col in categorical_cols:
    data[col] = encoder.fit_transform(data[col])

In [24]:
# Prepare Features and Labels
X = data.drop(['pass_fail', 'total_score'], axis=1)
y = data['pass_fail']

# Feature Scaling
scaler = StandardScaler()
X = scaler.fit_transform(X)

## Data Preprocessing

Data preprocessing was performed before training the ANN model.

- **Handling Missing Values:**  
  The dataset was checked for missing values, and no missing values were found.

- **Encoding Categorical Variables:**  
  Categorical features such as gender, lunch type, and test preparation course were converted into numerical form using label encoding so that they can be processed by the neural network.

- **Feature Scaling:**  
  Standardization was applied to scale numerical features so that all input values are on a similar range, which improves ANN training efficiency.


In [25]:
# Train–Test Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

### Build ANN Model

In [26]:
model = Sequential()
model.add(Dense(16, activation='relu', input_shape=(X_train.shape[1],)))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [27]:
model.summary()

## ANN Architecture and Activation Functions

The Artificial Neural Network consists of the following layers:

- **Input Layer:**  
  The input layer receives five features: gender, reading score, writing score, lunch type, and test preparation course.

- **Hidden Layer:**  
  One hidden layer with sixteen neurons is used to learn complex and non-linear relationships in the data. The ReLU activation function is applied in this layer.

- **Output Layer:**  
  The output layer contains one neuron with a Sigmoid activation function to predict whether a student passes or fails.

ReLU is used to improve learning speed and non-linearity, while Sigmoid is used to produce probabilities between 0 and 1.

In [28]:
# Compile the Model
model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

### Train Model

In [29]:
model.fit(X_train, y_train, epochs=50, batch_size=16)

Epoch 1/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 6ms/step - accuracy: 0.5763 - loss: 0.6689
Epoch 2/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.7975 - loss: 0.4900
Epoch 3/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 9ms/step - accuracy: 0.9000 - loss: 0.3673
Epoch 4/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.9287 - loss: 0.2792
Epoch 5/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.9413 - loss: 0.2170
Epoch 6/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.9600 - loss: 0.1726
Epoch 7/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.9675 - loss: 0.1406
Epoch 8/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.9712 - loss: 0.1179
Epoch 9/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[

<keras.src.callbacks.history.History at 0x2879fdf6270>

### Evaluate the Model

In [30]:
loss, accuracy = model.evaluate(X_test, y_test)
print("Test Accuracy:", accuracy)

[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 16ms/step - accuracy: 0.9850 - loss: 0.0300 
Test Accuracy: 0.9850000143051147


### Save the Model

In [31]:
model.save("student_ann_model.h5")

