# AI-BASED PERSONALIZED LEARNING SYSTEM

# 1.Problem Definition and Objective

In many classrooms, all students are taught in the same manner, even though everyone learns differently. Some students take more time to understand certain topics, while others understand them quicky and feel bored. Because of this, many students do not get he support they actually need. Personalized learning is important because it helps students learn at their own pace based on their abilitie. The goal of this system is to study student performance and understand where they are lagging. Using this information, the system suggests suitable learning content to help students improve better.

# 2. Data Understanding and preparation

The dataset used in this project is a synthetic dataset created for academic purposes. It represents student performance across different subjects using scores and time taken. This data is used to analyze learning patterns and generate personalized recommendations.

In [1]:
import pandas as pd
import random

In [2]:
students=[f"S{i}" for i in range(1,21)]
subjects=["Math","Science","English","History","Computer"]
data=[]

In [3]:
for student in students:
    for subject in subjects:
        data.append({
            "Student_ID":student,
            "Subject":subject,
            "Score":random.randint(40,95)
        })

# 3. Data Exploration

In this step, the dataset is explored to understand student performance across different subjects. Basic statistics are used to observe score to observe distribution and learning patterns.

In [4]:
df=pd.DataFrame(data)
df.head()

Unnamed: 0,Student_ID,Subject,Score
0,S1,Math,68
1,S1,Science,57
2,S1,English,49
3,S1,History,78
4,S1,Computer,63


In [5]:
df.info()
df.describe()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   Student_ID  100 non-null    object
 1   Subject     100 non-null    object
 2   Score       100 non-null    int64 
dtypes: int64(1), object(2)
memory usage: 2.5+ KB


Unnamed: 0,Score
count,100.0
mean,66.5
std,15.607885
min,40.0
25%,54.5
50%,67.0
75%,78.0
max,95.0


# 4. Learning level identification

Student performance is categorized based on their scores to understand their learning level in each subject. This classification helps in identifying weak and strong areas for every student. The learning levels are divided into weak, medium, and strong.

In [6]:
def classify_level(score):
    if score<55:
        return "Weak"
    elif score<80:
        return "Medium"
    else:
        return "Strong"

In [7]:
df["Learning_Level"]=df["Score"].apply(classify_level)
df.head()

Unnamed: 0,Student_ID,Subject,Score,Learning_Level
0,S1,Math,68,Medium
1,S1,Science,57,Medium
2,S1,English,49,Weak
3,S1,History,78,Medium
4,S1,Computer,63,Medium


# 4.1 ML-Based Learning Level Prediction

In this section, a supervised machine learning model is trained to predict the learning level of a student based on performance-related features. This allows the system to automatically learn patterns from data instead of relying on fixed rules.

In [8]:
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report

In [9]:
print(df.columns.tolist())

['Student_ID', 'Subject', 'Score', 'Learning_Level']


In [10]:
# Encode Subject column
le = LabelEncoder()
df["Subject_Encoded"] = le.fit_transform(df["Subject"])

# Features and target
X = df[["Score","Subject_Encoded"]]
y = df["Learning_Level"]

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

In [11]:
ml_model=DecisionTreeClassifier(random_state=42)
ml_model.fit(X_train,y_train)

In [12]:
y_pred=ml_model.predict(X_test)
print("ML Model Accuracy:",accuracy_score(y_test,y_pred))
print(classification_report(y_test,y_pred))

ML Model Accuracy: 1.0
              precision    recall  f1-score   support

      Medium       1.00      1.00      1.00        14
        Weak       1.00      1.00      1.00         6

    accuracy                           1.00        20
   macro avg       1.00      1.00      1.00        20
weighted avg       1.00      1.00      1.00        20



# 5. Personalized Learning Recommendations


Based on the identified learning level, personalized learning recommendations are generated for each student. These recommendations aim to support weak areas, strengthen average performance, and challenge strong learners with advanced content.

In [13]:
def generate_recommendation(level):
    if level=="Weak":
        return "Revise basic concepts and practise more questions"
    elif level=="Medium":
        return "Practice standard problems and revise key topics"
    else:
        return "Try advanced problems and explore additional resources"

In [14]:
df["Recommendation"]=df["Learning_Level"].apply(generate_recommendation)
df.head()

Unnamed: 0,Student_ID,Subject,Score,Learning_Level,Subject_Encoded,Recommendation
0,S1,Math,68,Medium,3,Practice standard problems and revise key topics
1,S1,Science,57,Medium,4,Practice standard problems and revise key topics
2,S1,English,49,Weak,1,Revise basic concepts and practise more questions
3,S1,History,78,Medium,2,Practice standard problems and revise key topics
4,S1,Computer,63,Medium,0,Practice standard problems and revise key topics


# 6. Student-wise Personalized Learning Path

This section summarizes personalized learning recommendations for each student across different subjects. It provides a clear learning path highlighting areas that need improvement and areas of strength.

In [15]:
student_summary=df[["Student_ID","Subject","Learning_Level","Recommendation"]]
student_summary.head(10)

Unnamed: 0,Student_ID,Subject,Learning_Level,Recommendation
0,S1,Math,Medium,Practice standard problems and revise key topics
1,S1,Science,Medium,Practice standard problems and revise key topics
2,S1,English,Weak,Revise basic concepts and practise more questions
3,S1,History,Medium,Practice standard problems and revise key topics
4,S1,Computer,Medium,Practice standard problems and revise key topics
5,S2,Math,Medium,Practice standard problems and revise key topics
6,S2,Science,Medium,Practice standard problems and revise key topics
7,S2,English,Medium,Practice standard problems and revise key topics
8,S2,History,Strong,Try advanced problems and explore additional r...
9,S2,Computer,Medium,Practice standard problems and revise key topics


# 7. Evaluation and Analysis

The system was evaluated by observing the learning levels and recommendations generated for each student. Students with lower scores were correctly identified as weak and provided with basic learning suggestions. Medium-level students received standard practice recommendations, while strong students were encouraged with advanced learning tasks. This shows that the system can effectively differentiate learning needs based on performance data.

In [16]:
# Distribution of learning levels
df["Learning_Level"].value_counts()


Medium    52
Weak      25
Strong    23
Name: Learning_Level, dtype: int64

# 8. Ethical Considerations and Responsible AI

This system uses a synthetic dataset created only for academic purposes, so no real student data or personal information is involved. The same rules and evaluation criteria are applied to all students, ensuring fairness in learning level classification. However, since the dataset is synthetic and limited in size, the results may not fully represent real classroom scenarios. In real-world applications, larger and more diverse datasets would be required to improve fairness and accuracy.

# API Layer for Model Deployment

A lightweight API is created to expose the trained machine learning model. This allows external systems or user interfaces to request predictions using student performance data. The API is implemented using FastAPI and uses the trained model for inference.

In [17]:
import pickle
with open("learning_model.pkl","wb") as f:
    pickle.dump(ml_model,f)

# RAG-Based Learning Content Support

To enhance personalized learning, a Retrieval Augmented Generation (RAG) approach is used to provide topic-specific explanations. The system retrieves relevant learning material from a custom knowledge base and uses it to generate responses for learners.

In [18]:
learning_material = {
    "Math_Basic": "Math basics include numbers, fractions, and simple operations.",
    "Math_Advanced": "Advanced math includes algebraic manipulation and problem-solving.",

    "Science_Basic": "Science basics include observation, experiments, and simple concepts.",
    "Science_Advanced": "Advanced science includes deeper understanding of physical and biological processes.",

    "English_Basic": "English basics include grammar, vocabulary, and sentence formation.",
    "English_Advanced": "Advanced English includes comprehension, writing, and critical analysis.",

    "History_Basic": "History basics focus on important past events and timelines.",
    "History_Advanced": "Advanced history involves analysis of historical causes and impacts.",

    "Computer_Basic": "Computer basics include understanding hardware, software, and simple programs.",
    "Computer_Advanced": "Advanced computer topics include algorithms, data structures, and programming concepts."
}


In [19]:
def retrieve_content(subject, level):
    if level == "Strong":
        key = f"{subject}_Advanced"
    else:
        key = f"{subject}_Basic"
    return learning_material.get(key, "Content not available")

In [20]:
df["Retrieved_Content"] = df.apply(
    lambda row: retrieve_content(row["Subject"], row["Learning_Level"]), axis=1
)

df[["Student_ID", "Subject", "Learning_Level", "Retrieved_Content"]].head()

Unnamed: 0,Student_ID,Subject,Learning_Level,Retrieved_Content
0,S1,Math,Medium,"Math basics include numbers, fractions, and si..."
1,S1,Science,Medium,"Science basics include observation, experiment..."
2,S1,English,Weak,"English basics include grammar, vocabulary, an..."
3,S1,History,Medium,History basics focus on important past events ...
4,S1,Computer,Medium,Computer basics include understanding hardware...


# 9. Conclusion and Future Scope

This project demonstrates how student performance data can be used to create personalized learning recommendations. By identifying weak, medium, and strong learning areas, the system helps guide students toward suitable learning strategies. Although the current system is rule-based and uses synthetic data, it provides a strong foundation for personalized learning. In the future, the system can be improved by using real-world data, machine learning models, and adaptive content delivery.