### Student Data Analysis with Pandas

This notebook demonstrates a step-by-step process of loading, cleaning, and preparing student performance data using Pandas. The dataset, retrieved from an online source, contains individual scores in Mathematics, Reading, and Writing.

In this project, we:

   - Calculate each student's Total Score and Percentage.  
   - Add a new column for status tracking.
   - Clean and enhance the dataset by renaming columns and removing unnecessary ones.
   - Save the cleaned data into both CSV and Excel formats for further use.
 
This workflow provides a solid foundation for analyzing academic performance and preparing data for visualization or reporting.
 

In [5]:
# Import pandas
import pandas as pd

In [7]:
url = "https://raw.githubusercontent.com/ritaafrica/data/refs/heads/main/student_scores.csv"
df = pd.read_csv(url)

In [11]:
df.head()

Unnamed: 0,StudentID,Name,Country,Gender,Age,Math_Score,Reading_Score,Writing_Score,School,Class
0,1,Ifeanyi Mugisha,Zimbabwe,Female,18,99,77,12,Nelson Mandela School,JSS1
1,2,Yemi Okeke,Tanzania,Male,16,60,29,98,Ubuntu Academy,JSS2
2,3,Fatou Mugisha,Zimbabwe,Male,15,49,46,84,Ubuntu Academy,SS3
3,4,Chinedu Okafor,Ethiopia,Female,17,34,57,45,Ubuntu Academy,SS2
4,5,Yemi Moyo,Senegal,Male,16,22,16,38,Ubuntu Academy,SS3


In [19]:
# Adding a new column for new status
df["status"] = "Pending"

In [21]:
df.head()

Unnamed: 0,StudentID,Name,Country,Gender,Age,Math_Score,Reading_Score,Writing_Score,School,Class,status
0,1,Ifeanyi Mugisha,Zimbabwe,Female,18,99,77,12,Nelson Mandela School,JSS1,Pending
1,2,Yemi Okeke,Tanzania,Male,16,60,29,98,Ubuntu Academy,JSS2,Pending
2,3,Fatou Mugisha,Zimbabwe,Male,15,49,46,84,Ubuntu Academy,SS3,Pending
3,4,Chinedu Okafor,Ethiopia,Female,17,34,57,45,Ubuntu Academy,SS2,Pending
4,5,Yemi Moyo,Senegal,Male,16,22,16,38,Ubuntu Academy,SS3,Pending


In [27]:
# Adding the scores
df["Total_Score"] = df["Math_Score"] + df["Reading_Score"] + df["Writing_Score"]

In [29]:
df.head()

Unnamed: 0,StudentID,Name,Country,Gender,Age,Math_Score,Reading_Score,Writing_Score,School,Class,status,Total_Score
0,1,Ifeanyi Mugisha,Zimbabwe,Female,18,99,77,12,Nelson Mandela School,JSS1,Pending,188
1,2,Yemi Okeke,Tanzania,Male,16,60,29,98,Ubuntu Academy,JSS2,Pending,187
2,3,Fatou Mugisha,Zimbabwe,Male,15,49,46,84,Ubuntu Academy,SS3,Pending,179
3,4,Chinedu Okafor,Ethiopia,Female,17,34,57,45,Ubuntu Academy,SS2,Pending,136
4,5,Yemi Moyo,Senegal,Male,16,22,16,38,Ubuntu Academy,SS3,Pending,76


In [35]:
# Finding the percentage of the scores
df["Percentage"] = (df["Total_Score"] / 300) * 100

In [37]:
print(df["Percentage"])

0     62.666667
1     62.333333
2     59.666667
3     45.333333
4     25.333333
        ...    
95    64.000000
96    46.666667
97    59.666667
98    46.000000
99    39.666667
Name: Percentage, Length: 100, dtype: float64


In [39]:
df.head()

Unnamed: 0,StudentID,Name,Country,Gender,Age,Math_Score,Reading_Score,Writing_Score,School,Class,status,Total_Score,Percentage
0,1,Ifeanyi Mugisha,Zimbabwe,Female,18,99,77,12,Nelson Mandela School,JSS1,Pending,188,62.666667
1,2,Yemi Okeke,Tanzania,Male,16,60,29,98,Ubuntu Academy,JSS2,Pending,187,62.333333
2,3,Fatou Mugisha,Zimbabwe,Male,15,49,46,84,Ubuntu Academy,SS3,Pending,179,59.666667
3,4,Chinedu Okafor,Ethiopia,Female,17,34,57,45,Ubuntu Academy,SS2,Pending,136,45.333333
4,5,Yemi Moyo,Senegal,Male,16,22,16,38,Ubuntu Academy,SS3,Pending,76,25.333333


In [42]:
#Rounding the Perentage two decimal points
df["Percentage"] = df["Percentage"].round(2)

In [44]:
df.head()

Unnamed: 0,StudentID,Name,Country,Gender,Age,Math_Score,Reading_Score,Writing_Score,School,Class,status,Total_Score,Percentage
0,1,Ifeanyi Mugisha,Zimbabwe,Female,18,99,77,12,Nelson Mandela School,JSS1,Pending,188,62.67
1,2,Yemi Okeke,Tanzania,Male,16,60,29,98,Ubuntu Academy,JSS2,Pending,187,62.33
2,3,Fatou Mugisha,Zimbabwe,Male,15,49,46,84,Ubuntu Academy,SS3,Pending,179,59.67
3,4,Chinedu Okafor,Ethiopia,Female,17,34,57,45,Ubuntu Academy,SS2,Pending,136,45.33
4,5,Yemi Moyo,Senegal,Male,16,22,16,38,Ubuntu Academy,SS3,Pending,76,25.33


In [46]:
# Renaming Math_Score to Mathematics_Score
df.rename(columns={"Math_Score" : "Mathematics_Score"}, inplace=True)

In [48]:
df.head()

Unnamed: 0,StudentID,Name,Country,Gender,Age,Mathematics_Score,Reading_Score,Writing_Score,School,Class,status,Total_Score,Percentage
0,1,Ifeanyi Mugisha,Zimbabwe,Female,18,99,77,12,Nelson Mandela School,JSS1,Pending,188,62.67
1,2,Yemi Okeke,Tanzania,Male,16,60,29,98,Ubuntu Academy,JSS2,Pending,187,62.33
2,3,Fatou Mugisha,Zimbabwe,Male,15,49,46,84,Ubuntu Academy,SS3,Pending,179,59.67
3,4,Chinedu Okafor,Ethiopia,Female,17,34,57,45,Ubuntu Academy,SS2,Pending,136,45.33
4,5,Yemi Moyo,Senegal,Male,16,22,16,38,Ubuntu Academy,SS3,Pending,76,25.33


In [50]:
# Rename Genger to G
df.rename(columns={"Gender" : "G"}, inplace=True)

In [52]:
df.head()

Unnamed: 0,StudentID,Name,Country,G,Age,Mathematics_Score,Reading_Score,Writing_Score,School,Class,status,Total_Score,Percentage
0,1,Ifeanyi Mugisha,Zimbabwe,Female,18,99,77,12,Nelson Mandela School,JSS1,Pending,188,62.67
1,2,Yemi Okeke,Tanzania,Male,16,60,29,98,Ubuntu Academy,JSS2,Pending,187,62.33
2,3,Fatou Mugisha,Zimbabwe,Male,15,49,46,84,Ubuntu Academy,SS3,Pending,179,59.67
3,4,Chinedu Okafor,Ethiopia,Female,17,34,57,45,Ubuntu Academy,SS2,Pending,136,45.33
4,5,Yemi Moyo,Senegal,Male,16,22,16,38,Ubuntu Academy,SS3,Pending,76,25.33


In [59]:
# Droping the school column
df.drop(columns=["School"], inplace=True)

In [61]:
df.head()

Unnamed: 0,StudentID,Name,Country,G,Age,Mathematics_Score,Reading_Score,Writing_Score,Class,status,Total_Score,Percentage
0,1,Ifeanyi Mugisha,Zimbabwe,Female,18,99,77,12,JSS1,Pending,188,62.67
1,2,Yemi Okeke,Tanzania,Male,16,60,29,98,JSS2,Pending,187,62.33
2,3,Fatou Mugisha,Zimbabwe,Male,15,49,46,84,SS3,Pending,179,59.67
3,4,Chinedu Okafor,Ethiopia,Female,17,34,57,45,SS2,Pending,136,45.33
4,5,Yemi Moyo,Senegal,Male,16,22,16,38,SS3,Pending,76,25.33


In [63]:
# Saving datasets to csv
df.to_csv("modified_students_scores.csv",index=False)

In [65]:
# Saving datasets to excel
df.to_excel("modified_students.xlsx", index=False)