# Data Analytics Project: `"Student Performance Analytics using NumPy"`

### Project Description:

You are given student performance data for 100 students. The dataset includes:

* Student ID

* Scores in 5 subjects: Math, English, Economics, Biology, and Civic Education

### Your task is to:

* Calculate the average score for each student.

* Find the top 5 students based on total scores.

* Analyze the performance in each subject (average, max, min).

* Classify students into performance categories (High, Average, Low) based on their total score.

### NOTE:

You are to classify the students into three categories based on their total score:

* High: Total score ≥ 450

* Average: Total score between 350 and 449

* Low: Total score < 350

# Step 1: Import Libraries

In [1]:
# Import libraries
import numpy as np

# Step 2: Create Synthetic Student Data

Let’s simulate scores for 100 students in 5 subjects:

In [2]:
# Set random seed for repeatability
np.random.seed(42)

# Generate synthetic data for 100 students
student_ids = np.arange(1, 101)  # Student IDs from 1 to 100
scores = np.random.randint(50, 100, size=(100, 5))  # Scores between 50 and 100 for each subject

# Display the student data
print("Student Performance Data (ID, Math, English, Science, History, PE):")
data = np.column_stack((student_ids, scores))
print(data)

Student Performance Data (ID, Math, English, Science, History, PE):
[[  1  88  78  64  92  57]
 [  2  70  88  68  72  60]
 [  3  60  73  85  89  73]
 [  4  52  71  51  73  93]
 [  5  79  87  51  70  82]
 [  6  61  71  93  74  98]
 [  7  76  91  77  65  64]
 [  8  96  93  52  86  56]
 [  9  70  58  88  67  53]
 [ 10  74  63  99  58  75]
 [ 11  51  69  77  96  56]
 [ 12  93  57  96  84  63]
 [ 13  66  85  99  89  53]
 [ 14  51  55  91  53  78]
 [ 15  67  75  93  83  59]
 [ 16  85  63  80  97  64]
 [ 17  57  63  72  89  70]
 [ 18  65  94  67  96  73]
 [ 19  75  74  94  90  78]
 [ 20  64  94  50  74  56]
 [ 21  58  73  50  93  57]
 [ 22  73  60  66  57  84]
 [ 23  84  82  54  91  88]
 [ 24  90  77  56  58  57]
 [ 25  61  83  82  97  72]
 [ 26  73  86  84  93  89]
 [ 27  71  76  84  50  84]
 [ 28  86  96  63  52  50]
 [ 29  54  75  63  88  76]
 [ 30  58  64  64  75  91]
 [ 31  62  81  88  98  81]
 [ 32  53  79  86  72  88]
 [ 33  94  64  92  78  85]
 [ 34  62  81  56  71  77]
 [ 35  51  91 

# Step 3: Calculate Average Score for Each Student

To calculate the average score for each student across all subjects:

In [3]:
# Calculate the average score for each student
average_scores = np.mean(scores, axis=1)

# Display the average score for each student
print("Average Scores for Each Student:")
print(average_scores)

Average Scores for Each Student:
[75.8 71.6 76.  68.  73.8 79.4 74.6 76.6 67.2 73.8 69.8 78.6 78.4 65.6
 75.4 77.8 70.2 79.  82.2 67.6 66.2 68.  79.8 67.6 79.  85.  73.  69.4
 71.2 70.4 82.  75.6 82.6 69.4 73.6 82.2 76.2 66.  66.8 85.  77.8 79.2
 80.  69.8 73.4 71.  69.4 67.2 72.4 63.8 69.6 75.6 76.6 76.4 78.  77.2
 69.6 76.2 71.2 74.  82.8 81.4 72.8 75.  76.2 62.6 79.  80.8 76.8 78.2
 80.  78.6 82.  84.8 77.8 73.2 77.8 83.8 68.8 77.4 85.8 78.4 70.8 68.2
 69.4 73.8 75.8 82.  68.6 86.8 81.2 71.6 73.2 68.  87.  65.6 66.2 75.
 84.4 66. ]


# Step 4: Find the Top 5 Students Based on Total Score

### Total Score Calculation

* First, calculate the total score for each student (sum of all subject scores):

In [4]:
# Calculate the total score for each student
total_scores = np.sum(scores, axis=1)

# Display the total score for each student
print("Total Scores for Each Student:")
print(total_scores)

Total Scores for Each Student:
[379 358 380 340 369 397 373 383 336 369 349 393 392 328 377 389 351 395
 411 338 331 340 399 338 395 425 365 347 356 352 410 378 413 347 368 411
 381 330 334 425 389 396 400 349 367 355 347 336 362 319 348 378 383 382
 390 386 348 381 356 370 414 407 364 375 381 313 395 404 384 391 400 393
 410 424 389 366 389 419 344 387 429 392 354 341 347 369 379 410 343 434
 406 358 366 340 435 328 331 375 422 330]


### Find Top 5 Students

To find the top 5 students based on the highest total scores:

In [5]:
# Get the indices of the top 5 students
top_5_indices = np.argsort(total_scores)[::-1][:5]

# Get the top 5 student IDs and their total scores
top_5_students = student_ids[top_5_indices]
top_5_scores = total_scores[top_5_indices]

print(f"Top 5 Students Based on Total Scores:")
for i, student in enumerate(top_5_students):
    print(f"Rank {i + 1}: Student ID {student} with Total Score {top_5_scores[i]}")

Top 5 Students Based on Total Scores:
Rank 1: Student ID 95 with Total Score 435
Rank 2: Student ID 90 with Total Score 434
Rank 3: Student ID 81 with Total Score 429
Rank 4: Student ID 26 with Total Score 425
Rank 5: Student ID 40 with Total Score 425


# Step 5: Analyze Performance in Each Subject

* We will calculate the average, maximum, and minimum score for each subject.

### Performance in Each Subject

In [6]:
# Analyze performance in each subject (Math, English, Science, History, PE)
subject_averages = np.mean(scores, axis=0)
subject_max = np.max(scores, axis=0)
subject_min = np.min(scores, axis=0)

subject_names = ["Math", "English", "Economics", "Biology", "Civic Education"]

# Display performance summary
print("\nSubject Performance Summary:")
for i, subject in enumerate(subject_names):
    print(f"{subject}:")
    print(f"  Average Score: {subject_averages[i]:.2f}")
    print(f"  Max Score: {subject_max[i]}")
    print(f"  Min Score: {subject_min[i]}")



Subject Performance Summary:
Math:
  Average Score: 73.51
  Max Score: 99
  Min Score: 50
English:
  Average Score: 76.10
  Max Score: 99
  Min Score: 50
Economics:
  Average Score: 75.58
  Max Score: 99
  Min Score: 50
Biology:
  Average Score: 76.98
  Max Score: 99
  Min Score: 50
Civic Education:
  Average Score: 72.15
  Max Score: 98
  Min Score: 50


# Step 6: Classify Students Based on Total Score

We can classify students into 3 categories based on their total score:

* High: Total score ≥ 450

* Average: Total score between 350 and 449

* Low: Total score < 350

### Classification

In [7]:
def classify_performance(score):
    if score >= 450:
        return "High"
    elif score >= 350:
        return "Average"
    else:
        return "Low"

# Apply classification for each student based on their total score
performance_categories = np.vectorize(classify_performance)(total_scores)

# Display the performance classification for each student
print("\nStudent Performance Categories (High, Average, Low):")
for student_id, performance in zip(student_ids, performance_categories):
    print(f"Student ID {student_id}: {performance}")


Student Performance Categories (High, Average, Low):
Student ID 1: Average
Student ID 2: Average
Student ID 3: Average
Student ID 4: Low
Student ID 5: Average
Student ID 6: Average
Student ID 7: Average
Student ID 8: Average
Student ID 9: Low
Student ID 10: Average
Student ID 11: Low
Student ID 12: Average
Student ID 13: Average
Student ID 14: Low
Student ID 15: Average
Student ID 16: Average
Student ID 17: Average
Student ID 18: Average
Student ID 19: Average
Student ID 20: Low
Student ID 21: Low
Student ID 22: Low
Student ID 23: Average
Student ID 24: Low
Student ID 25: Average
Student ID 26: Average
Student ID 27: Average
Student ID 28: Low
Student ID 29: Average
Student ID 30: Average
Student ID 31: Average
Student ID 32: Average
Student ID 33: Average
Student ID 34: Low
Student ID 35: Average
Student ID 36: Average
Student ID 37: Average
Student ID 38: Low
Student ID 39: Low
Student ID 40: Average
Student ID 41: Average
Student ID 42: Average
Student ID 43: Average
Student ID 44:

# Step 7: Summary and Insights

* Finally, we will display a summary of key insights:

In [8]:
print("\n--- Summary of Insights ---")
print(f"Average Score for All Students: {np.mean(total_scores):.2f}")
print(f"Average Score per Subject:")
for i, subject in enumerate(subject_names):
    print(f"  {subject}: {subject_averages[i]:.2f}")
print(f"\nTotal Students in High Performance Category: {np.sum(performance_categories == 'High')}")
print(f"Total Students in Average Performance Category: {np.sum(performance_categories == 'Average')}")
print(f"Total Students in Low Performance Category: {np.sum(performance_categories == 'Low')}")


--- Summary of Insights ---
Average Score for All Students: 374.32
Average Score per Subject:
  Math: 73.51
  English: 76.10
  Economics: 75.58
  Biology: 76.98
  Civic Education: 72.15

Total Students in High Performance Category: 0
Total Students in Average Performance Category: 73
Total Students in Low Performance Category: 27
