# Grade Analyzer

A comprehensive data transformation project that converts messy student grade data into a structured pandas DataFrame.

## Objective
Transform raw string data into a clean DataFrame with calculated averages and letter grades, demonstrating the progression from Python basics to pandas.

## Skills Demonstrated
- Multi-step data cleaning pipeline
- Nested data structure manipulation
- Type conversion and validation
- Dictionary construction
- Grade calculation logic
- Pandas DataFrame conversion

---

## 1. Raw Data Input

Starting with messy student data in the format: `"name:score1,score2,score3,score4,score5"`

Issues to address:
- Inconsistent name capitalization
- Scores stored as strings
- Data not structured for analysis

In [1]:
# Raw student data: name and five test scores per student
students = [
    "Alice:85,92,78,90,88",
    "BOB:95,89,94,91,93",
    "charlie:72,68,75,70,71",
    "DIANA:88,85,90,87,89",
    "eve:60,65,58,62,61"
]

## 2. Data Parsing - Step 1

Splitting names from scores.

In [2]:
# Split each string at the colon to separate name from scores
for i in range(len(students)):
    students[i] = students[i].split(':')

students

[['Alice', '85,92,78,90,88'],
 ['BOB', '95,89,94,91,93'],
 ['charlie', '72,68,75,70,71'],
 ['DIANA', '88,85,90,87,89'],
 ['eve', '60,65,58,62,61']]

## 3. Data Parsing - Step 2

Splitting individual test scores.

In [3]:
# Split the comma-separated scores into individual strings
for i in range(len(students)):
    students[i][1] = students[i][1].split(',')

students

[['Alice', ['85', '92', '78', '90', '88']],
 ['BOB', ['95', '89', '94', '91', '93']],
 ['charlie', ['72', '68', '75', '70', '71']],
 ['DIANA', ['88', '85', '90', '87', '89']],
 ['eve', ['60', '65', '58', '62', '61']]]

## 4. Data Cleaning

Standardizing names and converting scores to integers.

In [4]:
# Clean names to title case and convert all scores to integers
for student in students:
    student[0] = student[0].title()
    for i in range(len(student[1])):
        student[1][i] = int(student[1][i])

students

[['Alice', [85, 92, 78, 90, 88]],
 ['Bob', [95, 89, 94, 91, 93]],
 ['Charlie', [72, 68, 75, 70, 71]],
 ['Diana', [88, 85, 90, 87, 89]],
 ['Eve', [60, 65, 58, 62, 61]]]

## 5. Dictionary Construction

Converting list structure to dictionary for better organization.

In [5]:
# Create a dictionary with separate columns for names and scores
students_dict = {'student_name': [], 'test_scores': []}

for student in students:
    students_dict['student_name'].append(student[0])
    students_dict['test_scores'].append(student[1])

students_dict

{'student_name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
 'test_scores': [[85, 92, 78, 90, 88],
  [95, 89, 94, 91, 93],
  [72, 68, 75, 70, 71],
  [88, 85, 90, 87, 89],
  [60, 65, 58, 62, 61]]}

## 6. Calculate Score Averages

Computing the average test score for each student.

In [6]:
# Add score_avg column and calculate average for each student
students_dict['score_avg'] = []

for scores in students_dict['test_scores']:
    count_scores = 0
    total_scores = 0
    
    for score in scores:
        count_scores += 1
        total_scores += score
    
    score_avg = total_scores / count_scores
    students_dict['score_avg'].append(score_avg)

students_dict

{'student_name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
 'test_scores': [[85, 92, 78, 90, 88],
  [95, 89, 94, 91, 93],
  [72, 68, 75, 70, 71],
  [88, 85, 90, 87, 89],
  [60, 65, 58, 62, 61]],
 'score_avg': [86.6, 92.4, 71.2, 87.8, 61.2]}

## 7. Assign Letter Grades

Converting numerical averages to letter grades using standard grading scale:
- A: 90-100
- B: 80-89
- C: 70-79
- D: 60-69
- F: Below 60

In [7]:
# Add letter_grade column and assign grades based on averages
students_dict['letter_grade'] = []

for score in students_dict['score_avg']:
    if score >= 90:
        students_dict['letter_grade'].append('A')
    elif score >= 80:
        students_dict['letter_grade'].append('B')
    elif score >= 70:
        students_dict['letter_grade'].append('C')
    elif score >= 60:
        students_dict['letter_grade'].append('D')
    else:
        students_dict['letter_grade'].append('F')

students_dict

{'student_name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
 'test_scores': [[85, 92, 78, 90, 88],
  [95, 89, 94, 91, 93],
  [72, 68, 75, 70, 71],
  [88, 85, 90, 87, 89],
  [60, 65, 58, 62, 61]],
 'score_avg': [86.6, 92.4, 71.2, 87.8, 61.2],
 'letter_grade': ['B', 'A', 'C', 'B', 'D']}

## 8. Convert to Pandas DataFrame

Final transformation into a structured DataFrame for easy analysis and visualization.

In [8]:
import pandas as pd

# Convert dictionary to DataFrame
students_df = pd.DataFrame(students_dict)
students_df

Unnamed: 0,student_name,test_scores,score_avg,letter_grade
0,Alice,"[85, 92, 78, 90, 88]",86.6,B
1,Bob,"[95, 89, 94, 91, 93]",92.4,A
2,Charlie,"[72, 68, 75, 70, 71]",71.2,C
3,Diana,"[88, 85, 90, 87, 89]",87.8,B
4,Eve,"[60, 65, 58, 62, 61]",61.2,D


## 9. Quick Analysis

Now that we have a DataFrame, we can easily perform analysis.

In [9]:
# Display summary statistics
print(f"Class Average: {students_df['score_avg'].mean():.1f}")
print(f"\nGrade Distribution:")
print(students_df['letter_grade'].value_counts().sort_index())

Class Average: 79.8

Grade Distribution:
letter_grade
A    1
B    2
C    1
D    1
Name: count, dtype: int64


## Summary

### Data Transformation Journey

This project demonstrated a complete data transformation pipeline:

1. **Raw String** → `"Alice:85,92,78,90,88"`
2. **Parsed Structure** → `["Alice", [85, 92, 78, 90, 88]]`
3. **Calculated Metrics** → `["Alice", [85, 92, ...], 86.6, "B"]`
4. **Structured Dictionary** → `{"student_name": [...], "test_scores": [...], ...}`
5. **Pandas DataFrame** → Ready for analysis

### Key Skills Applied

- **Data Parsing**: Breaking down complex string formats
- **Nested Loops**: Iterating through multi-dimensional data
- **Type Conversion**: String to integer conversion
- **Conditional Logic**: Grade assignment rules
- **Data Structures**: Lists → Dictionary → DataFrame progression
- **Statistical Calculation**: Average computation

### Real-World Applications

This type of transformation is essential for:
- Student information systems
- Grade book automation
- Academic performance tracking
- Report card generation
- Educational data analysis