Q8 - Pivot Table Puzzles¶

Question: Welcome to Pivot Table Puzzles!
You are given a dataset of whimsical workshops attended by various mythical creatures.
Each creature attends multiple workshops, and you need to analyze their attendance and performance.
Your task is to use pivot tables to answer the following questions:

- Create a pivot table showing the total hours attended by each creature for each workshop.
- Calculate the average score of each creature for each workshop.
- Identify the workshop with the highest average score.
- Determine the total number of workshops attended by each creature.
- Find the creature with the highest total score across all workshops.

Datasets:

workshop_attendance: Contains columns (creature_id, creature_name, workshop_name, hours_attended, score).

In [None]:
import pandas as pd
import numpy as np

# Seed for reproducibility
np.random.seed(404)

# Generate synthetic data
creature_ids = np.arange(1, 21)
creature_names = ['Unicorn Ulysses', 'Phoenix Phoebe', 'Dragon Draco', 'Goblin Greta', 'Elf Elrond']
workshops = ['Wand Making', 'Spell Weaving', 'Potion Brewing', 'Crystal Gazing', 'Herbology']
hours_options = np.arange(1, 11)
score_options = np.arange(50, 101)

data = []
for creature in creature_ids:
    creature_name = np.random.choice(creature_names)
    num_workshops = np.random.randint(1, len(workshops) + 1)
    attended_workshops = np.random.choice(workshops, num_workshops, replace=False)
    for workshop in attended_workshops:
        hours_attended = np.random.choice(hours_options)
        score = np.random.choice(score_options)
        data.append([creature, creature_name, workshop, hours_attended, score])

# Create DataFrame
workshop_attendance = pd.DataFrame(data, columns=['creature_id', 'creature_name', 'workshop_name', 'hours_attended', 'score'])

# Display the dataset
workshop_attendance.head()

In [None]:
# Create a pivot table showing the total hours attended by each creature for each workshop.
pivot_hours = workshop_attendance.pivot_table(index='creature_name', columns='workshop_name', values='hours_attended', aggfunc='sum', fill_value=0)
pivot_hours

In [None]:
# Calculate the average score of each creature for each workshop
average_score = workshop_attendance.pivot_table(index='creature_name', columns='workshop_name', values='score', aggfunc='mean', fill_value=0)
average_score

In [None]:
# Identify the workshop with the highest average score
averagescore_per_workshop = workshop_attendance.groupby(['workshop_name'])['score'].mean().reset_index()
highest_score_workshop = workshop_attendance.loc[averagescore_per_workshop['score'].idxmax()]
highest_score_workshop

In [None]:
# Determine the total number of workshops attended by each creature
total_workshops_creature = workshop_attendance.groupby(['creature_name'])['workshop_name'].count().reset_index()
total_workshops_creature.columns = ['Creature Name', 'Workshops Attended']
total_workshops_creature

In [None]:
# Find the creature with the highest total score across all workshops
total_score_creature = workshop_attendance.groupby(['creature_name'])['score'].sum().reset_index()
highest_score_creature = total_score_creature.loc[total_score_creature['score'].idxmax()]
highest_score_creature