Q9 - Aggregation Aggravation

Question: Welcome to Aggregation Aggravation!
You are given a dataset of enchanted items and their properties.
Each item has multiple properties, and you need to perform various aggregations to answer the following questions:

- Calculate the total weight and total value of items for each item type.
- Identify the item type with the highest average value.
- Determine the item with the highest weight in each item type.
- Calculate the sum, mean, and standard deviation of the value of items for each item type.
- Find the top 3 most common properties across all items.

Datasets:

enchanted_items: Contains columns (item_id, item_type, item_name, weight, value, properties).

In [None]:
import pandas as pd
import numpy as np

# Seed for reproducibility
np.random.seed(505)

# Generate synthetic data
item_ids = np.arange(1, 101)
item_types = ['Magic Wand', 'Potion Bottle', 'Enchanted Amulet', 'Flying Carpet', 'Invisibility Cloak']
item_names = ['Wand of Wonders', 'Bottle of Bliss', 'Amulet of Agility', 'Carpet of Comfort', 'Cloak of Concealment']
weight_options = np.arange(1, 11)
value_options = np.arange(100, 1001)
properties_options = ['Glows in the Dark', 'Indestructible', 'Floats on Water', 'Grants Invisibility', 'Sings Softly']

data = []
for item_id in item_ids:
    item_type = np.random.choice(item_types)
    item_name = np.random.choice(item_names)
    weight = np.random.choice(weight_options)
    value = np.random.choice(value_options)
    properties = ', '.join(np.random.choice(properties_options, np.random.randint(1, 4), replace=False))
    data.append([item_id, item_type, item_name, weight, value, properties])

# Create DataFrame
enchanted_items = pd.DataFrame(data, columns=['item_id', 'item_type', 'item_name', 'weight', 'value', 'properties'])

# Display the dataset
enchanted_items.head()

In [None]:
# Calculate the total weight and total value of items for each item type.
total_weight_item = enchanted_items.groupby('item_type')['weight'].sum().reset_index()
total_weight_item

In [None]:
total_value_item = enchanted_items.groupby('item_type')['value'].sum().reset_index()
total_value_item

In [None]:
# Identify the item type with the highest average value.
average_value_item = enchanted_items.groupby('item_type')['value'].mean().reset_index()
highest_value_item = average_value_item.loc[average_value_item['value'].idxmax()]
highest_value_item

In [None]:
# Determine the item with the highest weight in each item type
max_weight_per_type = enchanted_items.loc[enchanted_items.groupby(['item_type'])['weight'].idxmax()].reset_index(drop=True)
max_weight_per_type[['item_type', 'item_name', 'weight']]

In [None]:
# Calculate the sum, mean, and standard deviation of the value of items for each item type
stats_value_item_type = enchanted_items.groupby('item_type')['value'].aggregate(['sum', 'mean', 'std']).reset_index()
stats_value_item_type

In [None]:
# Find the top 3 most common properties across all items
properties_split = enchanted_items['properties'].str.split(', ', expand=True).stack()
properties_count = properties_split.value_counts().head(3)
properties_count