# 4. Advanced Aggregations

Advanced aggregations in Pandas allow for flexibility and detailed analysis by applying multiple aggregations, custom functions, and group-specific logic.

Both `.agg()` and `.aggregate()` methods can be used for this purpose, and they are interchangeable. However, `.agg()` is more concise and is preferred in practice.

## Performing Multiple Aggregations Simultaneously Using `.agg()`

The `.agg()` method enables you to apply multiple aggregations to grouped data, such as calculating the mean, sum, and count in a single step.



In [ ]:
import pandas as pd
# Load the COVID-19 dataset
data_path = '../DataSets/Data_COVID19_Indonesia.csv'
covid_data = pd.read_csv(data_path)
print('Dataset Preview:')
print(covid_data.head())

# Group by Location and apply multiple aggregations
multi_agg = covid_data.groupby('Location')['New Cases'].agg(['mean', 'sum', 'count'])
print('Multiple Aggregations:')
print(multi_agg)

## Passing a Dictionary to `.agg()` to Apply Different Aggregations to Different Columns

You can specify a dictionary of column names and aggregation functions to customize aggregations for different columns.



In [ ]:
# Apply different aggregations to different columns
custom_agg = covid_data.groupby('Location').agg({
    'New Cases': 'mean',
    'Total Cases': 'sum',
    'New Deaths': 'max'
})
print('Custom Aggregations:')
print(custom_agg)

## Using Custom Aggregation Functions

Custom functions provide flexibility for performing unique or specialized aggregations, such as calculating the range of a column (max - min). These can be applied within `.agg()` or `.aggregate()` methods.



In [ ]:
# Define a custom aggregation function
def range_agg(series):
    return series.max() - series.min()

# Apply the custom aggregation
custom_range = covid_data.groupby('Location')['New Cases'].agg(range_agg)
print('Custom Range Aggregation:')
print(custom_range)

## Practical Examples
### Calculate Mean and Standard Deviation of Numerical Columns for Each Group


In [ ]:
# Calculate mean and standard deviation for numerical columns
stats_agg = covid_data.groupby('Location').agg({
    'New Cases': ['mean', 'std'],
    'New Deaths': ['mean', 'std']
})
print('Mean and Standard Deviation Aggregations:')
print(stats_agg)

### Define a Custom Aggregation to Calculate the Range of a Column (Max - Min)


In [ ]:
# Using the previously defined custom function
range_agg = covid_data.groupby('Location').agg({
    'Total Cases': range_agg
})
print('Range Aggregation:')
print(range_agg)

# 5. Transforming Grouped Data

Transformations enable group-specific computations while retaining the original DataFrame structure.

## Using `.transform()` to Perform Group-Wise Operations
The `.transform()` method applies a function to each group and returns a Series with the same shape as the original DataFrame.



In [ ]:
# Normalize each group's New Cases
covid_data['Normalized Cases'] = covid_data.groupby('Location')['New Cases'].transform(lambda x: (x - x.mean()) / x.std())
print('Data with Normalized Cases:')
print(covid_data[['Location', 'New Cases', 'Normalized Cases']].head())

### Calculate Percentage Contribution of Each Item Within Its Group


In [ ]:
# Calculate percentage contribution of New Cases within each Location
covid_data['Case Percentage'] = covid_data.groupby('Location')['New Cases'].transform(lambda x: x / x.sum() * 100)
print('Data with Case Percentage:')
print(covid_data[['Location', 'New Cases', 'Case Percentage']].head())

## Conclusion

Advanced aggregations and transformations provide powerful tools for group-wise analysis. Using `.agg()`, `.aggregate()`, custom functions, and `.transform()`, you can derive deeper insights and perform complex computations efficiently.