[Advanced Grouping and Aggregating](https://www.w3resource.com/python-exercises/pandas/advanced-grouping-and-aggregation/index.php)

<div class="alert alert-warning">

**1. Grouping by Multiple columns**


**Write a Pandas program to group data by multiple columns to perform complex data analysis and aggregations.**
</div>

In [1]:
import pandas as pd
# Sample DataFrame
data = {'category': ['A', 'A', 'B', 'B', 'C', 'C'],
        'type': ['X', 'Y', 'X', 'Y', 'X', 'Y'],
        'value': [1, 2, 3, 4, 5, 6]}

df = pd.DataFrame(data)
print("Sample DataFrame:")
display(df)

Sample DataFrame:


Unnamed: 0,category,type,value
0,A,X,1
1,A,Y,2
2,B,X,3
3,B,Y,4
4,C,X,5
5,C,Y,6


<div class="alert alert-success">

**Solution 01:**
</div>

In [2]:
# Group by 'category' and 'type'
print("Group by 'category' and 'type':")
grouped = df.groupby(['category', 'type']).sum()
display(grouped)

Group by 'category' and 'type':


Unnamed: 0_level_0,Unnamed: 1_level_0,value
category,type,Unnamed: 2_level_1
A,X,1
A,Y,2
B,X,3
B,Y,4
C,X,5
C,Y,6


<div class="alert alert-warning">

**2. Applying Multiple Aggregations**

**Write a Pandas program to apply multiple aggregation functions to grouped data using for enhanced data insights.**
</div>

In [3]:
import pandas as pd

# Sample DataFrame
data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Value': [10, 20, 30, 40, 50, 60]}

df = pd.DataFrame(data)
print("Sample DataFrame:")
display(df)

Sample DataFrame:


Unnamed: 0,Category,Value
0,A,10
1,A,20
2,B,30
3,B,40
4,C,50
5,C,60


<div class="alert alert-success">

**Solution 02:**
</div>

In [4]:
# Group by 'Category' and apply multiple aggregations
print("Group by 'Category' and apply multiple aggregations:")
grouped = df.groupby('Category').agg(
    total_value = ('Value','sum')
    , average_value = ('Value', 'mean')
    , max_value = ('Value', 'max')
    , min_value = ('Value', 'min')
    , count = ('Value', 'count')
)

display(grouped)

Group by 'Category' and apply multiple aggregations:


Unnamed: 0_level_0,total_value,average_value,max_value,min_value,count
Category,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
A,30,15.0,20,10,2
B,70,35.0,40,30,2
C,110,55.0,60,50,2


<div class="alert alert-warning">

**3. Custom Aggregation Functions**

**Write a Pandas program to implement custom aggregation functions within groupby for tailored data analysis.**
</div>

In [5]:
import pandas as pd

# Sample DataFrame
data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Value': [5, 15, 25, 35, 45, 55]}

df = pd.DataFrame(data)
print("Sample DataFrame:")
display(df)

Sample DataFrame:


Unnamed: 0,Category,Value
0,A,5
1,A,15
2,B,25
3,B,35
4,C,45
5,C,55


<div class="alert alert-success">

**Solution 03:**
</div>

In [6]:
# Define custom aggregation function
def custom_agg(x):
    return (x.max() - x.min()) / (x.count() * 100)

# Group by 'Category' and apply custom aggregation
print("\nGroup by 'Category' and apply custom aggregation:")
grouped = df.groupby('Category').agg(
    custom_value=('Value', custom_agg)
    , total_value=('Value', 'sum')
    , average_value=('Value', 'mean')
    , max_value=('Value', 'max')
    , min_value=('Value', 'min')
    , count=('Value', 'count')
)

display(grouped)


Group by 'Category' and apply custom aggregation:


Unnamed: 0_level_0,custom_value,total_value,average_value,max_value,min_value,count
Category,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A,0.05,20,10.0,15,5,2
B,0.05,60,30.0,35,25,2
C,0.05,100,50.0,55,45,2


<div class="alert alert-warning">

**4. Group by and Filter Groups**

**Write a Pandas program that implements the technique of grouping and filtering groups to refine your data analysis and insights.**
</div>

In [7]:
import pandas as pd
# Sample DataFrame
data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Value': [1, 2, 3, 4, 5, 6]}
df = pd.DataFrame(data)
print("Sample DataFrame:")
display(df)

Sample DataFrame:


Unnamed: 0,Category,Value
0,A,1
1,A,2
2,B,3
3,B,4
4,C,5
5,C,6


<div class="alert alert-success">

**Solution 04:**
</div>

In [8]:
# Group by 'Category'
grouped = df.groupby(by=['Category'])
grouped

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x1080b4c20>

In [9]:
# Filter groups where the sum of 'Value' > 5
print("\nFilter groups where the sum of 'Value' > 5")
lf = lambda x: x['Value'].sum() > 5
filtered = grouped.filter(lf)

display(filtered)


Filter groups where the sum of 'Value' > 5


Unnamed: 0,Category,Value
2,B,3
3,B,4
4,C,5
5,C,6


<div class="alert alert-warning">

**5. Group by and Apply function**

**Write a Pandas program to group data and apply custom functions to groups for flexible data transformations.**

</div>

In [10]:
import pandas as pd
# Sample DataFrame
data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Value': [10, 20, 30, 40, 50, 60]}
df = pd.DataFrame(data)
print("Sample DataFrame:")
display(df)

Sample DataFrame:


Unnamed: 0,Category,Value
0,A,10
1,A,20
2,B,30
3,B,40
4,C,50
5,C,60


<div class="alert alert-success">

**Solution 05:**
</div>

In [11]:
# Define function to apply to each group
def scale_values(x):
    return x / x.max()
# Group by 'Category' and apply function
print("\nGroup by 'Category' and apply function:")
grouped = df.groupby(by=['Category']).transform(scale_values)
display(grouped)


Group by 'Category' and apply function:


Unnamed: 0,Value
0,0.5
1,1.0
2,0.75
3,1.0
4,0.833333
5,1.0


<div class="alert alert-warning">

**6. Aggregating with different functions on different Columns**

**Write a Pandas program to use different aggregation functions on different columns for versatile data analysis.**

</div>

In [12]:
import pandas as pd
# Sample DataFrame
data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Value1': [1, 2, 3, 4, 5, 6],
        'Value2': [10, 20, 30, 40, 50, 60]}
df = pd.DataFrame(data)
print("Sample DataFrame:")
display(df)

Sample DataFrame:


Unnamed: 0,Category,Value1,Value2
0,A,1,10
1,A,2,20
2,B,3,30
3,B,4,40
4,C,5,50
5,C,6,60


<div class="alert alert-success">

**Solution 06:**
</div>

In [13]:
# Group by 'Category' and apply different aggregations
print("\nGroup by 'Category' and apply different aggregations:")
grouped = df.groupby('Category').agg({'Value1': 'sum', 'Value2': 'mean'})
grouped_2 = df.groupby(by=['Category']).agg(
    total_value1=('Value1', 'sum')
    , average_value2=('Value2', 'mean')
    , max_value1=('Value1', 'max')
    , min_value2=('Value2', 'min')
    , count_value1=('Value1', 'count')
    , count_value2=('Value2', 'count')
)
display(grouped)
display(grouped_2)


Group by 'Category' and apply different aggregations:


Unnamed: 0_level_0,Value1,Value2
Category,Unnamed: 1_level_1,Unnamed: 2_level_1
A,3,15.0
B,7,35.0
C,11,55.0


Unnamed: 0_level_0,total_value1,average_value2,max_value1,min_value2,count_value1,count_value2
Category,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A,3,15.0,2,10,2,2
B,7,35.0,4,30,2,2
C,11,55.0,6,50,2,2


<div class="alert alert-warning">

**7. Using GroupBy with Lambda functions**

**Write a Pandas program to use lambda functions within groupby for flexible and efficient data transformations.**
</div

In [14]:
import pandas as pd
# Sample DataFrame
data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Value': [5, 15, 25, 35, 45, 55]}

df = pd.DataFrame(data)
print("Sample DataFrame:")
display(df)

Sample DataFrame:


Unnamed: 0,Category,Value
0,A,5
1,A,15
2,B,25
3,B,35
4,C,45
5,C,55


<div class="alert alert-success">

**Solution 07:**
</div>

In [15]:
# Group by 'Category' and apply lambda function
print("\nGroup by 'Category' and apply lambda function:")
lf = lambda x: (x.max() - x.min()) / (x.count() * 100)
grouped = df.groupby(by=['Category']).agg(lf)
display(grouped)


Group by 'Category' and apply lambda function:


Unnamed: 0_level_0,Value
Category,Unnamed: 1_level_1
A,0.05
B,0.05
C,0.05


<div class="alert alert-warning">

**8. Grouping and Aggregating with Multiple Index Levels**

**Write a Pandas program to perform grouping and aggregation operations using multiple index levels.**
</div

In [16]:
import pandas as pd
# Sample DataFrame
data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Type': ['X', 'Y', 'X', 'Y', 'X', 'Y'],
        'Value': [1, 2, 3, 4, 5, 6]}
df = pd.DataFrame(data)
print("Sample DataFrame:")
display(df)

Sample DataFrame:


Unnamed: 0,Category,Type,Value
0,A,X,1
1,A,Y,2
2,B,X,3
3,B,Y,4
4,C,X,5
5,C,Y,6


<div class="alert alert-success">

**Solution 08:**
</div>

In [17]:
# Group by 'Category' and 'Type'
print("\nGroup by 'Category' and 'Type':")
grouped = df.groupby(['Category', 'Type']).agg(
    total_value=('Value', 'sum'),
    average_value=('Value', 'mean'),
    max_value=('Value', 'max'),
    min_value=('Value', 'min'),
    count=('Value', 'count')
)
display(grouped)


Group by 'Category' and 'Type':


Unnamed: 0_level_0,Unnamed: 1_level_0,total_value,average_value,max_value,min_value,count
Category,Type,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A,X,1,1.0,1,1,1
A,Y,2,2.0,2,2,1
B,X,3,3.0,3,3,1
B,Y,4,4.0,4,4,1
C,X,5,5.0,5,5,1
C,Y,6,6.0,6,6,1


<div class="alert alert-warning">

**9. Applying different functions to different columns with GroupBy**

**Write a Pandas program that applies different functions to different columns in Pandas GroupBy for tailored data analysis.**
</div

In [18]:
import pandas as pd

# Sample DataFrame
data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Value1': [10, 20, 30, 40, 50, 60],
        'Value2': [100, 200, 300, 400, 500, 600]}

df = pd.DataFrame(data)
print("Sample DataFrame:")
display(df)

Sample DataFrame:


Unnamed: 0,Category,Value1,Value2
0,A,10,100
1,A,20,200
2,B,30,300
3,B,40,400
4,C,50,500
5,C,60,600


<div class="alert alert-success">

**Solution 09:**
</div>

In [19]:
# Group by 'Category' and apply different functions
print("\nGroup by 'Category' and apply different functions:")
grouped = df.groupby(by=['Category']).agg(
    total_value1=('Value1', 'sum'),
    average_value2=('Value2', 'mean'),
    max_value1=('Value1', 'max'),
    min_value2=('Value2', 'min'),
    count_value1=('Value1', 'count'),
    count_value2=('Value2', 'count')
)
display(grouped)


Group by 'Category' and apply different functions:


Unnamed: 0_level_0,total_value1,average_value2,max_value1,min_value2,count_value1,count_value2
Category,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
A,30,150.0,20,100,2,2
B,70,350.0,40,300,2,2
C,110,550.0,60,500,2,2
