In [5]:
import pandas as pd


You are a Product Analyst on the **WhatsApp** team investigating group messaging dynamics. Your team wants to understand how large groups are being used and their messaging patterns. You'll leverage data to uncover insights about group participation and communication behaviors.

In [6]:
# Load the CSV file into a DataFrame
dim_groups = pd.read_csv('dim_groups.csv')

# Display the DataFrame
print("DataFrame loaded from dim_groups.csv:")
print(dim_groups)


DataFrame loaded from dim_groups.csv:
    group_id created_date  total_messages  participant_count
0          1   2024-10-01             100                 25
1          2   2024-10-10             200                 55
2          3   2024-11-05             150                 40
3          4   2024-10-15             500                100
4          5   2024-12-01             120                 35
5          6   2024-10-20             300                 50
6          7   2024-10-25             400                 60
7          8   2024-11-10             220                 45
8          9   2024-10-30             450                 80
9         10   2024-12-15              80                 15
10        11   2024-10-05             600                 90
11        12   2024-10-12              50                 10


### Question 1 of 3

What is the maximum number of participants among WhatsApp groups that were created in October 2024? This metric will help us understand the largest group size available.

In [7]:
# Ensure 'created_date' is a datetime type
dim_groups['created_date'] = pd.to_datetime(dim_groups['created_date'])

# Filter for October 2024
oct_2024 = dim_groups[
    (dim_groups['created_date'].dt.year == 2024) &
    (dim_groups['created_date'].dt.month == 10)
]

# Get the maximum participant_count
max_participants = oct_2024['participant_count'].max()

print("Maximum number of participants in October 2024 groups:")
print(max_participants)


Maximum number of participants in October 2024 groups:
100


### Question 2 of 3

What is the average number of participants in WhatsApp groups that were created in October 2024? This number will indicate the typical group size and inform our group messaging feature considerations.

In [8]:
# Filter for groups created in October 2024
oct_2024 = dim_groups[
    (dim_groups['created_date'].dt.year == 2024) &
    (dim_groups['created_date'].dt.month == 10)
]

# Calculate the average number of participants
avg_participants = oct_2024['participant_count'].mean()

print("Average number of participants in October 2024 groups:")
print(avg_participants)


Average number of participants in October 2024 groups:
58.75


### Question 3

For WhatsApp groups with more than 50 participants that were created in October 2024, what is the average number of messages sent? This insight will help assess engagement in larger groups and support recommendations for group messaging features.

In [9]:
# Filter for groups created in October 2024 with more than 50 participants
large_oct_2024 = dim_groups[
    (dim_groups['created_date'].dt.year == 2024) &
    (dim_groups['created_date'].dt.month == 10) &
    (dim_groups['participant_count'] > 50)
]

# Calculate the average number of messages sent
avg_messages = large_oct_2024['total_messages'].mean()

print("Average number of messages sent in October 2024 groups with more than 50 participants:")
print(avg_messages)

Average number of messages sent in October 2024 groups with more than 50 participants:
430.0
