# Day 1: WhatsApp Group Size Engagement Analysis

You are a Product Analyst on the WhatsApp team investigating group messaging dynamics. Your team wants to understand how large groups are being used and their messaging patterns. You'll leverage data to uncover insights about group participation and communication behaviors.

In [None]:
import pandas as pd
import numpy as np

dim_groups_data = [
  {
    "group_id": 1,
    "created_date": "2024-10-01",
    "total_messages": 100,
    "participant_count": 25
  },
  {
    "group_id": 2,
    "created_date": "2024-10-10",
    "total_messages": 200,
    "participant_count": 55
  },
  {
    "group_id": 3,
    "created_date": "2024-11-05",
    "total_messages": 150,
    "participant_count": 40
  },
  {
    "group_id": 4,
    "created_date": "2024-10-15",
    "total_messages": 500,
    "participant_count": 100
  },
  {
    "group_id": 5,
    "created_date": "2024-12-01",
    "total_messages": 120,
    "participant_count": 35
  },
  {
    "group_id": 6,
    "created_date": "2024-10-20",
    "total_messages": 300,
    "participant_count": 50
  },
  {
    "group_id": 7,
    "created_date": "2024-10-25",
    "total_messages": 400,
    "participant_count": 60
  },
  {
    "group_id": 8,
    "created_date": "2024-11-10",
    "total_messages": 220,
    "participant_count": 45
  },
  {
    "group_id": 9,
    "created_date": "2024-10-30",
    "total_messages": 450,
    "participant_count": 80
  },
  {
    "group_id": 10,
    "created_date": "2024-12-15",
    "total_messages": 80,
    "participant_count": 15
  },
  {
    "group_id": 11,
    "created_date": "2024-10-05",
    "total_messages": 600,
    "participant_count": 90
  },
  {
    "group_id": 12,
    "created_date": "2024-10-12",
    "total_messages": 50,
    "participant_count": 10
  }
]
dim_groups = pd.DataFrame(dim_groups_data)


## Question 1

What is the maximum number of participants among WhatsApp groups that were created in October 2024? This metric will help us understand the largest group size available.

In [None]:
# Note: pandas and numpy are already imported as pd and np
# The following tables are loaded as pandas DataFrames with the same names: dim_groups
# Please print your final result or dataframe

df = dim_groups

#Getting Data info and structure
df.info()
df.head()

#First Filtered out all rows in October
df_october = df.query("created_date >= '2024-10-01' and created_date < '2024-11-01'")
print(df_october)


df_oct_tot_participants = df_october["participant_count"].max()
print("The maximum number of participants among WhatsApp groups that were created in October 2024 is", df_oct_tot_participants, "participants")

## Question 2

What is the average number of participants in WhatsApp groups that were created in October 2024? This number will indicate the typical group size and inform our group messaging feature considerations.

In [None]:
# copying data to another variable to avoid any changes to original data
data = dim_groups
df = data.copy()

# Explored data 
print(df.head())

#Filtered data by the month of october 
df_October = df.query("created_date >= '2024-10-01' and created_date < '2024-11-01'")
print(df_October)

#Took the mean of the participant_count column
df_meanpart_Oct = df_October["participant_count"].mean()

# Rounded the mean 
df_mean_rounded = df_meanpart_Oct.round()

#Printed the Answer 
#print("The average number of participants in WhatsApp groups that were created in October 2024 was", df_meanpart_Oct, "participants")
print("The average number of participants in WhatsApp groups that were created in October 2024 was", df_mean_rounded, "participants")

## Question 3

For WhatsApp groups with more than 50 participants that were created in October 2024, what is the average number of messages sent? This insight will help assess engagement in larger groups and support recommendations for group messaging features.

In [None]:
# Copied data to avoid changes to original data
data = dim_groups
df = data.copy()

# Explored data
print(df.head())
print(df.info())

# Filtered data to the Month of October
df_October = df.query("created_date >= '2024-10-01' and created_date < '2024-11-01'")
print(df_October)

# Filtered October data to groups with more than 50 participants
df_above50part_Oct = df_October.query("participant_count > 50")
print(df_above50part_Oct)

# Took the mean value of the total messages for groups with more than 50 participants from the month of October and rounded 
df_avgmsgsent = df_above50part_Oct["total_messages"].mean().round()
print("The average number of messages sent for WhatsApp groups with more than 50 participants that were created in October 2024 were", df_avgmsgsent, "messages")

Made with ❤️ by [Interview Master](https://www.interviewmaster.ai)