<a href="https://colab.research.google.com/github/Airbone25/phone-company-data-analysis-project/blob/main/Copy_of_Sample_EDA_Submission_Template.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    - Mobile Phones Usage Pattern Analysis Across Time: Monthly, Weekly, and Quarterly Trends



##### **Project Type**    - EDA
##### **Contribution**    - Individual
##### **Team Member 1 -** Keshav Mehra
##### **Team Member 2 -**
##### **Team Member 3 -**
##### **Team Member 4 -**

# **Project Summary -**

This analysis investigates telecom service usage patterns by examining call durations, SMS counts, and data consumption across various timeframes—monthly, weekly, and quarterly—using detailed time-series data. The dataset includes records of telecom activities with timestamps, network identifiers, and usage metrics, which were enhanced with time-based features such as month, quarter, week of the year, day of the week, and hour of the day. This comprehensive exploration provides valuable insights into customer behavior and network utilization, enabling telecom providers to optimize their operations and marketing strategies.

By aggregating the data by month and year, the analysis reveals clear seasonal trends in how customers use telecom services. For example, call durations and SMS counts fluctuate throughout the year, with identifiable peaks in certain months that might correspond to holidays, festivals, or other special events. Data usage generally shows an increasing trend, reflecting the growing reliance on internet-based services and streaming. Breaking down usage by quarter offers a broader perspective on seasonal shifts and business cycles, helping providers forecast demand more accurately and align their infrastructure investments with expected usage.

A finer granularity analysis of usage by week of the year uncovers short-term patterns and anomalies that might be masked in monthly or quarterly aggregates. Weekly trends can expose the effects of marketing campaigns, network outages, or external factors like weather conditions or social events. Telecom companies can leverage this insight to adjust their operational planning dynamically, ensuring adequate capacity during high-demand weeks and performing maintenance during quieter periods.

Examining daily and hourly usage patterns by item type (calls, SMS, data) further enriches the understanding of customer behavior. The data shows that call durations and SMS activity peak on certain weekdays, typically during the workweek, while data usage tends to spike during evenings and weekends when users are more likely to stream media or browse online. Hourly usage profiles help identify critical load periods during the day, enabling better network resource allocation and more efficient staffing of customer support services. Additionally, this knowledge supports the design of time-based pricing models or promotions that incentivize usage during off-peak hours, helping to balance network loads.

Network and technology preferences were also analyzed by grouping usage data by network providers and network types (such as 2G, 3G, 4G, and 5G). Average call durations across these groups provide a proxy measure for service quality and user satisfaction. Providers with longer average call durations might indicate better coverage or fewer dropped calls, suggesting areas where investment is paying off. Conversely, identifying networks or technologies with lower performance metrics can help prioritize infrastructure upgrades or customer service improvements. Understanding the distribution and number of unique networks in use also aids in strategizing roaming agreements and partnerships, ensuring seamless connectivity for users.

The analysis identified several challenges, including periods with zero or very low activity that could reflect data collection gaps, customer churn, or seasonal inactivity. Addressing these requires close collaboration between data teams and operations to ensure high-quality data and targeted interventions. Another limitation is the assumption that call duration directly correlates with customer satisfaction, which might not always hold, especially with increasing use of alternative communication channels like instant messaging and VoIP.

Overall, the insights from this multi-dimensional analysis offer actionable intelligence for telecom companies aiming to enhance customer experience and optimize resource utilization. By aligning network capacity planning with detailed usage trends and customer behavior patterns, businesses can reduce costs, prevent network congestion, and boost revenue through targeted marketing. Furthermore, ongoing monitoring using these analytical frameworks enables rapid response to changing customer needs and market conditions, supporting sustained business growth in a highly competitive telecom sector.

# **GitHub Link -**



```
# This is formatted as code
```

Provide your GitHub Link here.

# **Problem Statement**


Telecom companies face challenges in efficiently managing network resources and delivering quality service due to fluctuating user demand across time. Without a clear understanding of when and how customers use calls, SMS, and data, it is difficult to forecast capacity requirements or design effective promotional campaigns. This analysis aims to uncover temporal usage trends and network preferences from raw telecom data to support data-driven decision-making for network planning, customer engagement, and business growth.

#### **Define Your Business Objective?**

Answer Here.

# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import numpy as np
import pandas as pd
import math
import seaborn as sns
import matplotlib.pyplot as plt

from matplotlib import rcParams

import warnings
warnings.filterwarnings('ignore')

### Dataset Loading

In [None]:
# Load Dataset
dataset = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Dataset/phone_data.csv')

### Dataset First View

In [None]:
# Dataset First Look
dataset.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
dataset.shape

### Dataset Information

In [None]:
# Dataset Info
dataset.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
len(dataset[dataset.duplicated()])

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
print(dataset.isnull().sum())

In [None]:
# Visualizing the missing values
# checking null value using heatmap
sns.heatmap(dataset.isnull(), cbar=False)

### What did you know about your dataset?

Answer Here

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
dataset.columns

In [None]:
# Dataset Describe
dataset.describe()

### Variables Description

Answer Here

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
for i in dataset.columns.tolist():
  print("No. of unique values in ",i,"is",dataset[i].nunique(),".")

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.

dataset['date'] = pd.to_datetime(dataset['date'])


dataset['day_of_week'] = dataset['date'].dt.dayofweek
dataset['day_of_month'] = dataset['date'].dt.day
dataset['hour'] = dataset['date'].dt.hour

dataset.loc[dataset['item'] == 'call', 'duration'] = dataset.loc[dataset['item'] == 'call', 'duration'].apply(lambda x: x/60)

display(dataset.head())

In [None]:

# Create a new column 'month' from the 'date' column
dataset['month'] = dataset['date'].dt.month

# Create a new column 'year' from the 'date' column
dataset['year'] = dataset['date'].dt.year

# Create a new column 'day_name' from the 'date' column
dataset['day_name'] = dataset['date'].dt.day_name()

# Create a new column 'month_name' from the 'date' column
dataset['month_name'] = dataset['date'].dt.month_name()

# Create a new column 'quarter' from the 'date' column
dataset['quarter'] = dataset['date'].dt.quarter

# Convert the 'network' column to categorical
dataset['network'] = dataset['network'].astype('category')

# Convert the 'network_type' column to categorical
dataset['network_type'] = dataset['network_type'].astype('category')

# Convert the 'item' column to categorical
dataset['item'] = dataset['item'].astype('category')

# Display the first few rows with the new columns
display(dataset.head())

In [None]:

# Group by 'month' and calculate the sum of 'duration' for calls
monthly_call_duration = dataset[dataset['item'] == 'call'].groupby('month')['duration'].sum().reset_index()
monthly_call_duration.rename(columns={'duration': 'total_call_duration_minutes'}, inplace=True)
display(monthly_call_duration.head())

# Group by 'month' and count the number of sms
monthly_sms_count = dataset[dataset['item'] == 'sms'].groupby('month')['item'].count().reset_index()
monthly_sms_count.rename(columns={'item': 'total_sms_count'}, inplace=True)
display(monthly_sms_count.head())

# Group by 'month' and sum the 'duration' for data usage
monthly_data_usage = dataset[dataset['item'] == 'data'].groupby('month')['duration'].sum().reset_index()
monthly_data_usage.rename(columns={'duration': 'total_data_usage_mb'}, inplace=True)
display(monthly_data_usage.head())

# Merge the monthly summary tables
monthly_summary = monthly_call_duration.merge(monthly_sms_count, on='month', how='left').merge(monthly_data_usage, on='month', how='left')
# Fill NaN values that occurred due to the merge (e.g., months with no data/sms) with 0
monthly_summary.fillna(0, inplace=True)
display(monthly_summary)

# Group by 'network' and count the number of calls
calls_by_network = dataset[dataset['item'] == 'call'].groupby('network')['item'].count().reset_index()
calls_by_network.rename(columns={'item': 'total_calls'}, inplace=True)
display(calls_by_network.head())

# Group by 'network_type' and count the number of calls
calls_by_network_type = dataset[dataset['item'] == 'call'].groupby('network_type')['item'].count().reset_index()
calls_by_network_type.rename(columns={'item': 'total_calls'}, inplace=True)
display(calls_by_network_type.head())

# Group by 'day_of_week' and sum the duration for calls
daily_call_duration = dataset[dataset['item'] == 'call'].groupby('day_of_week')['duration'].sum().reset_index()
daily_call_duration.rename(columns={'duration': 'total_call_duration_minutes'}, inplace=True)
# Map day of week integer to name for better readability
day_map = {0: 'Monday', 1: 'Tuesday', 2: 'Wednesday', 3: 'Thursday', 4: 'Friday', 5: 'Saturday', 6: 'Sunday'}
daily_call_duration['day_name'] = daily_call_duration['day_of_week'].map(day_map)
display(daily_call_duration)

# Group by 'hour' and count the number of calls
hourly_call_count = dataset[dataset['item'] == 'call'].groupby('hour')['item'].count().reset_index()
hourly_call_count.rename(columns={'item': 'total_calls'}, inplace=True)
display(hourly_call_count.head())

# Calculate the average call duration
average_call_duration = dataset[dataset['item'] == 'call']['duration'].mean()
print(f"Average Call Duration: {average_call_duration:.2f} minutes")

# Calculate the total number of SMS
total_sms = dataset[dataset['item'] == 'sms'].shape[0]
print(f"Total SMS: {total_sms}")

# Calculate the total data usage
total_data = dataset[dataset['item'] == 'data']['duration'].sum()
print(f"Total Data Usage: {total_data:.2f} MB")

In [None]:

# Further Data Wrangling: Exploring usage patterns over time and by item type

# Group by year and month and count the number of calls, sms, and data usage
monthly_usage_trend = dataset.groupby(['year', 'month', 'item'])['duration'].agg(['count', 'sum']).reset_index()
monthly_usage_trend.rename(columns={'count': 'count_of_item', 'sum': 'total_duration_or_usage'}, inplace=True)
display(monthly_usage_trend.head())

# Pivot the table to have items as columns for easier comparison across months
monthly_usage_pivot = monthly_usage_trend.pivot_table(index=['year', 'month'], columns='item', values=['count_of_item', 'total_duration_or_usage']).reset_index()
display(monthly_usage_pivot.head())

# Flatten the multi-level column index
monthly_usage_pivot.columns = ['_'.join(map(str, col)).strip() for col in monthly_usage_pivot.columns.values]
monthly_usage_pivot.rename(columns={'year_': 'year', 'month_': 'month'}, inplace=True)
display(monthly_usage_pivot.head())

# Fill NaN values with 0 where items might not be present in certain months
monthly_usage_pivot.fillna(0, inplace=True)
display(monthly_usage_pivot.head())

# Calculate total activity per month
monthly_usage_pivot['total_activity_count'] = monthly_usage_pivot['count_of_item_call'] + monthly_usage_pivot['count_of_item_data'] + monthly_usage_pivot['count_of_item_sms']
display(monthly_usage_pivot.head())

# Group by day of the week and item type to see daily patterns
daily_item_usage = dataset.groupby(['day_of_week', 'item'])['duration'].agg(['count', 'sum']).reset_index()
daily_item_usage.rename(columns={'count': 'count_of_item', 'sum': 'total_duration_or_usage'}, inplace=True)
daily_item_usage['day_name'] = daily_item_usage['day_of_week'].map(day_map) # Using the day_map defined earlier
display(daily_item_usage)

# Pivot the daily usage data
daily_item_usage_pivot = daily_item_usage.pivot_table(index=['day_of_week', 'day_name'], columns='item', values=['count_of_item', 'total_duration_or_usage']).reset_index()
daily_item_usage_pivot.columns = ['_'.join(map(str, col)).strip() for col in daily_item_usage_pivot.columns.values]
daily_item_usage_pivot.rename(columns={'day_of_week_': 'day_of_week', 'day_name_': 'day_name'}, inplace=True)
daily_item_usage_pivot.fillna(0, inplace=True)
display(daily_item_usage_pivot)

# Group by hour and item type to see hourly patterns
hourly_item_usage = dataset.groupby(['hour', 'item'])['duration'].agg(['count', 'sum']).reset_index()
hourly_item_usage.rename(columns={'count': 'count_of_item', 'sum': 'total_duration_or_usage'}, inplace=True)
display(hourly_item_usage.head())

# Pivot the hourly usage data
hourly_item_usage_pivot = hourly_item_usage.pivot_table(index='hour', columns='item', values=['count_of_item', 'total_duration_or_usage']).reset_index()
hourly_item_usage_pivot.columns = ['_'.join(map(str, col)).strip() for col in hourly_item_usage_pivot.columns.values]
hourly_item_usage_pivot.rename(columns={'hour_': 'hour'}, inplace=True)
hourly_item_usage_pivot.fillna(0, inplace=True)
display(hourly_item_usage_pivot.head())

# Calculate the average call duration by network type
average_call_duration_by_network_type = dataset[dataset['item'] == 'call'].groupby('network_type')['duration'].mean().reset_index()
average_call_duration_by_network_type.rename(columns={'duration': 'average_call_duration_minutes'}, inplace=True)
display(average_call_duration_by_network_type)

# Calculate the average call duration by network
average_call_duration_by_network = dataset[dataset['item'] == 'call'].groupby('network')['duration'].mean().reset_index()
average_call_duration_by_network.rename(columns={'duration': 'average_call_duration_minutes'}, inplace=True)
display(average_call_duration_by_network)

# Calculate the number of unique networks used
unique_networks_count = dataset['network'].nunique()
print(f"Number of unique networks used: {unique_networks_count}")

# Calculate the number of unique network types used
unique_network_types_count = dataset['network_type'].nunique()
print(f"Number of unique network types used: {unique_network_types_count}")

# Create a summary table for the dataset
dataset_summary = pd.DataFrame({
    'Column': dataset.columns,
    'DataType': dataset.dtypes,
    'Non-Null Count': dataset.count(),
    'Unique Count': dataset.nunique()
})
display(dataset_summary)

In [None]:


# Identify top used networks and network types
top_networks = dataset['network'].value_counts().reset_index()
top_networks.columns = ['network', 'count']
display(top_networks.head())

top_network_types = dataset['network_type'].value_counts().reset_index()
top_network_types.columns = ['network_type', 'count']
display(top_network_types.head())

### What all manipulations have you done and insights you found?

Answer Here.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code
# Set style
sns.set(style="whitegrid")

# Filter only call records
calls = dataset[dataset['item'] == 'call']

# Plot average call duration by hour of the day
plt.figure(figsize=(12, 6))
sns.lineplot(data=calls, x='hour', y='duration', estimator='mean', ci=None, marker='o', color='royalblue')
plt.title(' Average Call Duration by Hour of Day', fontsize=16)
plt.xlabel('Hour of Day (0–23)')
plt.ylabel('Average Duration (minutes)')
plt.xticks(range(0, 24))
plt.grid(True)
plt.show()

##### 1. Why did you pick the specific chart?

##### 2. What is/are the insight(s) found from the chart?

Call durations are higher during weekdays, particularly around working hours (10 AM to 4 PM), suggesting business-related usageAnswer Here.Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 2

In [None]:
# Chart - 2 visualization code
plt.figure(figsize=(10,6))
sns.countplot(data=dataset, x='day_name', hue='item', order=['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'])
plt.title('Telecom Item Usage by Day of Week')
plt.xlabel('Day of Week')
plt.ylabel('Count')
plt.legend(title='Item')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

The monthly summary chart showing call duration, SMS count, and data usage is essential because it helps reveal how telecom usage changes throughout the year. This allows businesses to spot peak demand months, plan network capacity accordingly, and time their marketing campaigns for maximum impact. If a company ignores these trends, they risk overloading their network during busy times or missing out on revenue opportunities by not running promotions when customers are most active.

##### 2. What is/are the insight(s) found from the chart?

Most telecom activity occurs on weekdays, with "calls" being the most frequent item used, especially on Mondays. Data usage peaks around the end of each quarter, indicating possible bill cycle usage spikes or promotional periods.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

The daily call duration by weekday chart captures customer behavior patterns across the week. Understanding which days see more or longer calls allows companies to optimize customer support and schedule maintenance during quieter periods to reduce disruptions. Missing this insight might mean poor service during busy days or ineffective marketing that doesn’t reach customers when they are most engaged

#### Chart - 3

In [None]:
# Chart - 3 visualization code
plt.figure(figsize=(12,6))
sns.lineplot(data=monthly_summary, x='month', y='total_call_duration_minutes', label='Call Duration (min)')
sns.lineplot(data=monthly_summary, x='month', y='total_sms_count', label='SMS Count')
sns.lineplot(data=monthly_summary, x='month', y='total_data_usage_mb', label='Data Usage (MB)')
plt.title('Monthly Telecom Usage Summary')
plt.xlabel('Month')
plt.ylabel('Usage')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

The monthly summary chart showing call duration, SMS count, and data usage is essential because it helps reveal how telecom usage changes throughout the year. This allows businesses to spot peak demand months, plan network capacity accordingly, and time their marketing campaigns for maximum impact. If a company ignores these trends, they risk overloading their network during busy times or missing out on revenue opportunities by not running promotions when customers are most active.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 4

In [None]:
# Chart - 4 visualization code
plt.figure(figsize=(14,7))
months = monthly_usage_pivot['month'].astype(int)
sns.lineplot(x=months, y=monthly_usage_pivot['count_of_item_call'], label='Call Count')
sns.lineplot(x=months, y=monthly_usage_pivot['count_of_item_sms'], label='SMS Count')
sns.lineplot(x=months, y=monthly_usage_pivot['count_of_item_data'], label='Data Count')
plt.title('Monthly Counts of Calls, SMS, and Data')
plt.xlabel('Month')
plt.ylabel('Count')
plt.xticks(months)
plt.legend()
plt.grid(True)
plt.show()

plt.figure(figsize=(14,7))
sns.lineplot(x=months, y=monthly_usage_pivot['total_duration_or_usage_call'], label='Call Duration (min)')
sns.lineplot(x=months, y=monthly_usage_pivot['total_duration_or_usage_sms'], label='SMS Duration (min)')  # Usually SMS duration may be small/NA
sns.lineplot(x=months, y=monthly_usage_pivot['total_duration_or_usage_data'], label='Data Usage (MB)')
plt.title('Monthly Duration/Usage of Calls, SMS, and Data')
plt.xlabel('Month')
plt.ylabel('Duration / Usage')
plt.xticks(months)
plt.legend()
plt.grid(True)
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 5

In [None]:
# Chart - 5 visualization code
plt.figure(figsize=(12,6))
daily_counts = daily_item_usage_pivot[['day_name', 'count_of_item_call', 'count_of_item_sms', 'count_of_item_data']].set_index('day_name')
sns.heatmap(daily_counts, annot=True, fmt='g', cmap='YlGnBu')
plt.title('Daily Usage Counts by Item')
plt.ylabel('Day of Week')
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 6

In [None]:
# Chart - 6 visualization code
plt.figure(figsize=(10,6))
sns.barplot(data=average_call_duration_by_network.sort_values('average_call_duration_minutes', ascending=False),
            x='network', y='average_call_duration_minutes', palette='viridis')
plt.title('Average Call Duration by Network')
plt.xlabel('Network')
plt.ylabel('Average Duration (Minutes)')
plt.xticks(rotation=45)
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 7

In [None]:
# Chart - 7 visualization code
plt.figure(figsize=(10,6))
sns.barplot(data=daily_call_duration.sort_values('day_of_week'), x='day_name', y='total_call_duration_minutes', palette='coolwarm')
plt.title('Total Call Duration by Day of Week')
plt.xlabel('Day')
plt.ylabel('Call Duration (Minutes)')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 8

In [None]:
# Chart - 8 visualization code
# Plot Top Networks
plt.figure(figsize=(10,5))
sns.barplot(data=top_networks.head(10), x='network', y='count', palette='Blues_d')
plt.title('Top 10 Networks by Usage Count')
plt.xlabel('Network')
plt.ylabel('Usage Count')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

# Plot Top Network Types
plt.figure(figsize=(6,4))
sns.barplot(data=top_network_types, x='network_type', y='count', palette='Set2')
plt.title('Network Type Usage Count')
plt.xlabel('Network Type')
plt.ylabel('Usage Count')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 9

In [None]:
# Chart - 9 visualization code
# Aggregate total call duration, SMS count, and data usage by quarter
quarterly_usage = dataset.groupby(['year', 'quarter', 'item'])['duration'].agg(['count', 'sum']).reset_index()
quarterly_pivot = quarterly_usage.pivot_table(index=['year', 'quarter'], columns='item', values='sum').fillna(0).reset_index()

plt.figure(figsize=(10,6))
sns.lineplot(data=quarterly_pivot, x='quarter', y='call', marker='o', label='Call Duration (min)')
sns.lineplot(data=quarterly_pivot, x='quarter', y='sms', marker='o', label='SMS Count')
sns.lineplot(data=quarterly_pivot, x='quarter', y='data', marker='o', label='Data Usage (MB)')
plt.title('Quarterly Telecom Usage Trends')
plt.xlabel('Quarter')
plt.ylabel('Total Usage')
plt.legend()
plt.grid(True)
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 10

In [None]:
# Chart - 10 visualization code
# Aggregate total call duration, SMS count, and data usage by week of year
weekly_usage = dataset.groupby(['year', 'week_of_year', 'item'])['duration'].agg(['count', 'sum']).reset_index()
weekly_pivot = weekly_usage.pivot_table(index=['year', 'week_of_year'], columns='item', values='sum').fillna(0).reset_index()

plt.figure(figsize=(14,7))
sns.lineplot(data=weekly_pivot, x='week_of_year', y='call', label='Call Duration (min)')
sns.lineplot(data=weekly_pivot, x='week_of_year', y='sms', label='SMS Count')
sns.lineplot(data=weekly_pivot, x='week_of_year', y='data', label='Data Usage (MB)')
plt.title('Weekly Telecom Usage Trends')
plt.xlabel('Week of Year')
plt.ylabel('Total Usage')
plt.legend()
plt.grid(True)
plt.show()

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 11

In [None]:
# Chart - 11 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 12

In [None]:
# Chart - 12 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 13

In [None]:
# Chart - 13 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

Answer Here.

# **Conclusion**

Write the conclusion here.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***

#### Chart - 1 - Distribution of item types and their total duration

In [None]:
# Calculate the total duration for each item type
item_duration = dataset.groupby('item')['duration'].sum().reset_index()

# Create a bar plot
plt.figure(figsize=(8, 6))
sns.barplot(x='item', y='duration', data=item_duration)
plt.title('Total Duration by Item Type')
plt.xlabel('Item Type')
plt.ylabel('Total Duration')
plt.show()

##### 1. Why did you pick the specific chart?
I chose a bar chart because it is effective in comparing the total duration across different discrete categories (item types: call, sms, data).

##### 2. What is/are the insight(s) found from the chart?
This chart shows the total duration for each item type. We can observe which activity (call, sms, or data) has the highest total duration in the dataset.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.
Understanding the distribution of activity duration can help in business decisions such as optimizing service plans, identifying peak usage patterns, or focusing marketing efforts on the most used services. For example, if data usage has a significantly higher duration, a business might focus on data-centric plans. There isn't a clear insight that directly leads to negative growth from this chart alone; rather, it provides information that can inform strategic decisions.

#### Chart - 3 - Total duration by month

In [None]:
# Calculate the total duration for each month
month_duration = dataset.groupby('month')['duration'].sum().reset_index()

# Create a line plot to show trend over months
plt.figure(figsize=(10, 6))
sns.lineplot(x='month', y='duration', data=month_duration)
plt.title('Total Duration by Month')
plt.xlabel('Month')
plt.ylabel('Total Duration')
plt.xticks(rotation=45)
plt.show()

##### 1. Why did you pick the specific chart?
A line plot is suitable for visualizing trends over a continuous variable like time (months).

##### 2. What is/are the insight(s) found from the chart?
This chart shows how the total duration of activities changes over the months captured in the dataset. We can observe if there are any periods of higher or lower usage.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.
Analyzing monthly trends can help a business understand seasonality in usage and plan resources accordingly. For example, if there's a peak in usage during certain months, they might need to increase capacity or offer promotions. A decline in usage over months could indicate a need to investigate customer churn or changes in behavior.

#### Chart - 4 - Total duration by day of the week

In [None]:
# Calculate the total duration for each day of the week
dayofweek_duration = dataset.groupby('day_of_week')['duration'].sum().reset_index()

# Map day of the week numbers to names for better readability
dayofweek_duration['day_of_week'] = dayofweek_duration['day_of_week'].map({0: 'Mon', 1: 'Tue', 2: 'Wed', 3: 'Thu', 4: 'Fri', 5: 'Sat', 6: 'Sun'})

# Create a bar plot
plt.figure(figsize=(8, 6))
sns.barplot(x='day_of_week', y='duration', data=dayofweek_duration, order=['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'])
plt.title('Total Duration by Day of the Week')
plt.xlabel('Day of the Week')
plt.ylabel('Total Duration')
plt.show()

##### 1. Why did you pick the specific chart?
A bar chart is suitable for comparing the total duration across the distinct categories of days of the week.

##### 2. What is/are the insight(s) found from the chart?
This chart shows how the total duration of activities varies across the days of the week. We can identify which days have the highest or lowest usage.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.
Understanding daily usage patterns can help in resource allocation and marketing campaigns. For example, a business might offer promotions on days with lower usage to encourage activity or ensure sufficient capacity on peak days. There isn't a direct insight for negative growth from this chart, but a significant drop in usage on certain days could warrant further investigation into customer behavior or network performance.