<a href="https://colab.research.google.com/github/Ramiassaf/DATA_SCIENCE/blob/main/Customer_Behaviour_Analysis_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction

*An Introduction to Customer Behavior Analysis*

*Customer Behavior Analysis emerges as a cornerstone for success in the throbbing heart of business evolution. This crucial approach enables firms to move beyond traditional tactics and embrace a data-driven mindset. Understanding and exploiting customer behavior has become critical for making informed decisions, magnifying customer experiences, and guaranteeing long-term competitiveness as the business landscape evolves.*


# Import Needed Libraries

In [None]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

# Load & Explore The Dataset

In [None]:
df = pd.read_csv("/content/drive/MyDrive/DATA/my projrects data set/ecommerce_customer_data.csv")


In [None]:
df.head()

Unnamed: 0,User_ID,Gender,Age,Location,Device_Type,Product_Browsing_Time,Total_Pages_Viewed,Items_Added_to_Cart,Total_Purchases
0,1,Female,23,Ahmedabad,Mobile,60,30,1,0
1,2,Male,25,Kolkata,Tablet,30,38,9,4
2,3,Male,32,Bangalore,Desktop,37,13,5,0
3,4,Male,35,Delhi,Mobile,7,20,10,3
4,5,Male,27,Bangalore,Tablet,35,20,8,2


In [None]:
df.columns

Index(['User_ID', 'Gender', 'Age', 'Location', 'Device_Type',
       'Product_Browsing_Time', 'Total_Pages_Viewed', 'Items_Added_to_Cart',
       'Total_Purchases'],
      dtype='object')

In [None]:
df.describe() # Summary statistics for numeric columns

Unnamed: 0,User_ID,Age,Product_Browsing_Time,Total_Pages_Viewed,Items_Added_to_Cart,Total_Purchases
count,500.0,500.0,500.0,500.0,500.0,500.0
mean,250.5,26.276,30.74,27.182,5.15,2.464
std,144.481833,5.114699,15.934246,13.071596,3.203127,1.740909
min,1.0,18.0,5.0,5.0,0.0,0.0
25%,125.75,22.0,16.0,16.0,2.0,1.0
50%,250.5,26.0,31.0,27.0,5.0,2.0
75%,375.25,31.0,44.0,38.0,8.0,4.0
max,500.0,35.0,60.0,50.0,10.0,5.0


In [None]:
df.describe(include="object") # Summary statistics for non numerical columns

Unnamed: 0,Gender,Location,Device_Type
count,500,500,500
unique,2,8,3
top,Male,Kolkata,Mobile
freq,261,71,178


# Data Visualization

In [None]:
fig =px.histogram(df, x="Age", title="Customer Distribution of Age", color_discrete_sequence =["#17becf"])
# fig.update_traces(texttemplate='%{x}', textposition='outside')
fig.show()

*The histogram depicts the age distribution of customers within the examined firm. The peaks for ages 21, 29, and 35 indicate high concentrations of clients in these age groups. This observation shows that the firm has a sizable consumer base in these age categories. Understanding this age distribution is critical for better customizing marketing strategies, product offers, and consumer engagement initiatives to the prevalent demographics. It identifies possible focal points for targeted advertising and individualized approaches, allowing the organization's efforts to be more aligned with the preferences and characteristics of its most important consumer segments.*


In [None]:
fig = px.histogram(df, x='Location', title='Customer Demographic by Location', color='Location', color_discrete_sequence=px.colors.qualitative.Set2)
fig.update_traces(texttemplate='%{y}', textposition='inside')
fig.show()

In [None]:
location_grouped = df.groupby('Location')['Total_Purchases'].mean().reset_index()
location_grouped['Average_Items_Purchased'] = location_grouped['Total_Purchases'].round()  # Round the average values
location_grouped = location_grouped[['Location', 'Average_Items_Purchased']]  # Keep only necessary columns

fig = px.bar(location_grouped, x='Location', y='Average_Items_Purchased',
             title='Average Items Purchased by Location', color='Location',
             color_discrete_sequence=px.colors.qualitative.Set2)
fig.update_traces(texttemplate='%{y}', textposition='inside')

fig.update_layout(yaxis_title='Average Items Purchased')  # Optional: add y-axis title
fig.show()


The first graph illustrates the distribution of customers based on their geographic representation. It is evident that Kolkata and Delhi have the most significant customer representation, showcasing a high concentration of customers in these locations. However, when we transition to the second graph, which displays the average items purchased per location, a nuanced insight emerges. Despite Kolkata and Delhi having the highest customer representation, Chennai, Delhi, and Kolkata stand out as leaders in terms of average items purchased. This indicates a deeper engagement and higher purchasing behavior in these specific locations, suggesting potential areas for targeted marketing efforts, product promotions, or personalized strategies to further enhance customer satisfaction and capitalize on the robust purchasing patterns observed in Chennai, Delhi, and Kolkata.

**Mean (Average):When we calculate the mean of 'Items_Purchased' for each device type, we get the average number of items purchased for that specific device type.
For example, if you have data for three types of devices (A, B, and C), and you calculate the mean for each, you get an idea of the typical or average number of items purchased for each device type.**

In [None]:
fig = px.pie(df,
             names='Gender',
             hole=0.5,
             color_discrete_sequence=['#17becf', '#d62728'],
             title='Cusstomer Gender Distriution')

fig.update_traces(textposition='inside', textinfo='percent+label')
fig.show()


In [None]:
gender_grouped = df.groupby('Gender')['Total_Purchases'].mean().reset_index()
gender_grouped['Average_Items_Purchased'] = gender_grouped['Total_Purchases'].round()  # Round the average values

gender_grouped = gender_grouped[['Gender', 'Average_Items_Purchased']]


fig = px.bar(gender_grouped, x='Gender', y='Average_Items_Purchased',
             title='Average Items Purchased by Gender', color='Gender',
             color_discrete_sequence=['#17becf', '#d62728'])
fig.update_traces(texttemplate='%{y}', textposition='inside')


fig.update_layout(yaxis_title='Average Items Purchased')  # Optional: add y-axis title
fig.show()


In [None]:
gender_grouped = df.groupby('Gender')['Product_Browsing_Time'].mean().reset_index()
gender_grouped.columns = ['Gender', 'Averag_Product_Browsing_Time']

fig = px.pie(gender_grouped, values='Averag_Product_Browsing_Time', names='Gender',
             title='Average Product Browsing Time by Gender',
             color='Gender', color_discrete_sequence=px.colors.qualitative.Set1)
fig.update_traces(textposition='inside', textinfo='percent+label')


fig.show()

In [None]:
devices_grouped = df.groupby('Device_Type')['Total_Purchases'].mean().reset_index()
devices_grouped.columns = ['Device_Type', 'Average_Items_Purchased']


fig = px.bar(devices_grouped, x='Device_Type', y='Average_Items_Purchased',
             title='Average Items Purchased by Devices', color='Device_Type',
             color_discrete_sequence=px.colors.qualitative.Set2)
fig.update_traces(texttemplate='%{y}', textposition='inside')


fig.update_layout(yaxis_title='Average Items Purchased')  # Optional: add y-axis title
fig.show()

In [None]:
devices_grouped = df.groupby('Device_Type')['Total_Pages_Viewed'].mean().reset_index()
devices_grouped.columns = ['Device_Type', 'Average_Pages_Viewed']


fig = px.bar(devices_grouped, x='Device_Type', y='Average_Pages_Viewed',
             title='Average_Pages_Viewed by Devices', color='Device_Type',
             color_discrete_sequence=px.colors.qualitative.Set2)
fig.update_traces(texttemplate='%{y}', textposition='inside')


fig.show()

In [None]:
devices_grouped = df.groupby('Device_Type')['Product_Browsing_Time'].mean().reset_index()
devices_grouped.columns = ['Device_Type', 'Averag_Product_Browsing_Time']


fig = px.bar(devices_grouped, x='Device_Type', y='Averag_Product_Browsing_Time',
             title='Averag_Product_Browsing_Time by Devices', color='Device_Type',
             color_discrete_sequence=px.colors.qualitative.Set2)
fig.update_traces(texttemplate='%{y}', textposition='inside')


fig.show()

*Upon analyzing customer behavior based on the device used, intriguing insights surface. Mobile and tablet users exhibit a noteworthy trend, displaying higher averages in both items purchased and pages viewed compared to desktop users. Although the difference isn't substantial, this consistency across multiple metrics indicates a distinct preference for mobile and tablet devices when engaging with the platform.
Remarkably, despite variations in items purchased and pages viewed, the browsing time on different devices remains closely aligned. The proximity in browsing durations suggests that, while customers on mobile and tablet devices may engage more actively in terms of purchasing and page views, the overall time spent navigating the platform is comparable across devices.*

*These findings underscore the significance of optimizing the user experience on mobile and tablet interfaces, recognizing the preferences of a substantial user base. Additionally, the comparable browsing times on different devices highlight the need for a seamless and efficient browsing experience across the entire spectrum of devices, ensuring a consistent and engaging interaction regardless of the chosen platform.*

In [None]:
browsingtime_grouped = df.groupby('Product_Browsing_Time')['Total_Pages_Viewed'].mean().reset_index()
browsingtime_grouped.columns = ['Product_Browsing_Time', 'Averag_Total_Pages_Viewede']


fig = px.bar(browsingtime_grouped, x='Product_Browsing_Time', y='Averag_Total_Pages_Viewede',
             title='Average Total Pages Viewed by Product Browsing Time', color='Product_Browsing_Time',
             color_discrete_sequence=px.colors.qualitative.Set3)
fig.update_traces(texttemplate='%{y}', textposition='inside')


fig.show()



In [None]:
pagesview_grouped = df.groupby('Total_Pages_Viewed')['Items_Added_to_Cart'].mean().reset_index()
pagesview_grouped.columns = ['Total_Pages_Viewed', 'Averag_Items_Added_to_Carte']


fig = px.bar(pagesview_grouped, x='Total_Pages_Viewed', y='Averag_Items_Added_to_Carte',
             title='Average Items Added to Cart in Relation to Total Pages Viewed', color='Averag_Items_Added_to_Carte',
             color_discrete_sequence=px.colors.qualitative.Set3)
fig.update_traces(texttemplate='%{y}', textposition='inside')


fig.show()

*Upon closer examination, it appears that there is a limited correlation between total pages viewed and the actions of adding items to the cart or the overall browsing time. This observation suggests that customer engagement, as measured by the number of pages viewed, doesn't necessarily translate directly into cart additions or prolonged browsing sessions.
This divergence in behavior could be influenced by various factors. Customers might be exploring a wide range of products without a specific intent to purchase, leading to a lower correlation between page views and items added to the cart. Additionally, browsing time might be influenced by factors unrelated to the shopping process, such as informational searches or casual exploration.*

*To enhance the understanding of customer behavior and improve conversion rates, it's crucial to delve deeper into the specific actions that drive cart additions and prolonged browsing. This could involve analyzing the types of pages visited, the relevance of product recommendations, or the effectiveness of the user interface. By pinpointing the factors influencing customer actions, businesses can tailor their strategies to align with customer preferences and boost overall engagement.*

In [None]:
itempurchased_fromcart_grouped = df.groupby('Total_Purchases')['Items_Added_to_Cart'].mean().reset_index()
itempurchased_fromcart_grouped.columns = ['Total_Purchases', 'Averag_Items_Added_to_Carte']


fig = px.bar(itempurchased_fromcart_grouped, x='Total_Purchases', y='Averag_Items_Added_to_Carte',
             title='Average Total Pages Viewed in Relation to Total Purchases', color='Total_Purchases',
             color_discrete_sequence=px.colors.qualitative.Set3)
fig.update_traces(texttemplate='%{y}', textposition='inside')


fig.show()

In [None]:
itempurchased_grouped = df.groupby('Total_Purchases')['Total_Pages_Viewed'].mean().reset_index()
itempurchased_grouped.columns = ['Total_Purchases', 'Averag_Total_Pages_Viewed']


fig = px.bar(itempurchased_grouped, x='Total_Purchases', y='Averag_Total_Pages_Viewed',
             title='Average Total Pages Viewed in Relation to Total Purchases', color='Total_Purchases',
             color_discrete_sequence=px.colors.qualitative.Set3)
fig.update_traces(texttemplate='%{y}', textposition='inside')


fig.show()

## Customer Life Time

In [None]:
#calculating the customer lifetime value
df['CLV'] = (df['Total_Purchases'] * df['Total_Pages_Viewed']) / df['Age']

df['Segment'] = pd.cut(df['CLV'], bins=[1, 2.5, 5, float('inf')],
                         labels=['Low Value', 'Medium Value', 'High Value'])

segment_counts = df['Segment'].value_counts().reset_index()
segment_counts.columns = ['Segment', 'Count']

# Create a bar chart to visualize the customer segments
fig = px.bar(segment_counts, x='Segment', y='Count',
             title='Customer Segmentation by CLV',color='Segment',color_discrete_sequence=px.colors.qualitative.Prism)
fig.update_xaxes(title='Segment')
fig.update_yaxes(title='Number of Customers')
fig.show()

*A churn rate of 0.198, or 19.8%, signifies a substantial proportion of customers who have discontinued their engagement with the business. Churn, in this context, refers to customers who have ceased using the products or services offered by the business. This insight is crucial as it indicates a notable loss in the customer base, which can have detrimental effects on business growth and profitability.*

In [None]:
# Calculate churn rate
df['Churned'] = df['Total_Purchases'] == 0

churn_rate = df['Churned'].mean()
print(churn_rate)

0.198


# Cocnclusion

In conclusion, our comprehensive analysis provides valuable insights into the dynamics of customer behavior within the examined firm. The age distribution reveals significant peaks at ages 21, 29, and 35, guiding the customization of marketing strategies and product offers to align with these prevalent demographics. Geographic representation highlights Kolkata and Delhi as key customer hubs, while a nuanced exploration of average items purchased identifies Chennai, Delhi, and Kolkata as leaders in purchasing behavior.
The examination of device-based behavior emphasizes the preference for mobile and tablet interfaces, showcasing higher averages in items purchased and pages viewed. However, the comparable browsing times across devices underline the importance of a seamless user experience across the entire spectrum.

Despite variations, the limited correlation between total pages viewed and cart additions or browsing time prompts a deeper exploration of specific factors influencing customer actions. This understanding is pivotal for refining strategies and improving conversion rates.

Additionally, a concerning churn rate of 19.8% signals the need for robust retention strategies to mitigate customer losses and sustain business growth. Notably, male customers outnumber females, demonstrating higher purchasing activity, while females invest more time in browsing.

In light of these findings, a holistic approach that tailors strategies to diverse demographics, optimizes user experiences across devices, and addresses the identified challenges is essential. Implementing such insights-driven initiatives will not only enhance customer satisfaction but also fortify the organization's position in a dynamic market landscape.

# **DONE BY : Rami Assaf**