# Shopology: Decoding Customer Shopping Trends
---

In [None]:
#installing necessary libraries
!pip install numpy pandas seaborn matplotlib plotly

In [1]:
# importing libraries
import numpy as np # Importing the numpy library for array operations and mathematical functions
import pandas as pd # Use for exploring the data 
import seaborn as sns # it has also plot
import matplotlib.pyplot as plt # for some extra plot functions
import plotly.express as px # for interactive plots

In [3]:
# reading the data set
shop = pd.read_csv('shopping_trends_updated.csv')

In [None]:
shop.shape

In [None]:
shop.to_excel('shopping_trends_updated.xlsx')

In [None]:
shop.head()

In [None]:
shop.tail()

In [None]:
shop.dtypes

In [None]:
# it shows the names of the columns 
shop.columns

In [None]:
shop.info()

In [None]:
shop.isnull().sum()

In [None]:
print(f"The unique values of the 'Gender' column are: {shop['Gender'].unique()}")
print()# This will print a blank line
print(f"The unique values of the 'Category' column are: {shop['Category'].unique()}")
print()# This will print a blank line
print(f"The unique values of the 'Size' column are: {shop['Size'].unique()}")
print()# This will print a blank line
print(f"The unique values of the 'Subscription Status' column are: {shop['Subscription Status'].unique()}")
print()# This will print a blank line
print(f"The unique values of the 'Shipping Type' column are: {shop['Shipping Type'].unique()}")
print()# This will print a blank line
print(f"The unique values of the 'Discount Applied' column are: {shop['Discount Applied'].unique()}")
print()# This will print a blank line
print(f"The unique values of the 'Promo Code Used' column are: {shop['Promo Code Used'].unique()}")
print()# This will print a blank line
print(f"The unique values of the 'Payment Method' column are: {shop['Payment Method'].unique()}")

## OBSERVATION:
Upon initial examination of the dataset, it is evident that we have a comprehensive and well-structured dataset with 3900 rows and 18 columns. The data is complete, with no missing values, which allows us to proceed confidently with our analysis.

Let's delve into the columns and their significance in understanding our custome      

-  **Customer ID:** This column serves as a unique identifier for each customer, enabling us to differentiate between individuals.
-  **Age:** The age column provides insights into the age demographics of our customers, helping us understand their preferences and behaviors.
-  **Gender:** This column showcases the gender of the customers, enabling us to analyze buying patterns based on gender.
-  **Item Purchased:** Here, we can identify the specific products that customers have bought, allowing us to gain an understanding of popular choices.
-  **Category:** The category column categorizes the products into different groups such as clothing, footwear, and more, aiding us in analyzing trends within specific product categories.
-  **Purchase Amount (USD):** This column reveals the amount customers spent on their purchases, providing insights into their spending habits.
-  **Location:** The location column indicates the geographical location of customers, which can help identify regional trends and preferences.
-  **Size:** This column denotes the size of the purchased products, assisting in understanding size preferences across different categories.
-  **Color:** Here, we can determine the color preferences of customers, aiding in analyzing color trends and their impact on purchasing decisions.
-  **Season:** The season column allows us to identify the season during which customers made their purchases, enabling us to explore seasonal shopping trends.
-  **Review Rating:** This column showcases the ratings given by customers, providing valuable feedback on product satisfaction and quality.
-  **Subscription Status:** This column indicates whether customers have opted for a subscription status, which can help us understand customer loyalty and engagement.
-  **Shipping Type:** Here, we can identify the different shipping methods used to deliver products to customers, shedding light on preferred shipping options.
-  **Discount Applied:** This column indicates whether a discount was applied to the purchased products, enabling us to analyze the impact of discounts on customer behavior.
-  **Promo Code Used:** Here, we can identify whether customers utilized promo codes during their purchases, helping us evaluate the effectiveness of promotional campaigns.
-  **Previous Purchases:** This column reveals the number of previous purchases made by customers, aiding in understanding customer loyalty and repeat business.
-  **Payment Method:** The payment method column showcases the various methods used by customers to make their purchases, allowing us to analyze preferred payment options.
-  **Frequency of Purchases:** This column provides insights into the frequency at which customers make purchases, helping us identify patterns and customer buying habits.

Customer buying habits.
With this rich and diverse dataset, we are well-equipped to explore customer shopping trends, understand their preferences, and uncover valuable insights that can drive informed decision-making and enhance the overall customer experience. Let's embark on this exciting analysis journey!



## 1 What is the overall distribution of customer ages in the dataset?

In [None]:
shop['Age'].value_counts()

In [None]:
shop['Age'].mean()

In [None]:
shop['Gender'].unique()

In [5]:
shop['Age_category'] = pd.cut(shop['Age'], bins= [0,15, 18 , 30 , 50 , 70] , labels= ['child' , 'teen' , 'Young Adults' ,'Middle-Aged Adults' , 'old'] )

In [None]:
# installing nbformat required to run fig.show()
%pip install nbformat

In [4]:
# importting nbformat
import nbformat

In [None]:
fig = px.histogram(shop , y = 'Age' , x = 'Age_category')
fig.show()

## 2 How does the average purchase amount vary across different product categories?

In [None]:
shop.columns

In [None]:
shop['Category'].unique()

In [None]:
shop['Category']

In [None]:
shop.groupby('Category')['Purchase Amount (USD)'].mean()

## 3 Which gender has the highest number of purchases?

In [None]:
shop.columns

In [None]:
sns.barplot(shop , x = 'Gender' , y = 'Purchase Amount (USD)')

## 4 What are the most commonly purchased items in each category?

In [None]:
shop.columns

In [None]:
shop.groupby('Category')['Item Purchased'].value_counts()

In [None]:
fig = px.histogram(shop , x = 'Item Purchased' , color = 'Category')
fig.show()

## 5 Are there any specific seasons or months where customer spending is significantly higher?

In [None]:
shop['Season'].unique()

In [None]:
shop[shop['Season'] == 'Summer'].value_counts().sum()

In [None]:
shop[shop['Season'] == 'Winter'].value_counts().sum()

In [None]:
shop[shop['Season'] == 'Spring'].value_counts().sum()

In [None]:
shop[shop['Season'] == 'Fall'].value_counts().sum()

In [None]:
fig = px.histogram(shop , x = 'Season' , range_y= [200 , 1500] )

fig.show()

## 6 What is the average rating given by customers for each product category?

In [34]:
shop_groupby = shop.groupby('Category')['Review Rating'].mean().reset_index()

In [None]:
fig = px.bar(shop_groupby ,x= 'Category' , y = 'Review Rating' )
fig.show()

## 7 Are there any notable differences in purchase behavior between subscribed and non-subscribed customers?

In [None]:
shop.columns

In [None]:
shop['Subscription Status'].unique()

In [None]:
sns.barplot(shop  , x = 'Subscription Status' , y = 'Purchase Amount (USD)')

In [None]:
shop['Purchase Amount (USD)'].sum()

In [None]:
shop.groupby('Subscription Status')['Purchase Amount (USD)'].mean()

## 8 Which payment method is the most popular among customers?

In [None]:
shop.groupby('Payment Method')['Purchase Amount (USD)'].mean().sort_values(ascending= False)

In [42]:
shop_groupby = shop.groupby('Payment Method')['Purchase Amount (USD)'].mean().reset_index()

In [None]:
fig = px.bar(shop_groupby , x = 'Payment Method' , y = 'Purchase Amount (USD)')
fig.show()

In [None]:
sns.barplot(shop ,x='Payment Method' , y = 'Purchase Amount (USD)')

## 9 Do customers who use promo codes tend to spend more than those who don't?

In [45]:
shop_groupby  = shop.groupby('Promo Code Used')['Purchase Amount (USD)'].sum().reset_index()

In [None]:
fig = px.sunburst(shop , path=['Gender' , 'Promo Code Used'] , values='Purchase Amount (USD)')
fig.show()

In [None]:
fig  =  px.bar(shop_groupby , x= 'Promo Code Used' , y = 'Purchase Amount (USD)')
fig.show()

## 10 How does the frequency of purchases vary across different age groups?

In [None]:
shop[['Age' , 'Age_category']]

In [None]:
shop['Age_category'].unique()

In [50]:
shop_group = shop.groupby('Frequency of Purchases')['Age'].sum()

In [None]:
px.sunburst(shop , path=['Frequency of Purchases','Age_category'] , values='Age')

## 11 Are there any correlations between the size of the product and the purchase amount?

In [None]:
shop.columns

In [53]:
shop_group = shop.groupby('Size')['Purchase Amount (USD)'].sum().reset_index()

In [None]:
fig  = px.bar(shop_group , x = 'Size' , y ='Purchase Amount (USD)'  )
fig.show()

## 12 Which shipping type is preferred by customers for different product categories?

In [None]:
shop.groupby('Category')['Shipping Type'].value_counts().sort_values(ascending= False)

In [56]:
shop['Shipping_Category'] =shop['Shipping Type'].map({'Express': 0, 'Free Shipping': 1, 'Next Day Air': 2,
                                                       'Standard': 3, '2-Day Shipping': 4, 'Store Pickup': 5})

In [None]:
shop['Category'].unique()

In [58]:
shop['Category_num'] =shop['Category'].map({'Clothing':1, 'Footwear':2, 'Outerwear':3, 'Accessories':4})

## 13 How does the presence of a discount affect the purchase decision of customers?

In [None]:
shop.columns

In [60]:
shop_group = shop.groupby('Discount Applied')['Purchase Amount (USD)'].sum().reset_index()

In [None]:
px.histogram(shop_group , x = 'Discount Applied' , y = 'Purchase Amount (USD)')

In [None]:
fig = px.sunburst(shop , path = ['Gender' , 'Discount Applied'], values='Purchase Amount (USD)' , color= 'Gender')

fig.show()

## 14 Are there any specific colors that are more popular among customers?

In [None]:
px.histogram(shop , x = 'Color')

In [None]:
shop['Color'].value_counts().nlargest(5)

## 15 What is the average number of previous purchases made by customers?

In [None]:
shop['Previous Purchases'].mean()

## 16 Are there any noticeable differences in purchase behavior between different locations?

In [None]:
shop.groupby('Location')['Purchase Amount (USD)'].mean().sort_values(ascending = False)

In [67]:
shop_group = shop.groupby('Location')['Purchase Amount (USD)'].mean().reset_index()

In [None]:
fig = px.bar(shop_group, x = 'Location' , y = 'Purchase Amount (USD)')
fig.show()

## 17 Is there a relationship between customer age and the category of products they purchase?

In [69]:
shop_group = shop.groupby('Category')['Age'].mean().reset_index()

In [None]:
fig = px.bar(shop_group ,y = 'Age' , x= 'Category')
fig.show()

## 18 How does the average purchase amount differ between male and female customers?

In [6]:
shop_group = shop.groupby('Gender')['Purchase Amount (USD)'].sum().reset_index()

In [None]:
fig = px.bar(shop_group , x = 'Gender' , y = 'Purchase Amount (USD)')
fig.show()

In [None]:
px.sunburst(data_frame= shop , path = ['Gender' ,'Age'] , values='Purchase Amount (USD)')