## Exploratory Data Analysis (EDA) on Retail Sales Data

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

## Data Loading and Cleaning

In [None]:
df = pd.read_csv("retail_sales_dataset.csv",parse_dates=['Date'])
df

In [None]:
df.info()
df.head()

In [None]:
df = df.rename(columns={'Total Amount': 'TotalAmount'})


In [None]:
df1=df.isnull().sum
df1

In [None]:
pd.isnull(df).sum()  #to check for null values

In [None]:
df.dtypes

In [None]:
df = df.drop_duplicates()

## Descriptive Statistics

In [None]:
mean_value=df['TotalAmount'].mean()
median_value=df['TotalAmount'].median()
mode_value=df['TotalAmount'].mode()
std_value=df['TotalAmount'].std()
print(f"Mean: {mean_value},Median: {median_value},Mode: {mode_value},standard_deviation: {std_value}")

##  Time Series Analysis

 Analysing the sales based on monthly and yearly growth

In [None]:
# Set 'date' column as the index
df.sort_values(by='Date', inplace=True)
print(df)

In [None]:
df.set_index('Date', inplace=True)

In [None]:
#monthly sales 
monthly_sale=df['TotalAmount'].resample('ME').mean()
monthly_sale.plot()

In [None]:
#yearly sales
yearly_quantity=df['TotalAmount'].resample('YE').mean()
yearly_quantity.plot()

## Customer and Product Analysis
We can analyze the demographics of customers and their purchasing behavior.

In [None]:
df.columns

In [None]:
gender = df.groupby('Gender')['TotalAmount'].mean()
print(gender)

In [None]:
Age= df.groupby('Age')['TotalAmount'].mean().sort_values(ascending=False).head(10)
print(Age)

In [None]:
products_amount = df.groupby('Product Category')['TotalAmount'].sum().sort_values(ascending=False)
print(products_amount)

In [None]:
# gender and product category
gender_product_purchased=pd.crosstab(df['Gender'], df['Product Category'])
print(gender_product_purchased)

## Visualization
You can visualize different insights, such as sales trends, customer demographics, or correlations between variables.

In [None]:
ax=sns.countplot(x= 'Gender', data=df)

for bars in ax.containers:
    ax.bar_label(bars)

In [None]:
# Group by Gender and calculate total amount for each product
quantity_sold = df.groupby(['Gender'],as_index=False)['TotalAmount'].sum().sort_values(by='TotalAmount',ascending=False)
sns.barplot(x= 'Gender', y= 'TotalAmount', data= quantity_sold)

From the above graph it is clear that female have spend more amount than male for purchasing.

In [None]:
# Product demographics analysis 
product_demographics= df.groupby(['Product Category'],as_index=False)['Quantity'].sum().sort_values(by='Quantity',ascending=False)
sns.set(rc={'figure.figsize':(15,6)})
sns.barplot(data= product_demographics, x='Product Category',y='Quantity',hue='Product Category', palette='viridis', legend=False)


From the above plot it is clear that female are spending more for purchasing than male and also in terms of product category it is clear that Electronics are purchased more and the least for Beauty product.

In [None]:
Age.plot(kind='barh',color='g')
plt.title("Amount spend",fontsize=14)
plt.xlabel('TotalAmount')
plt.ylabel('Age')
plt.show()

In the above bar graph it is clear that person having age of 37 spend more amount spend followed by 19 years old age group.

In [None]:
#Comparing the product purchased based on gender by ploting bar graph
gender_product_purchased.plot(kind='bar')
plt.title("Product purchased by Male and Female",fontsize=14)
plt.xlabel('Gender')
plt.ylabel('Product Category')
plt.show()


## Recommendation


##### Analyze Monthly Fluctuations:
Since there have been fluctuation in totals from January 2023 to January 2024, it’s important to figure out what caused these changes. To understand the drivers of spikes, particularly in February 2023 and January 2024, it is necessary to take a close look at outside factors such as seasons, special sales, or economic events etc.

##### Identify Outliers and Events:
The significant spikes in February 2023 and January 2024 suggest potential outliers or events that may have influenced spending patterns. Check if these spikes are linked to things like marketing campaigns, sales, holidays, or changes in the economy. Understanding these events can help make better predictions and improve planning for future sales

#### Targeted Product Promotions by Gender:

##### Clothing as a Core Focus for Both Genders: 
Since both females (174 purchases) and males (177 purchases) show strong interest in Clothing, consider launching targeted, gender-specific Clothing promotions. Create personalized offers that appeal to the distinct preferences of each gender, potentially offering exclusive designs, seasonal collections, or discounts tailored to these customer groups.
Beauty Products for Female Shoppers:

##### Expand Beauty Offerings for Females: 
The slightly higher interest in Beauty products among females (166 purchases) presents an opportunity to introduce new beauty product lines, bundled deals, or exclusive beauty promotions aimed at female shoppers.
Electronics Engagement for Males:

##### Targeted Promotions for Males in Electronics:
Males showed significant interest in Electronics (172 purchases), which makes it a key area for growth. Consider special offers such as bundle deals, rewards, or flash sales in the Electronics category. Additionally, targeted marketing campaigns can be designed to highlight the latest tech products, gadgets, or seasonal promotions.

Overall, by giving cross-category promotion,it will encourage the customers to explore products in multiple categories such as Clothing, Beauty, and Electronics. For instance, offer a discount on Beauty products with a Clothing purchase or vice versa. Similarly, for males who purchase electronics, consider recommending complementary items from the Clothing category, promoting a wider range of products across gender-based preferences.

##### Continuous Monitoring and Trend Adjustments:
It is also important to regularly review sales data to adjust product offerings, promotions, and marketing strategies in response to emerging trends and customer preferences. As the data reveals that the treands gets fluctuates over time, which indicates that continuous monitoring of purchasing behaviours.

##### Adapt to Changing Purchasing Patterns: 
The data reveals that trends fluctuate over time, signaling the need for continuous monitoring of purchasing behaviors. Regularly review sales data to adjust product offerings, promotions, and marketing strategies in response to emerging trends and customer preferences. Utilize these insights to refine product assortments, inventory management, and marketing efforts for optimal engagement.
By applying these insights, businesses can better adjust their strategies to match changing customer preferences and market trends, leading to a more focused approach that boosts growth and enhances customer satisfaction.



