<a href="https://colab.research.google.com/github/akshat5002/business-reports-analysis-/blob/main/business_challenges.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Business Challenges: Identifying High vs. Low-Performing Stores
 #  Factors Differentiating Store Performance:

Location: Foot traffic, demographics, and competition.
Store Layout: Effective use of space and product placement.
Customer Service: Quality of staff interaction and service.
Product Range: Variety and relevance of products offered.
Marketing Efforts: Local promotions and advertising effectiveness.



#  Impact of External Factors:

Markdown Strategies: Aggressive discounts may boost sales but affect margins.
Economic Conditions: Recession can reduce consumer spending; growth can increase it.
Regional Variations: Local preferences and economic health can influence performance.
Optimizing Store Strategies Based on Clustering
# Tailored Strategies for Clusters:

1. Pricing Strategies: Adjust prices based on cluster performance and customer demographics.
2. Inventory Management: Stock different products based on local demand and sales trends.
3. Optimizing Markdowns:

# Data Analysis: Use sales data to determine the best timing and depth of markdowns.
Profitability Focus: Balance between clearing inventory and maintaining profit margins.
Data-Driven Decision Making for Growth
Grouping Stores for Targeted Strategies:

# Performance Clusters: Identify high, medium, and low-performing stores for tailored strategies.
Targeted Marketing: Create specific promotions based on cluster characteristics.
Influence of External Factors on Clusters:

CPI (Consumer Price Index): Rising prices can affect consumer spending habits.
Fuel Prices: Higher fuel costs may reduce disposable income for shopping.
Unemployment Rates: Higher unemployment can lead to decreased sales in certain areas.


In [None]:
import pandas as pd

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns


In [None]:
data=pd.read_csv('/content/features.csv')

In [None]:
data.columns

In [None]:
# Load Data
data = pd.read_csv("/content/features.csv", parse_dates=["Date"])

# Display basic info
print(data.info())

# Show first few rows
print(data.head())

In [None]:
print(data.isnull().sum())  # Count missing values in each column

In [None]:
# Check missing values
print(data.isnull().sum())

# Fill missing values in numerical columns with median
num_cols = ['Temperature', 'Fuel_Price', 'MarkDown1', 'MarkDown2', 'MarkDown3', 'MarkDown4', 'MarkDown5', 'CPI', 'Unemployment']
data[num_cols] = data[num_cols].fillna(data[num_cols].median())

# Fill categorical column
data['IsHoliday'] = data['IsHoliday'].fillna(data['IsHoliday'].mode()[0])


In [None]:
# Extract date features
data['Year'] = data['Date'].dt.year
data['Month'] = data['Date'].dt.month
data['Week'] = data['Date'].dt.isocalendar().week
data['DayOfWeek'] = data['Date'].dt.dayofweek  # Monday=0, Sunday=6

In [None]:
# Lag features for temperature and fuel price
data['Prev_Week_Temp'] = data.groupby('Store')['Temperature'].shift(7)
data['Prev_Week_Fuel_Price'] = data.groupby('Store')['Fuel_Price'].shift(7)

In [None]:
# Correlation Heatmap
plt.figure(figsize=(12,6))
sns.heatmap(data.corr(), annot=True, cmap='coolwarm', fmt='.2f')
plt.title("Feature Correlation")
plt.show()


In [None]:
# Visualize markdown trends over time
plt.figure(figsize=(10,5))
sns.lineplot(x='Date', y='MarkDown1', data=data)
plt.title("Markdown 1 Trend Over Time")
plt.show()

Advanced Store-Level Analysis

Store Opening Date
Closed Store Identification
Sales Growth/Decline Analysis

In [None]:
# Compare markdowns on holidays vs non-holidays
sns.histplot(x='IsHoliday', y='MarkDown1', data=data)
plt.title("Markdown 1 on Holidays vs Non-Holidays")
plt.show()


**Identify Store Opening Dates**


In [None]:
# Get the first recorded date for each store
store_opening_dates = data.groupby("Store")["Date"].min().reset_index()
store_opening_dates.columns = ["Store", "Opening_Date"]

print(store_opening_dates.head())

**Identify Closed Stores**


In [None]:
# Get the last recorded date for each store
store_closing_dates = data.groupby("Store")["Date"].max().reset_index()
store_closing_dates.columns = ["Store", "Last_Active_Date"]

# Define threshold for closing (e.g., stores inactive for 1 year)
latest_date = data["Date"].max()
store_closing_dates["Days_Inactive"] = (latest_date - store_closing_dates["Last_Active_Date"]).dt.days

# Identify stores inactive for more than a year
closed_stores = store_closing_dates[store_closing_dates["Days_Inactive"] > 365]
if(closed_stores.shape[0]==0):
  print('No closed_stores')

**Store Clustering: Grouping Stores into 2 Clusters**

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
# Selecting relevant store-level features for clustering
features = ['Temperature', 'Fuel_Price', 'MarkDown1', 'MarkDown2', 'MarkDown3', 'MarkDown4', 'MarkDown5', 'CPI', 'Unemployment']

# Aggregate data at the store level (mean values)
store_data = data.groupby("Store")[features].mean().reset_index()

# Standardize the data
scaler = StandardScaler()
store_data_scaled = scaler.fit_transform(store_data[features])


In [None]:
# Apply K-Means clustering with 2 clusters
kmeans = KMeans(n_clusters=2, random_state=42)
store_data["Cluster"] = kmeans.fit_predict(store_data_scaled)

# View cluster assignments
print(store_data[["Store", "Cluster"]].head())


In [None]:
plt.figure(figsize=(10,6))
sns.scatterplot(x=store_data["CPI"], y=store_data["Unemployment"], hue=store_data["Cluster"], palette="viridis")
plt.xlabel("CPI (Consumer Price Index)")
plt.ylabel("Unemployment Rate")
plt.title("Store Clustering Based on Economic Indicators")
plt.legend(title="Cluster")
plt.show()

Business Interpretation of Clusters
🔹 Cluster 0: High-Performance Stores

Characteristics:
Lower unemployment rates.
Stable fuel prices.
Fewer markdowns needed.
Implication:
These stores are likely located in economically strong regions, indicating a healthy customer base and effective sales strategies.
🔹 Cluster 1: Low-Performance Stores

Characteristics:
Higher markdowns required.
Often found in areas facing economic challenges.
Implication:
These stores may need targeted promotions or changes in business strategies to improve performance and attract more customers.

                    Business Goals & Outcomes
#Segment Stores:

Divide stores into two distinct clusters based on economic and operational factors.
#Identify High-Performing Characteristics:

#Analyze and recognize traits of high-performing stores to replicate their successful strategies.
#Enhance Markdown Efficiency:

Tailor promotions to specific store clusters to improve markdown effectiveness.
Leverage Data-Driven Insights:

#Use data insights to make regional adjustments in pricing and inventory management.


