# **Project Description: Diagnostic Analysis of Sales Decline at XYZ Retail**
Project Title: Investigating the Causes of Sales Decline at XYZ Retail
Objective: The primary goal of this project is to conduct a diagnostic analysis to identify the underlying causes of a recent decline in sales at XYZ Retail. By analyzing various data sources, we aim to uncover the factors contributing to this decline and provide actionable insights for management to formulate effective strategies for improvement.
Background: Over the past six months, XYZ Retail has experienced a noticeable decline in sales, which has prompted management to seek a thorough understanding of the reasons behind this trend. This analysis will not only help in identifying contributing factors but also guide remedial actions to reverse the decline.


In [7]:
!pip install dash statsmodels

import pandas as pd
import numpy as np
import dash
from dash import dcc, html
import plotly.express as px
import statsmodels.api as sm

# Generate Synthetic Sales Data
np.random.seed(42)
dates = pd.date_range(start="2023-01-01", periods=365, freq='D')
categories = ["Electronics", "Clothing", "Groceries", "Furniture", "Toys"]
payment_methods = ["Credit Card", "Cash", "Digital Payment"]
locations = ["New York", "Los Angeles", "Chicago", "Houston", "Miami"]

sales_data = pd.DataFrame({
    "Date": np.random.choice(dates, 1500),
    "Product Category": np.random.choice(categories, 1500),
    "Transaction Amount": np.random.randint(10, 500, 1500),
    "Payment Method": np.random.choice(payment_methods, 1500),
    "Location": np.random.choice(locations, 1500),
})

# Convert Date to Month for Analysis
sales_data['Month'] = sales_data['Date'].dt.to_period('M').astype(str)

# 1️⃣ **Sales Trend Over Time (Decline)**
monthly_sales = sales_data.groupby("Month", as_index=False)["Transaction Amount"].sum()
monthly_sales["Transaction Amount"] *= np.linspace(1.2, 0.8, len(monthly_sales))  # Simulating a decline

# 2️⃣ **Correlation Analysis (Regression Model: Sales vs. Time)**
monthly_sales["Month_Num"] = range(1, len(monthly_sales) + 1)  # Convert months to numerical values
X = sm.add_constant(monthly_sales["Month_Num"])
y = monthly_sales["Transaction Amount"]
model = sm.OLS(y, X).fit()
correlation_results = model.summary()

# 3️⃣ **Best-Selling Categories**
category_sales = sales_data.groupby("Product Category", as_index=False)["Transaction Amount"].sum()
category_sales = category_sales.sort_values("Transaction Amount", ascending=False)

# 4️⃣ **Payment Method Preferences**
payment_distribution = sales_data.groupby("Payment Method", as_index=False)["Transaction Amount"].sum()
payment_distribution = payment_distribution.sort_values("Transaction Amount", ascending=False)

# 5️⃣ **Customer Segmentation: High vs. Low Spending Customers**
customer_spending = sales_data.groupby("Location", as_index=False)["Transaction Amount"].mean()
customer_spending = customer_spending.sort_values("Transaction Amount", ascending=False)

# Initialize Dash App
app = dash.Dash(__name__)

app.layout = html.Div(children=[
    html.H1("📊 Diagnostic Analysis of Sales Decline at XYZ Retail", style={"text-align": "center"}),

    # 🔴 Sales Trend Over Time (Decline)
    html.H3("📉 Sales Trend Over Time"),
    dcc.Graph(figure=px.bar(monthly_sales, x="Transaction Amount", y="Month", orientation="h",
                             title="📉 Declining Sales Trend Over Time", color_discrete_sequence=["red"])
              .update_layout(yaxis={'categoryorder': 'total ascending'})),

    # 🔵 Best-Selling Product Categories
    html.H3("🏆 Best-Selling Product Categories"),
    dcc.Graph(figure=px.bar(category_sales, x="Transaction Amount", y="Product Category", orientation="h",
                             title="🔹 Top-Selling Product Categories", color_discrete_sequence=["blue"])
              .update_layout(yaxis={'categoryorder': 'total ascending'})),

    # 🟢 Payment Method Distribution
    html.H3("💳 Payment Method Preferences"),
    dcc.Graph(figure=px.bar(payment_distribution, x="Transaction Amount", y="Payment Method", orientation="h",
                             title="💰 Payment Method Preferences", color_discrete_sequence=["green"])
              .update_layout(yaxis={'categoryorder': 'total ascending'})),

    # 🟣 Customer Segmentation (Spending Behavior)
    html.H3("📍 Customer Segmentation by Location"),
    dcc.Graph(figure=px.bar(customer_spending, x="Transaction Amount", y="Location", orientation="h",
                             title="📌 High vs. Low Spending Customer Locations", color_discrete_sequence=["purple"])
              .update_layout(yaxis={'categoryorder': 'total ascending'})),
])

# Run the Dash App
if __name__ == '__main__':
    app.run(debug=True)





`kurtosistest` p-value may be inaccurate with fewer than 20 observations; only n=12 observations were given.



<IPython.core.display.Javascript object>

In [8]:
# 1️⃣ Print Sales Trend Over Time (Decline)
print("📉 Sales Decline Over Time (Monthly Sales Summary):")
print(monthly_sales)

# 2️⃣ Print Correlation Analysis (Regression Summary)
print("\n🔍 Correlation Analysis (Regression Model - Sales vs. Time):")
print(correlation_results)

# 3️⃣ Print Best-Selling Product Categories
print("\n🏆 Best-Selling Product Categories:")
print(category_sales)

# 4️⃣ Print Payment Method Preferences
print("\n💳 Payment Method Distribution:")
print(payment_distribution)

# 5️⃣ Print Customer Segmentation (High vs. Low Spending Customers)
print("\n📍 Customer Segmentation (Average Spending Per Location):")
print(customer_spending)


📉 Sales Decline Over Time (Monthly Sales Summary):
      Month  Transaction Amount  Month_Num
0   2023-01        38329.200000          1
1   2023-02        38985.309091          2
2   2023-03        27677.927273          3
3   2023-04        39440.727273          4
4   2023-05        38829.418182          5
5   2023-06        36767.563636          6
6   2023-07        23562.654545          7
7   2023-08        31593.309091          8
8   2023-09        31026.363636          9
9   2023-10        28628.945455         10
10  2023-11        23335.381818         11
11  2023-12        23271.200000         12

🔍 Correlation Analysis (Regression Model - Sales vs. Time):
                            OLS Regression Results                            
Dep. Variable:     Transaction Amount   R-squared:                       0.532
Model:                            OLS   Adj. R-squared:                  0.485
Method:                 Least Squares   F-statistic:                     11.36
Date:        

# **Synthesis of Findings**
The analysis of sales trends over the past year reveals a clear and sustained decline, with total monthly sales decreasing from $38,329 in January to $23,271 in December. This downward trend is particularly noticeable in July ($23,562) and November ($23,335), indicating possible seasonal influences or external market conditions affecting demand. The regression analysis further confirms this trend, with a statistically significant negative correlation (-1317 per month, p = 0.007) between time and sales, meaning that approximately 53.2% of the variance in sales is explained by time alone. This suggests a consistent drop in performance, likely influenced by factors such as changing consumer preferences, competitive pressures, or ineffective marketing strategies.

In terms of product category performance, Furniture ($83,637) and Electronics ($77,682) emerged as the best-selling categories, followed closely by Clothing ($76,740).

However, given that these are often considered high-ticket items, their sales could be more vulnerable to economic downturns or shifts in discretionary spending.

The payment method distribution indicates a growing preference for Digital Payments ($134,767), followed by Cash ($125,362) and Credit Cards ($120,292). This suggests that enhancing digital payment options and offering targeted promotions for digital transactions could help boost sales.

A look at customer segmentation by location shows that Chicago ($267.92) and Los Angeles ($261.03) have the highest average transaction amounts, while New York ($243.01) reports the lowest spending per transaction. This could indicate stronger competition in New York, variations in customer purchasing power, or a need for localized marketing strategies to enhance sales in that region. Additionally, the decline in sales during specific months suggests that seasonality plays a role in consumer behavior, highlighting the need for seasonal promotions, inventory adjustments, and demand forecasting to maintain revenue.

# **📊 Integrated Findings and Recommendations**
The findings suggest that sales decline is influenced by a combination of structural, competitive, and seasonal factors. To address this issue, it is recommended that XYZ Retail implement targeted pricing and inventory strategies to align with demand fluctuations. A focus on strengthening digital marketing efforts and loyalty programs for high-spending customers in top-performing locations (Chicago and Los Angeles) could help sustain revenue. Additionally, conducting qualitative research through customer surveys or staff feedback may uncover further insights into changing consumer behaviors. Finally, a data-driven approach to promotional strategies, including A/B testing for discounts and seasonal sales events, could help reverse the declining trend and drive long-term growth. 🚀