# Walmart Sales Data Analysis
---



This notebook performs an exploratory data analysis on Walmart weekly sales data, integrating information from multiple datasets to understand factors influencing sales.

## 1. Data Loading and Consolidation

This section focuses on loading the initial datasets (`train.csv`, `features.csv`, and `stores.csv`), consolidating the `train.csv` data to the store level, and merging it with the `features.csv` data.

In [19]:
import pandas as pd
import os
from google.colab import files

# ====================================================================================
# Step 1: Initial Data Consolidation
# ====================================================================================

'''

Upload the three csv files (train.csv, stores.csv, features.csv)
to the colab session before you run this cell.

This cell will merge the csv files and drop the unnecessary columns.

'''

print("Step 1: Initial Data Consolidation")
print("-" * 30)

# Read the uploaded CSV file into a pandas DataFrame.
# Check if the file exists in the current directory
if os.path.exists('train.csv'):
    train_df = pd.read_csv('train.csv')
    print("train.csv loaded successfully.")
else:
    print("Error: train.csv not found in the session files. Please ensure it is uploaded.")
    # Handle the error appropriately, e.g., exit or raise an exception
    raise FileNotFoundError("train.csv not found")


# Group by 'IsHoliday', 'Date', and 'Store' and sum 'Weekly_Sales'.
# This effectively consolidates department-level sales into store-level sales.
consolidated_sales_df = train_df.groupby(['IsHoliday', 'Date', 'Store'])['Weekly_Sales'].sum().reset_index()

# Print the head of the consolidated DataFrame to confirm the operation.
print("\nConsolidated Sales DataFrame Head:")
display(consolidated_sales_df.head())

# Save the consolidated data to a new CSV file.
consolidated_sales_filename = 'consolidated_sales.csv'
consolidated_sales_df.to_csv(consolidated_sales_filename, index=False)
print(f"\nConsolidated sales data saved to '{consolidated_sales_filename}'.")


print("\n" + "=" * 60 + "\n")

# ====================================================================================
# Step 2: Feature Data Integration
# ====================================================================================

print("Step 2: Feature Data Integration")
print("-" * 30)

# Read the uploaded features CSV file.
# Check if the file exists in the current directory
if os.path.exists('features.csv'):
    features_df = pd.read_csv('features.csv')
    print("features.csv loaded successfully.")
else:
    print("Error: features.csv not found in the session files. Please ensure it is uploaded.")
    # Handle the error appropriately
    raise FileNotFoundError("features.csv not found")


# Convert 'Date' columns to datetime objects for accurate merging.
# This ensures that the merge operation handles dates correctly, regardless of their original format.
consolidated_sales_df['Date'] = pd.to_datetime(consolidated_sales_df['Date'])
features_df['Date'] = pd.to_datetime(features_df['Date'])
print("\n'Date' columns converted to datetime format.")

# Drop the Markdown columns as they are not needed for this analysis.
markdown_cols = ['MarkDown1', 'MarkDown2', 'MarkDown3', 'MarkDown4', 'MarkDown5']
features_df = features_df.drop(columns=markdown_cols)
print(f"Dropped {markdown_cols} from the features DataFrame.")

# Perform a left merge of consolidated sales with the features data.
# This merge brings in key information like Temperature, Fuel_Price, and CPI.
merged_df = pd.merge(consolidated_sales_df, features_df, on=['IsHoliday', 'Date', 'Store'], how='left')

# Print the head of the merged DataFrame to confirm the successful merge.
print("\nMerged DataFrame (with features) Head:")
display(merged_df.head())

# Save the resulting DataFrame to a new CSV file.
cleaned_filename = 'wallmart_sales_cleaned.csv'
merged_df.to_csv(cleaned_filename, index=False)
print(f"\nCleaned sales data saved to '{cleaned_filename}'.")


print("\n" + "=" * 60 + "\n")

# ====================================================================================
# Step 3: Final Data Merging
# ====================================================================================

print("Step 3: Final Data Merging")
print("-" * 30)

# Read the uploaded stores CSV file.
# Check if the file exists in the current directory
if os.path.exists('stores.csv'):
    stores_df = pd.read_csv('stores.csv')
    print("stores.csv loaded successfully.")
else:
    print("Error: stores.csv not found in the session files. Please ensure it is uploaded.")
    # Handle the error appropriately
    raise FileNotFoundError("stores.csv not found")

# Perform a left merge with the stores data to add 'Type' and 'Size' information.
final_df = pd.merge(merged_df, stores_df, on='Store', how='left')

# Print the head of the final DataFrame to confirm the successful merge.
print("\nFinal DataFrame (with stores data) Head:")
display(final_df.head())

# Save the final complete dataset to a new CSV file.
final_filename = 'wallmart_final_dataset.csv'
final_df.to_csv(final_filename, index=False)
print(f"\nFinal dataset saved to '{final_filename}'.")

print("\n🎉 Data processing complete!")

Step 1: Initial Data Consolidation
------------------------------
train.csv loaded successfully.

Consolidated Sales DataFrame Head:


Unnamed: 0,IsHoliday,Date,Store,Weekly_Sales
0,False,2010-02-05,1,1643690.9
1,False,2010-02-05,2,2136989.46
2,False,2010-02-05,3,461622.22
3,False,2010-02-05,4,2135143.87
4,False,2010-02-05,5,317173.1



Consolidated sales data saved to 'consolidated_sales.csv'.


Step 2: Feature Data Integration
------------------------------
features.csv loaded successfully.

'Date' columns converted to datetime format.
Dropped ['MarkDown1', 'MarkDown2', 'MarkDown3', 'MarkDown4', 'MarkDown5'] from the features DataFrame.

Merged DataFrame (with features) Head:


Unnamed: 0,IsHoliday,Date,Store,Weekly_Sales,Temperature,Fuel_Price,CPI,Unemployment
0,False,2010-02-05,1,1643690.9,42.31,2.572,211.096358,8.106
1,False,2010-02-05,2,2136989.46,40.19,2.572,210.752605,8.324
2,False,2010-02-05,3,461622.22,45.71,2.572,214.424881,7.368
3,False,2010-02-05,4,2135143.87,43.76,2.598,126.442065,8.623
4,False,2010-02-05,5,317173.1,39.7,2.572,211.653972,6.566



Cleaned sales data saved to 'wallmart_sales_cleaned.csv'.


Step 3: Final Data Merging
------------------------------
stores.csv loaded successfully.

Final DataFrame (with stores data) Head:


Unnamed: 0,IsHoliday,Date,Store,Weekly_Sales,Temperature,Fuel_Price,CPI,Unemployment,Type,Size
0,False,2010-02-05,1,1643690.9,42.31,2.572,211.096358,8.106,A,151315
1,False,2010-02-05,2,2136989.46,40.19,2.572,210.752605,8.324,A,202307
2,False,2010-02-05,3,461622.22,45.71,2.572,214.424881,7.368,B,37392
3,False,2010-02-05,4,2135143.87,43.76,2.598,126.442065,8.623,A,205863
4,False,2010-02-05,5,317173.1,39.7,2.572,211.653972,6.566,B,34875



Final dataset saved to 'wallmart_final_dataset.csv'.

🎉 Data processing complete!


## 2. Data Inspection and Cleaning

This section involves inspecting the merged dataset for missing values and checking data types to ensure data quality before further analysis.

In [20]:
# Load the final dataset
final_df = pd.read_csv('wallmart_final_dataset.csv')

# Print the head of the DataFrame
print("Final DataFrame Head:")
display(final_df.head())

# Print the tail of the DataFrame
print("\nFinal DataFrame Tail:")
display(final_df.tail())

# Check for missing values
print("\nMissing values in the Final DataFrame:")
display(final_df.isnull().sum())

Final DataFrame Head:


Unnamed: 0,IsHoliday,Date,Store,Weekly_Sales,Temperature,Fuel_Price,CPI,Unemployment,Type,Size
0,False,2010-02-05,1,1643690.9,42.31,2.572,211.096358,8.106,A,151315
1,False,2010-02-05,2,2136989.46,40.19,2.572,210.752605,8.324,A,202307
2,False,2010-02-05,3,461622.22,45.71,2.572,214.424881,7.368,B,37392
3,False,2010-02-05,4,2135143.87,43.76,2.598,126.442065,8.623,A,205863
4,False,2010-02-05,5,317173.1,39.7,2.572,211.653972,6.566,B,34875



Final DataFrame Tail:


Unnamed: 0,IsHoliday,Date,Store,Weekly_Sales,Temperature,Fuel_Price,CPI,Unemployment,Type,Size
6430,True,2012-09-07,41,1392143.82,67.41,3.596,198.095048,6.432,A,196321
6431,True,2012-09-07,42,617405.35,83.07,4.124,130.932548,7.17,C,39690
6432,True,2012-09-07,43,663814.18,84.99,3.73,213.799099,9.285,C,41062
6433,True,2012-09-07,44,338737.33,70.65,3.689,130.932548,5.407,C,39910
6434,True,2012-09-07,45,766512.66,75.7,3.911,191.577676,8.684,B,118221



Missing values in the Final DataFrame:


Unnamed: 0,0
IsHoliday,0
Date,0
Store,0
Weekly_Sales,0
Temperature,0
Fuel_Price,0
CPI,0
Unemployment,0
Type,0
Size,0


In [21]:
print("Data types of each column:")
display(final_df.dtypes)

Data types of each column:


Unnamed: 0,0
IsHoliday,bool
Date,object
Store,int64
Weekly_Sales,float64
Temperature,float64
Fuel_Price,float64
CPI,float64
Unemployment,float64
Type,object
Size,int64


## 3. Exploratory Data Analysis (EDA) - Weekly Sales

This section focuses on calculating and visualizing descriptive statistics for weekly sales by store to understand the distribution and variability of sales across different stores.

In [22]:
# Calculate mean, median, standard deviation, and variance for weekly sales by store
weekly_sales_stats = final_df.groupby('Store')['Weekly_Sales'].agg(['mean', 'median', 'std', 'var'])

# Calculate range for weekly sales by store
weekly_sales_stats['range'] = final_df.groupby('Store')['Weekly_Sales'].apply(lambda x: x.max() - x.min())

# Calculate the mean Z-score for weekly sales by store
# This calculates the Z-score for each weekly sale within a store and then takes the mean of those Z-scores for each store.
weekly_sales_stats['z_score'] = final_df.groupby('Store')['Weekly_Sales'].transform(lambda x: (x - x.mean()) / x.std()).groupby(final_df['Store']).mean()


print("Weekly Sales Statistics by Store:")
display(weekly_sales_stats)

Weekly Sales Statistics by Store:


Unnamed: 0_level_0,mean,median,std,var,range,z_score
Store,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,1555264.0,1534849.64,155980.767761,24330000000.0,1071050.89,-4.650515e-16
2,1925751.0,1879107.31,237683.694682,56493540000.0,1785613.24,-6.223654e-16
3,402704.4,395107.35,46319.631557,2145508000.0,266393.03,-1.545093e-15
4,2094713.0,2073951.38,266201.442297,70863210000.0,1913849.68,-1.435332e-16
5,318011.8,310338.17,37737.965745,1424154000.0,247263.36,1.846231e-15
6,1564728.0,1524390.07,212525.855862,45167240000.0,1466322.0,-6.50218e-17
7,570617.3,557166.35,112585.46922,12675490000.0,687041.66,-3.287968e-16
8,908749.5,893399.77,106280.829881,11295610000.0,739101.97,-9.656223e-16
9,543980.6,536537.64,69028.666585,4764957000.0,452419.46,5.318201e-16
10,1899425.0,1827521.71,302262.062504,91362350000.0,2121350.38,4.145868e-16


In [23]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Get the store numbers
stores = weekly_sales_stats.index

# Create subplots
fig = make_subplots(rows=3, cols=2,
                    subplot_titles=('Mean Weekly Sales by Store', 'Median Weekly Sales by Store',
                                    'Std Dev of Weekly Sales by Store', 'Variance of Weekly Sales by Store',
                                    'Range of Weekly Sales by Store', 'Z-score of Weekly Sales by Store'))

# Add traces for each statistic
fig.add_trace(go.Bar(x=stores, y=weekly_sales_stats['mean'], name='Mean'), row=1, col=1)
fig.add_trace(go.Bar(x=stores, y=weekly_sales_stats['median'], name='Median'), row=1, col=2)
fig.add_trace(go.Bar(x=stores, y=weekly_sales_stats['std'], name='Standard Deviation'), row=2, col=1)
fig.add_trace(go.Bar(x=stores, y=weekly_sales_stats['var'], name='Variance'), row=2, col=2)
fig.add_trace(go.Bar(x=stores, y=weekly_sales_stats['range'], name='Range'), row=3, col=1)
fig.add_trace(go.Bar(x=stores, y=weekly_sales_stats['z_score'], name='Z-score'), row=3, col=2)


# Update layout
fig.update_layout(height=900, title_text="Weekly Sales Statistics by Store")
fig.show()

# Line Chart for each store



In [24]:
import plotly.graph_objects as go

# Ensure 'Date' column is in datetime format
final_df['Date'] = pd.to_datetime(final_df['Date'])

# Get unique store numbers
stores = final_df['Store'].unique()

# Create a separate line chart for each store
for store in stores:
    # Filter data for the current store
    store_df = final_df[final_df['Store'] == store]

    # Sort by date to ensure the line chart is in chronological order
    store_df = store_df.sort_values(by='Date')

    # Create a line trace for the weekly sales over time
    fig = go.Figure(data=go.Scatter(x=store_df['Date'], y=store_df['Weekly_Sales'], mode='lines'))

    # Update layout
    fig.update_layout(title=f'Weekly Sales Over Time for Store {store}',
                      xaxis_title='Date',
                      yaxis_title='Weekly Sales')

    # Show the plot
    fig.show()

# Holiday sales Impact
Analyze and visualize the impact of holidays and seasons on weekly sales for each store using the `weekly_sales_stats` data and plotly.

## Analyze holiday vs. non-holiday sales



In [25]:
# Group by 'Store' and 'IsHoliday' and calculate the mean of 'Weekly_Sales'
holiday_sales_impact = final_df.groupby(['Store', 'IsHoliday'])['Weekly_Sales'].mean().reset_index()

# Rename the 'Weekly_Sales' column for clarity
holiday_sales_impact = holiday_sales_impact.rename(columns={'Weekly_Sales': 'Average_Weekly_Sales'})

# Display the resulting DataFrame
print("Average Weekly Sales by Store and Holiday Status:")
display(holiday_sales_impact)

Average Weekly Sales by Store and Holiday Status:


Unnamed: 0,Store,IsHoliday,Average_Weekly_Sales
0,1,False,1.546957e+06
1,1,True,1.665748e+06
2,2,False,1.914209e+06
3,2,True,2.079267e+06
4,3,False,4.000648e+05
...,...,...,...
85,43,True,6.359463e+05
86,44,False,3.032536e+05
87,44,True,2.960356e+05
88,45,False,7.821985e+05


### Visualize holiday impact


In [26]:

# Filter the DataFrame for holiday and non-holiday weeks
holiday_sales_df = holiday_sales_impact[holiday_sales_impact['IsHoliday'] == True]
non_holiday_sales_df = holiday_sales_impact[holiday_sales_impact['IsHoliday'] == False]

# Create the bar chart figure
fig = go.Figure()

# Add bar trace for non-holiday sales with tooltips
fig.add_trace(go.Bar(
    x=non_holiday_sales_df['Store'],
    y=non_holiday_sales_df['Average_Weekly_Sales'],
    name='Non-Holiday Sales',
    hoverinfo='text',
    hovertext='Store: ' + non_holiday_sales_df['Store'].astype(str) +
              '<br>Holiday: No' +
              '<br>Avg Sales: ' + non_holiday_sales_df['Average_Weekly_Sales'].round(2).astype(str)
))

# Add bar trace for holiday sales with tooltips
fig.add_trace(go.Bar(
    x=holiday_sales_df['Store'],
    y=holiday_sales_df['Average_Weekly_Sales'],
    name='Holiday Sales',
    hoverinfo='text',
    hovertext='Store: ' + holiday_sales_df['Store'].astype(str) +
              '<br>Holiday: Yes' +
              '<br>Avg Sales: ' + holiday_sales_df['Average_Weekly_Sales'].round(2).astype(str)
))

# Update layout with title and axis labels
fig.update_layout(
    title='Average Weekly Sales by Store: Holiday vs. Non-Holiday',
    xaxis_title='Store',
    yaxis_title='Average Weekly Sales',
    barmode='group' # This groups the bars for each store
)

# Show the plot
fig.show()

In [27]:
import plotly.express as px

# Create a boxplot of Weekly_Sales vs. IsHoliday
fig = px.box(final_df, x='IsHoliday', y='Weekly_Sales',
             title='Weekly Sales Distribution: Holiday vs. Non-Holiday')

# Update layout
fig.update_layout(xaxis_title='IsHoliday', yaxis_title='Weekly Sales')

# Show the plot
fig.show()

### Consider seasonal impact

In [28]:
# Ensure 'Date' column is in datetime format
final_df['Date'] = pd.to_datetime(final_df['Date'])

# Extract month and quarter
final_df['Month'] = final_df['Date'].dt.month
final_df['Quarter'] = final_df['Date'].dt.quarter

# Calculate average weekly sales by month
average_sales_by_month = final_df.groupby('Month')['Weekly_Sales'].mean().reset_index()

# Calculate average weekly sales by quarter
average_sales_by_quarter = final_df.groupby('Quarter')['Weekly_Sales'].mean().reset_index()

# Create a bar chart for average weekly sales by month
fig_month = go.Figure(data=go.Bar(x=average_sales_by_month['Month'], y=average_sales_by_month['Weekly_Sales']))
fig_month.update_layout(title='Average Weekly Sales by Month',
                        xaxis_title='Month',
                        yaxis_title='Average Weekly Sales')
fig_month.show()

# Create a bar chart for average weekly sales by quarter
fig_quarter = go.Figure(data=go.Bar(x=average_sales_by_quarter['Quarter'], y=average_sales_by_quarter['Weekly_Sales']))
fig_quarter.update_layout(title='Average Weekly Sales by Quarter',
                          xaxis_title='Quarter',
                          yaxis_title='Average Weekly Sales')
fig_quarter.show()

# Average weekly sales by store type

## Bar chart
using bar chart to visualize the average weekly sales for each type

In [29]:

# Calculate the average weekly sales by store type
average_sales_by_type = final_df.groupby('Type')['Weekly_Sales'].mean().reset_index()

# Create a bar chart for average weekly sales by store type
fig = go.Figure(data=go.Bar(x=average_sales_by_type['Type'], y=average_sales_by_type['Weekly_Sales']))

# Update layout
fig.update_layout(title='Average Weekly Sales by Store Type',
                  xaxis_title='Store Type',
                  yaxis_title='Average Weekly Sales')

# Show the plot
fig.show()

## Pie chart
used to visualize the contributing percentage for each store type

In [30]:

# Calculate the total weekly sales
total_sales = final_df['Weekly_Sales'].sum()

# Calculate the sum of weekly sales by store type
sales_by_type = final_df.groupby('Type')['Weekly_Sales'].sum().reset_index()

# Calculate the percentage contribution of each store type
sales_by_type['Percentage'] = (sales_by_type['Weekly_Sales'] / total_sales) * 100

# Create a pie chart
fig = go.Figure(data=[go.Pie(labels=sales_by_type['Type'], values=sales_by_type['Percentage'], hole=.3)])

# Update layout
fig.update_layout(title='Percentage Contribution of Store Types to Total Weekly Sales')

# Show the plot
fig.show()

# Top 10 weekly sales stores

In [31]:

# Calculate the average weekly sales for each store
average_sales_by_store = final_df.groupby('Store')['Weekly_Sales'].mean().reset_index()

# Sort the stores by average weekly sales in descending order and get the top 10
top_10_stores = average_sales_by_store.sort_values(by='Weekly_Sales', ascending=False).head(10)

# Create a horizontal bar chart
fig = go.Figure(go.Bar(
    x=top_10_stores['Weekly_Sales'],
    y=top_10_stores['Store'].astype(str), # Convert store number to string for categorical axis
    orientation='h' # Set orientation to horizontal
))

# Update layout
fig.update_layout(
    title='Top 10 Stores by Average Weekly Sales',
    xaxis_title='Average Weekly Sales',
    yaxis_title='Store'
)

# Show the plot
fig.show()

## Unemployment and CPI Impact on the weekly sales

In [32]:

# Ensure 'Date' column is in datetime format
final_df['Date'] = pd.to_datetime(final_df['Date'])

# Pivot the data to prepare for the heatmap
# We'll use the week number as the x-axis.
# First, extract the week number from the date.
final_df['Week_Number'] = final_df['Date'].dt.isocalendar().week

# Pivot the table to have Stores as index, Week_Number as columns and Weekly_Sales as values
heatmap_data = final_df.pivot_table(index='Store', columns='Week_Number', values='Weekly_Sales')

# Create the heatmap
fig = go.Figure(data=go.Heatmap(
    z=heatmap_data.values,
    x=heatmap_data.columns,
    y=heatmap_data.index,
    colorscale='Viridis' # You can choose a different colorscale
))

# Update layout
fig.update_layout(
    title='Weekly Sales Heatmap by Store and Week',
    xaxis_title='Week Number',
    yaxis_title='Store',
    xaxis=dict(tickmode='array', tickvals=heatmap_data.columns), # Ensure all week numbers are displayed
    yaxis=dict(tickmode='array', tickvals=heatmap_data.index) # Ensure all store numbers are displayed
)

# Show the plot
fig.show()

### Unemployment rate impact

In [33]:
# Create a scatter plot of Weekly Sales vs. Unemployment
fig = go.Figure(data=go.Scatter(x=final_df['Unemployment'], y=final_df['Weekly_Sales'], mode='markers'))

# Update layout
fig.update_layout(title='Weekly Sales vs. Unemployment Rate (All Stores)',
                  xaxis_title='Unemployment Rate',
                  yaxis_title='Weekly Sales')

# Show the plot
fig.show()

In [34]:
# Ensure 'Date' column is in datetime format
final_df['Date'] = pd.to_datetime(final_df['Date'])

# Get unique store numbers
stores = final_df['Store'].unique()

# Create a separate figure for each store
for store in stores:
    # Filter data for the current store
    store_df = final_df[final_df['Store'] == store].sort_values(by='Date')

    # Create figure with secondary y-axis
    fig = make_subplots(specs=[[{"secondary_y": True}]])

    # Add trace for Weekly Sales
    fig.add_trace(go.Scatter(x=store_df['Date'], y=store_df['Weekly_Sales'], mode='lines', name='Weekly Sales'),
                  secondary_y=False)

    # Add trace for Unemployment Rate
    fig.add_trace(go.Scatter(x=store_df['Date'], y=store_df['Unemployment'], mode='lines', name='Unemployment Rate'),
                  secondary_y=True)

    # Add figure title
    fig.update_layout(
        title_text=f"Weekly Sales and Unemployment Rate Over Time for Store {store}"
    )

    # Set x-axis title
    fig.update_xaxes(title_text="Date")

    # Set y-axes titles
    fig.update_yaxes(title_text="Weekly Sales", secondary_y=False)
    fig.update_yaxes(title_text="Unemployment Rate", secondary_y=True)

    # Show the plot
    fig.show()

### CPI rate impact

In [35]:
# Ensure 'Date' column is in datetime format
final_df['Date'] = pd.to_datetime(final_df['Date'])

# Get unique store numbers
stores = final_df['Store'].unique()

# Create a separate figure for each store
for store in stores:
    # Filter data for the current store
    store_df = final_df[final_df['Store'] == store].sort_values(by='Date')

    # Create figure with secondary y-axis
    fig = make_subplots(specs=[[{"secondary_y": True}]])

    # Add trace for Weekly Sales
    fig.add_trace(go.Scatter(x=store_df['Date'], y=store_df['Weekly_Sales'], mode='lines', name='Weekly Sales'),
                  secondary_y=False)

    # Add trace for CPI
    fig.add_trace(go.Scatter(x=store_df['Date'], y=store_df['CPI'], mode='lines', name='CPI'),
                  secondary_y=True)

    # Add figure title
    fig.update_layout(
        title_text=f"Weekly Sales and CPI Over Time for Store {store}"
    )

    # Set x-axis title
    fig.update_xaxes(title_text="Date")

    # Set y-axes titles
    fig.update_yaxes(title_text="Weekly Sales", secondary_y=False)
    fig.update_yaxes(title_text="CPI", secondary_y=True)

    # Show the plot
    fig.show()

In [36]:
# Create a scatter plot of Weekly Sales vs. CPI (All Stores)
fig = go.Figure(data=go.Scatter(x=final_df['CPI'], y=final_df['Weekly_Sales'], mode='markers'))

# Update layout
fig.update_layout(title='Weekly Sales vs. CPI (All Stores)',
                  xaxis_title='CPI',
                  yaxis_title='Weekly Sales')

# Show the plot
fig.show()