# CW2: UK Retail Market Analysis for Carrefour, Abhijeet Santhosh Kumar                                  
## Table of Links
### Table
| Description | Link |
| -- | -- |
| Market Share Dataset | 1. hhttps://www.kantarworldpanel.com/grocery-market-share/great-britain/snapshot/28.01.24/
| UK supermarket product description | 2. https://www.kaggle.com/datasets/declanmcalinden/time-series-uk-supermarket-data
| Retail Sales Index (RSI) | 3. https://www.ons.gov.uk/businessindustryandtrade/retailindustry/methodologies/retailsalesindexrsiqmi
| Consumer Price Index (CPI) | 4. https://www.ons.gov.uk/datasets/cpih01/editions/time-series/versions/53


## Table of Contents
1. Introduction 
2. Aims, Objective & Audience
3. UK Retail Market Analysis - Dashboard (Code)
4. Articulation of Decision Making Process
5. Review of Analytics Methods Chosen
6. Review of Available Tools
7. Review of Chosen Datasets 
8. Reflective Evaluation
9. Conclusion


## 1. Introduction

There are several supermarket chains existing in the United Kingdom competing with each other by serving a distinct customer segmentation. The situation under study is that the global retailer Carrefour wants to understand the UK retail market in terms of its potential to expand business. A data-driven approach for making a judgement is implemented and initialises the development of an interactive dashboard using the Dash Python framework. With this dashboard, the market share of big retailers, their price distribution, product variety, and economic indices such as the Retail Sales Index (RSI) and Consumer Price Index (CPI) of the country will be analysed.

## 2. Aim, Objectives and Target Audience

Aim: This project's main objective is to create and develop an interactive dashboard using Dash to monitor the UK retail market for Carrefour. The insights derived from the dashboard would ultimately enable data-driven strategic planning for market entry and positioning Carrefour based on its competitors.

Objectives : The above aim is fulfilled by the following objectives:

1.	Market Share Analysis: Create a stacked area graph and a pie chart showing the distribution of market shares among the biggest players in the UK retail market to analyse competitors in the market. 

2.	Price Distribution Analysis: Box and whisker plot to examine price distribution for all products by all brands and their own brands across the top five UK retailers, filtered by product category to understand pricing strategies and market segmentation.

3.	By Product Category Comparison: A bar graph is going to show the unique products in each category for all brands and their own brands from the five largest UK retail chains. That should give a good insight into the breadth of products and the differentiation by brand. 

4.	Retail Market Performance: For this purpose, line graphs will be designed, depicting trends in the Retail Sales Index (RSI) and the Consumer Price Index (CPI), which are to indicate the overall performance of the retail market over time and present a visual illustration of trends in sales and inflation among different product categories.

Target Audience: The dashboard is personalised for the Carrefour decision-maker who has the most significant say on market expansions, especially on entering new countries such as the UK. Senior executives such as Chief Strategy Officer (CSO), business development managers, and market expansion directors could refer to this to help them make informed strategic decisions. This visualisation contributes to understanding and evaluating the market trend and competitor positioning and pricing strategies to know the competitive landscape and thus opportunities for growth.


## 3. UK Retail Market Analysis - Dashboard (Code)

“Please bear with me, as the code might take a few extra seconds to run. Thank you for your patience!”

In [7]:
#Importing Libraries
from dash import Dash, html, dash_table, dcc, Input, Output,callback
import dash_bootstrap_components as dbc
import plotly.express as px
import pandas as pd


############################################################################################################################################
#Loading Market Share Data
share = pd.read_csv("excel_cleaned_data/Market share.csv")

#Loading CPI Data
cpi = pd.read_csv("excel_cleaned_data/CPI.csv")

#Loading RSI Data
rsi = pd.read_csv("excel_cleaned_data/RSI.csv")

#Loading Kaggle supermarkets Data and concatenation
aldi = pd.read_csv("excel_cleaned_data/All_Data_Aldi.csv")
asda = pd.read_csv("excel_cleaned_data/All_Data_ASDA.csv")
morrisons = pd.read_csv("excel_cleaned_data/All_Data_Morrisons.csv")
sainsbury = pd.read_csv("excel_cleaned_data/All_Data_Sains.csv")
tesco = pd.read_csv("excel_cleaned_data/All_Data_Tesco.csv")
supermarkets = pd.concat([aldi, asda, morrisons, sainsbury, tesco], ignore_index=True)


############################################################################################################################################
#Cleaning Supermarket Data
supermarkets["date"] = pd.to_datetime(supermarkets["date"], format='%Y%m%d').dt.strftime('%d %m %Y')
supermarkets.drop(['prices_unit_(£)',"unit"], axis =1, inplace = True)

#Cleaning RSI data
rsi.rename(columns={"Time Period": "Year"}, inplace=True)
rsi["Year"] = pd.to_datetime(rsi["Year"], format='%Y %b').dt.strftime('%Y')
rsi = rsi.groupby(by = ["Year"]).mean().reset_index()
rsi["Year"] = rsi["Year"].astype(int)

#Melting Market share data
share_melt = pd.melt(share, ["Year"],
                       var_name="Store Name",
                       value_name="Market Share")
share_melt = share_melt.sort_values(by=["Store Name", "Year"])

#Melting CPI data
cpi_melt = pd.melt(cpi, ["Year"],
                       var_name="Category",
                       value_name="CPI")
cpi_melt = cpi_melt.sort_values(by=["Year", "Category"])

#Melting RSI data
rsi_melt = pd.melt(rsi, ["Year"],
                       var_name="Category",
                       value_name="RSI")
rsi_melt = rsi_melt.sort_values(by=["Year", "Category"])
rsi_melt.dropna(axis = 1)

############################################################################################################################################
#All Brands price distribution box and whiskers plot
# Dropdown for selecting a category
category_all_dropdown = dcc.Dropdown(
    id='category-all-dropdown',
    options=[{"label": str(category), "value": category} for category in sorted(supermarkets["category"].unique())],
    value=supermarkets["category"].iloc[0],  # Set default value to the first category
    placeholder="Select a Category",
    style={"margin-bottom": "20px"}
)
all_price_range_slider = dcc.RangeSlider(
        id='all-price-range-slider',
        min=0,
        max=500,
        step=1,
        marks={i: f"£{i}" for i in range(0, 500, 50)},
        value=[0, 50]  # Default range
)

# Category bar graph card
card_all_box = dbc.Card(
    [
        dbc.CardHeader("Price distribution of all products across UK Supermarkets", 
                       style={
                                "background": "linear-gradient(to right, #1e3c72, #2a5298)",  # Gradient
                                "color": "white",
                                "font-size": "18px",
                                "font-weight": "bold",
                                "text-align": "center",
                            }),
        dbc.CardBody([category_all_dropdown, all_price_range_slider, dcc.Graph(id="all-box")])
    ],
    outline=True,
    color="white"
)

# Callback for updating the treemap based on the selected year
@callback(
    Output("all-box", "figure"),
    [Input("category-all-dropdown", "value"),
     Input("all-price-range-slider", "value")]
)
def update_all_category_box(selected_category, price_range):
    # Filter the data for the selected category
    filtered_data = supermarkets[
        (supermarkets["category"] == selected_category) & 
        (supermarkets["date"] == supermarkets.loc[0, "date"]) &
        (supermarkets["prices_(£)"] >= price_range[0]) & 
        (supermarkets["prices_(£)"] <= price_range[1])
    ]

    # Compute the average values for the selected category
    avg_data = filtered_data.groupby("supermarket", as_index=False).agg({
        "prices_(£)": "mean",
        #"prices_unit_(£)": "mean"
    })


    # Create the box and whiskers plot
    fig = px.box(
        filtered_data,  # Use the unfiltered data dataframe
        x="supermarket",  # X-axis represents supermarkets
        y="prices_(£)",   # Y-axis represents prices in £
        color="supermarket",  # Color-coded by supermarket
        color_discrete_sequence=px.colors.qualitative.Dark24,  # Use the same color palette
        title=f"Price distribution of all products across supermarkets in {selected_category} (£{price_range[0]}-£{price_range[1]})",  # Dynamic title
        hover_data={"names": True, "prices_(£)": True}
    )
    
    return fig


############################################################################################################################################
#All Brand price distribution box and whiskers plot
# Dropdown for selecting a category
category_own_dropdown = dcc.Dropdown(
    id='category-own-dropdown',
    options=[{"label": str(category), "value": category} for category in sorted(supermarkets["category"].unique())],
    value=supermarkets["category"].iloc[0],  # Set default value to the first category
    placeholder="Select a Category",
    style={"margin-bottom": "20px"}
)
own_price_range_slider = dcc.RangeSlider(
        id='own-price-range-slider',
        min=0,
        max=500,
        step=1,
        marks={i: f"£{i}" for i in range(0, 500, 50)},
        value=[0, 50]  # Default range
)

# Category bar graph card
card_own_box = dbc.Card(
    [
        dbc.CardHeader("Price distribution of own brand products across UK Supermarkets", 
                       style={
                                "background": "linear-gradient(to right, #1e3c72, #2a5298)",  # Gradient
                                "color": "white",
                                "font-size": "18px",
                                "font-weight": "bold",
                                "text-align": "center",
                            }),
        dbc.CardBody([category_own_dropdown, own_price_range_slider, dcc.Graph(id="own-box")])
    ],
    outline=True,
    color="white"
)

# Callback for updating the treemap based on the selected year
@callback(
    Output("own-box", "figure"),
    [Input("category-own-dropdown", "value"),
     Input("own-price-range-slider", "value")]
)
def update_own_category_box(selected_category, price_range):
    # Filter the data for the selected category
    filtered_data = supermarkets[
        (supermarkets["category"] == selected_category) & 
        (supermarkets["date"] == supermarkets.loc[0, "date"]) &
        (supermarkets["own_brand"] == True) &
        (supermarkets["prices_(£)"] >= price_range[0]) & 
        (supermarkets["prices_(£)"] <= price_range[1])
    ]

    # Compute the average values for the selected category
    avg_data = filtered_data.groupby("supermarket", as_index=False).agg({
        "prices_(£)": "mean",
        #"prices_unit_(£)": "mean"
    })


    # Create the box and whiskers plot
    fig = px.box(
        filtered_data,  # Use the unfiltered data dataframe
        x="supermarket",  # X-axis represents supermarkets
        y="prices_(£)",   # Y-axis represents prices in £
        color="supermarket",  # Color-coded by supermarket
        color_discrete_sequence=px.colors.qualitative.Dark24,  # Use the same color palette
        title=f"Price distribution of own brand across Supermarkets in {selected_category} (£{price_range[0]}-£{price_range[1]})",  # Dynamic title
        hover_data={"names": True, "prices_(£)": True}
    )
    
    return fig

############################################################################################################################################
# All Brand No of Products based across UK supermarkets
# Dropdown for selecting a category
count_all_dropdown = dcc.Dropdown(
    id="count-all-dropdown",
    options=[{"label": str(cat), "value": cat} for cat in sorted(supermarkets["category"].unique())],
    value=supermarkets["category"].iloc[0],  # Default to the first category
    placeholder="Select a Category",
    style={"margin-bottom": "20px"},
)

# Bar graph card
card_all_bar = dbc.Card(
    [
        dbc.CardHeader("Number of Products by All Brands", 
                       style={
                                "background": "linear-gradient(to right, #1e3c72, #2a5298)",  # Gradient
                                "color": "white",
                                "font-size": "18px",
                                "font-weight": "bold",
                                "text-align": "center",
                            }),
        dbc.CardBody([count_all_dropdown, dcc.Graph(id="all-bar")]),
    ],
    outline=True,
    color="white",
)

# Callback for updating the bar graph based on selected category
@callback(
    Output("all-bar", "figure"),
    Input("count-all-dropdown", "value"),
)
def update_all_brand_bar(selected_category):
    filtered_data = supermarkets[(supermarkets["category"] == selected_category) & 
        (supermarkets["date"] == supermarkets.loc[0, "date"])
    ]
    
    
    # Count unique product names grouped by own_brand
    product_counts = filtered_data.groupby(["supermarket"])["names"].nunique().reset_index()
    product_counts.rename(columns={"names": "product_count"}, inplace=True)
    
    # Create the bar graph
    fig = px.bar(
        product_counts,
        x="supermarket",
        y="product_count",
        color="supermarket",
        text="product_count",
        title=f"Number of Unique Products (All Brands) in '{selected_category}'",
        labels={"supermarket": "Supermarket", "product_count": "Number of Products"},
        color_discrete_sequence=px.colors.qualitative.Pastel1,
    )

    fig.update_traces(textposition="outside")
    fig.update_layout(
        xaxis_title="Supermarket",
        yaxis_title="Number of Products",
        showlegend=False,
    )
    return fig

############################################################################################################################################
# Own Brand No of Products based across UK supermarkets
# Dropdown for selecting a category
count_own_dropdown = dcc.Dropdown(
    id="count-own-dropdown",
    options=[{"label": str(cat), "value": cat} for cat in sorted(supermarkets["category"].unique())],
    value=supermarkets["category"].iloc[0],  # Default to the first category
    placeholder="Select a Category",
    style={"margin-bottom": "20px"},
)

# Bar graph card
card_own_bar = dbc.Card(
    [
        dbc.CardHeader("Number of Products by Own Brand", 
                       style={
                                "background": "linear-gradient(to right, #1e3c72, #2a5298)",  # Gradient
                                "color": "white",
                                "font-size": "18px",
                                "font-weight": "bold",
                                "text-align": "center",
                            }),
        dbc.CardBody([count_own_dropdown, dcc.Graph(id="own-bar")]),
    ],
    outline=True,
    color="white",
)

# Callback for updating the bar graph based on selected category
@callback(
    Output("own-bar", "figure"),
    Input("count-own-dropdown", "value"),
)
def update_own_brand_bar(selected_category):
    filtered_data = supermarkets[(supermarkets["category"] == selected_category) & 
        (supermarkets["own_brand"] == True) &
        (supermarkets["date"] == supermarkets.loc[0, "date"])
    ]
    
    
    # Count unique product names grouped by own_brand
    product_counts = filtered_data.groupby(["supermarket"])["names"].nunique().reset_index()
    product_counts.rename(columns={"names": "product_count"}, inplace=True)
    
    # Create the bar graph
    fig = px.bar(
        product_counts,
        x="supermarket",
        y="product_count",
        color="supermarket",
        text="product_count",
        title=f"Number of Unique Products (Own Brand) in '{selected_category}'",
        labels={"supermarket": "Supermarket", "product_count": "Number of Products"},
        color_discrete_sequence=px.colors.qualitative.Pastel1,
    )

    fig.update_traces(textposition="outside")
    fig.update_layout(
        xaxis_title="Supermarket",
        yaxis_title="Number of Products",
        showlegend=False,
    )
    return fig


############################################################################################################################################
#Market Share Pie Chart
# Dropdown for Selecting Year
year_dropdown = dcc.Dropdown(
    id='year-dropdown',
    options=[{"label": str(year), "value": year} for year in sorted(share_melt["Year"].unique())],
    value=share_melt["Year"].max(),
    placeholder="Select a Year",
    style={"margin-bottom": "20px"}
)

# Market Share Pie Chart Graph Card
card_share_pie = dbc.Card(
    [
        dbc.CardHeader("Pie Chart showing Market Share among stores in UK", 
                       style={
                                "background": "linear-gradient(to right, #1e3c72, #2a5298)",  # Gradient
                                "color": "white",
                                "font-size": "18px",
                                "font-weight": "bold",
                                "text-align": "center",
                            }),
        dbc.CardBody([year_dropdown, dcc.Graph(id="pie-chart")])
    ],outline=True, color="white"
)

@callback(
    Output("pie-chart", "figure"),
    Input("year-dropdown", "value")
)
def update_bar_graph(selected_year):
    # Filter the data for the selected year
    share_filtered_data = share_melt[share_melt["Year"] == selected_year]

    # Create the Bar Graph
    fig = px.pie(
        share_filtered_data,
        values='Market Share',
        names='Store Name',
        color="Store Name",
        color_discrete_sequence=px.colors.qualitative.Dark24,
        hole=.3
    )
    return fig

############################################################################################################################################
#Market Share Area Graph
# Create a stacked area graph
fig_market_stacked_area = px.area(
    share_melt,
    x="Year",
    y="Market Share",
    color="Store Name",
    color_discrete_sequence=px.colors.qualitative.Dark24,
    width=1000
)

# Customize axes
fig_market_stacked_area.update_xaxes(tickangle=-45)
fig_market_stacked_area.update_xaxes(
    showline=True, linewidth=2, linecolor='black',
    showgrid=True, gridwidth=1, gridcolor='lightGray'
)
fig_market_stacked_area.update_yaxes(
    showline=True, linewidth=2, linecolor='black',
    showgrid=True, gridwidth=1, gridcolor='lightGray'
)

# Customize layout
fig_market_stacked_area.update_layout(
    xaxis_title="Year",
    yaxis_title="Market Share in (%)",
    title_text="Market Share of Supermarkets in the UK",
)

#Market share stacked area graph Card
card_market_stacked_area = dbc.Card(
    [   
        dbc.CardHeader('Stacked Area Graph showing Market Share among stores in UK',
                       style = {
                                "background": "linear-gradient(to right, #1e3c72, #2a5298)",  # Gradient
                                "color": "white",
                                "font-size": "18px",
                                "font-weight": "bold",
                                "text-align": "center",
                            }),
        dbc.CardBody([dcc.Graph(figure = fig_market_stacked_area)])
    ], outline=True, color="white"
)

############################################################################################################################################
#CPI Line Graph
# Dropdown for selecting the start year
cpi_start_year_dropdown = dcc.Dropdown(
    id='cpi-start-year-dropdown',
    options=[{"label": str(year), "value": year} for year in sorted(cpi_melt["Year"].unique())],
    value=cpi_melt["Year"].min(),
    placeholder="Select a Year",
    style={"margin-bottom": "20px"}
)

# Dropdown for selecting end-year
cpi_end_year_dropdown = dcc.Dropdown(
    id='cpi-end-year-dropdown',
    options=[{"label": str(year), "value": year} for year in sorted(cpi_melt["Year"].unique())],
    value=cpi_melt["Year"].max(),
    placeholder="Select a Year",
    style={"margin-bottom": "20px"}
)

# Card to contain the elements
card_cpi_line = dbc.Card(
    [
        dbc.CardHeader("Line Graph showing Consumer Price Index values across categories in UK", 
                       style={
                                "background": "linear-gradient(to right, #1e3c72, #2a5298)",  # Gradient
                                "color": "white",
                                "font-size": "18px",
                                "font-weight": "bold",
                                "text-align": "center",
                            }),
        dbc.CardBody([cpi_start_year_dropdown, cpi_end_year_dropdown, dcc.Graph(id="cpi-line-graph")])
    ],
    outline=True,
    color="white"
)

# Callback to update the line graph based on selected years
@callback(
    Output("cpi-line-graph", "figure"),
    Input("cpi-start-year-dropdown", "value"),
    Input("cpi-end-year-dropdown", "value")
)

def update_cpi_line_graph(start_year, end_year):
    # Validate inputs
    if start_year is None or end_year is None:
        return {}  # Return empty figure if inputs are invalid
    
    # Filter the data
    try:
        cpi_filtered_data = cpi_melt[(cpi_melt["Year"] >= start_year) & (cpi_melt["Year"] <= end_year)]
        if cpi_filtered_data.empty:
            raise ValueError("No data available for the selected range.")
    except Exception as e:
        print(f"Error in data filtering: {e}")
        return {}  # Return empty figure

     # Create a line graph where the axes are year and Consumer Price index in th UK.  Then only show three selected categories
    fig_cpi_line = px.line(cpi_melt, x="Year", y="CPI", color='Category', color_discrete_sequence=px.colors.qualitative.Dark24, width=1000,).update_traces(
        visible="legendonly", selector=lambda t: not t.name in ["Food, alcoholic beverages & tobacco goods", "Processed food & non-alcoholic beverages GOODS", "Unprocessed food goods"])
    
    fig_cpi_line.update_xaxes(tickangle = -45)
    fig_cpi_line.update_xaxes(showline = True, linewidth = 2, linecolor ='black',
                          showgrid = True, gridwidth = 1, gridcolor ='lightGray')
    fig_cpi_line.update_yaxes(showline = True, linewidth = 2, linecolor ='black',
                          showgrid = True, gridwidth = 1, gridcolor ='lightGray')
    
    
    fig_cpi_line.update_layout(
        xaxis_title = "Year",
        yaxis_title = "Consumer Price Index",
        title_text = 'Consumer Price Index per Category of Goods',
        plot_bgcolor =  'rgba(0, 0, 0, 0)',
        paper_bgcolor = 'rgba(0, 0, 0, 0)',
    )
    return fig_cpi_line

############################################################################################################################################
#RSI line graph
# Dropdown for selecting start year
rsi_start_year_dropdown = dcc.Dropdown(
    id='rsi-start-year-dropdown',
    options=[{"label": str(year), "value": year} for year in sorted(rsi_melt["Year"].unique())],
    value=2000,
    placeholder="Select a Year",
    style={"margin-bottom": "20px"}
)

# Dropdown for selecting end year
rsi_end_year_dropdown = dcc.Dropdown(
    id='rsi-end-year-dropdown',
    options=[{"label": str(year), "value": year} for year in sorted(rsi_melt["Year"].unique())],
    value=rsi_melt["Year"].max(),
    placeholder="Select a Year",
    style={"margin-bottom": "20px"}
)

# Card to contain the elements
card_rsi_line = dbc.Card(
    [
        dbc.CardHeader("Line Graph showing Retail Sales Index values across categories in UK", 
                       style={
                                "background": "linear-gradient(to right, #1e3c72, #2a5298)",  # Gradient
                                "color": "white",
                                "font-size": "18px",
                                "font-weight": "bold",
                                "text-align": "center",
                            }),
        dbc.CardBody([rsi_start_year_dropdown, rsi_end_year_dropdown, dcc.Graph(id="rsi-line-graph")])
    ],
    outline=True,
    color="white"
)

# Callback to update the line graph based on selected years
@callback(
    Output("rsi-line-graph", "figure"),
    Input("rsi-start-year-dropdown", "value"),
    Input("rsi-end-year-dropdown", "value")
)
def update_rsi_line_graph(start_year, end_year):
    # Validate inputs
    if start_year is None or end_year is None:
        return {}  # Return empty figure if inputs are invalid
    
    # Filter the data
    try:
        rsi_filtered_data = rsi_melt[(rsi_melt["Year"] >= start_year) & (rsi_melt["Year"] <= end_year)]
        if rsi_filtered_data.empty:
            raise ValueError("No data available for the selected range.")
    except Exception as e:
        print(f"Error in data filtering: {e}")
        return {}  # Return empty figure

    # Create a line graph where the axes are year and Retail Price index in th UK.  Then only show three selected categories
    fig_rsi_line = px.line(rsi_filtered_data, x="Year", y="RSI", color='Category', color_discrete_sequence=px.colors.qualitative.Dark24, width=1000,).update_traces(
        visible="legendonly", selector=lambda t: not t.name in ["All retail excl. fuel", "food stores", "non-food stores"])
    
    fig_rsi_line.update_xaxes(tickangle = -45)
    fig_rsi_line.update_xaxes(showline = True, linewidth = 2, linecolor ='black',
                          showgrid = True, gridwidth = 1, gridcolor ='lightGray')
    fig_rsi_line.update_yaxes(showline = True, linewidth = 2, linecolor ='black',
                          showgrid = True, gridwidth = 1, gridcolor ='lightGray')
    
    
    fig_rsi_line.update_layout(
        xaxis_title = "Year",
        yaxis_title = "Retail Price Index",
        title_text = 'Retail Price Index per Category of Goods',
        plot_bgcolor =  'rgba(0, 0, 0, 0)',
        paper_bgcolor = 'rgba(0, 0, 0, 0)',
    )
    return fig_rsi_line
###########################################################################################################################################

# Initialize the app
app = Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])


app.layout = html.Div([
    html.H1("UK Retail Market Analysis For Carrefour",style = {"color" : "#1e3c72"}),

       # Adding Carrefour logo image and header text
    html.P(
        """
       This interactive dashboard is designed to help Carrefour analyze the UK retail market and its competition for potential business 
       expansion. It includes four key sections: Market Share Analysis, All Products Information (price distribution and product count by 
       category across supermarkets), Own Brand Product Information (price distribution and product count by category), and CPI & Retail 
       Sales Index. With interactive dropdowns and sliders, users can filter data and gain insights into market dynamics. This tool supports 
       Carrefour in making informed decisions as they evaluate opportunities for entering the UK retail market. Scroll down for more details.
        """,
        style={
            "display": "inline-block",  # Display the description inline with the image
            "width": "70%",  
            "vertical-align": "bottom",  
            "padding-top": "50px",
            "padding-bottom": "50px",
            "padding-right": "10px",
            "padding-left": "10px",
            "color" : "#1e3c72"
            }),
    html.Img(src="/assets/carrefour_logo.png", style={
        "width": "300px",  
        "height": "auto", 
        "float": "right",  
        "margin": "1px",
        "padding-right" : "20px"
    }),
    
    #Market Stacked Area and Share Pie
    html.H2("Market Share Analysis", style = {"color" : "#1e3c72"}),
    html.P(
            """
            The first part provides insights into the market share of UK supermarkets through two interactive visualizations.
            The Stacked Area Chart shows how market shares have evolved (2020–2024), while the Pie Chart offers
            a detailed breakdown for a selected year. Use the dropdown to change years and the legend to isolate supermarkets
            for detailed analysis. Scroll down for more.
            """,
            style={"font-size": "14px", "line-height": "1.5", "color" : "#1e3c72"}  
        ),
    dbc.Row([
        dbc.Col(card_market_stacked_area, width=6),
        dbc.Col(card_share_pie, width=6)
    ], className="mb-4"),

    # All Box and All Bar
    html.H2("Product Information in UK supermarkets", style = {"color" : "#1e3c72"}),
    html.P(
            """
            This section provides a detailed analysis of supermarket pricing and product variety through two visualizations. 
            The Box and Whisker Plot displays the price distribution of all products across supermarkets, allowing users to 
            analyze pricing strategies. Use the dropdown menu to select a product category and the slider to adjust the price range, 
            focusing on specific price brackets. This helps identify whether supermarkets cater to budget-conscious or premium customers.
            """,
            style={"font-size": "14px", "line-height": "1.5", "color" : "#1e3c72"} 
        ),
    html.P(
            """  
            The Bar Graph, on the other hand, highlights the number of products each supermarket offers within the selected category. 
            It enables users to compare the diversity of product offerings across supermarkets.
            """,
            style={"font-size": "14px", "line-height": "1.5", "color" : "#1e3c72"}  
        ),       
    html.P(
            """    
            Together, these graphs provide insights into market positioning. For example, a supermarket with a wide range of low-priced products 
            may target cost-sensitive shoppers, while another with fewer high-priced products may focus on premium quality. These combined 
            visualizations reveal pricing strategies and product variety trends in different categories. Scroll down for more.
            """,
            style={"font-size": "14px", "line-height": "1.5", "color" : "#1e3c72"}  
        ),
    dbc.Row([
        dbc.Col(card_all_box, width=6),
        dbc.Col(card_all_bar, width=6)
    ], className="mb-4"),

    # Own Brand Box and Own Brand Bar
    html.H2("Own Brand Product Information in UK supermarkets", style = {"color" : "#1e3c72"}),
    html.P(
            """
            This section analyzes supermarket products sold under their brand name through two visualizations. The Box and Whisker Plot
            displays the price distribution of own-brand products across supermarkets, offering insights into how these private-label goods are priced.
            Users can select a specific product category via the dropdown and refine the analysis by adjusting the price range using the slider. 
            This allows you to explore whether own-brand products are positioned as budget-friendly alternatives or as premium offerings within a 
            particular category.
            """,
            style={"font-size": "14px", "line-height": "1.5", "color" : "#1e3c72"}  
        ),            
    html.P(
            """            
            Complementing this, the Bar Graph highlights the number of own-brand products offered by each supermarket in the chosen category. 
            This graph provides a clear comparison of which supermarkets emphasize private-label goods and the extent of their product diversity within
            the category.
            """,
            style={"font-size": "14px", "line-height": "1.5", "color" : "#1e3c72"} 
        ),
 

    html.P(
            """
            Together, these graphs reveal the role of own-brand products in each supermarket’s strategy. For example, a supermarket with a wide range of
            low-priced own-brand items may target cost-conscious customers, while another with fewer, higher-priced own-brand products may focus on
            quality or niche segments. These visualizations help uncover how supermarkets use their private-label brands to compete and differentiate
            themselves in the market. Scroll down for more.
            """,
            style={"font-size": "14px", "line-height": "1.5", "color" : "#1e3c72"}  
        ),
    dbc.Row([
        dbc.Col(card_own_box, width=6),
        dbc.Col(card_own_bar, width=6)
    ], className="mb-4"),

    #CPI Line
    html.H2("Consumer Price Index (CPI) trends over the years", style = {"color" : "#1e3c72"}),
    html.P(
            """
            The Consumer Price Index (CPI) plot is an interactive line graph that visualizes inflation trends across various categories over
            time, with dropdown menus to select start and end years. The graph displays CPI data for categories like food, housing, and
            transportation, offering flexibility to analyze specific time periods. In retail market analysis, this tool helps track the
            impact of inflation on product pricing and consumer behavior. Retailers can adjust strategies based on CPI trends, forecasting 
            shifts in demand and cost of goods sold. This visualization supports informed, data-driven decision-making in pricing and market
            forecasting.
            """,
            style={"font-size": "14px", "line-height": "1.5", "color" : "#1e3c72"}  
        ),     
    dbc.Row(card_cpi_line, className="mb-4"),

    # RSI Line
    html.H2("Retail Sales Index (RSI) trends over the years", style = {"color" : "#1e3c72"}),
    html.P(
            """
            The Retail Sales Index (RSI) plot is an interactive line graph that visualizes trends in retail sales across various categories over time, 
            with dropdown menus for selecting start and end years. The graph shows sales data for categories like electronics, clothing,
            and groceries, allowing users to focus on specific periods. In retail market analysis, the RSI plot helps track sales
            performance and identify patterns in consumer spending. Retailers can use this information to adjust inventory, forecast demand,
            and refine pricing strategies. The visualization supports data-driven decision-making, enhancing overall business strategy 
            and market forecasting.
            """,
            style={"font-size": "14px", "line-height": "1.5", "color" : "#1e3c72"}  
        ),     
    dbc.Row(card_rsi_line, className="mb-4")
])

# Here we want the app to run in a browser tab external to the notebook but to also allow us to debug
if __name__ == '__main__':
    app.run(jupyter_mode="external", debug=True)

  asda = pd.read_csv("excel_cleaned_data/All_Data_ASDA.csv")


Dash app running on http://127.0.0.1:8050/


## Articulation of Decision-Making Process 

Key Visualization Principles Implemented

An essential part of the dashboard was the principles of basic data visuasization applied to the design with the intent of clarity, accessibility, and relevance. These ensured that each of the individual charts effectively communicated its meaning, guaranteeing thait t enabled Carrefour to make smart data-driven decisions.
One of the first considerations was the ink used. The headings and labels on the dashboardvehas Carrefour's signature blue curor  in order to be consistent with the brand identity. With the incorporation of the brand logo itself, this interface aims to roporhe user engagement (Tufte, 2001). The design has actively reduced the non-data ink to a considerable amount. Background in white allowed charts to stand ;ut, only non-data ink was the headings that had blue backgrosund to give focus to the graphs (Few, 2006).
The choice of chart types was guided by the principle of selecting the most effective vissalization for the data type. For example, a box and whisker plot was selected to vissalize price distribution across supermarket brands. This chart type is valuable for asalyzing not only the average price but also variability, outliers, and price range, providing insights into the pricing strategies of competitors (Cleveland, 1994). Similarly, stacked area charts and pie charts were chosen to display market share trends and proportions over time, allowing users to easily track changes in the competitive landscape (Heer et al., 2010). Furthermore, the design adhered to user-centricity by ensuring that each chart was designed with the user’s needs in mind. Short explanatory paragraphs were included within each section to provide context, ensuring that users could navigate the data without requiring technical expertise (Shneiderman, 2003).

Stduy was done on choosing the right type of vssualization for the selected data type. A box and whisker plot, for example, is used to represent the distribution of prices among supermarket brands. This sort of chart is very useful insanalyzing not only the averageprice, but also its variation, any ou,tliers and the price range and psrovide valuable insights into pricing strategies and custommentationgefent og competitors (Cleveland, 1994). On the other hand, stacked area charts and pie charts are also considered for displaying market shares in proportions over time in order to tacililate tracking of how competitors have evolved over time (Heer et al., 2010). The dashboard is kept very user-centric by incorporating short textboxes in each section to give contextual information without users having to possess high technical knowledge in order to browse through the data (Shneid rman, 




003).
Key Steps in the Process

The first of which was to clearly define the project goals and then critically choose data sources. Carrefour's strategic goals of analysing the UK rktail makrrokenre bolied down into 4 key o,bjectives as mentione,d earlier into market share, product description, and economic indicators like CPI and RSI. Once these were set, the next step was gathering data from reliable sources.

Data for market share was obtained through Kantar, which provided the necessary information on UK supermarket status and was instrumental in producing stunning visualizations of the leading retailers such as Tesco, ,Sainsbury's and Aldi (Kantar, 2022). Price distributions and product count data originated from Kaggle which provided the vital inputs to understand the price strategies as well as the diversity in products across the major supermarkets in the UK. This data was valuabse in analyzing price variations through the box and whisker plots and comparing product counts with bar graphs (Kaggle, 2022). Further, RSI and CPI figures from the Office for National Statistics offered the economic context under which Carrefour judged its retail market performance and inflation over time (Office for National Statistics, 2023). 


To develop the next portion of the dashboard, codina J was done in jupyter notebook, and the first thing was to prepare and clean the data. This includes removing all irrelevant columns and focusing on the key metrics that would be used for analysis. Different sources of data would then have to be concatenated to create a unified dataset. Filtering was applied to ensure only the relevant data would be retrieved for the specific visu,amization i.e. Market share, pricing distribution, and product count by category. The data preprocessing step was important since it was meant to ensure that what went into the dashboard was accurate and actionable insight through which Carrefour could make future strategic decisions. The cleansed dataset was used for constructing the interactive dashboard.

Visualization Layout and Design

The layout of the dashboard is constructed with a user-centric apa smooth, to provide smooth intuitive experience. Itmwas disided anto Markep Shard Analysis, PricepDistribction aod PboductpCount, awn Brand Prodpct Analysis, ard Perfmrmance in Retail Markets (RSI and CPI) divisions. It has allowed users easy navigation with the dashboard and focus on the most pertinent insight for their decisions.

Short explanatory paragraphs talked about each section that made it easier for users to comprehend the visualizations. Thus, Carrefour's decision-makers would not require spending extra time on intricate and complicated data analysis to understand the key inputs (Few, 2006). In the end, the usability part is well taken care of by providing some interactive dropdown menus and sliders that let usess filter or customize their information based on different product categories or periods. In fact, this would allow the dashboard to really suit different strategic needs and focus -oints.-out

It has well-thought hierarchica,lization of the layout guiding the user's attention to the most critical insights. The charts were given large space to eathe sily read them, while overall design was clear and simple. It produced the simple yet very basic essentials of best-practice information design (Tufte, 2001), very quickly allowing users to ,gnize key insights and which improves usability for the dashboard as a functional tool for strategic decision-making.


## Review of Analytics Methods Chosen 

The analytical methods used in the project of dashboard construction were designed in line with the strategic goals of Carrefour's expansion plans to the UK. Starting with descriptive analysis, which is kept as the basis of the dashboard to provide clear and to-the-point insights from the vast data being used. The stacked area charts and pie charts were chosen as they were suited for analysing market shares due to their ability to flexibly represent proportions and trends over time. For instance, stacked area charts emphasise changes in market shares over the course of time at a single glance, while pie charts illustrate individual market shares at a specific year (Heer et al., 2010). Such visualization enables a quick, seamless evaluation for decision-makers on who represents the main competitors for Carrefour in UK markets.

Comparative analysis was pivotal for analysing differences in strategic pricing and product offerings between supermarkets. The box and whisker plot—an illustration employed to analyze product price distributions among various retail chains by illustrating the key price metrics. The figure portrays a five-number summary: minimum, first quartile (Q1), median, third quartile (Q3), and maximum values in addition to possible outliers in the distribution. Besides the spread of prices, central tendencies and price variability are captured in this visualization (McClave et al., 2017). The box and whiskers plot for Carrefour not only reflected the price averages but also the extent of price limits of most products whilst also identifying any kind of pricing phenomenons or outliers. Outliers reflected pricing strategies such as premium pricing or deep discounting, which helped Carrefour to conclude about the competitor behaviour. Analysing competitors pricing for efficient market entry is another plus of the price distribution plots by studying the spread and central tendency of prices for product categories (McClave et al., 2017).

General consumer behaviour analysis can be done with the use of the time series of the retail sales index (RSI) and consumer price index (CPI). Line graphs are used for this to understand how UK consumers react to seasonal changes and major socio-economic events that have occurred throughout the years (Cleveland, 1993). The division between the categories helps Carrefour to focus on important categories that could yield more benefit. Furthermore, the interactive components like the dropdowns and the sliders of the dashboard allowed stakeholders to further explore these trends dynamically, for example, by analysing by isolating for a particular category or within time ranges.


## Review of Available Tools

Jupyter Notebook was used as the primary IDE to run and test the codes for visualisation, and it was essential as it helps divide parts of codes into different cells to localise the sections and debug when needed (Jupyter Development Team, n.d.). Pandas and Numpy libraries of Python were used to load the.csv files retrived from its before-mentioned sources to clean and manipulate the data (The Pandas development team, n.d.; NumPy developers, n.d.). The product pricing dataset found for the five UK retailers from Kaggle was a big file with around 100,000 rows after concatenation; therefore, some irrelvant columns were to be removed to save memory.
The data was initially visually assessed by creating much simpler individual plots using the Matplotlib library. This library was also used to run tests to see how the plot handles the data. The library was chosen for its ease of use and because of its previous familiarity (The 
Matplotlib Development Team, n.d.).


Thd Dashboard was created with the help othe f dash library. Dash provides various integratio,ns such as dynamic intacvative input elements like dropdowns for categories, range sliders for selecting year rang,es etc. With the help of callback functions, these input values are embedded in the output visualisations. The html functins was an essential function to design the aesthetics of the user interface of the dashboard. It helped bring in design elements like losgo and place each visualization in a certain position using dash’s card function (Plotly, n.d.). FinallyP python’s plotly.express function was used to build the visualizations like bar grasph, pie chasrt, line gras,ph and box and whiskers plsot. The plotly.express function allowed easy integration with other dash elements and offered other features lika e hover tets box and coulor selections. The decision to use Dash was further reinforced by its extensive documentation and active community support, which accelerated development and troubleshooting (Plotly, n.d.).


## Review of Chosen Datasets 

Market share data are sourced from Kantar Worldpanel (2024) and offer an analysis of the shares of the largest UK retail players—the likes of Tesco, Sainsbury's, and Aldi—over the years from 2020 to 2024. This data set is most relevant for visual representation through stacked area graphs and pie charts showing changes and proportions associated with market share over time. The data depict competitive dynamics and retailer performance outcomes, responding to the project objective to assess Carrefour's rivals in the UK market. By presenting market share trends, Carrefour can determine which key players are assessing dominance levels and strategising market entry for itself.

The product description dataset was created by McAlinden, D. (2024) and published in Kaggle. The dataset includes listing details of prices, product name, and information on own-brand products for UK supermarkets, including Tesco, Sainsburys, Aldi, ASDA, and Morrisons. It supports the analysis of price distribution by categories using box and whisker plots. Product count data allows the creation of a bar chart that compares the unique products of each supermarket within a particular category. Through this analysis, Carrefour gets to know the extent of products that competitors offer, thereby enabling them to identify gaps or unique value propositions in their market entry strategy.


The Retail Sales Index (RSI) dataset from the Office for National Statistics (2024a) presents both monthly and annual retail sales index values reflecting the performance of the retail market in the UK. With this dataset, line graphs are established, divided by different categories, to monitor sales trends over time and help Carrefour understand consumer purchasing behaviour. The visualization of retail sales changes provides decision-makers with insights into patterns created by seasonality, economic conditions, and competitive pressures. RSI data indeed would serve Carrefour's aim of understanding overall market dynamics and base their strategic decisions based on broader economic trends.


The Consumer Price Index (CPI) dataset, also from the office for national statistics (2024b), gives data on inflation concerning the different product categories considered. A conclusive line graph plotting all the comparative analysis will be made with this dataset, focusing on the trend of CPIs for the different product categories. These trends of CPI will give Carrefour an idea about how it affects the purchasing power of consumers and, hence, product pricing patterns. With the dual analysis done on CPI and RSI as key economic indicators, Carrefour will be able to study the entire UK retail market environment, enabling a sharper understanding of strategic decisions in pricing and product positioning.


## Reflective Evaluation 

The journey of creation of the dashboard project has been challenging yet rewarding, delivering valuable lessons and scope for growth. It was pivotal to understand and articulate the objectives pertaining to the aim of the project. It was understood that competitive analysis in the market of expansion should be the focal point out of which objectives are drawn. 

A major hurdle in the beginning was the incorporation of a chloropleth map showing the number of supermarkets in locations over the UK. Although a very good addition, this plot had to be discarded as the GeoJSON file holding the mapping details of UK affected the performance of the dashboard to a drastic level. This decision was thus about the equal trade-off between functionalities and user experience in project design. There was also the challenge of processing data volumes associated with the analysis. Processing extensive data volumes from market shares, product prices, and retail trends normally ended with code executions taking too much time. Several optimisation attempts were made, and several columns were disregarded to facilitate the working of the code.

Despite these challenges, the dashboard was able to be executed, focusing on the most relevant details drawn from the datasets. Incorporating principles drawn from Tufte and Few for the aesthetics of the dashbaord was given much focus, as mentioned earlier. There have been lessons from this project. First and foremost, in cases of constraints, one needs to be able to adapt; sometimes that means efficiency over extensivity. Besides that, it is important to try and test subsections individually to discover bottlenecks early. Finally, incorporating the art of story telling, i.e., putting the most important and grasping information in the beginning and unfolding the story. Overall, it has been a worthwhile learning process in that it entails the complexities that come with data-driven decision-making and highlights some serious design considerations while creating such impactful analytics solutions.


## Conclusion

To summarise, the project has successfully developed an interactive dashboard that enables Carrefour to get comprehensive insights on the UK retail market, focusing on the expansion aspect. This Python Dash framework dashboard is very intuitive to insight metrics like market share, price spread, product assortments, and economic indicators such as the Retail Sales Index (RSI) and Consumer Price Index (CPI); hence, Carrefour's decision-makers can interpret the competition and pricing dynamics in the UK market for informed decisions into the market.

The design of the dashboard is kept very user-centric. Textual content in the dashboard not only explains what is there in each visualization, but also provides information on how to use and what insights can be derived from each visualization. The dynamic and interactive features like dropdowns and range sliders allow the users to isolate based on categories, years, or even a range of years to make multiple combinations for better analysis. The use of data type centric visualizations like the use of box and whisker plots allows users to visually understand the vast data much better. Ultimately, this project provides Carrefour with the tools to make data-driven, strategic decisions in their expansion efforts while also demonstrating the value of interactive analytics in driving business success.


# References

1. Adams, R. (2022). Data visualization for strategic decision-making in retail markets. *Journal of Business Analytics*, 45(3), 65-78.
2. Cleveland, W. S. (1994). *The elements of graphing data*. Hobart Press.
3. Few, S. (2006). *Information dashboard design: The effective visual communication of data*. O’Reilly Media.
4. Heer, J., Bostock, M., & Ogievetsky, V. (2010). A tour through the visualization zoo. *Communications of the ACM*, 53(6), 59-67.
5. Johnson, M., & Lee, S. (2020). Impact of economic indicators on consumer behavior in the UK retail market. *International Journal of Retail Economics*, 12(2), 100-115.
6. Kantar. (2022). UK Retail Market Share Report. Retrieved from [https://www.kantar.com](https://www.kantar.com)
7. Kaggle. (2022). Supermarket Dataset. Retrieved from [https://www.kaggle.com](https://www.kaggle.com)
8. Kantar Worldpanel. (2024). Grocery market share: Great Britain snapshot 28.01.24. Retrieved December 20, 2024, from [https://www.kantarworldpanel.com/grocery-market-share/great-britain/snapshot/28.01.24/](https://www.kantarworldpanel.com/grocery-market-share/great-britain/snapshot/28.01.24/)
9. McAlinden, D. (2024). Time-series UK supermarket data. Retrieved December 20, 2024, from [https://www.kaggle.com/datasets/declanmcalinden/time-series-uk-supermarket-data](https://www.kaggle.com/datasets/declanmcalinden/time-series-uk-supermarket-data)
10. McClave, J. T., Benson, P. G., & Sincich, T. (2017). *Statistics for business and economics* (12th ed.). Pearson.
11. Office for National Statistics. (2023). Consumer Price Index and Retail Sales Index Data. Retrieved from [https://www.ons.gov.uk](https://www.ons.gov.uk)
12. Office for National Statistics. (2024a). Retail sales index reference tables. Retrieved December 20, 2024, from [https://www.ons.gov.uk/businessindustryandtrade/retailindustry/datasets/retailsalesindexreferencetables](https://www.ons.gov.uk/businessindustryandtrade/retailindustry/datasets/retailsalesindexreferencetables)
13. Office for National Statistics. (2024b). Consumer price indices. Retrieved December 20, 2024, from [https://www.ons.gov.uk/economy/inflationandpriceindices/datasets/consumerpriceindices](https://www.ons.gov.uk/economy/inflationandpriceindices/datasets/consumerpriceindices)
14. Plotly. (n.d.). Plotly Express. Retrieved December 20, 2024, from [https://plotly.com/python/plotly-express/](https://plotly.com/python/plotly-express/)
15. Plotly. (n.d.). Dash user guide and documentation. Retrieved December 20, 2024, from [https://dash.plotly.com/](https://dash.plotly.com/)
16. Shneiderman, B. (2003). Leonard: Visualizing and discovering the world of data. *IEEE Computer Graphics and Applications*, 23(3), 14-17.
17. Smith, T. (2021). Competitive analysis in the UK retail sector. *Retail Market Research Journal*, 33(4), 142-158.
18. Tufte, E. R. (2001). *The visual display of quantitative information*. Graphics Press.
19. The Matplotlib Development Team. (n.d.). Matplotlib documentation. Retrieved December 20, 2024, from [https://matplotlib.org/stable/index.html](https://matplotlib.org/stable/index.html)
20. The Pandas Development Team. (n.d.). Pandas documentation. Retrieved December 20, 2024, from [https://pandas.pydata.org/docs/](https://pandas.pydata.org/docs/)
21. The NumPy Developers. (n.d.). NumPy documentation. Retrieved December 20, 2024, from [https://numpy.org/doc/](https://numpy.org/doc/)
22. Jupyter Development Team. (n.d.). Project Jupyter documentation. Retrieved December 20, 2024, from [https://docs.jupyter.org/en/latest/](https://docs.jupyter.org/en/latest/)