# Sports Retail Performance Dashboard in Q3 from 2020 to 2023

## Introduction
The sportswear industry in the UK is considered highly active and holds great significance within the retail sector. According to a 2022 research by GlobalData, the British sportswear industry had reached approximately £13.8 billion in 2020 and was likely to develop at better than more than 4% per annum until 2025. In the highly competitive business, numerous retailers are striving to enhance profitability and maintain consumer loyalty. Many strategies have been made, encompass improving marketing efforts, expanding the quality of customer services, and strengthening the sportswear brand, among other initiatives. Based on the data obtained from one sports wear retail, I have prepared a performance analysis report for the company, which is specific to the senior manager, particularly focusing on Q3 of 2020 to 2023. The analysis facilitates the company's decision-making process by ensuring the selection of appropriate items and their timely placement. A well-designed dashboard is developed to provide key information and direction towards Q3 of 2024. This project utilizes sophisticated visualization techniques to uncover patterns, correlations, and actionable insights that are crucial for implementing strategies in the sports equipment retail market.

## Table of Links
### Table
| Description | Link |
| -- | -- |
| Reflective blog | https://ele.exeter.ac.uk/mod/oublog/view.php?id=2698275 |
| Chosen Dataset | https://www.kaggle.com/datasets/insightfuldataset/insightful-dataset-for-retail-market |

## Table of Contents
1. Executive Summary
2. Project Dashboard
3. Background to the Project
4. Articulation of Decision Making Process
5. Review of Analytics Methods Chosen
6. Review of Available Tools
7. Review of Chosen Datasets 
8. Visualisation of Data with Accompanying Code
9. Reflective Evaluation
10. Conclusion


## 1. Executive Summary

This project culminates in the creation of an intricate dashboard tailored for comprehensive insights into the Q3 performance analysis of a sports clothing retail venture from 2020 to 2023. Leveraging three thoroughly cleaned and merged CSV files—orders, products, and retailers via the Pandas library, the data underwent iterative refinement to ensure optimal usability. A coherent dataset was created by means of deliberate manipulations, which served as the basis for the dashboard.

In the decision-making process, various theories and methodologies were applied for the selection of charts. The objective was to achieve delicate charts that not only convey insightful data but also boasted an aesthetic appeal. Drawing from theories such as Edward Tufte's principles, the emphasis was on maximizing data clarity while minimizing clutter, ensuring that each chart served as a window into the underlying insights. Additionally, Tamara Munzner's theories regarding visualization design aided in identifying the audience's needs and tasks, enabling the creation of charts tailored to specific user requirements. These approach ensured that the dashboard didn't just display data; it communicated meaningful narratives, making complex information more approachable and actionable.

Dash served as the primary design tool, visualizing a layout of eight strategically positioned charts of six diverse types. The design of the dashboard was developed to attain a seamless integration of lucidity, simplicity, and aesthetic appeal. The inclusion of relevant filters on the dashboard was crucial in facilitating in-depth research by users and enhancing data engagement.

## 2. Project Dashboard

In [None]:
# Import libraries
import pandas as pd
from dash import html,dcc,Dash
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output
import plotly.express as px

# Load the data
orders = 'https://raw.githubusercontent.com/khanhle08/final-assignment-for-BEMM461/main/Orders_2020_to_2023.csv'
products = 'https://raw.githubusercontent.com/khanhle08/final-assignment-for-BEMM461/main/Products.csv'
retailers = 'https://raw.githubusercontent.com/khanhle08/final-assignment-for-BEMM461/main/Retailers.csv'

# Change to data series
orders_pd = pd.read_csv(orders,delimiter=';')
orders_column_with_pound = ['Product_Price','Product_Cost','Sales_Amount','Cost_of_Goods_Sold','Profit']
for column in orders_column_with_pound:
    orders_pd[column] = orders_pd[column].replace('£', '',regex=True)
    orders_pd[column] = orders_pd[column].replace(',', '.',regex=True).astype('float')
products_pd = pd.read_csv(products,delimiter=';')
retailers_pd = pd.read_csv(retailers,delimiter=';')

# Separate 'Order_YearQuarter' into 'Year' and 'Quarter' columns
orders_pd['Order Date'] = pd.to_datetime(orders_pd['Order Date'], format='%d/%m/%Y')
orders_pd['Year'] = orders_pd['Order Date'].dt.year
orders_pd['Quarter'] = orders_pd['Order Date'].dt.quarter
orders_pd.drop('Order_YearQuarter',axis = 1,inplace=True)
orders_pd.drop('Order_YearMonth',axis = 1,inplace=True)

# Filter data for Q3 from 2020 to 2023
Q3_orders_pd = orders_pd[(orders_pd['Quarter'] == 3)]

# Calculate total order quantity by year in Q3
total_order_quantity_by_year = Q3_orders_pd.groupby('Year')['Order_Quantity'].sum().reset_index()

# Filter data for Q of the year 2023
Q3_2023_orders_pd = Q3_orders_pd[(Q3_orders_pd['Quarter'] == 3) & (Q3_orders_pd['Year'] == 2023)]
Q3_2022_orders_pd = Q3_orders_pd[(Q3_orders_pd['Quarter'] == 3) & (Q3_orders_pd['Year'] == 2022)]

# Merge DataFrames
merged_orders_retailers = pd.merge(Q3_orders_pd, retailers_pd, on='Retailer_ID')
merged_orders_products = pd.merge(Q3_orders_pd,products_pd,on='Product_SKU')
final = pd.merge(merged_orders_retailers, merged_orders_products,how="outer")
total_order_quantity_from_final = final.groupby(['Product_Gender','Year','Product_Size','Product_Category','City','Retailer_Channel','Country','Product_Name'])['Order_Quantity'].sum().reset_index()

# Calculate total order quantity, profit, sales amount in 2023
total_order_quantity_2023 = Q3_2023_orders_pd['Order_Quantity'].sum()
total_profit_2023 = Q3_2023_orders_pd['Profit'].sum()
total_sales_amount_2023 = Q3_2023_orders_pd['Sales_Amount'].sum()

# Calculate total order quantity, profit, sales amount in 2022
total_order_quantity_2022 = Q3_2022_orders_pd['Order_Quantity'].sum()
total_profit_2022 = Q3_2022_orders_pd['Profit'].sum()
total_sales_amount_2022 = Q3_2022_orders_pd['Sales_Amount'].sum()

# Calculate growth
order_quantity_growth_percentage_2023_2022 = ((total_order_quantity_2023 - total_order_quantity_2022) / total_order_quantity_2022) * 100
profit_growth_percentage_2023_2022 = ((total_profit_2023 - total_profit_2022) / total_profit_2022) * 100
sales_amount_growth_percentage_2023_2022 = ((total_sales_amount_2023 - total_sales_amount_2022) / total_sales_amount_2022) * 100

# Initialize the Dash app
app = Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])

app.layout = html.Div([
    dbc.Row([
        dbc.Col(
            dbc.Card(
                dbc.CardBody([
                    html.H4('Total Order Quantity in Q3 2023', className='card-title text-center', style={'color': '#003c3b'}),
                    html.H2(f'{total_order_quantity_2023:,}', className='card-text text-center', style={'color': '#005654'}),
                    html.Link(href="https://cdn.jsdelivr.net/npm/bootstrap-icons/font/bootstrap-icons.css", rel='stylesheet'),
                    html.P([  
                        html.I(className='bi bi-arrow-up', style={'color': '#1a936f', 'font-size': '15px'}),  
                        html.Span(f'{order_quantity_growth_percentage_2023_2022:.2f}%', className='growth-percentage', style={'color': '#1a936f', 'font-size': '15px'})  
                    ], className='growth-info text-center'),
                    html.P('vs Q3 2022', className='growth-info text-center', style={'color': '#808080', 'font-size': '14px'})
                ]),
                className='custom-card',
            ),
            width=3, className='custom-col'
        ),
        dbc.Col(
            dbc.Card(
                dbc.CardBody([
                    html.H4('Total Profit in Q3 2023', className='card-title text-center', style={'color': '#003c3b'}),
                    html.H2(f'£{total_profit_2023:,.0f}', className='card-text text-center', style={'color': '#005654'}),
                    html.Link(href="https://cdn.jsdelivr.net/npm/bootstrap-icons/font/bootstrap-icons.css", rel='stylesheet'),
                    html.P([  
                        html.I(className='bi bi-arrow-up', style={'color': '#1a936f'}),
                        html.Span(f'{profit_growth_percentage_2023_2022:.2f}%', className='growth-percentage', style={'color': '#1a936f', 'font-size': '15px'}) 
                    ], className='growth-info text-center'),
                    html.P('vs Q3 2022', className='growth-info text-center', style={'color': '#808080', 'font-size': '14px'})
                ]),
                className='custom-card',
            ),
            width=3, className='custom-col'
        ),
        dbc.Col(
            dbc.Card(
                dbc.CardBody([
                    html.H4('Total Sales Amount in Q3 2023', className='card-title text-center', style={'color': '#003c3b'}),
                    html.H2(f'£{total_sales_amount_2023:,.0f}', className='card-text text-center', style={'color': '#005654'}),
                    html.Link(href="https://cdn.jsdelivr.net/npm/bootstrap-icons/font/bootstrap-icons.css", rel='stylesheet'),
                    html.P([  
                        html.I(className='bi bi-arrow-up', style={'color': '#1a936f'}),  
                        html.Span(f'{sales_amount_growth_percentage_2023_2022:.2f}%', className='growth-percentage', style={'color': '#1a936f', 'font-size': '15px'})  
                    ], className='growth-info text-center'),
                    html.P('vs Q3 2022', className='growth-info text-center', style={'color': '#808080', 'font-size': '14px'})
                ]),
                className='custom-card',
            ),
            width=3, className='custom-col'
        ),
        dbc.Col(width=1, className='custom-col'),
        dbc.Col(
            [
                html.Label('Select Year :', style={'font-weight': 'bold','color': '#003c3b'}),
                dcc.RadioItems(
                    id='year-dropdown',
                    options=[{'label': 'All', 'value': 'All'}] + [{'label': year, 'value': year} for year in total_order_quantity_from_final['Year'].unique()],
                    value='All',
                ),
            ],
            style={'display': 'block'},
            width=1, className='custom-col'
        ),
        dbc.Col(
            [
                html.Label('Select Gender :', style={'font-weight': 'bold','color': '#003c3b'}),
                dcc.RadioItems(
                    id='gender-dropdown',
                    options=[{'label': 'All', 'value': 'All'}] + [{'label': gender, 'value': gender} for gender in total_order_quantity_from_final['Product_Gender'].unique()],
                    value='All',
                ),
            ],
            style={'display': 'block'},
            width=1, className='custom-col'
        ),
    ]),
    dbc.Row([
        dbc.Col(
            dcc.Graph(id='quantity-line-chart'),
            width=7, className='custom-col'
        ),
        dbc.Col(
            dcc.Graph(id='city-bar-chart'),
            width=5, className='custom-col'
        )
    ]),
    dbc.Row([
        dbc.Col(
            dcc.Graph(id='product-name-bar-chart'),
            width=5, className='custom-col'
        ),
        dbc.Col(
            dcc.Graph(id='retailer-pie-chart'),
            width=3, className='custom-col'
        ),
        dbc.Col(
            dcc.Graph(id='product-size-pie-chart'),
            width=4, className='custom-col'
        )
    ]),
])

# Total Order Quantity in Q3 2020 to 2023
@app.callback(
    Output('quantity-line-chart', 'figure'),
    [Input('year-dropdown', 'value'),
     Input('gender-dropdown', 'value')]
)
def update_bar_chart(selected_year, selected_gender):
    filtered_data = total_order_quantity_from_final
    if selected_year and selected_year != 'All':
        filtered_data = filtered_data[filtered_data['Year']== selected_year]
    if selected_gender and selected_gender != 'All':
        filtered_data = filtered_data[filtered_data['Product_Gender']== selected_gender]    
    order_quantity_by_year_category = filtered_data.groupby(['Year','Product_Category'])['Order_Quantity'].sum().reset_index()
    categories = order_quantity_by_year_category['Product_Category'].unique()
    custom_colors = px.colors.qualitative.Plotly[:len(categories)]

    fig = px.line(order_quantity_by_year_category, x='Year', y='Order_Quantity',color = 'Product_Category', title='Quantity by Year', color_discrete_sequence=custom_colors)
    fig.update_layout(
        title={'text': 'Total Order Quantity by Product Category in Q3 2020 to 2023', 'font': {'color': '#003c3b'}},  
        xaxis=None,  
        yaxis=None
    )
    fig.update_traces(line=dict(color='#005654'),hovertemplate='<b>Year</b> : %{x}<br><b>Order Quantity</b> : %{y}<extra></extra>')
    fig.update_layout(xaxis={'type': 'category'})
    for i, data in enumerate(fig.data):
        data.update(hoverinfo='name',line=dict(color=custom_colors[i]))

    if selected_year and selected_year != 'All':  
        for i, category in enumerate(categories):
            scatter_data = order_quantity_by_year_category[(order_quantity_by_year_category['Year'] == selected_year) &
                                                           (order_quantity_by_year_category['Product_Category'] == category)]
            fig.add_scatter(x=scatter_data['Year'], y=scatter_data['Order_Quantity'], mode='markers',
                            marker=dict(size=10, color=custom_colors[i]), name=category,
                            showlegend=False, hovertemplate='Order Quantity: %{y}')

    fig.update_layout(yaxis=dict(range=['', order_quantity_by_year_category['Order_Quantity'].max()+2000]))
    return fig

# Top 5 Cities by Order Quantity in Q3 2020 to 2023
@app.callback(
    Output('city-bar-chart', 'figure'),
    [Input('year-dropdown', 'value'),
     Input('gender-dropdown', 'value')]
)
def update_bar_chart(selected_year, selected_gender):
    filtered_data = total_order_quantity_from_final
    if selected_year and selected_year != 'All':
        filtered_data = filtered_data[filtered_data['Year']== selected_year]
    if selected_gender and selected_gender != 'All':
        filtered_data = filtered_data[filtered_data['Product_Gender']== selected_gender]
    
    order_quantity_by_city = filtered_data.groupby('City')['Order_Quantity'].sum().reset_index()
    top_5_cities = order_quantity_by_city.nlargest(5, 'Order_Quantity')    
    top_5_cities_sorted = top_5_cities.iloc[::-1]
    fig_city = px.bar(top_5_cities_sorted, x='Order_Quantity', y='City', orientation='h', title='Order Quantity by City')
    fig_city.update_layout(
        title={'text': 'Top 5 Cities by Order Quantity in Q3 2020 to 2023', 'font': {'color': '#003c3b'}},  
        xaxis=None,  
        yaxis=None
    )
    fig_city.update_traces(marker_color='#005654',hovertemplate='<b>City</b> : %{y}<br><b>Order Quantity</b> : %{x}<extra></extra>')
    return fig_city

# Top 5 Products by Order Quantity in Q3 2020 to 2023
@app.callback(
    Output('product-name-bar-chart', 'figure'),
    [Input('year-dropdown', 'value'),
     Input('gender-dropdown', 'value')]
)
def update_bar_chart(selected_year, selected_gender):
    filtered_data = total_order_quantity_from_final
    if selected_year and selected_year != 'All':
        filtered_data = filtered_data[filtered_data['Year']== selected_year]
    if selected_gender and selected_gender != 'All':
        filtered_data = filtered_data[filtered_data['Product_Gender']== selected_gender]
   
    order_quantity_by_product_name = filtered_data.groupby('Product_Name')['Order_Quantity'].sum().reset_index()
    top_5_product_name = order_quantity_by_product_name.nlargest(5, 'Order_Quantity')

    top_5_product_name_sorted = top_5_product_name.iloc[::-1].copy()
    max_name_length = 20
    top_5_product_name_sorted['Product Name'] = top_5_product_name_sorted['Product_Name'].apply(
        lambda x: x[:max_name_length] + ('...' if len(x) > max_name_length else ''))

    fig_product_name = px.bar(top_5_product_name_sorted, x='Order_Quantity', y='Product Name', orientation='h',
                              title='Top 5 Products by Order Quantity in Q3 2020 to 2023')

    fig_product_name.update_traces(
        marker_color='#005654',hovertemplate='<b>Product Name</b> : %{y}<br><b>Order Quantity</b> : %{x}<extra></extra>')

    fig_product_name.update_layout(
        title={'text': 'Top 5 Products by Order Quantity in Q3 2020 to 2023', 'font': {'color': '#003c3b'}},
        xaxis=None,
        yaxis=None
    )

    return fig_product_name

# Order Quantity by Retail Channel in Q3 2020 to 2023
@app.callback(
    Output('retailer-pie-chart', 'figure'),
    [Input('year-dropdown', 'value'),
     Input('gender-dropdown', 'value')]
)
def update_bar_chart(selected_year, selected_gender):
    filtered_data = total_order_quantity_from_final
    if selected_year and selected_year != 'All':
        filtered_data = filtered_data[filtered_data['Year']== selected_year]
    if selected_gender and selected_gender != 'All':
        filtered_data = filtered_data[filtered_data['Product_Gender']== selected_gender]
    
    order_quantity_by_retailer = filtered_data.groupby('Retailer_Channel')['Order_Quantity'].sum().reset_index()
    order_quantity_by_retailer = order_quantity_by_retailer.sort_values('Order_Quantity', ascending = False)
    fig_retailer = px.pie(order_quantity_by_retailer, values='Order_Quantity', names='Retailer_Channel', title='Total Order Quantity by Retailer Channel')
    fig_retailer.update_layout(title={'text': 'Order Quantity by Retail Channel in Q3 2020 to 2023', 'font': {'color': '#003c3b'}})

    custom_colors = ['#114b5f', '#1a936f', '#88d498', '#c6dabf','#f3e9d2'] 
    fig_retailer.update_traces(marker=dict(colors=custom_colors, line=dict(color='#FFFFFF', width=1)), showlegend=True)
    fig_retailer.update_traces(textposition='inside', textinfo='percent', hoverinfo='label+percent', 
                           hovertemplate='<b>%{label}</b><br>%{value} units',
                           textfont=dict(color='white'))
    return fig_retailer

# Order Quantity by Product Size in Q3 2020 to 2023
@app.callback(
    Output('product-size-pie-chart', 'figure'),
    [Input('year-dropdown', 'value'),
     Input('gender-dropdown', 'value')]
)
def update_bar_chart(selected_year, selected_gender):
    filtered_data = total_order_quantity_from_final
    if selected_year and selected_year != 'All':
        filtered_data = filtered_data[filtered_data['Year']== selected_year]
    if selected_gender and selected_gender != 'All':
        filtered_data = filtered_data[filtered_data['Product_Gender']== selected_gender]
    
    order_quantity_by_size = filtered_data.groupby('Product_Size', observed=True)['Order_Quantity'].sum().reset_index()
    order_quantity_by_size['Product_Size'] = pd.Categorical(order_quantity_by_size['Product_Size'], categories=['XS', 'S', 'M', 'L', 'XL'], ordered=True)
    order_quantity_by_size = order_quantity_by_size.sort_values('Product_Size')

    fig_size = px.bar(order_quantity_by_size, x='Product_Size', y='Order_Quantity', 
                      title='Order Quantity in Q3 by Product Size', 
                      labels={'Order_Quantity': 'Order Quantity', 'Product_Size': 'Product Size'})
    fig_size.update_layout(title={'text': 'Order Quantity by Product Size in Q3 2020 to 2023', 'font': {'color': '#003c3b'}})

    custom_colors = ['#114b5f', '#1a936f', '#88d498', '#c6dabf','#f5e1a4'] 
    fig_size.update_traces(marker=dict(color=custom_colors, line=dict(color='#FFFFFF', width=1)), 
                           showlegend=False,hovertemplate='<b>Product Size</b> : %{x}<br><b>Order Quantity</b> : %{y}<extra></extra>')
    fig_size.update_layout(xaxis=None, yaxis=None, titlefont=dict(color='#003c3b'))
    fig_size.update_xaxes(categoryorder='array', categoryarray=['XS', 'S', 'M', 'L', 'XL'])
    return fig_size

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True, port=8051)

## 3. Background to the Project

The decision to undertake a performance analysis stemmed from recognizing the evolving dynamics and high competitive landscape within the UK's sportswear market. This initiative was driven by the strategic goal of strengthening the company's competitive position and fostering sustained growth. The primary aim of this project is to meticulously examine the company's performance, specifically within the Q3 periods spanning 2020 to 2023. This analysis will encompass various aspects of the business operations, aiming to discern successful strategies and areas requiring enhancement.

## 4. Articulation of Decision Making Process

To initiate this process, I employed Tamara Munzner’s 4-level validation theory. This methodology facilitated the identification of the target audience, their specific tasks, visualization techniques, and the integration of algorithms within the project framework. With a comprehensive approach, the primary objective of this project is to discern prevalent trends and extract actionable insights for the sportswear retail in UK. These insights aim to facilitate strategic planning for the upcoming Q3 of 2024, catering specifically to senior managers. The analysis spans a four-year period (2020 to 2023) and focuses on various segments including Product Category, City, Product Name, Retailer Channel, and Product Size. Three distinct datasets encompassing orders, retailers, and products were meticulously collected, thoroughly cleansed, and intricately manipulated to serve as the foundational elements of this analysis. Further information detailing the datasets is available in the blog post dated 2nd December (https://ele.exeter.ac.uk/mod/oublog/viewpost.php?post=34468). To construct a comprehensive dashboard, a total of eight charts were meticulously crafted. 

The concept of this dashboard is rooted in Wiley’s (2010) performance dashboard theory, which indicates that a performance dashboard communicates strategic objectives and enables business people to measure, monitor, and manage the key activities and processes needed to achieve their goals. Additionally, I am also inspired with the design guidelines where a dashboard should not overwhelm users; should avoid visual clutter; should avoid poor visual design and carefully chose KPIs; should align with existing workflows; should not show too much data; should have both functional features and visual features; should provide consistency, interaction affordances and manage complexity; and should organize charts symmetrically, group charts by attribute, clearly separate these groups of charts and order charts according to time from Bach (2023). 

The top row of the dashboard features three cards displaying the total quantities, profits, and sales amounts for Q3 2023. These cards serve as fundamental summaries of the company's current performance, offering insights into the achievement of set KPIs. Including the percentage growth compared to the previous year within the cards allows for a comprehensive overview of progress. Next to the cards are filters for Year and Gender for user interaction, enabling a deeper exploration of each year and gender for enhanced insights.

Moving to the second row, two distinct charts were visualized. A line chart depicting total order quantities by product category in Q3 for each year and a horizontal bar chart showcasing the top 5 cities in terms of quantity sold during Q3. A line chart was chosen as a fact that it shows how quantitative values have changed over time for different categorical items (Kirk, 2019). This line chart has made it evident that that the Hoodies & Sweatshirts category has consistently maintained the highest demand over the past three years. In 2023, this category experienced a fourfold increase compared to the Bras & Tops, Tank Tops, Shorts, and Pants categories. Conversely, Shorts, once a leading category in 2020, witnessed a significant decline in subsequent years. This analysis suggests a need to focus on promoting Hoodies & Sweatshirts for increased profitability and meeting customer demands. A horizontal bar chart were chosen for Top 5 Cities by Order Quantity in Q3 from 2020 to 2023 since it is easy to see quickly which category is the biggest, which is the smallest, and also the incremental difference between categories (Wiley, 2015). The analysis underscores Abbey Ward as the predominant city for clothing consumption for four years. The quantity of purchases in Abbey Ward nearly reaches 12,000 units, marking a significant sixfold increase compared to Loundsley Green Ward and Parwich, although both of which also rank among the top five cities. Notably, Abbey Ward consistently maintains its top position in clothing sales over the past four years, demonstrating a consistent slight increase each year. This trend presents a compelling insight for managerial consideration, suggesting a strategic opportunity to allocate additional resources and investments in this city. 

The last row of the dashboard were visualized by three charts in total : a horizontal bar chart for top five products by total order quantity in Q3 from 2020 to 2023, a pie chart for total order quantity by retailer channel in Q3 from 2020 to 2023 and a column chart for total order quantity by product size in Q3 from 2020 to 2023. The product-based chart highlights the best-selling items across the entire timeframe, revealing unexpected popularity for certain items within the Tank Tops, Tees, Jackets, Shorts, and Hoodies & Sweatshirts categories. While Tank Tops and Shorts initially ranked poorly, specific items within these categories gained substantial traction. Upon filtering the gender for products, it becomes apparent that men's product quantities hover around the 1000-unit mark, with two categories reaching nearly 1200 units, slightly surpassing those of women. Next to the bar chart is the pie chart with four slices, showing how proportions of quantities for different retailer channels make up a whole. Notably, the Franchise channel consistently maintains the highest share, accounting for 33-35% of total sales each year. Conversely, the small chain store demonstrates a declining trend in orders, witnessing a 5% decrease in 2023 compared to 2022. The final chart, a column chart, delineates the most purchased sizes. Size S emerges as the favored choice across genders at nearly 15,000 units, three times higher than size XS. On the other hand, size M and XL purchases are expected to rise in 2022 and 2023, reaching levels that are comparable to those of size S, according to current trends in men's sizes. 

Aiming for the best possible data visualization has required careful consideration of Edward Tufte's effectiveness metrics. This strategy places a high priority on transparent manipulation and careful data collecting, guaranteeing correctness and precision in all shared information. Key insights were presented in a clear and concise manner by eliminating extraneous data ink and emphasizing important features. Using the concepts of color harmony, a green-yellow color scheme was used consistently in most of the charts to provide a visually appealing display. Additionally, complementary colors were employed to improve user comprehension in the line chart with several legends. This purposeful decision was made in an effort to produce distinct visual contrasts that would help consumers differentiate between various data sets. The dashboard's adherence to these design principles promotes efficient data analysis and interpretation, while also enhancing comprehension and accessibility for the audience. It also shows data in an aesthetically beautiful manner.

## 5. Review of Analytics Methods Chosen

To make this dashboard insightful with information, enlightening and functional for the audience, the below methods were chosen for my analytics : 
- Data Collection and Integration: I've collected raw data from three distinct datasets — orders, retailers, and products. After that a process of identifying and adressing issues to find whether the data is missing values, or there are any inconsistencies in the datasets were implemented. After cleaning data, an intergration was made thoughtfully and accurately to combine all the three datasets together. A data transformation is also implemented to refine data for better displaying in visualization.
- Metrics Selection and KPIs: The identification of key metrics (total quantities, profits, sales amounts) for each quarter in Q3 2023 serves as a critical part of my analytics approach. These KPIs offer a snapshot view of the company's performance, enabling quick assessment against targeted objectives.
- Trend Analysis: Utilizing line charts to showcase the total order quantity by product category over the years in Q3 demonstrates my analytics to track trends. This approach allows for visualizing patterns and identifying shifts in demand across various product categories, aiding in strategic decision-making.
- Comparative Analysis: Horizontal bar charts in representing the top 5 cities in terms of sales quantity, provide a comparative view. This analytics-driven visualization method helps in quickly identifying high-performing cities and potential growth opportunities, enabling informed managerial actions.
- Categorical Analysis: Employing column charts and pie charts to display order quantity by product size and retailer channel, respectively, allows for categorical analysis. These visualizations help understand customer preferences and channel performance, aiding in targeted marketing and resource allocation decisions.
- Interaction and Filter Mechanisms: I've given users the ability to engage with the data dynamically by adding interactive filters for gender and year. Users can delve down into particular segments with this analytics-driven approach, allowing for deeper research and the extraction of insights.

## 6. Review of Available Tools

When I start doing the project, I turned to Power BI due to its efficacy in transforming data into visually captivating charts that can be quickly modified. Having some experience using Power Bi before, I am accustomed to Power BI's intuitive 'drag and drop' functionality, which facilitated seamless navigation across multiple databases and the establishment of interconnections between them. This particular application has been vital in terms of visualization, as it has facilitated the exploration of crucial concepts and enhanced my comprehension of data through its intuitive UI and interactive features. While developing, I also cross checked to ensure that my data aligned with the data in Power BI. However, there is only one limitation I found is that Power BI may encounter difficulties when performing complicated operations on large datasets, resulting in poor processing and performance.

## 7. Review of Chosen Datasets 

I selected my datasets with an initial interest on sportswear, driven by its dynamic evolution within the context of rapidly advancing retail market. In the contemporary landscape, retail markets must harness technological advancements to glean consumer-driven insights, a pivotal factor in competing for market share. The datasets I've curated encompass a diverse array of information pertinent to retail markets, sourced and collected between 2020 and 2023. Each file provides a rich information, affording me the opportunity to delve deeply and comprehensively into retail data analysis.

## 8. Visualisation of Data with Accompanying Code

The dashboard has detailed the retail's performance in Q3 across the span of 2020 to 2023, offering a comprehensive overview of sales distribution. This comprehensive overview inspired a deeper dive into the proportional breakdown across cities, specifically focusing on those exhibiting the highest sales volume throughout the year.

In [None]:
app = Dash(__name__)

cities = total_order_quantity_from_final['City'].unique()

app.layout = html.Div([
    dcc.Dropdown(
        id='city-dropdown',
        options=[{'label': city, 'value': city} for city in cities],
        value=None,
        placeholder='Select City',
        style={'width': '500px', 'height' : '40px'}  
    ),
    dcc.Graph(
        id='order-quantity-pie-chart-city',
        style={'height': '600px'}  
    )
])

@app.callback(
    Output('order-quantity-pie-chart-city', 'figure'),
    [Input('city-dropdown', 'value')]
)
def update_pie_chart(selected_city):
    filtered_city_data = total_order_quantity_from_final.copy()
    if selected_city is not None:
        filtered_city_data = filtered_city_data[filtered_city_data['City'] == selected_city]

    order_quantity_by_retailer_city = filtered_city_data.groupby('Retailer_Channel')['Order_Quantity'].sum().reset_index()
    order_quantity_by_retailer_city = order_quantity_by_retailer_city.sort_values('Order_Quantity', ascending=False)
    fig_retailer = px.pie(order_quantity_by_retailer_city, values='Order_Quantity', names='Retailer_Channel')
    fig_retailer.update_layout(
        title={'text': f'Total Order Quantity by Retail Channel in Q3 2020 to 2023',
               'font': {'color': '#003c3b'},
                'x': 0.5, 'y': 0.95, 'xanchor': 'center', 'yanchor': 'top'},
        legend={'x': 1, 'y': 0.7, 'title': {'font': {'size': 15}}}  
    )

    custom_colors = ['#114b5f', '#1a936f', '#88d498', '#c6dabf', '#f3e9d2']
    fig_retailer.update_traces(marker=dict(colors=custom_colors, line=dict(color='#FFFFFF', width=1)),
                               showlegend=True)
    fig_retailer.update_traces(textposition='inside', textinfo='percent', hoverinfo='label+percent',
                               hovertemplate='<b>%{label}</b><br>%{value} units',
                               textfont=dict(color='white'))

    return fig_retailer

if __name__ == '__main__':
    app.run_server(debug=True, port=8052)

Upon a closer review of Abbey Ward's retail landscape, a conspicuous trend emerges — there's a notable absence of presence in local stores, yet the retail outlets maintain visibility across three key retailer channels. This observation sparks a dual perspective for managerial reflection: there's a chance to broaden product availability by exploring local store opportunities to reach a wider customer base, or a need to understand why sales strategies haven't been implemented in this particular retail domain. Contrasting this, Carbrooke stands out for its pronounced preference for local stores, dominating with a solid 61.7% share, significantly exceeding other sales channels. This highlights the critical importance of doubling down on local store strategies. Similar patterns unfold in Broxburn, Uphall, and Winchburg, all showcasing local stores capturing more than a quarter of the market share. Furthermore, the Loundsley Green Ward demonstrates a more robust sales influence in small retail chains. This thorough analysis highlights the significance of the company in adapting their product distribution methods to diverse retailer channels, by modifying their strategy based on the unique consumer behaviors seen in various environments.

In [None]:
app = Dash(__name__)

cities = total_order_quantity_from_final['City'].unique()
genders = total_order_quantity_from_final['Product_Gender'].unique()

app.layout = html.Div([
    dcc.Dropdown(
        id='city-dropdown',
        options=[{'label': city, 'value': city} for city in cities],
        value=None,
        placeholder='Select City',
        style={'width': '500px', 'marginBottom': '10px'} 
    ),
    dcc.Dropdown(
        id='gender-dropdown',
        options=[{'label': gender, 'value': gender} for gender in genders],
        value=None,
        placeholder='Select Gender',
        style={'width': '500px', 'marginBottom': '10px'}  
    ),
    dcc.Graph(id='order-quantity-column-chart-size')
])

@app.callback(
    Output('order-quantity-column-chart-size', 'figure'),
    [Input('city-dropdown', 'value'),
     Input('gender-dropdown', 'value')]
)
def update_column_chart(selected_city, selected_gender):
    filtered_data = total_order_quantity_from_final.copy()
    if selected_city:
        filtered_data = filtered_data[filtered_data['City'] == selected_city]
    if selected_gender:
        filtered_data = filtered_data[filtered_data['Product_Gender'] == selected_gender]

    order_quantity_by_size = filtered_data.groupby('Product_Size', observed=True)['Order_Quantity'].sum().reset_index()
    order_quantity_by_size['Product_Size'] = pd.Categorical(order_quantity_by_size['Product_Size'], categories=['XS','S','M','L','XL'], ordered = True)
    order_quantity_by_size = order_quantity_by_size.sort_values('Product_Size')

    fig = px.bar(order_quantity_by_size, x='Product_Size', y='Order_Quantity', color='Product_Size',
                 title='Total Order by Product Size',
                 color_discrete_sequence=['#114b5f', '#1a936f', '#88d498', '#c6dabf', '#f3e9d2'])
    fig.update_layout(title={'text': 'Total Order Quantity by Product Size in Q3 2020 to 2023', 'x': 0.5}, 
                      xaxis_title=None,
                      yaxis_title=None, height=800)
    fig.update_traces(width=0.5, hovertemplate = '<b>Product Size</b> : %{x}<br><b>Order Quantity</b> : %{y}<extra></extra>')

    return fig

if __name__ == '__main__':
    app.run_server(debug=True, port = 8053)    

Additional investigation into the utilization of product size employed two filters — City and Gender has revealed fascinating consumption trends. The demand for men's products in larger sizes in Abbey Ward greatly exceeds the city's overall average. The demand for sizes M, L, and XL is almost the same, reaching a level similar to that of size S. In contrast, menswear in Carbrooke, Broxburn, Uphall, and Winchburg predominantly favors size S, indicating a noticeable change in sizing preferences. Significantly, Loundsley Green Ward stands out as a community that exhibits a high level of demand for size L. The demand for women's products in Abbey Ward shows a clear preference for sizes S and M, with 1160 and 1218 units sold, respectively. These figures are more than three times greater than the sales of size XS. Meanwhile, in Broxburn, Uphall, and Winchburg, there is a noticeable and significant preference for size S in women's products, which stands out as a prominent trend. Parwich has the biggest demand for size XL with 281 units, which provides interesting insights into the varied size preferences throughout cities and women in the retail industry.

## 9. Reflective Evaluation
This project reveals all the obstacles and insights that have shaped my journey. Managing the CSV files, which included orders, items, and merchants, required multiple phases of cleaning and merging using Pandas. The data underwent multiple modifications, progressively increasing its usability, culminating in the development of the dashboards.

During this journey, I discovered the importance of selecting the suitable visualization techniques. It became evident that different types of graphs offered distinct perspectives on the information. The balance between providing comprehensive information and ensuring visual clarity is quite intricate. This procedure revealed the significance of carefully choosing charts to facilitate easy understanding of complex information by a wide range of audience.

## 10. Conclusion
This project aims to provide a resilient and insightful dashboard that includes detailed Q3 performance data from 2020 to 2023 for a sports apparel firm. Entails the creation of polished data, aesthetically pleasing dashboards, and functional methods for interacting with this tool. This project adventure has once again confirmed the indispensability of data preparation, deliberate visualization design, and user-centric interaction in creating meaningful analytical tools for managers to utilize in decision-making.


## References
- Bach, B., Freeman, E., Abdul-Rahman, A., Turkay, C., Khan, S., Fan, Y., & Chen, M. (2023). Dashboard Design Patterns. IEEE Transactions on Visualization and Computer Graphics, 29(1), 342–352. https://doi.org/10.1109/TVCG.2022.3209448
- Eckerson, W. W. (2010). Performance dashboards: measuring, monitoring, and managing your business (2nd Edition). Wiley. https://doi.org/10.1002/9781119199984 
- GlobalData UK Ltd. United Kingdom (UK) Sportswear Market Size and Forecast Analytics by Category (Apparel, Footwear, Accessories), Segments (Gender, Positioning, Activity), Retail Channel and Key Brands, 2020-2025. (2023). Market Research Reports & Consulting https://www.globaldata.com/store/report/uk-sportswear-market-analysis/
- Kirk, A. (2019). Data Visualisation. SAGE. 
- Knaflic, C. N. (2015). Storytelling with Data. John Wiley & Sons. 
- Morton, J. (2015) Color Matters. https://www.colormatters.com/color-and-design/basic-color-theory
- Munzner, T. (2014). Visualization analysis and design (1st ed.). Taylor and Francis.
- Tufte, E. R. (2001). The visual display of quantitative information (2nd ed.). Graphics Press.
