## 1.0 Import Libraries 

In [1]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go

from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)

## 2.0 Load data 

In [2]:
df = pd.read_csv('dataset/data-2.csv')
df.head(5)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,12/1/2010 8:26,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,12/1/2010 8:26,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,12/1/2010 8:26,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,12/1/2010 8:26,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,12/1/2010 8:26,3.39,17850.0,United Kingdom


## 3.0 Data Understanding

### 3.1 Shape

In [3]:
df.shape

(541909, 8)

> <div class='alert alert-block alert-secondary' style='font-size:18px'>
    <li>The dataset has 541909 rows and 8 cols.</li></div>

### 3.2 Info

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 541909 entries, 0 to 541908
Data columns (total 8 columns):
 #   Column       Non-Null Count   Dtype  
---  ------       --------------   -----  
 0   InvoiceNo    541909 non-null  object 
 1   StockCode    541909 non-null  object 
 2   Description  540455 non-null  object 
 3   Quantity     541909 non-null  int64  
 4   InvoiceDate  541909 non-null  object 
 5   UnitPrice    541909 non-null  float64
 6   CustomerID   406829 non-null  float64
 7   Country      541909 non-null  object 
dtypes: float64(2), int64(1), object(5)
memory usage: 33.1+ MB


> <div class="alert alert-block alert-secondary" style='font-size:18px'>
  <li>This output shows that the <b>InvoiceDate</b> and <b>CustomerID</b> are of the wrong types.</li>
  <li>Also, <b>Description</b> and <b>CustomerID</b> have missing values.</li></span></div>

### 3.3 Describe

In [5]:
df.describe()

Unnamed: 0,Quantity,UnitPrice,CustomerID
count,541909.0,541909.0,406829.0
mean,9.55225,4.611114,15287.69057
std,218.081158,96.759853,1713.600303
min,-80995.0,-11062.06,12346.0
25%,1.0,1.25,13953.0
50%,3.0,2.08,15152.0
75%,10.0,4.13,16791.0
max,80995.0,38970.0,18287.0


> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li>Here we can see that <b>Quantity</b> and <b>UnitPrice</b> have negative values. We would investigate why.</li></div>

## 4.0 Data Preparation and Cleaning

### 4.1 Missing values

In [4]:
df_msn = pd.DataFrame(df.isna().sum())
df_msn['Features'] = df_msn.index
df_msn['MissingValues'] = df_msn[0]
fig = px.bar(df_msn, 
            title='Missing Values by Features',
             x = 'Features', 
             y = 'MissingValues', 
             color='Features',
            text='MissingValues',
            text_auto='.2s')
fig.update_yaxes(title='Count of Missing Values')

What % of Customer and Description data is missing?

In [7]:
customer_msn = (df['CustomerID'].isna().sum() / len(df['CustomerID'])) * 100
description_msn = (df['Description'].isna().sum() / len(df['CustomerID'])) * 100
print('{:.2f}%, {:.2f}%'.format(customer_msn,description_msn))

24.93%, 0.27%


> <div class="alert alert-block alert-secondary">
    <span style='font-size:18px'><li>Approximately 140k transactions have missing Customer data. That is 24.93% of all Customer data. </li></span>
<span style='font-size:18px'><li>1.5k items have missing descriptions. That is 0.27%.</li></span></div>

Checking missing values in Description Column

In [8]:
df[df['Description'].isna()]

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
622,536414,22139,,56,12/1/2010 11:52,0.0,,United Kingdom
1970,536545,21134,,1,12/1/2010 14:32,0.0,,United Kingdom
1971,536546,22145,,1,12/1/2010 14:33,0.0,,United Kingdom
1972,536547,37509,,1,12/1/2010 14:33,0.0,,United Kingdom
1987,536549,85226A,,1,12/1/2010 14:34,0.0,,United Kingdom
...,...,...,...,...,...,...,...,...
535322,581199,84581,,-2,12/7/2011 18:26,0.0,,United Kingdom
535326,581203,23406,,15,12/7/2011 18:31,0.0,,United Kingdom
535332,581209,21620,,6,12/7/2011 18:35,0.0,,United Kingdom
536981,581234,72817,,27,12/8/2011 10:33,0.0,,United Kingdom


> <div class="alert alert-block alert-secondary">
    <span style='font-size:18px'><li>Here it appears that where <b>Description</b> is missing, <b>UnitPrice</b> and <b>CustomerID</b> are also missing. </li></span>
<span style='font-size:18px'><li>It makes more sense to delete these missing values. </li></span></div>

Drop NaN from CustomerID and Description

In [9]:
df.drop(df[df['CustomerID'].isna() | df['Description'].isna()].index, inplace=True)

Drop the Countries that have unspecified entries

In [10]:
df.drop(df.query('Country == "Unspecified"').index, inplace=True)

### 4.2 Datatype issues

Converting InvoiceDate and CustomerID to datetime and int respectively

In [11]:
df = df.astype({'InvoiceDate':'datetime64[ns]','CustomerID':'int'})
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 406585 entries, 0 to 541908
Data columns (total 8 columns):
 #   Column       Non-Null Count   Dtype         
---  ------       --------------   -----         
 0   InvoiceNo    406585 non-null  object        
 1   StockCode    406585 non-null  object        
 2   Description  406585 non-null  object        
 3   Quantity     406585 non-null  int64         
 4   InvoiceDate  406585 non-null  datetime64[ns]
 5   UnitPrice    406585 non-null  float64       
 6   CustomerID   406585 non-null  int64         
 7   Country      406585 non-null  object        
dtypes: datetime64[ns](1), float64(1), int64(2), object(4)
memory usage: 27.9+ MB


Investigating negative values from Quantity

In [12]:
df.query('Quantity < 0')

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
141,C536379,D,Discount,-1,2010-12-01 09:41:00,27.50,14527,United Kingdom
154,C536383,35004C,SET OF 3 COLOURED FLYING DUCKS,-1,2010-12-01 09:49:00,4.65,15311,United Kingdom
235,C536391,22556,PLASTERS IN TIN CIRCUS PARADE,-12,2010-12-01 10:24:00,1.65,17548,United Kingdom
236,C536391,21984,PACK OF 12 PINK PAISLEY TISSUES,-24,2010-12-01 10:24:00,0.29,17548,United Kingdom
237,C536391,21983,PACK OF 12 BLUE PAISLEY TISSUES,-24,2010-12-01 10:24:00,0.29,17548,United Kingdom
...,...,...,...,...,...,...,...,...
540449,C581490,23144,ZINC T-LIGHT HOLDER STARS SMALL,-11,2011-12-09 09:57:00,0.83,14397,United Kingdom
541541,C581499,M,Manual,-1,2011-12-09 10:28:00,224.69,15498,United Kingdom
541715,C581568,21258,VICTORIAN SEWING BOX LARGE,-5,2011-12-09 11:57:00,10.95,15311,United Kingdom
541716,C581569,84978,HANGING HEART JAR T-LIGHT HOLDER,-1,2011-12-09 11:58:00,1.25,17315,United Kingdom


> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li>Here we can see that where <b>InvoiceNo</b> begins with a 'C', the <b>Quantity</b> is negative.</li>
    <li>It's possible that this could mean Canceled orders.</li></div>

Removing items not necesary for this analysis

In [13]:
df.drop(df.query('Description in ["POSTAGE", "CARRIAGE", "Discount", "DOTCOM POSTAGE", "CRUK Commission", "Manual"]').index, axis=0, inplace=True)

### 4.3 Irregular Entries

Displaying the Country column

In [14]:
df.Country.unique()

array(['United Kingdom', 'France', 'Australia', 'Netherlands', 'Germany',
       'Norway', 'EIRE', 'Switzerland', 'Spain', 'Poland', 'Portugal',
       'Italy', 'Belgium', 'Lithuania', 'Japan', 'Iceland',
       'Channel Islands', 'Denmark', 'Cyprus', 'Sweden', 'Austria',
       'Israel', 'Finland', 'Greece', 'Singapore', 'Lebanon',
       'United Arab Emirates', 'Saudi Arabia', 'Czech Republic', 'Canada',
       'Brazil', 'USA', 'European Community', 'Bahrain', 'Malta', 'RSA'],
      dtype=object)

> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li>The countries EIRE, USA, and RSA are irregular entries.</li></div>

Replacing irregular entries

In [15]:
df.loc[df.query('Country == "EIRE"').index, 'Country'] = 'Ireland'
df.loc[df.query('Country == "USA"').index, 'Country'] = 'United States of America'
df.loc[df.query('Country == "RSA"').index, 'Country'] = 'South Africa'

df.Country.unique()

array(['United Kingdom', 'France', 'Australia', 'Netherlands', 'Germany',
       'Norway', 'Ireland', 'Switzerland', 'Spain', 'Poland', 'Portugal',
       'Italy', 'Belgium', 'Lithuania', 'Japan', 'Iceland',
       'Channel Islands', 'Denmark', 'Cyprus', 'Sweden', 'Austria',
       'Israel', 'Finland', 'Greece', 'Singapore', 'Lebanon',
       'United Arab Emirates', 'Saudi Arabia', 'Czech Republic', 'Canada',
       'Brazil', 'United States of America', 'European Community',
       'Bahrain', 'Malta', 'South Africa'], dtype=object)

## 5.0 Feature Engineering

Add a Revenue column

In [16]:
df['Revenue'] = abs(df['UnitPrice'] * df['Quantity'])
df.columns

Index(['InvoiceNo', 'StockCode', 'Description', 'Quantity', 'InvoiceDate',
       'UnitPrice', 'CustomerID', 'Country', 'Revenue'],
      dtype='object')

Derive Date, Year, Month, Day, and Hour columns from the InvoiceDate column

In [17]:
df['Year'] = df['InvoiceDate'].dt.year
df['Month'] = df['InvoiceDate'].dt.month_name()
df['Date'] = df[['Month','Year']].astype(str).apply('-'.join, axis=1)
df['Day'] = df['InvoiceDate'].dt.day_name()
df['Hour'] = df['InvoiceDate'].dt.hour

df.head(5)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country,Revenue,Year,Month,Date,Day,Hour
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850,United Kingdom,15.3,2010,December,December-2010,Wednesday,8
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850,United Kingdom,20.34,2010,December,December-2010,Wednesday,8
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850,United Kingdom,22.0,2010,December,December-2010,Wednesday,8
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-12-01 08:26:00,3.39,17850,United Kingdom,20.34,2010,December,December-2010,Wednesday,8
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-12-01 08:26:00,3.39,17850,United Kingdom,20.34,2010,December,December-2010,Wednesday,8


In [18]:
def formatTime(h):
    return h.Hour+':00'

## 6.0 Analysis 

CONSTANTS

In [19]:
CURRENCY = '£'
TICKANGLE = 30

Create another dataframe - Sales, excluding all canceled orders 

In [20]:
Sales = df[~df['InvoiceNo'].str.contains('C')]
Sales.head(5)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country,Revenue,Year,Month,Date,Day,Hour
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850,United Kingdom,15.3,2010,December,December-2010,Wednesday,8
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850,United Kingdom,20.34,2010,December,December-2010,Wednesday,8
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850,United Kingdom,22.0,2010,December,December-2010,Wednesday,8
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-12-01 08:26:00,3.39,17850,United Kingdom,20.34,2010,December,December-2010,Wednesday,8
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-12-01 08:26:00,3.39,17850,United Kingdom,20.34,2010,December,December-2010,Wednesday,8


What is the total revenue for the available period?

In [21]:
total_revenue = Sales.groupby('Date', as_index=False)['Revenue'].sum().round(2)
order_list = ["05","09","01","13","03","02","08","07","04","06","12","11","10"]
total_revenue['Order'] = order_list
total_revenue.sort_values(by='Order', inplace=True)
total_revenue.reset_index(drop=True, inplace=True)
total_revenue.drop('Order', axis=1, inplace=True)

fig = go.Figure()
fig.add_traces(go.Scatter(x=total_revenue.Date,y=total_revenue.Revenue,line=dict(color='crimson')))
fig.update_traces(name='Revenue')
fig.update_layout(title='Total Sales Revenue')
fig.update_xaxes(tickangle=TICKANGLE)
fig.update_yaxes(tickprefix=CURRENCY, title='Revenue')

fig.show()

total_revenue

Unnamed: 0,Date,Revenue
0,December-2010,567505.72
1,January-2011,564041.64
2,February-2011,443346.02
3,March-2011,584562.85
4,April-2011,454982.81
5,May-2011,659644.22
6,June-2011,654246.78
7,July-2011,591933.42
8,August-2011,636276.21
9,September-2011,940930.81


> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li>November 2011 had the highest sales revenue of £1.15M</li>
    <li>February 2011 saw the lowest of £443.77k</li></div>

What is the top 10 total revenue by Country?

In [22]:
country_revenue = Sales.groupby('Country', as_index=False)['Revenue'].sum().round(2)
country_revenue.sort_values(by=['Revenue'], ascending=False, inplace=True)

fig = px.bar(country_revenue[:10].reset_index(),
             title='Top 10 Countries by Revenue',
             x='Country',
             y='Revenue',
             text='Revenue',
             text_auto='.3s',
             color='Country')
fig.update_yaxes(title='Revenue', tickprefix=CURRENCY)
fig.show()

country_revenue[:10]

Unnamed: 0,Country,Revenue
34,United Kingdom,7266027.23
23,Netherlands,283889.34
16,Ireland,257296.56
13,Germany,205569.89
12,France,183891.68
0,Australia,138171.31
30,Spain,55725.11
32,Switzerland,52441.95
19,Japan,37416.37
3,Belgium,36927.34


> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li>United Kingdom generated the most revenue of £7.29M from Sales between 2010 - 2011.</li></div>

What are the top 20 products in terms of revenue?

In [23]:
products_revenue = Sales.groupby(['StockCode','Description'], as_index=False)['Revenue'] \
.sum().round(2) \
.sort_values('Revenue', ascending=False) \
.head(20) \
.reset_index(drop=True)

fig = px.bar(products_revenue,
            title='Top 20 Products by Revenue',
            x='StockCode',
            y='Revenue',
            color='StockCode',
            text='Revenue',
            text_auto='.4s')
fig.update_yaxes(tickprefix=CURRENCY)
fig.show()

> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li><b>PAPER CRAFT, LITTLE BIRDIE</b> is the top selling product generating a revenue of £168.5k between 2010 - 2011.</li></div>

What is the sales quantity by product between 2010 - 2011?

In [24]:
product_sales = Sales.groupby('StockCode', as_index=False)['Quantity'].sum()
product_sales.rename(columns={'Quantity':'Sales'}, inplace=True)

product_sales_perf = products_revenue.merge(product_sales, on='StockCode')

fig = px.bar(product_sales_perf.reset_index(),
            title='Sales Quantity for Top 20 Products',
            x='StockCode',
            y='Sales',
            text='StockCode',
            color='StockCode',
            text_auto='.2s')
fig.update_yaxes(title='Quantity')
fig.show()

what is the average unit price of top 20 product sold?

In [25]:
unitPrice_grouped = Sales.groupby(['StockCode'], as_index=False)['UnitPrice'].mean()
unit_price = product_sales_perf.merge(unitPrice_grouped, on='StockCode')

fig = px.bar(unit_price,
            title='Average Unit Price of Top 20 Products Sold',
            x='StockCode',
            y='UnitPrice',
            color='StockCode',
            text='UnitPrice',
            text_auto='.2f')
fig.update_yaxes(tickprefix = '£', title='Unit Price')
fig.show()

unit_price

Unnamed: 0,StockCode,Description,Revenue,Sales,UnitPrice
0,23843,"PAPER CRAFT , LITTLE BIRDIE",168469.6,80995,2.08
1,22423,REGENCY CAKESTAND 3 TIER,142567.45,12410,12.475842
2,85123A,WHITE HANGING HEART T-LIGHT HOLDER,100448.15,36782,2.893106
3,85099B,JUMBO BAG RED RETROSPOT,85220.78,46181,2.015878
4,23166,MEDIUM CERAMIC TOP STORAGE JAR,81416.73,77916,1.220303
5,47566,PARTY BUNTING,68844.33,15295,4.872885
6,84879,ASSORTED COLOUR BIRD ORNAMENT,56543.16,35340,1.680776
7,23084,RABBIT NIGHT LIGHT,51346.2,27202,2.013943
8,79321,CHILLI LIGHTS,46286.51,9650,5.42876
9,22086,PAPER CHAIN KIT 50'S CHRISTMAS,42660.83,15617,2.937203


> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li>PICNIC BASKET WICKER 60 PIECES</b> and <b>REGENCY CAKESTAND 3 TIER</b> have comparably high unit prices and significantly lower quantities sold.</li>
    </div>

Which customers made the largest purchase?

In [26]:
top_customers = Sales.groupby('CustomerID', as_index=False)['Revenue'].sum() \
.sort_values(by='Revenue', ascending=False) \
.reset_index(drop=True)
top_customers['CustomerID'] = top_customers['CustomerID'].astype(str)

fig = px.bar(top_customers[:10],
            title='Top 10 Customers by Revenue',
            x='CustomerID',
            y='Revenue',
            color='CustomerID',
            text='Revenue',
            text_auto='.4s')
fig.update_yaxes(tickprefix=CURRENCY)
fig.show()

top_customers[:10]

Unnamed: 0,CustomerID,Revenue
0,14646,279138.02
1,18102,259657.3
2,17450,194550.79
3,16446,168472.5
4,14911,136275.72
5,12415,124564.53
6,14156,116729.63
7,17511,91062.38
8,12346,77183.6
9,16029,72882.09


> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li>Customer with ID <b>14646</b> made the most purchases amounting to £279.1k between 2010 - 2011.</li></div>

What is the sales performance by Day of week?

In [27]:
daily = Sales.groupby('Day', as_index=False)['Revenue'].sum()
# daily.rename(columns={'Quantity':'Sales'}, inplace=True)
daily['Order'] = ['5','1','6','4','2','3']
daily.sort_values(by=['Order'], inplace=True)

fig = px.bar(daily.reset_index(),
            title='Sales Performance by Day',
            x='Day',
            y='Revenue',
            text='Day',
            color='Day',
            text_auto='.2s')
fig.update_yaxes(tickprefix = '£', title='Revenue')
fig.show()

> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li>Here we can see that the most sales are made on Thursdays and least sales on Sundays.</li></div>

What is the sales performance by hour?

In [28]:
product_sales = Sales.groupby('Hour', as_index=False)['Revenue'].sum()
# product_sales.rename(columns={'Quantity':'Sales'}, inplace=True)
product_sales['Hour'] = product_sales['Hour'].astype(str)
product_sales['Hour'] = product_sales.apply(lambda h:formatTime(h), axis=1)

fig = px.line(product_sales,
            title='Sales Performance by Hour',
            x='Hour',
            y='Revenue')
fig.update_yaxes(tickprefix = '£', title='Unit Price')
fig.show()

> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li>The peak hours for sales are between 10:00 and 12:00</li></div>

What is the total sales revenue lost from canceled orders? 

In [29]:
products_canceled = df.query('(InvoiceNo.str.contains("C"))') \
.groupby(['Date'], as_index=False) \
.Revenue.agg(['count','sum']) \
.reset_index()

products_canceled.rename(columns={'count':'CanceledOrders','sum':'RevenueLost'}, inplace=True)
order_list = ["05","09","01","13","03","02","08","07","04","06","12","11","10"]
products_canceled['Order'] = order_list
products_canceled.sort_values(by='Order', ascending=True, inplace=True)
products_canceled.reset_index(drop=True, inplace=True)
products_canceled.drop('Order', axis=1,inplace=True)

fig = go.Figure()
fig.add_traces(go.Scatter(name='RevenueLost',x=products_canceled.Date,y=products_canceled.RevenueLost,line=dict(color='crimson')))
fig.update_layout(title='Total Sales Revenue Lost (Canceled Orders)')
fig.update_yaxes(tickprefix=CURRENCY, title='Revenue')
fig.show()


products_canceled

Unnamed: 0,Date,CanceledOrders,RevenueLost
0,December-2010,675,17365.64
1,January-2011,667,91253.82
2,February-2011,414,8315.77
3,March-2011,609,10036.4
4,April-2011,530,33313.96
5,May-2011,552,8928.25
6,June-2011,629,12652.71
7,July-2011,638,11391.01
8,August-2011,614,22912.2
9,September-2011,758,17007.93


> <div class="alert alert-block alert-secondary" style='font-size:18px'>
    <li>December 2011 had the highest revenue loss of £175k from canceled orders.</li>
    <li>The highest number of order cancelations was recorded in October 2011 at 1124 orders.</li></div>

## 7.0 Findings and Recommendations

<div class="alert alert-block alert-secondary" style='font-size:18px'>
    <ul>
    <li>In September £944.4k, there was a significant increase in revenue compared to the previous month of August (£639.3k), representing a growth of approximately 47.8%. This upward trend continued steadily until November 2011, which recorded the highest sales revenue of £1,145.2k from Dec. 2010 to Dec. 2011. However, December 2011 experienced a 20.4% drop in revenue to £513.7k. Further investigation needs to be conducted to determine the causes behind the surge in revenue and the sharp decline.</li>
    <li>The UK generated the highest revenue of £7.29M from sales between Dec. 2010 and Dec. 2011, which is over 2000% higher than the next best performer at £283.9k. This indicates the need for additional marketing efforts in other countries.</li>
    <li>Between Dec. 2010 and Dec. 2011, the top-grossing product was "PAPER CRAFT, LITTLE BIRDIE," yielding a revenue of £168.5k.</li>
    <li>The products "REGENCY CAKESTAND 3 TIER" and "PICNIC BASKET WICKER 60 PIECES" had relatively high unit prices per product, at £12.48 and £11.20 respectively. These prices should be reviewed.</li>
    <li>The customer with ID 14646 made the most purchases between 2010 and 2011, amounting to £279.1k.</li>
    <li>Customers tend to make more purchases on Thursdays, with the peak hours occurring between 10:00 AM and 12:00 PM. The sales and marketing teams should target their efforts accordingly.</li>
    </ul></div>