UK MONETARY POLICY 2019-2025

## Objectives

* The objectives of this project are to:

Collect official UK macroeconomic data from reliable public sources

Apply Python-based ETL (Extract, Transform, Load) techniques

Clean and preprocess time-series data for analysis

Convert datasets to a common quarterly frequency for alignment

Analyse trends in inflation, Official Bank Rate and GDP between 2019 and 2025
## Inputs

* The following data inputs required: 
1. Bank of England Policy Interest Rate : time series data on the UK nominal policy interest rate 
2. Consumer Price Index(CPI): UK CPI, used as a measure inflation
3. Gross Domestic Product(GDP) : UK GDP data, used to represent economic activity. GDP

## Outputs

* Write here which files, code or artefacts you generate by the end of the notebook 

## Additional Comments

* If you have any additional comments that don't fit in the previous bullets, please state them here. 



---

# Change working directory

* We are assuming you will store the notebooks in a subfolder, therefore when running the notebook in the editor, you will need to change the working directory

We need to change the working directory from its current folder to its parent folder
* We access the current directory with os.getcwd()

In [8]:
import os
current_dir = os.getcwd() #
current_dir

'\\\\talktalk\\redirectedfolders\\F.Afolabi\\Documents\\VSCode1\\UK_MonetaryPolicy_2019'

We want to make the parent of the current directory the new current directory
* os.path.dirname() gets the parent directory
* os.chir() defines the new current directory

In [73]:
os.chdir(os.path.dirname(current_dir))
print("You set a new current directory")

You set a new current directory


Confirm the new current directory

In [74]:
current_dir = os.getcwd()
current_dir

'\\\\talktalk\\redirectedfolders'

# Section 1
#Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns 

Section 1 Extraction: Load the Dataset

In [1]:
import pandas as pd

cpi = pd.read_csv(r"\\talktalk\redirectedfolders\F.Afolabi\Documents\VSCode1\inflation.csv")
print(cpi.head())

# Display basic information about the dataset, I would take the variable one by one; manipulating CPI first; 
cpi.info() 

print(cpi.shape)  # Looking at the shape of CPI

   Date  CPI_Inflation
0  1996           68.8
1  1997           70.1
2  1998           71.2
3  1999           72.1
4  2000           72.7
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1280 entries, 0 to 1279
Data columns (total 2 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Date           1280 non-null   object 
 1   CPI_Inflation  634 non-null    float64
dtypes: float64(1), object(1)
memory usage: 20.1+ KB
(1280, 2)


In [2]:
# Generate a summary of the statistics
print(cpi.describe())

       CPI_Inflation
count     634.000000
mean       86.401577
std        22.090846
min        48.400000
25%        70.500000
50%        81.150000
75%       100.550000
max       139.800000


In [3]:
# Checking for missing values and data type
print(cpi.isnull().sum())
print(cpi.dtypes)

Date               0
CPI_Inflation    646
dtype: int64
Date              object
CPI_Inflation    float64
dtype: object


In [4]:
# Converting Date to timestamp and ensuring the date are strings  
cpi['Date'] = cpi['Date'].astype(str)
cpi['Date'] = cpi['Date'].str.replace(' Q', '-Q')
cpi['Date'] = pd.PeriodIndex(cpi['Date'], freq='Q').to_timestamp()

In [5]:
print(cpi.isnull().sum())

Date               0
CPI_Inflation    646
dtype: int64


In [6]:
# Filtering the data between 2019 and 2025 for CPI for the analysis and the scope of this project
cpi = cpi[(cpi['Date'] >= '2019-01-01') & (cpi['Date'] <= '2025-12-31')]
print(cpi.head())
cpi.info()

         Date  CPI_Inflation
23 2019-01-01          107.8
24 2020-01-01          108.7
25 2021-01-01          111.6
26 2022-01-01          121.7
27 2023-01-01          130.5
<class 'pandas.core.frame.DataFrame'>
Index: 116 entries, 23 to 1279
Data columns (total 2 columns):
 #   Column         Non-Null Count  Dtype         
---  ------         --------------  -----         
 0   Date           116 non-null    datetime64[ns]
 1   CPI_Inflation  115 non-null    float64       
dtypes: datetime64[ns](1), float64(1)
memory usage: 2.7 KB


In [7]:
# Interpolating missing value. This will fill the missing value based on the trend between surrounding data points, CPI typically changes gradually over time.
cpi['CPI_Inflation'] = cpi['CPI_Inflation'].interpolate()
print(cpi.isnull().sum())

Date             0
CPI_Inflation    0
dtype: int64


In [8]:
cpi_cleaned = cpi.to_csv('cpi_cleaned.csv', index=False) # Saving the cleaned CPI data to a new CSV file, for further analysis

Loading the Official Bank Rate to be cleaned. 

In [9]:
boe_rate = pd.read_csv('../Dataset/Raw/Bank_rate_boe.csv')
boe_rate.head()
    

Unnamed: 0,Date,Official Bank Rate [a] [b] IUDBEDR
0,15-Dec-25,4.0
1,12-Dec-25,4.0
2,11-Dec-25,4.0
3,10-Dec-25,4.0
4,09-Dec-25,4.0


In [10]:
boe_rate.info() #Looking at the data types and missing values

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7823 entries, 0 to 7822
Data columns (total 2 columns):
 #   Column                                                       Non-Null Count  Dtype  
---  ------                                                       --------------  -----  
 0   Date                                                         7823 non-null   object 
 1   Official Bank Rate              [a] [b]             IUDBEDR  7823 non-null   float64
dtypes: float64(1), object(1)
memory usage: 122.4+ KB


In [11]:
# Rename the Colum, I rename the column to be consistent across the dataset

boe_rate.columns = ['Date', 'Bank_Rate']

In [12]:
boe_rate.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7823 entries, 0 to 7822
Data columns (total 2 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Date       7823 non-null   object 
 1   Bank_Rate  7823 non-null   float64
dtypes: float64(1), object(1)
memory usage: 122.4+ KB


In [13]:
#Converting the Date to datetime also to quarterly frequency, this will be consistent with CPI data and GDP data
boe_rate['Date'] = pd.to_datetime(boe_rate['Date'])
boe_rate.set_index('Date', inplace=True)
boe_rate = boe_rate.resample('QS').mean()
print(boe_rate.head())

  boe_rate['Date'] = pd.to_datetime(boe_rate['Date'])


            Bank_Rate
Date                 
1995-01-01   6.453125
1995-04-01   6.625000
1995-07-01   6.625000
1995-10-01   6.581349
1996-01-01   6.125000


In [18]:
# Filtering the data between 2019 and 2025 for Bank Rate  for the analysis and the scope of this project, date reset to allow filtering
boe_rate.reset_index(inplace=True)
boe_rate_filtered = boe_rate[(boe_rate['Date'] >= '2019-01-01') & (boe_rate['Date'] <= '2025-12-31')]
print(boe_rate_filtered.head())
boe_rate_filtered.info()

          Date  Bank_Rate
96  2019-01-01   0.750000
97  2019-04-01   0.750000
98  2019-07-01   0.750000
99  2019-10-01   0.750000
100 2020-01-01   0.611719
<class 'pandas.core.frame.DataFrame'>
Index: 28 entries, 96 to 123
Data columns (total 2 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   Date       28 non-null     datetime64[ns]
 1   Bank_Rate  28 non-null     float64       
dtypes: datetime64[ns](1), float64(1)
memory usage: 672.0 bytes


In [20]:
# setting Date as index
boe_rate_filtered.set_index('Date', inplace=True)

In [24]:
# Setting  Bank Rate to Quarterly frequency.
boe_rate_filtered = boe_rate_filtered.resample('QS').mean()
print(boe_rate_filtered.head())


            Bank_Rate
Date                 
2019-01-01   0.750000
2019-04-01   0.750000
2019-07-01   0.750000
2019-10-01   0.750000
2020-01-01   0.611719


In [26]:
boe_rate_filtered.info()


<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 28 entries, 2019-01-01 to 2025-10-01
Freq: QS-JAN
Data columns (total 1 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Bank_Rate  28 non-null     float64
dtypes: float64(1)
memory usage: 448.0 bytes


In [28]:
# Checking for missing values
print(boe_rate_filtered.isnull().sum())

Bank_Rate    0
dtype: int64


In [30]:
#Checking for missing Quaterly periods
boe_rate_filtered = boe_rate_filtered.asfreq('QS')
print(boe_rate_filtered.isnull().sum())


Bank_Rate    0
dtype: int64


In [32]:
#print Bank Rtae info
boe_rate_filtered.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 28 entries, 2019-01-01 to 2025-10-01
Freq: QS-JAN
Data columns (total 1 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   Bank_Rate  28 non-null     float64
dtypes: float64(1)
memory usage: 448.0 bytes


In [34]:
# saved the cleaned Bank Rate data to a new CSV file, for further analysis
boe_rate_cleaned = boe_rate_filtered.to_csv('boe_rate_cleaned.csv', index=True)

---

# Section 2

Section 2 content

---

NOTE

* You may add as many sections as you want, as long as it supports your project workflow.
* All notebook's cells should be run top-down (you can't create a dynamic wherein a given point you need to go back to a previous cell to execute some task, like go back to a previous cell and refresh a variable content)

---

# Push files to Repo

* In cases where you don't need to push files to Repo, you may replace this section with "Conclusions and Next Steps" and state your conclusions and next steps.

In [35]:
import os
#try:
  # create your folder here
  # os.makedirs(name='')
#except Exception as e:
 # print(e)
