# Predictive Time Series Model:Case Study NSE Kenya 20 Share Index

## 1. Business Understanding

### a) Problem Statement

The Nairobi Securities Exchange (NSE) is the main securities exchange in Kenya, providing a platform for trading a range of financial instruments including equities, bonds, and derivatives. The NSE Share 20 Index is an index of the 20 largest and most actively traded stocks on the NSE, designed to provide a benchmark for the performance of the Kenyan stock market.

The ability to predict stock prices is of great interest to investors, traders, and financial analysts, as it can help them make informed decisions about buying and selling securities. Accurate stock price predictions can also be useful for companies seeking to raise capital through stock offerings, as they can better understand the potential value of their shares.

The Breakfast Club Consultancy is committed to leveraging machine learning techniques to predict the stock prices of the NSE Share 20 Index. By analyzing historical data on stock prices, as well as other economic and financial indicators, models will be trained that can forecast future prices with a high degree of accuracy. The goal is to provide investors, traders, and financial analysts with a valuable tool for making more informed decisions about buying and selling securities on the NSE.

### b) Main Objective

To develop and deploy a predictive time series model that leverages machine learning techniques to accurately forecast the stock prices of the Kenya NSE 20 Share Index, taking into account market-specific factors and historical data.


### c) Specific Objectives

### d) Experimental Design
1. Data Collection
2. Read and check the data
3. Cleaning the data
4. Exploratory Data Analysis
5. Data modelling and model performance evaluation
6. Use the model to make predictions
7. Conclusions and Recommendations
8. Deploy the model

### d) Data Understanding 
The data used in this project was downloaded from [here](https://www.investing.com/indices/kenya-nse-20-historical-data) and [CBK website.](https://www.centralbank.go.ke/inflation-rates/)

The NSE 20 dataset contains 4531 rows and 6 columns with the following information:

|No.| Column    | Description|
|---| ---       | ---         |
|1|Date|Relevant date|
|2|Price|Average price of the stock|
|3|Open|Price at which the stock trades when an exchange opens for the day|
|4|High|Highest price the stock traded on that date|
|5|Low|Lowest price the stock traded on that date|
|6|Change %|Percentage change in stock price from the previous day| 


The datasets from the CBK website are on the specific macroeconomic factors affecting the prices of shares. The following datasets have been downloaded:

1. Inflation rates : This has 219 rows and contains the the percentage change in the monthly consumer price index (CPI).
2. Annual GDP : This has 23 rows and contains the Kenyan GDP from 2000-2021
3. Quarterly GDP : This dataset is in wide format with 95 rows and 56 columns and contains the quarterly Kenyan GDP from 2009 to 2021 and forecasted GDP for 2022
4. Monthly Exchange Rate : This dataset has 362 rows and 31 columns and contains the average exchange rate per month for different currencies. The USD column is the one to be used in this project.
5. Central Bank Rate : Data on this is found in two datasets, with one ranging from 2008-2023 and the other from  1991-2016. They contain the interest rate that the Central Bank of Kenya charges on loans to banks.

## 2. Importing Libraries

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

## 3. Reading the Data

In [16]:
nse_20 = pd.read_csv("Datasets/Kenya NSE 20 Historical Data.csv")
nse_20.head()

Unnamed: 0,Date,Price,Open,High,Low,Vol.,Change %
0,03/28/2023,1581.11,1581.11,1581.11,1581.11,,0.78%
1,03/27/2023,1568.94,1568.94,1568.94,1568.94,,0.31%
2,03/24/2023,1564.16,1564.16,1564.16,1564.16,,0.34%
3,03/23/2023,1558.87,1558.87,1558.87,1558.87,,1.14%
4,03/22/2023,1541.26,1541.26,1541.26,1541.26,,0.96%


In [3]:
annual_gdp = pd.read_csv("Datasets/Annual GDP.csv")
annual_gdp.head()

Unnamed: 0,Year,Nominal GDP prices (Ksh Million),Annual GDP growth (%),Real GDP prices (Ksh Million)
0,2021,12098200,7.5,9391684
1,2020,10716034,-0.3,8735040
2,2019,10237727,5.1,8756946
3,2018,9340307,5.6,8330891
4,2017,8483396,3.8,7885521


In [9]:
quarterly_gdp = pd.read_csv("Datasets/QuarterlyGDP.csv",skiprows=2,index_col=[0])
quarterly_gdp.head()

Unnamed: 0,NEW QUARTERLY ESTIMATES - After rebasing,2009,Unnamed: 3,Unnamed: 4,Unnamed: 5,2010,Unnamed: 7,Unnamed: 8,Unnamed: 9,2011,...,Unnamed: 47,Unnamed: 48,Unnamed: 49,2021,Unnamed: 51,Unnamed: 52,Unnamed: 53,2022*,Unnamed: 55,Unnamed: 56
,Quarter,Q1,Q2,Q3,Q4,Q1,Q2,Q3,Q4,Q1,...,Q2,Q3,Q4,Q1,Q2,Q3,Q4,Q1,Q2,Q3
,Agriculture,312740,319490,277360,267871,342859,363316,296407,296749,354088,...,496572,354080,391968,465677,494225,356169,387264,462271,487229,354111
,Mining & Quarrying,9316,10046,10562,11791,11546,13013,13704,14115,13542,...,22528,20260,22167,25463,24983,23576,29805,31515,30624,23048
,Manufacturing,126473,127715,134156,149223,133161,131723,142909,159650,143083,...,178073,180176,201024,199445,198150,198496,210910,206804,205209,203342
,Electricity & water supply,35149,35483,36352,36745,37304,36847,39997,39821,39860,...,51453,55635,57561,56879,55153,59194,59184,57982,57916,61969


In [11]:
inflation = pd.read_csv("Datasets/Inflation Rates.csv")
inflation.head()

Unnamed: 0,Year,Month,Annual Average Inflation,12-Month Inflation
0,2023,February,8.3,9.23
1,2023,January,7.95,8.98
2,2022,December,7.66,9.06
3,2022,November,7.38,9.48
4,2022,October,7.48,9.59


In [12]:
cbk_rate_to2023 = pd.read_csv("Datasets/Central Bank Rate (2008-2023) .csv")
cbk_rate_to2023.head()

Unnamed: 0,Date,Rate
0,29/03/2023,9.5
1,30/01/2023,8.75
2,23/11/2022,8.75
3,29/09/2022,8.25
4,27/07/2022,7.5


In [13]:
cbk_rate_to2016 = pd.read_csv("Datasets/Central Bank Rates (1991-2016).csv")
cbk_rate_to2016.head()

Unnamed: 0,Year,Month,Repo,Reverse Repo,Interbank Rate,91-Day T-bill,182-Day T-bill,364-Day T-bill,Cash Reserve Requirement,Central Bank Rate
0,2016,July,9.76,10.57,5.88,6.16,9.79,10.88,5.25,10.5
1,2016,June,10.04,10.59,4.56,7.25,9.56,10.84,5.25,10.5
2,2016,May,6.0,11.55,3.82,8.15,10.25,11.6,5.25,10.5
3,2016,April,5.23,12.49,4.01,8.92,10.87,11.84,5.25,11.5
4,2016,March,4.31,11.63,4.1,8.72,10.83,12.26,5.25,11.5


In [15]:
exchange_rate = pd.read_csv("Datasets/Monthly Exchange rate (period average).csv",skiprows=1)
exchange_rate.head()

Unnamed: 0,Year,Month,United States dollar,Sterling pound,Euro,South Africa Rand,Uganda shilling\2,Tanzania shilling\2,Rwanda Franc,Burundi Franc,...,Danish kroner,Austrian schilling,Finn marka,Spanish peseta,Indian rupee,Hong kong dollar,Singapore dollar,Saudi riyal,Chinese Yuan,Australian dollar
0,1993,1,36.23,55.62,,,,,,,...,5.74,3.19,6.71,31.64,1.4,,,,,
1,1993,2,36.56,52.68,,,,,,,...,5.78,3.17,6.29,31.16,1.41,,,,,
2,1993,3,43.12,62.92,,,,,,,...,6.78,3.72,7.21,36.62,1.44,,,,,
3,1993,4,51.88,80.34,,,,,,,...,8.48,4.63,9.47,45.9,1.65,,,,,
4,1993,5,62.16,96.38,,,,,,,...,10.08,5.5,11.33,51.43,1.98,,,,,


In [17]:
exchange_rate.shape

(362, 31)

## 4. Checking the Data

## 5. Tidying the Dataset

## 6. Exploratory Data Analysis

## 7. Data Modelling