# 📈 Time Series Forecasting: Real GDP of Kenya
This notebook follows the complete time series forecasting process as outlined in the Capstone Project Guide. The target indicator is **Real GDP (constant 2015 US$)** for Kenya, sourced via the **World Bank API**.


## 🌐 Data Acquisition
We fetch Real GDP data from the World Bank API (`NY.GDP.MKTP.KD`).

In [None]:
import pandas as pd
import requests

# API endpoint for Real GDP (constant 2015 US$)
url = 'http://api.worldbank.org/v2/country/KE/indicator/NY.GDP.MKTP.KD?format=json&per_page=1000'
response = requests.get(url)
data = response.json()
records = data[1]

# Convert to DataFrame
df = pd.DataFrame(records)[['date', 'value']].dropna()
df.columns = ['Year', 'Real_GDP']
df['Year'] = pd.to_datetime(df['Year'], format='%Y')
df.set_index('Year', inplace=True)
df.sort_index(inplace=True)
df.head()

## 🧹 Preprocessing
- Convert `Year` to datetime format
- Set it as the index
- Clean missing values

## 📊 Exploratory Data Analysis (EDA)
We explore trends, check for stationarity, and visualize components.

In [None]:
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import adfuller

# Plot GDP trend
df.plot(figsize=(12, 6), title='Kenya Real GDP (constant 2015 US$)')
plt.grid(True)
plt.show()

# Decompose trend and seasonality
decomposition = seasonal_decompose(df['Real_GDP'], model='additive', period=4)
decomposition.plot()
plt.tight_layout()
plt.show()

# ADF Test
result = adfuller(df['Real_GDP'])
print("ADF Statistic:", result[0])
print("p-value:", result[1])
for key, value in result[4].items():
    print(f"Critical Value ({key}): {value}")

## 🔁 Differencing to Achieve Stationarity (if required)

In [None]:
df_diff = df.diff().dropna()
df_diff.plot(figsize=(12, 5), title='First-order Differenced GDP')
plt.grid(True)
plt.show()

## 🤖 ARIMA Forecasting

In [None]:
from statsmodels.tsa.arima.model import ARIMA

model_arima = ARIMA(df['Real_GDP'], order=(1, 1, 1))
model_fit = model_arima.fit()

forecast_steps = 10
forecast_arima = model_fit.forecast(steps=forecast_steps)
forecast_index = pd.date_range(start=df.index[-1] + pd.DateOffset(years=1), periods=forecast_steps, freq='Y')

forecast_df = pd.DataFrame({'Forecasted GDP': forecast_arima}, index=forecast_index)

plt.figure(figsize=(12, 6))
plt.plot(df, label='Historical GDP')
plt.plot(forecast_df, label='ARIMA Forecast', color='red')
plt.title('ARIMA GDP Forecast')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

## 🔮 Prophet Forecasting

In [None]:
from prophet import Prophet

df_prophet = df.reset_index().rename(columns={'Year': 'ds', 'Real_GDP': 'y'})
model = Prophet()
model.fit(df_prophet)

future = model.make_future_dataframe(periods=10, freq='Y')
forecast = model.predict(future)

fig = model.plot(forecast)
plt.title("Prophet GDP Forecast")
plt.show()

## 🌐 Streamlit Integration (Optional)
This notebook can be adapted into a Streamlit app. Provide user options for forecast horizon and interactive plots.

## ✅ Conclusion
- Real GDP shows consistent growth over the decades.
- Both ARIMA and Prophet offer reasonable forecasts.
- Further tuning and economic context analysis are recommended.