<a href="https://colab.research.google.com/github/loyedan/ELAIS-QST-Mini-Project_Assignments/blob/main/Forecasting_the_Health_of_Ghana's_Economy_Using_Macroeconomic_Indicators.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Forecasting the Health of Ghana's Economy Using Macroeconomic Indicators

# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.arima_model import ARIMA
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import statsmodels.api as sm

# Set visualization aesthetics
sns.set(style='whitegrid')

# Section 1: Data Collection and Preprocessing

# 1.1 Load the dataset (replace with actual dataset path or URL)
# For the sake of demonstration, we are generating random data. Replace this with real data.
# For example, you can load a CSV file containing the macroeconomic indicators.
# df = pd.read_csv('path_to_your_data.csv')

# Sample data generation for demo purposes (Replace this with your actual data)
np.random.seed(0)
dates = pd.date_range(start='2000-01-01', periods=100, freq='Q')
data = {
    'Date': dates,
    'GDP': np.random.normal(5, 0.5, size=len(dates)),
    'Inflation': np.random.normal(10, 2, size=len(dates)),
    'Unemployment': np.random.normal(7, 0.3, size=len(dates)),
    'Interest_Rate': np.random.normal(15, 1, size=len(dates)),
    'Exchange_Rate': np.random.normal(1, 0.1, size=len(dates)),
    'Govt_Debt': np.random.normal(70, 5, size=len(dates))
}

df = pd.DataFrame(data)
df.set_index('Date', inplace=True)

# Display the first few rows of the data
df.head()

# 1.2 Check for missing values
df.isnull().sum()

# Section 2: Exploratory Data Analysis (EDA)

# 2.1 Plot the time series data for each macroeconomic indicator
fig, axes = plt.subplots(nrows=3, ncols=2, figsize=(14, 10))

df['GDP'].plot(ax=axes[0, 0], title='GDP Growth (%)', color='blue')
df['Inflation'].plot(ax=axes[0, 1], title='Inflation Rate (%)', color='green')
df['Unemployment'].plot(ax=axes[1, 0], title='Unemployment Rate (%)', color='red')
df['Interest_Rate'].plot(ax=axes[1, 1], title='Interest Rate (%)', color='purple')
df['Exchange_Rate'].plot(ax=axes[2, 0], title='Exchange Rate (GHS/USD)', color='orange')
df['Govt_Debt'].plot(ax=axes[2, 1], title='Government Debt (% of GDP)', color='brown')

plt.tight_layout()
plt.show()

# 2.2 Check for stationarity using Augmented Dickey-Fuller (ADF) Test
def adf_test(series):
    result = adfuller(series, autolag='AIC')
    print(f'ADF Statistic: {result[0]}')
    print(f'p-value: {result[1]}')
    if result[1] <= 0.05:
        print("Reject the null hypothesis (stationary)")
    else:
        print("Fail to reject the null hypothesis (non-stationary)")

# Test stationarity for each macroeconomic indicator
for column in df.columns:
    print(f"\nStationarity Test for {column}:")
    adf_test(df[column])

# Section 3: Forecasting Model Development (ARIMA)

# 3.1 Difference the data to make it stationary (if required)
df_diff = df.diff().dropna()

# 3.2 Split the data into training and testing sets
train_data, test_data = train_test_split(df_diff, test_size=0.2, shuffle=False)

# 3.3 Build ARIMA Model for each macroeconomic indicator (as an example, using GDP)
model = ARIMA(train_data['GDP'], order=(1,1,1))
model_fit = model.fit(disp=0)
print(model_fit.summary())

# 3.4 Forecasting
forecast = model_fit.forecast(steps=len(test_data))[0]

# 3.5 Evaluate the model's accuracy (using Mean Squared Error)
mse = mean_squared_error(test_data['GDP'], forecast)
print(f'Mean Squared Error: {mse}')

# Plot actual vs forecasted values
plt.figure(figsize=(10, 6))
plt.plot(test_data.index, test_data['GDP'], label='Actual')
plt.plot(test_data.index, forecast, label='Forecast', color='red')
plt.title('GDP Forecast vs Actual')
plt.legend()
plt.show()

# Section 4: Model for Other Indicators

# You can apply similar steps for Inflation, Unemployment, etc.
# Build ARIMA models for all other macroeconomic indicators and forecast them.

# Example for Inflation
model_inflation = ARIMA(train_data['Inflation'], order=(1,1,1))
model_fit_inflation = model_inflation.fit(disp=0)
forecast_inflation = model_fit_inflation.forecast(steps=len(test_data))[0]

# Plot Inflation Forecast
plt.figure(figsize=(10, 6))
plt.plot(test_data.index, test_data['Inflation'], label='Actual')
plt.plot(test_data.index, forecast_inflation, label='Forecast', color='green')
plt.title('Inflation Forecast vs Actual')
plt.legend()
plt.show()

# Section 5: Conclusion and Policy Recommendations

# 5.1 Policy Implications
# Based on the forecasts, provide recommendations for policymakers on handling future trends
# For instance, what measures can be taken if inflation is expected to rise?

# 5.2 Conclusion
# Summarize key findings, model performance, and potential policy responses.


Explanation of Notebook Sections:
Data Collection and Preprocessing:

Replace the sample data with actual macroeconomic data from reliable sources.
Preprocess the data (check for missing values, clean the data, etc.).
Exploratory Data Analysis (EDA):

Visualize the historical trends of macroeconomic indicators like GDP, inflation, etc.
Check for stationarity using the Augmented Dickey-Fuller (ADF) test.
Forecasting Model Development:

Use ARIMA models to predict future values of each macroeconomic indicator.
Evaluate the model using metrics like Mean Squared Error (MSE).
Forecast Other Indicators:

Repeat the same process (ARIMA) for other key indicators, e.g., inflation, unemployment, exchange rates, etc.
Conclusions and Recommendations:

Analyze the results and suggest policy implications based on the forecasts.
This notebook provides a structured way to forecast the health of Ghana's economy using time series models. You can extend it by incorporating more advanced machine learning models or by applying it to other countries or datasets.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.tsa.stattools import adfuller, cointintegration
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.tsa.vector_ar import VAR

# Replace with actual data paths
data_path = "your_data_path.csv"

# Load data
df = pd.read_csv(data_path)

# Explore the data
print(df.head())
print(df.describe())

# Visualize the data
plt.figure(figsize=(12, 8))
plt.plot(df['GDP_Growth'])
plt.title('GDP Growth')
plt.show()

# Check for stationarity using ADF test
result = adfuller(df['GDP_Growth'])
print('ADF Statistic:', result[0])
print('p-value:', result[1])
print('Critical Values:')
for key, value in result[4].items():
    print('\t%s: %.3f' % (key, value))

# If data is non-stationary, try differencing or other transformations

# Example: Cointegration test (if applicable)
results = cointintegration.coint(df['GDP_Growth'], df['Inflation'])
print(results)

# Model selection and forecasting
# Example: ARIMA model
model = ARIMA(df['GDP_Growth'], order=(1, 1, 1))
model_fit = model.fit()
forecast = model_fit.forecast(steps=12)
print(forecast)

# Example: VAR model
model = VAR(endog=df[['GDP_Growth', 'Inflation']])
model_fit = model.fit()
forecast = model_fit.forecast(steps=12)
print(forecast)

# Evaluate model performance using metrics like MSE, RMSE, MAE, and MAPE

Explanation:
Import necessary libraries: Pandas for data manipulation, NumPy for numerical operations, Matplotlib for visualization, Statsmodels for statistical analysis and time series modeling.
Load data: Replace your_data_path.csv with the actual path to your CSV file containing the macroeconomic indicators.
Explore the data: Get a basic understanding of the data using head() and describe().
Visualize the data: Create plots to visualize trends and patterns in the data.
Check for stationarity: Use the Augmented Dickey-Fuller test to determine if the data is stationary.
Cointegration test (if applicable): If the data is non-stationary, check for cointegration relationships between variables.
Model selection and forecasting: Choose appropriate models (ARIMA, VAR, or others) based on the data characteristics and forecasting objectives. Fit the models and generate forecasts.
Evaluate model performance: Assess the accuracy of the forecasts using relevant metrics.
Note: This is a basic example. You may need to experiment with different models, parameter settings, and data transformations to find the best-performing model for your specific forecasting task. Additionally, consider incorporating external factors or events that might influence the economy into your analysis.