# Project report

## Introduction
As residents of one of the richest countries in the world, Norwegians had long forgone all economic despair. However – recent renewable energy trends, a global pandemic, and an all-time high cost of living conspire to cause ripples in the fabric of our reality, and the oil – which was once enough to consolidate our expectations of financial security in the future – no longer adequately shields us from an uncertain future. Although this uncertain future inevitably will happen upon us, we may choose to accommodate it with realistic expectations and a strong empirical foundation – but how? The NOK exchange rate continues to fall, but its decline may reveal important information and serve as a barometer for the general state of the Norwegian economy.

How do macroeconomic factors influence the Norwegian Krone (NOK) exchange rate? This study aims to analyze the impact of key macroeconomic indicators such as interest rates, inflation, oil and gas prices, and GDP growth on the NOK exchange rate. The research encompasses highly relevant and timely topics and aligns well with the curriculum of TIØ4317 on multivariate time series models.


## Foundation and scope
This report aims to explore how various macroeconomic factors influence the NOK/EUR exchange rate over time; this is accomplished by modeling the exchange rate as the effect variable of several chosen macroeconomic casual variables.
A model may only be as accurate as its foundation. A key indicator of a model’s predictive power is its choice of variables. The NOK exchange rate is a complex entity, but to avoid going beyond the scope of the research question, the number of variables must be limited. After thorough consideration, the following have been selected as the most relevant variables to base the model on.
> •	**Interest rates:** Interest rates are governed by monetary policy, which also impacts currency exchange rates. Increases in interest rates, ceteris paribus, will provide better returns for government and corporate bonds and make investing in these a more attractive opportunity for foreign investors. To comply with the Law of one price, these effects must trickle down into exchange rates to prevent arbitrage opportunities.

> •	**Trade balance:** As the global economy grows exceedingly more interdependent, a well-functioning trade system is vital to any economy. Fluctuations in exchange rates affect the willingness to import and export goods, and vice versa; if a country’s goods are in high demand, the demand for that country’s currency will increase correspondingly.

> •	**Oil prices:** The oil industry has historically had a significant positive impact on Norway’s economic trajectory, and it continues to do so today – though, to a smaller degree. The oil has put Norway on the economic world map and is therefore a shoo-in among these selected variables.

> •	**Gas prices:** Like oil, gas is a staple among Norwegian export articles. Therefore, it is reasonable to assume that fluctuations in gas price will affect the Norwegian currency. Both oil and gas, being vital to Norway’s trade balance, will furthermore have an inherent correlation with the exchange rate. This symbiotic relationship was exemplified during Russia’s invasion of Ukraine and the consequent boycott of Russian oil.

> •	**GDP growth:** The gross domestic product (GDP) is a measure of the strength of an economy, while the exchange rate measures the strength of an economy in relation to others. While these need not necessarily correlate, their relationship would be interesting to investigate.

> •	**CPI growth:** Although changes in the consumer price index (CPI) may not directly translate to changes in exchange rates, market microstructures and trends will accumulate and be reflected in the overall state of the economy, exchange rates included.

## Theory
Time series analysis is the foundation upon which modern quantitative finance is built. Multivariate time series models give us key insights into how financial markets move over time, as well as the predictive power to use as a basis for investment decisions. 

The Autoregressive Integrated Moving Average (ARIMA) model is a unit-root stationary model with a strong memory - this is due to its coefficients in the MA components do not decay to zero over time. What separates an ARIMA model from a standard ARMA model is differencing: This describes the act of transforming a nonstationary series into a stationary one. $ c_t = y_t - y_(t-1) $ is known as the first (d=1) differenced series of $y_t$. The ARIMA(p,d,q) comprises three terms: p (previous values) - the AR term, d (differencing) - the I term, and q (previous errors) - the MA term. These terms interact to produce the model, which may generally be expressed as

> $ x_t = \phi_1 x_(t-1) + \phi_2 x_(t-2) + ... \phi_p x_(t-p) + \epsilon_t + \theta_1 \epsilon_(t-1) + \theta_2 \epsilon_(t-2) + ... + \theta_q \epsilon_(t-q) $

Where $x_t$ is the $d$ times differenced series $y_t$: 

> $ x_t = \Delta^d y_t $

*   $\phi_p$: Autoregressive coefficients
*   $\theta_q$: Moving average coefficients
*   $\epsilon_t$: Error term

To validate the residuals as white noise (and investigate the possibility of serial correlation), the Ljung-Box test is an appropriate tool. This test statistic may be formulated mathematically as following:

> $ Q = T(T+2) \sum \limits _{k=1} ^{h} \frac{\hat{\rho_k^2}}{T-k} $

Where $h$ is the chosen maximum lag, $T$ is the number of observations and $\hat{\rho_k^2}$ is the sample autocorrelation at lag $k$. This yields $Q$, which has a chi-square distribution with $h$ degrees of freedom [1].

## Data
Data is the alpha and omega of any model, as it provides the necessary empirical foundation. The analysis will use publicly available data from reliable sources:
 *	Exchange rate data (NOK/USD, NOK/EUR) – Norges Bank, Yahoo Finance, LSEG
 *	Macroeconomic indicators:
    *  Interest rates – Norges Bank
    * 	Inflation (CPI) – Statistics Norway (SSB)
    * 	GDP growth – Statistics Norway (SSB)
    * 	Oil prices (Brent Crude) and gas prices (TTF Rotterdam) – U.S. Energy Information Administration (EIA)
    * 	Trade balance – Statistics Norway (SSB)

## Methodology
This research topic facilitates the use of an Autoregressive Integrated Moving Average (ARIMA) model to analyze the NOK exchange rate. ARIMA combines three components: autoregression (AR), differencing (I), and moving average (MA), and is a widely used method in time series analysis. As opposed to many other models, ARIMA captures both short- and long-term trends due to its AR- and MA-components, respectively. Additionally, it is not dependent on a specific probability distribution, which makes it a good fit for our purposes.

Furthermore, the Box-Jenkins methodology will be followed in the estimation of the model, and function as an overarching guide in our process. This methodology is an iterative approach, which comprises four steps: Identification of process, estimation of parameters, verification, and forecasting. 

We will then assess the performance of the model using metrics such as AIC, BIC, and Log-Likelihood. Additionally, we will ensure residuals are approximately white noise using the Ljung-Box test.

## Model
The Python code implements an ARIMA model to forecast the NOK/EUR exchange rate based on the key macroeconomic indicators mentioned above. This forecast has its foundation in exogenous variables and utilizes historical time series data to make predictions for future exchange rates. 

First, relevant libraries are imported:


In [None]:
import pandas as pd
import requests
import sys
import yfinance as yf
import math
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.stats.diagnostic import acorr_ljungbox

Subsequently, time series data is retrieved from various reliable sources, as described in the *Data* section above. This data will include both training data – to train fit the model – and future data – to compare predictions.

In [None]:
def get_request_dataframe(url, params):
    # Make request
    r = requests.get(url, params=params)
    if r.status_code != 200:
        print('ERROR: Request for historical exchange rate failed.')
        sys.exit(1)

    # Make dataframe
    temp_key = list(r.json()['data']['dataSets'][0]['series'].keys())[0]
    data = pd.DataFrame(r.json()['data']['dataSets'][0]['series'][temp_key]['observations']).transpose().astype(float)

    # Make timestamp index
    idx = pd.DatetimeIndex(pd.DataFrame(r.json()['data']['structure']['dimensions']['observation'][0]['values'])['start'])
    data.set_index(idx, drop=True, inplace=True)
    data.index.name = 'index'
    return data.resample('MS').mean()


def get_exchange_rates(start_date, end_date, rate):
    url = f"https://data.norges-bank.no/api/data/EXR/B.{rate.upper()}.NOK.SP?"
    params = {
        'format': 'sdmx-json',
        'startPeriod': start_date.strftime('%Y-%m-%d'),
        'endPeriod': end_date.strftime('%Y-%m-%d'),
        'locale': 'en'
    }
    df = get_request_dataframe(url, params)
    df.rename(columns={0: f'exchange_rate'.lower()}, inplace=True)
    return df


def get_interest_rates(start_date, end_date):
    # Iterate through data for government bonds and treasury bills
    url = f"https://data.norges-bank.no/api/data/IR/B.KPRA.SD.?"
    params = {
        'format': 'sdmx-json',
        'startPeriod': start_date.strftime('%Y-%m-%d'),
        'endPeriod': end_date.strftime('%Y-%m-%d'),
        'locale': 'en'
    }
    df = get_request_dataframe(url, params)
    df.rename(columns={0: 'interest_rate'}, inplace=True)
    return df.resample('MS').mean()


def get_cpi(start_date, end_date):
    df = pd.read_csv("https://data.ssb.no/api/v0/dataset/1086.csv?lang=en", encoding="ISO-8859-1")
    df = df[df['contents'] == 'Monthly change (per cent)']  # Consumer Price Index (2015=100), 12-month rate (percent), Monthly change (per cent)
    df.index = df['month'].apply(lambda row: pd.Timestamp(row.replace('M', '-')+'-01'))
    df = df[(df.index >= start_date) & (df.index <= end_date)]
    df.drop(columns=['consumption group', 'month', 'contents'], inplace=True)
    df.columns = ['cpi']
    return df.astype(float)


def get_gdp(start_date, end_date):
    df = pd.read_csv("https://data.ssb.no/api/v0/dataset/615167.csv?lang=en", encoding="ISO-8859-1")
    df = df[df['macroeconomic indicator'] == 'bnpb.nr23_9 Gross domestic product, market values']
    df = df[df['contents'] == 'Change in value from the previous month, seasonally adjusted (per cent)']  # 'Change in <value/volume> from the previous month, seasonally adjusted (per cent)'
    df.index = df['month'].apply(lambda row: pd.Timestamp(row.replace('M', '-')+'-01'))
    df = df[(df.index >= start_date) & (df.index <= end_date)]
    df.drop(columns=['macroeconomic indicator', 'month', 'contents'], inplace=True)
    df.columns = ['gdp']
    return df.astype(float)


def get_trade_balance(start_date, end_date):
    df = pd.read_csv("https://data.ssb.no/api/v0/dataset/179421.csv?lang=en", encoding="ISO-8859-1")
    df = df[df['trade flow'] == 'Hbtot Trade balance, goods (Total exports - total imports)']
    df = df[df['contents'] == 'Seasonal adjusted']  # 'Unadjusted'
    df.index = df['month'].apply(lambda row: pd.Timestamp(row.replace('M', '-')+'-01'))
    df.drop(columns=['trade flow', 'month', 'contents'], inplace=True)
    df.columns = ['trade_balance']
    df = df.pct_change() * 100  # Find month-to-month change rate
    df = df[(df.index >= start_date) & (df.index <= end_date)]
    return df.astype(float)


def get_yahoo_data(ticker_code, start_date, end_date):
    c = yf.Ticker(ticker_code)
    load_months_back = math.ceil((pd.Timestamp.today()-start_date).days / 20)
    df = c.history(period=f"{load_months_back}mo")
    df.index = df.index.tz_localize(None)
    df = df[(df.index >= start_date) & (df.index <= end_date)]
    df = df[['Close']].rename(columns={'Close': 0})
    return df.resample('MS').mean()


def get_oil_price(start_date, end_date):
    return get_yahoo_data('BZ=F', start_date, end_date).rename(columns={0: 'oil_price'})


def get_gas_price(start_date, end_date):
    return get_yahoo_data('TTF=F', start_date, end_date).rename(columns={0: 'gas_price'})


def get_historical_data(start_date, end_date, rate):
    frames = [get_exchange_rates(start_date, end_date, rate),
              get_interest_rates(start_date, end_date),
              get_cpi(start_date, end_date),
              get_gdp(start_date, end_date),
              get_trade_balance(start_date, end_date),
              get_oil_price(start_date, end_date),
              get_gas_price(start_date, end_date)]
    return pd.concat(frames, axis=1)

Then, the ARIMA model is fitted to the data to produce predictions. Based on these, several diagnostics tests are conducted to certify the model’s reliability and validate whether it is statistically sound. This includes evaluations of the significance of coefficients, residuals, and white noise and autocorrelation.

In [None]:
from historical_data import get_historical_data

# Set parameters
rate = 'EUR'
training_start = pd.Timestamp('2018-01-01')
training_end = pd.Timestamp('2023-12-31')
prediction_end = pd.Timestamp('2024-12-31')
p, d, q = 3, 1, 1  # Todo: set  # Define the order of the ARIMA model; (p, d, q)
run_name = f'{rate}_{training_start.year}_{training_end.year}_{prediction_end.year}_p{p}_d{d}_q{q}'.lower()

# Collect input data
df = get_historical_data(start_date=training_start, end_date=prediction_end, rate=rate)
training_df = df.loc[:training_end]
test_df = df.loc[training_end:]

# Fit an ARIMA model
model = ARIMA(training_df['exchange_rate'], exog=training_df.drop(columns=['exchange_rate']), order=(p, d, q))

# Fit the model
model_fit = model.fit()

# Output model summary
print(model_fit.summary())
try:
    f = open(f"test_figures_and_results\\{run_name}_result.txt", "x")
except FileExistsError:
    f = open(f"test_figures_and_results\\{run_name}_result.txt", "w")
f.write(str(model_fit.summary()))

## Results 



In [None]:
# Diagnostic plots
model_fit.plot_diagnostics(figsize=(12, 8))
plt.savefig(f"test_figures_and_results\\{run_name}_diagnostics.png")
plt.show()

# Step 4: Forecasting
forecast_index = test_df.index
forecast = model_fit.forecast(steps=len(forecast_index), exog=test_df.drop(columns=['exchange_rate']))
forecast_df = pd.DataFrame(list(forecast), columns=['Forecast'], index=forecast_index)

# Step 5: Combine actual and forecast for visualization
plt.figure(figsize=(12, 6))
plt.plot(df['exchange_rate'], label='Actual Exchange Rate', color='blue')
plt.plot(forecast_df, label='Forecasted Exchange Rate', color='orange', linestyle='--')
plt.title('Exchange Rate Forecast using ARIMA Model')
plt.xlabel('Date')
plt.ylabel('Exchange Rate')
plt.legend()
plt.savefig(f"test_figures_and_results\\{run_name}_forecast.png")
plt.show()

## Empirical analysis
The most significant discovery illustrated by the graphs above, is that the trade balance is a seemingly redundant regressor in this model; no clear causal relationship between this and the NOK/EUR exchange rate may be determined by this model.

The fan chart above shows a comparison between the forecasted (orange) and the actual (blue) exchange rates throughout 2024, along with various confidence intervals (50,65,80 and 95%). As illustrated by the graph, the predicted line fluctuates significantly less than the actual line, and remains rather flat around 11.5 throughout the year. The exchange rate remains within a 50% confidence interval of the forecast for the entire year, with the exception of april, in which is moves up to the 60% confidence interval.

The highest log likelihood so far 23.806 with p=3, d=1, q=1.

## Weaknesses and limitations
Like all predictive models, ours is not exempt from inherent weaknesses or possible implicit biases. Whether it be the selection of variables, the construction of the model, or the interpretation of the results, several steps in the process are vulnerable to errors – both computational and human.

The greatest potential for errors lies in the selection of variables. The variable selection the the bread and butter of modeling, and serves as the very fundament upon which the model is constructed. The chosen regressors are essentially assumptions we make about our research question, and if our assumptions are wrong, it could very well cause ripple effects throughout the entire model - it is difficult to draw a correct conclusion based on false premises. It is highly unlikely that six factors alone determine the NOK/EUR exchange rate. However, our limited resources are reflected in the scope of this study: The drivers of the NOK exchange rate are believably richer in number and nuance than what is presented in this report, but this nuance would be impossible to capture within the timeframe we were given to complete this task.

While the ARIMA model is a good fit for this assignment, ARIMA models generally also have a few inherent limitations, which must be accounted for: While this format in theory supports both short- and long-term modeling of time series, its ability to accurately predict values in the long run may be inadequate. Additionally, ARIMA may be sensitive to outliers, and its linear assumptions can cause errors during extraordinary economic events – such as crashes or booms. ARIMA also requires time-consuming tuning, which may be a disadvantage.

Modern technology equips us with the computational power to accurately predict movements and behaviors in financial markets through simulations and modeling, yet there are limits to what we can accomplish. It would represent a great fallacy to view the economy as an internally virtuous, self-regulating entity – the economy is essentially a man-made structure at the mercy of human behavior, which can be erratic, irrational and unpredictable.


## Use of LLMs
Language Learning Models (LLMs) have not been utilized in the writing of this report.


## Bibliography

[1] Tsay, Ruey S. *Financial Time Series* (2010). New Jersey: Wiley & Sons.

[2] Brooks, Chris. *Introductory Econometrics For Finance* (2019). Cambridge: Cambridge University Press.