# Remittance to the Philippines â€“ Forecasting Analysis

**Dataset Source:**  
https://www.kaggle.com/datasets/joshbuttler/remittance-to-the-philippines

**Input File:**  
data/processed/remittance_cleaned.csv

**Purpose:**  
Forecast future remittance inflows by:
- Decomposing historical time series
- Applying ARIMA/SARIMA models
- Using Prophet for trend + seasonality modeling
- Evaluating forecast accuracy

In [None]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns

from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.statespace.sarimax import SARIMAX

from sklearn.metrics import mean_absolute_error, mean_squared_error

sns.set(style="whitegrid")
plt.rcParams["figure.figsize"] = (12, 6)

pd.set_option("display.float_format", "{:,.2f}".format)

In [None]:
from prophet import Prophet

In [None]:
DATA_PATH = "../data/processed/remittance_cleaned.csv"
df = pd.read_csv(DATA_PATH)

df.head()


In [None]:
# Convert date column if present
if "date" in df.columns:
    df["date"] = pd.to_datetime(df["date"])
    df.set_index("date", inplace=True)
elif "year" in df.columns:
    df["year"] = df["year"].astype(int)
    df.set_index("year", inplace=True)

df.head()

In [None]:
# Identify target variable
amount_col = "amount" if "amount" in df.columns else df.select_dtypes(np.number).columns[0]

# Aggregate to yearly level (recommended for stability)
ts = df[amount_col].resample("Y").sum()
ts.head()

In [None]:
ts.plot(title="Total Remittance Inflows Over Time")
plt.xlabel("Year")
plt.ylabel("Total Remittance Amount")
plt.show()

In [None]:
decomposition = seasonal_decompose(ts, model="additive", period=1)

decomposition.plot()
plt.show()

In [None]:
adf_result = adfuller(ts.dropna())

adf_summary = {
    "ADF Statistic": adf_result[0],
    "p-value": adf_result[1],
    "Critical Values": adf_result[4]
}

adf_summary

In [None]:
train_size = int(len(ts) * 0.8)

train, test = ts.iloc[:train_size], ts.iloc[train_size:]

train.shape, test.shape

In [None]:
arima_model = ARIMA(train, order=(1, 1, 1))
arima_fit = arima_model.fit()

arima_fit.summary()

In [None]:
arima_forecast = arima_fit.forecast(steps=len(test))

In [None]:
sarima_model = SARIMAX(
    train,
    order=(1, 1, 1),
    seasonal_order=(1, 1, 1, 1),
    enforce_stationarity=False,
    enforce_invertibility=False
)

sarima_fit = sarima_model.fit()
sarima_fit.summary()

In [None]:
sarima_forecast = sarima_fit.forecast(steps=len(test))

In [None]:
def evaluate_forecast(true, pred):
    return {
        "MAE": mean_absolute_error(true, pred),
        "RMSE": np.sqrt(mean_squared_error(true, pred))
    }

arima_metrics = evaluate_forecast(test, arima_forecast)
sarima_metrics = evaluate_forecast(test, sarima_forecast)

arima_metrics, sarima_metrics

In [None]:
plt.plot(train.index, train, label="Train")
plt.plot(test.index, test, label="Actual", marker="o")
plt.plot(test.index, arima_forecast, label="ARIMA Forecast", marker="o")
plt.plot(test.index, sarima_forecast, label="SARIMA Forecast", marker="o")

plt.legend()
plt.title("Remittance Forecast Comparison")
plt.show()

In [None]:
prophet_df = ts.reset_index()
prophet_df.columns = ["ds", "y"]

model = Prophet()
model.fit(prophet_df)

future = model.make_future_dataframe(periods=5, freq="Y")
forecast = model.predict(future)

model.plot(forecast)
plt.show()

In [None]:
final_model = SARIMAX(
    ts,
    order=(1, 1, 1),
    seasonal_order=(1, 1, 1, 1)
).fit()

future_forecast = final_model.forecast(steps=5)
future_forecast

## Forecast Interpretation

- Remittance inflows exhibit a persistent long-term trend.
- SARIMA generally outperforms ARIMA when seasonal patterns exist.
- Forecast uncertainty reflects macroeconomic exposure of remittance flows.
- Results can support fiscal planning, BSP projections, and policy simulation.