##### Task 2: Develop Time Series Forecasting Models for TSLA

This notebook fetches TSLA data from yfinance (2015-07-01 to 2025-07-31), performs a chronological split (train: 2015–2023-12-31, test: 2024–2025-07-31), and trains ARIMA/SARIMA and LSTM models. It evaluates with MAE, RMSE, MAPE and plots forecasts vs. actuals.


In [None]:
import warnings
warnings.filterwarnings('ignore')

import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime

from src.data_manager import DataManager
from models.forecasting_models import ARIMAForecaster, SARIMAForecaster, LSTMForecaster, ForecastingEngine
from src.model_selection import chronological_split, arima_order_grid_search

plt.style.use('seaborn-v0_8')
%matplotlib inline


### 1) Fetch TSLA data from yfinance


In [None]:
START_DATE = '2015-07-01'
END_DATE = '2025-07-31'
TICKER = ['TSLA']

dm = DataManager(data_source='yfinance')
raw = dm.fetch_data(TICKER, start_date=START_DATE, end_date=END_DATE, frequency='1d')
# Extract Adj Close for TSLA
tsla_close = raw['TSLA']['Adj Close'].rename('TSLA').dropna()
tsla_close.head()


### 2) Chronological split: train vs test (no shuffling)


In [None]:
TRAIN_END_DATE = '2023-12-31'
train, test = chronological_split(tsla_close, train_end_date=TRAIN_END_DATE)
len(train), len(test), train.index.min(), train.index.max(), test.index.min(), test.index.max()


### 3) Baseline visualization


In [None]:
fig, ax = plt.subplots(figsize=(12,4))
train.plot(ax=ax, label='Train')
test.plot(ax=ax, label='Test')
ax.set_title('TSLA Adjusted Close (Train/Test)')
ax.legend();


### 4) ARIMA order selection via grid search


In [None]:
best_order, metrics = arima_order_grid_search(train, p_values=range(0,4), d_values=range(0,2), q_values=range(0,4), criterion='aic')
best_order, metrics


### 5) Train ARIMA and forecast over the test horizon


In [None]:
arima_model = ARIMAForecaster(order=best_order).fit(train)
arima_pred = arima_model.predict(steps=len(test))
arima_eval = arima_model.evaluate(test, arima_pred)
arima_eval


### 6) Optional: SARIMA quick baseline


In [None]:
sarima = SARIMAForecaster(order=(1,1,1), seasonal_order=(1,1,1,12)).fit(train)
sarima_pred = sarima.predict(steps=len(test))
sarima_eval = sarima.evaluate(test, sarima_pred)
sarima_eval


### 7) LSTM model


In [None]:
try:
    lstm = LSTMForecaster(units=64, dropout=0.2, epochs=10, batch_size=32, lookback=60)
    lstm.fit(train)
    lstm_pred = lstm.predict(steps=len(test))
    lstm_eval = lstm.evaluate(test, lstm_pred)
except Exception as e:
    lstm_pred, lstm_eval = None, {'mse': np.nan, 'mae': np.nan, 'rmse': np.nan, 'mape': np.nan}
lstm_eval


### 8) Compare metrics


In [None]:
results = pd.DataFrame({
    'ARIMA': arima_eval,
    'SARIMA': sarima_eval,
    'LSTM': lstm_eval
}).T
results[['mae','rmse','mape']].sort_values('rmse')


### 9) Plot forecasts vs actuals


In [None]:
fig, ax = plt.subplots(figsize=(12,4))
test.plot(ax=ax, label='Actual', color='black')
pd.Series(arima_pred, index=test.index).plot(ax=ax, label='ARIMA')
pd.Series(sarima_pred, index=test.index).plot(ax=ax, label='SARIMA')
if lstm_pred is not None:
    pd.Series(lstm_pred, index=test.index).plot(ax=ax, label='LSTM')
ax.set_title('TSLA Forecasts vs Actuals (Test)')
ax.legend();


### 10) Brief Discussion

- ARIMA/SARIMA offer interpretability and strong baselines on many financial series after proper differencing.
- LSTM can capture nonlinear patterns but requires careful tuning, more data, and is sensitive to scaling/lookback choices.
- Compare MAE/RMSE/MAPE above to decide which performed better on 2024–2025 test data.
