# ARIMA Forecasting on Stock Price Data

Time series analysis is performed using an ARIMA(0,1,0) model on stock price data. The workflow includes missing value handling, technical indicator computation, model fitting, forecast visualization, and trading signal generation.

## Library Imports

Required libraries for data manipulation, modeling, and visualization are imported.

In [None]:
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from pathlib import Path
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score, mean_absolute_error
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA

## Function: `dt(loc)`

A CSV file containing stock data is read, and dates are parsed to ensure proper time series handling.

In [None]:
def dt(loc):
    stock = pd.read_csv(loc, parse_dates=True)
    return stock

## Function: `fillin(loc)`

Missing values in the 'Close' column are handled using linear interpolation for small gaps and backfilling for any remaining missing values.

In [None]:
def fillin(loc):
    stock = dt(loc)
    g = stock['Close'].isna().cumsum().where(stock['Close'].isna())
    stock['Close'] = stock['Close'].where(g.map(g.value_counts()) > 3, stock['Close'].interpolate('linear'))
    stock['Close'] = stock['Close'].bfill()
    return stock

## Function: `markers(loc)`

Technical indicators such as the 7-day and 30-day moving averages, daily returns, and the 14-day Relative Strength Index (RSI) are computed and added to the dataset.

In [None]:
def markers(loc):
    stock = fillin(loc)
    stock['7DMA'] = stock['Close'].shift(1).rolling(window=7, min_periods=0).mean()
    stock['30DMA'] = stock['Close'].shift(1).rolling(window=30, min_periods=0).mean()
    stock['Return'] = stock['Close'].shift(1).pct_change() * 100
    change = stock['Close'].diff()
    avgain = change.where(change > 0, 0).shift(1).rolling(window=14, min_periods=1).sum() / 14
    avloss = abs(change.where(change < 0, 0).shift(1).rolling(window=14, min_periods=1).sum()) / 14
    stock['RSI'] = (avgain / (avgain + avloss)) * 100
    return stock

## Function: `ins(loc)`

The 'Close' column is extracted for use as the input series in ARIMA modeling.

In [None]:
def ins(loc):
    stock = markers(loc)
    inp = pd.DataFrame()
    inp['Close'] = stock['Close']
    return inp

## Function: `arima(loc)`

An ARIMA(0,1,0) model is fitted to the training data. Forecasts for the test period are generated and visualized with confidence intervals. Model performance is evaluated using Mean Absolute Error (MAE).

In [None]:
def arima(loc):
    inp = ins(loc)
    stock = markers(loc)
    dtrain, dtest = inp[:int(len(inp) * 0.8)], inp[int(len(inp) * 0.8):]
    makem = ARIMA(dtrain, order=(0, 1, 0))
    fitm = makem.fit()
    pred = fitm.get_forecast(steps=len(dtest))
    stock['Predicted'] = np.nan
    predvalues = pred.predicted_mean.values
    stock.loc[dtest.index, 'Predicted'] = predvalues
    confint = pred.conf_int()
    mae = mean_absolute_error(dtest['Close'], predvalues)
    print(f'MAE: {mae:.2f}')
    plt.figure(figsize=(12, 6))
    plt.plot(dtrain.index, dtrain['Close'], label='Train')
    plt.plot(dtest.index, dtest['Close'], label='Actual')
    plt.plot(dtest.index, predvalues, color='red', label='Prediction')
    plt.fill_between(dtest.index, confint.iloc[:, 0], confint.iloc[:, 1], color='pink', alpha=0.3)
    plt.xlabel('Index')
    plt.ylabel('Close Price')
    plt.title('ARIMA(0,1,0) Forecast vs Actual')
    plt.legend()
    plt.show()
    return stock

In [None]:
def signal(loc):
    stock = arima(loc)
    trainp = stock[stock['Predicted'].isna()]
    testp = stock[stock['Predicted'].notna()].copy()
    testp['Predicted'] = testp['Predicted'].astype(float)
    lastknown = trainp['Close'].iloc[-1]
    testp['Signal'] = np.where(testp['Predicted'] > lastknown, 'Buy', np.where(testp['Predicted'] < lastknown, 'Sell', '--'))
    testp['Profit'] = abs(testp['Predicted'] - lastknown)
    print('Maximum Possible Profit in Test Period:', testp['Predicted'].max() - lastknown)
    print('Best Date to Sell:', testp.loc[testp['Predicted'].idxmax()])
    return testp

## Running the ARIMA Model

The ARIMA workflow is executed on the specified dataset. Model evaluation metrics, trading signals, and profit calculations are displayed.

In [None]:
# Replace this path with your own
print(signal(Path.home() / 'Downloads' / 'Amazon' / 'amzn.us.csv'))