# 参考資料
- [時系列分析のARモデルとは？](https://ai-trend.jp/basic-study/time-series-analysis/time-series-analysis-armodel/)
  - このノートブックで実装したModelはAR + 外生変数のARXモデル。
- [PythonによるSARIMAXモデルを使った「TVCMの効果検証」への挑戦](https://www.lifull.blog/entry/2019/12/25/151030#%E3%81%AF%E3%81%98%E3%82%81%E3%81%AB-SARIMAX%E3%83%A2%E3%83%87%E3%83%AB%E3%81%A8%E3%81%AF)
  - Xは外生変数。このノートブックの例では、DIA、SPAを外生変数と仮定。

# Environment

## Libraries

In [2]:
import sys

sys.path.append("../")
import pandas as pd
import statsmodels.api as sm
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

from datamart import Datamart
from feature import Feature
from model import Model
from name import Name
from raw_data import RawData
from symbol_data import SymbolData

## Functions & Classes

In [5]:
def create_datamart(
    ticker: str,
    num_lag: int = 5,
    days_before: int = 1,
    single_values: str = "close",
    nation: str = "US",
) -> pd.DataFrame:
    """データマートを生成する。"""
    name = Name(ticker, nation)
    ticker = name.ticker
    symbol_data = SymbolData(ticker).symbol_data
    raw_data = RawData(symbol_data).raw_data
    return Datamart(raw_data, single_values, num_lag, days_before, ticker).datamart

In [60]:
class ModelARX:
    """Model class predicts future price, up or down.
    Use Linear Regression as a predictor model.
    Args:
        datamart: the stock datamart you want to predict
        feature: explanatory variables
    """

    def __init__(self, datamart, feature):
        self._datamart = datamart
        self._feature = feature

    def fit(self):
        self.df = pd.concat([self._datamart, self._feature], axis=1)
        self.X = self.df.iloc[:, 1:]
        self.X = sm.add_constant(self.X)
        self.y = self.df["target"]
        self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(
            self.X, self.y, test_size=0.3, random_state=0
        )
        self.reg = sm.OLS(self.y_train, self.X_train)
        self.reg = self.reg.fit()

    def predict(self):
        self.y_pred = self.reg.predict(self.X_test)
        return self.y_pred

    def summary(self):
        print(self.reg.summary())

# Main

In [61]:
datamart_msft = create_datamart("msft")
datamart_dia = create_datamart("dia")
datamart_spy = create_datamart("spy")

In [7]:
features = Feature([datamart_dia, datamart_spy]).concat_datamarts()

In [62]:
model = ModelARX(datamart_msft, features)

In [63]:
model.fit()

In [64]:
model.predict()

timestamp
1611671400000    0.807774
1643207400000    1.358820
1631021400000    0.423499
1634563800000    0.723877
1613572200000    0.523351
                   ...   
1637764200000    0.468806
1625059800000    0.524280
1642516200000   -0.172654
1623936600000    0.794442
1616592600000    0.303118
Length: 84, dtype: float64

In [65]:
model.summary()

                            OLS Regression Results                            
Dep. Variable:                 target   R-squared:                       0.580
Model:                            OLS   Adj. R-squared:                  0.537
Method:                 Least Squares   F-statistic:                     13.58
Date:                Sat, 12 Feb 2022   Prob (F-statistic):           8.45e-25
Time:                        16:27:35   Log-Likelihood:                -57.055
No. Observations:                 196   AIC:                             152.1
Df Residuals:                     177   BIC:                             214.4
Df Model:                          18                                         
Covariance Type:            nonrobust                                         
                     coef    std err          t      P>|t|      [0.025      0.975]
----------------------------------------------------------------------------------
const              0.1592      0.758      0.