## Create and Train the Statsmodel

## Steps

### Import Libraries

Start by importing the libraries we will need to train the model.

In [7]:
import pandas as pd
import datetime
import os

from statsmodels.tsa.arima.model import ARIMA
from resources import simdb as simdb

### Train the Model

Load the data from the file `day.csv` and prepare it to be used in the training.

In [8]:
# pull the data the way I will in "production"

def mk_dt_range_query(*, tablename: str, seed_day: str) -> str:
    assert isinstance(tablename, str)
    assert isinstance(seed_day, str)
    query = f"select cnt from {tablename} where date > DATE(DATE('{seed_day}'), '-1 month') AND date <= DATE('{seed_day}')"
    return query

conn = simdb.get_db_connection()

# create the query
query = mk_dt_range_query(tablename=simdb.tablename, seed_day='2011-03-01')
print(query)

# read in the data
training_frame = pd.read_sql_query(query, conn)
training_frame

select cnt from bikerentals where date > DATE(DATE('2011-03-01'), '-1 month') AND date <= DATE('2011-03-01')


Unnamed: 0,cnt
0,1526
1,1550
2,1708
3,1005
4,1623
5,1712
6,1530
7,1605
8,1538
9,1746


In [9]:
# test
import forecast
import json

# create the appropriate json
jsonstr = json.dumps(training_frame.to_dict(orient='list'))
print(jsonstr)

forecast.wallaroo_json(jsonstr)

{"cnt": [1526, 1550, 1708, 1005, 1623, 1712, 1530, 1605, 1538, 1746, 1472, 1589, 1913, 1815, 2115, 2475, 2927, 1635, 1812, 1107, 1450, 1917, 1807, 1461, 1969, 2402, 1446, 1851]}


{'forecast': [1764, 1749, 1743, 1741, 1740, 1740, 1740]}

In [10]:
import importlib
importlib.reload(forecast)

<module 'forecast' from '/Users/jhansarick/Storage/github/WallarooLabs/Wallaroo_Tutorials/wallaroo-features/pipeline_multiple_replicas_forecast_tutorial/forecast.py'>

### Prepare evaluation data

For ease of inference, we save off the evaluation data to a separate json file.

In [11]:
# save off the evaluation frame json, too
import json
with open("testdata_dict.json", "w") as f:
    json.dump(training_frame.to_dict(orient='list'), f)
