# Time Series Prediction with BQML and AutoML

Objectives
 - Learn how BQML and AutoML Tables can be used then building time series models
 
In this lab we will explore

In [None]:
#  Ensure that we have the latest version of Tensorflow installed.
!pip3 freeze | grep tf-nightly-2.0-preview || pip3 install tf-nightly-2.0-preview

## Set up environment variables and load necessary libraries

In [None]:
PROJECT = "munn-sandbox"  # Replace with your PROJECT
REGION = "us-east1"            # Choose an available region for Cloud MLE

In [None]:
import os
os.environ["PROJECT"] = PROJECT
os.environ["REGION"] = REGION

In [None]:
!pip freeze | grep google-cloud-bigquery==1.6.1 || pip install google-cloud-bigquery==1.6.1

In [None]:
# Allow you to easily have Python variables in SQL query.
from IPython.core.magic import register_cell_magic
from IPython import get_ipython

@register_cell_magic('with_globals')
def with_globals(line, cell):
    contents = cell.format(**globals())
    if 'print' in line:
        print(contents)
    get_ipython().run_cell(contents)

## Review the dataset

In the previous lab we created the training dataset we will use for modeling and saved it as 

In [None]:
%load_ext google.cloud.bigquery

In [None]:
%%bigquery --project {PROJECT}
SELECT *
FROM stock_market.percent_change_sp500
LIMIT 10

## Using BQML

### Create classification model for `direction`

To create a model
1. Use `CREATE MODEL` and provide a destination table for resulting model. Alternatively we can use `CREATE OR REPLACE MODEL` which allows overwriting an existing model.
2. Use `OPTIONS` to specify the model type (linear_reg or logistic_reg). There are many more options [we could specify](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create#model_option_list), such as regularization and learning rate, but we'll accept the defaults.
3. Provide the query which fetches the training data 

Have a look at [Step Two of this tutorial](https://cloud.google.com/bigquery/docs/bigqueryml-natality) to see another example.

**The query will take about two minutes to complete**


We'll start with creating a classification model to predict the `direction` of each stock. 

We'll take a random split using the `symbol` value. With about 500 different values, using `MOD(ABS(FARM_FINGERPRINT(symbol)), 25) = 1` will give about 20 distinct `symbol` values which still corresponds to ______ training examples.

In [None]:
%%bigquery --project $PROJECT
CREATE OR REPLACE MODEL
  stock_market.direction_model OPTIONS(model_type = "logistic_reg",
    input_label_cols = ["direction"]) AS
  -- query to fetch training data
SELECT
  symbol,
  Date,
  Open,
  Close,
  tomorrow_close,
  tomo_close_m_close,
  close_MIN_prior_5_days,
  close_MIN_prior_20_days,
  close_MIN_prior_260_days,
  close_MAX_prior_5_days,
  close_MAX_prior_20_days,
  close_MAX_prior_260_days,
  close_AVG_prior_5_days,
  close_AVG_prior_20_days,
  close_AVG_prior_260_days,
  close_STDDEV_prior_5_days,
  close_STDDEV_prior_20_days,
  close_STDDEV_prior_260_days,
  direction
FROM
  `munn-sandbox.stock_market.percent_change_sp500`
WHERE
  normalized_change IS NOT NULL
  AND tomorrow_close IS NOT NULL
  AND tomo_close_m_close IS NOT NULL
  AND MOD(ABS(FARM_FINGERPRINT(symbol)), 25) = 1

## Get training statistics

To get the training results we use the [`ML.TRAINING_INFO`](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-train) function.

In [None]:
%%bigquery --project $PROJECT
SELECT
    *
FROM
    ML.TRAINING_INFO(MODEL `stock_market.direction_model`)

In [None]:
%%bigquery --project $PROJECT
SELECT
  *
FROM
  ml.EVALUATE(MODEL `stock_market.direction_model`)

### Create regression model for `normalized change`

Similar to above


In [None]:
%%bigquery --project $PROJECT
CREATE OR REPLACE MODEL
  stock_market.price_model OPTIONS(model_type = "linear_reg",
    input_label_cols = ["normalized_change"]) AS
  -- query to fetch training data
SELECT
  symbol,
  Date,
  Open,
  Close,
  tomorrow_close,
  tomo_close_m_close,
  close_MIN_prior_5_days,
  close_MIN_prior_20_days,
  close_MIN_prior_260_days,
  close_MAX_prior_5_days,
  close_MAX_prior_20_days,
  close_MAX_prior_260_days,
  close_AVG_prior_5_days,
  close_AVG_prior_20_days,
  close_AVG_prior_260_days,
  close_STDDEV_prior_5_days,
  close_STDDEV_prior_20_days,
  close_STDDEV_prior_260_days,
  normalized_change
FROM
  `munn-sandbox.stock_market.percent_change_sp500`
WHERE
  normalized_change IS NOT NULL
  AND tomorrow_close IS NOT NULL
  AND tomo_close_m_close IS NOT NULL
  AND MOD(ABS(FARM_FINGERPRINT(symbol)), 30) = 1

In [None]:
%%bigquery --project $PROJECT
SELECT
    *
FROM
    ML.TRAINING_INFO(MODEL `stock_market.price_model`)

In [None]:
%%bigquery --project $PROJECT
SELECT
  *
FROM
  ml.EVALUATE(MODEL `stock_market.price_model`)

## Using AutoML Tables

Will look at either regression or regression.