following [this](https://cloud.google.com/architecture/build-visualize-demand-forecast-prediction-datastream-dataflow-bigqueryml-looker#analyze-your-data-in-bigquery) tutorial, this notebook outlines the steps for the section:

# Analyze your data in BigQuery

### Run queries against your operational data

In [None]:
SELECT product_name, SUM(quantity) as total_sales
FROM `retail.ORDERS`
GROUP BY product_name
ORDER BY total_sales desc
LIMIT 3

to query the number of rows on both the `ORDERS` and `ORDERS_log` tables:

In [None]:
SELECT count(*) FROM `hackfast.retail.ORDERS_log`
SELECT count(*) FROM `hackfast.retail.ORDERS`

## Build a demand forecasting model in BigQuery ML

### create and save the training data to a new table named training_data

In [None]:
CREATE OR REPLACE TABLE `retail.training_data`
AS
   SELECT
       TIMESTAMP_TRUNC(time_of_sale, HOUR) as hourly_timestamp,
       product_name,
       SUM(quantity) AS total_sold
   FROM `retail.ORDERS`
       GROUP BY hourly_timestamp, product_name
       HAVING hourly_timestamp BETWEEN TIMESTAMP_TRUNC('2021-11-22', HOUR) AND
TIMESTAMP_TRUNC('2021-11-28', HOUR)
ORDER BY hourly_timestamp

In [None]:
# to verify
SELECT * FROM `retail.training_data` LIMIT 10;

## Forecast Demand with BQ ARIMA+

In [None]:
CREATE OR REPLACE MODEL `retail.arima_plus_model`
       OPTIONS(
         MODEL_TYPE='ARIMA_PLUS',
         TIME_SERIES_TIMESTAMP_COL='hourly_timestamp',
         TIME_SERIES_DATA_COL='total_sold',
         TIME_SERIES_ID_COL='product_name'
       ) AS
SELECT
   hourly_timestamp,
   product_name,
   total_sold
FROM
 `retail.training_data`

The `ML.FORECAST` function is used to forecast the expected demand over a horizon of n hours

To forecast the demand for organic bananas over the next 30 days...

In [None]:
SELECT * FROM ML.FORECAST(MODEL `retail.arima_plus_model`,  STRUCT(720 AS horizon))