<h1 style="text-align:center;font-weight: bold;"><font color = "0077A7" size = "10px">Snowpark Time-Series Demo
  </font></h1>
<p style="text-align:center;">At the moment the <code>FORECAST</code> object creation step has not been able to get working but here are the steps on how someone can create the object. Below are provided two ways:</p>
</br>
<b>
<ol>
    <li>With data from the AtScale semantic layer</li>
    <li>With data in Snowflake using pure SQL to access it </li></ol></b>

# AtScale Connection

In [5]:
from atscale.client import Client
from atscale.data_model import DataModel
from atscale.project import Project
from atscale.db.connections import Snowflake
import json

In [None]:
with open("/permissions") as file:
    permissions = json.load(file)
    
with open("/requirements") as f:
    packages_version = json.load(f)

In [6]:
client = Client(
    server = permissions["atscale_server"],
    organization = permissions["atscale_organization"],
    username = permissions["atscale_username"],
    password = permissions["atscale_password"]
)

In [7]:
client.connect()

In [8]:
project = client.select_project(name_contains="M5 Walmart Sales")

Please choose a project:
Automatically selecting only option: "ID: 3e0c5e50-66d9-4ac4-6f3f-1fe7064ea269: Name: M5 Walmart Sales GBQ"
Please choose a published project:
Automatically selecting only option: "ID: 8af58fec-6832-4b46-76a3-f768cdbe4852: Name: M5 Walmart Sales GBQ"


In [9]:
dm = project.select_data_model()

Please choose a data model:
Automatically selecting only option: "ID: f99aafe3-66ac-4df6-5096-1305cef92aa1: Name: m5_walmart_sales"


# Snowpark Set Up

In [1]:
from snowflake.snowpark.dataframe import col
from snowflake.snowpark import Session

import pkg_resources
import pandas as pd

In [3]:
connection_parameters = {
    "account": permissions["snowflake_account"],
    "user": permissions["snowflake_username"],
    "password": permissions["snowflake_password"],
    "role": permissions["snowflake_role"],
    "warehouse": permissions["snowflake_warehouse"],
    "database": permissions["snowflake_database"],
    "schema": permissions["snowflake_schema"]
}

In [4]:
session = Session.builder.configs(connection_parameters).create()

</br></br><h1 style="text-align:center;font-weight: bold;"><font color = "0077A7" size = "10px"> USING ATSCALE x SNOWFLAKE</font></h1>

<p style="text-align:center;"><font color = "0077A7">This will bring the data from AtScale into Snowflake with <b>Python</b> and use <b>SQL</b> in Snowflake.</font></p></br></br>

# Feature Engineering

In [10]:
# Get numeric features 
num_features = dm.get_all_numeric_feature_names()
num_features

['average_sales',
 'average_units_sold',
 'm_UNITS_SOLD_stddev_pop',
 'max_sales',
 'max_units_sold',
 'new_measure',
 'population_variance_sales',
 'population_variance_units_sold',
 'sample_standard_deviation_sales',
 'sample_standard_deviation_units_sold',
 'sample_variance_units_sold',
 'total_categories',
 'total_departments',
 'total_items',
 'total_sales',
 'total_states',
 'total_stores',
 'total_transactions',
 'total_units_sold',
 'day_over_day_units_sold',
 'new_calculated_measure',
 'previous_days_units_sold',
 'previous_weeks_units_sold',
 'total_sales_30_prd_mv_avg',
 'total_units_sold_30_prd_mv_avg',
 'week_over_week_units_sold']

In [11]:
# Get categorical features 
cat_features = dm.get_all_categorical_feature_names()
cat_features

['year',
 'month',
 'date',
 'day_name',
 'day_of_week',
 'event_name_1',
 'event_name_2',
 'event_type_1',
 'event_type_2',
 'weekday',
 'category',
 'department',
 'item',
 'state',
 'store']

In [12]:
df = dm.get_data(['date', 'total_units_sold', 'weekday', 'day_of_week', 'event_name_1'], limit = 50000)
df.head()

# Using Snowflake

In [13]:
# Write table to Snowflake
table_name = 'TS_DEMO'
session.write_pandas(df, table_name, auto_create_table=True, overwrite=True)

<snowflake.snowpark.table.Table at 0x14be03040>

In order to make a Forecasting Object you must specify the timestamp column of type timestamp, we will use Snowflake's <code>TO_TIMESTAMP_NTZ()</code> to convert our datetime column to a timestamp.

In [14]:
# Cast DT to a timestamp and create a table
timestamp_sql = 'CREATE OR REPLACE TABLE timestamp_included ' + \
                '(timestamp, date, total_units_sold, weekday, day_of_week, event_name_1)' + \ 
                'AS SELECT TO_TIMESTAMP_NTZ("date"), "date","total_units_sold", "weekday", "day_of_week", "event_name_1" ' + \
                'FROM TS_DEMO'
session.sql(timestamp_sql).show()

--------------------------------------------------
|"status"                                        |
--------------------------------------------------
|Table TIMESTAMP_INCLUDED successfully created.  |
--------------------------------------------------



In [15]:
# Create a view to use as a path for INPUT_DATA
view_sql = "CREATE OR REPLACE VIEW v0 AS SELECT * FROM timestamp_included"
session.sql(view_sql).show()

---------------------------------
|"status"                       |
---------------------------------
|View V0 successfully created.  |
---------------------------------



In [16]:
# See new Table
session.sql('SELECT * FROM timestamp_included').show()

------------------------------------------------------------------------------------------------------
|"TIMESTAMP"          |"DATE"      |"TOTAL_UNITS_SOLD"  |"WEEKDAY"  |"DAY_OF_WEEK"  |"EVENT_NAME_1"  |
------------------------------------------------------------------------------------------------------
|2011-01-29 00:00:00  |2011-01-29  |3933                |Saturday   |1              |                |
|2011-01-30 00:00:00  |2011-01-30  |3841                |Sunday     |2              |                |
|2011-01-31 00:00:00  |2011-01-31  |2709                |Monday     |3              |                |
|2011-02-01 00:00:00  |2011-02-01  |2905                |Tuesday    |4              |                |
|2011-02-02 00:00:00  |2011-02-02  |2289                |Wednesday  |5              |                |
|2011-02-03 00:00:00  |2011-02-03  |3546                |Thursday   |6              |                |
|2011-02-04 00:00:00  |2011-02-04  |3473                |Friday     |7   

In [27]:
# Column Info
p = pd.DataFrame(session.sql('SHOW COLUMNS IN TABLE timestamp_included').collect())
p

Unnamed: 0,table_name,schema_name,column_name,data_type,null?,default,kind,expression,comment,database_name,autoincrement
0,TIMESTAMP_INCLUDED,SNOWPARK_TESTING,TIMESTAMP,"{""type"":""TIMESTAMP_NTZ"",""precision"":0,""scale"":...",True,,COLUMN,,,AI_LINK,
1,TIMESTAMP_INCLUDED,SNOWPARK_TESTING,DATE,"{""type"":""DATE"",""nullable"":true}",True,,COLUMN,,,AI_LINK,
2,TIMESTAMP_INCLUDED,SNOWPARK_TESTING,TOTAL_UNITS_SOLD,"{""type"":""FIXED"",""precision"":38,""scale"":0,""null...",True,,COLUMN,,,AI_LINK,
3,TIMESTAMP_INCLUDED,SNOWPARK_TESTING,WEEKDAY,"{""type"":""TEXT"",""length"":16777216,""byteLength"":...",True,,COLUMN,,,AI_LINK,
4,TIMESTAMP_INCLUDED,SNOWPARK_TESTING,DAY_OF_WEEK,"{""type"":""FIXED"",""precision"":38,""scale"":0,""null...",True,,COLUMN,,,AI_LINK,
5,TIMESTAMP_INCLUDED,SNOWPARK_TESTING,EVENT_NAME_1,"{""type"":""TEXT"",""length"":16777216,""byteLength"":...",True,,COLUMN,,,AI_LINK,


<p style="text-align:center;"><font color = "red">Internal Errors are causing this part of the demo not to work at the moment but this is where we could create our time-series object which would act as our model.</font></p>

In [None]:
%%time
time_col = 'TIMESTAMP'
target_col = 'TOTAL_UNITS_SOLD'

# Create SNOWFLAKE.ML.FORECAST object
sql ="CREATE OR REPLACE SNOWFLAKE.ML.FORECAST demo_time_series( " + \
     "INPUT_DATA => SYSTEM$REFERENCE('VIEW', 'v0'), " + \
     "TIMESTAMP_COLNAME => '" + time_col +"', " + \
     "TARGET_COLNAME => '" + target_col + "')"

session.sql(sql).show()

</br></br><h1 style="text-align:center;font-weight: bold;"><font color = "0077A7" size = "10px"> PURE SQL APPROACH
  </font></h1>
  <p style="text-align:center;"><font color = "0077A7">This will only use Snowflake and the data already inside of it through <b>SQL</b>.</font></p></br></br>


Get the data

In [None]:
# Cast DT to a timestamp and create a table
session.sql('CREATE OR REPLACE TABLE timestamp_included (timestamp, date, cat_id, dept_id, item_id, units_sold) AS SELECT TO_TIMESTAMP(dt), dt, cat_id, dept_id, item_id, units_sold FROM M5_TIME_SERIES_THIN').show()

In [None]:
## Turn into a dataframe
df = session.table('timestamp_included')
df.show()

Make a View

In [None]:
# Create a view
view_sql = "CREATE OR REPLACE VIEW v2 AS SELECT * FROM timestamp_included"
session.sql(view_sql).show()

In [None]:
## See the View
df = pd.DataFrame(session.sql('SELECT * from v2').collect())
df

Make a Forecast Object

In [None]:
# Create SNOWFLAKE.ML.FORECAST object
sql ="CREATE OR REPLACE SNOWFLAKE.ML.FORECAST demo_time_series(" + \
     "INPUT_DATA => SYSTEM$REFERENCE('VIEW', 'v2')," + \
     "TIMESTAMP_COLNAME => 'TIMESTAMP'," + \
     "TARGET_COLNAME => 'UNITS_SOLD')"
session.sql(sql).show()