# 2. Building Features with Tecton and Snowflake

In this tutorial we'll cover how you can use Tecton and Snowflake to build features for machine learning.  We'll cover:
* How features are written in Tecton
* How to use Notebook Driven Development (NDD) to declare and test out a new feature in local code
* How to register features with Tecton
* How to use Tecton Aggregations to do easy window aggregations

## 0. Setup

✅ Run the cells below.

In [1]:
import logging
import os
import tecton
from dotenv import load_dotenv
import pandas as pd
import snowflake.connector
from datetime import date, datetime, timedelta
from pprint import pprint

In [2]:
#Details were sent in an email
%env SNOWFLAKE_USER=bot
%env SNOWFLAKE_PASSWORD=3F9e-MYdPD9y.tg6ooyWK-
%env SNOWFLAKE_ACCOUNT=tectonpartner-tecton_demo_buddy

env: SNOWFLAKE_USER=bot
env: SNOWFLAKE_PASSWORD=3F9e-MYdPD9y.tg6ooyWK-
env: SNOWFLAKE_ACCOUNT=tectonpartner-tecton_demo_buddy


In [18]:
load_dotenv()  # take environment variables from .env.
logging.getLogger('snowflake.connector').setLevel(logging.WARNING)
logging.getLogger('snowflake.snowpark').setLevel(logging.WARNING)

connection_parameters = {
    "user": os.environ['SNOWFLAKE_USER'],
    "password": os.environ['SNOWFLAKE_PASSWORD'],
    "account": os.environ['SNOWFLAKE_ACCOUNT'],
    "warehouse": "TRIAL_WAREHOUSE",
    # Database and schema are required to create various temporary objects by tecton
    "database": "BILLIE_SAMPLE_DATA",
    "schema": "PUBLIC"
}
conn = snowflake.connector.connect(**connection_parameters)
tecton.snowflake_context.set_connection(conn) # Tecton will use this Snowflake connection for all interactive queries

# Quick helper function to query snowflake from a notebook
# Make sure to replace with the appropriate connection details for your own account
def query_snowflake(query):
    df = conn.cursor().execute(query).fetch_pandas_all()
    return df

ws = tecton.get_workspace('prod')
tecton.version.summary()

Version: 0.7.0b42
Git Commit: d242a4371078ba54913e2a60e7169bfba4d01623
Build Datetime: 2023-07-06T21:07:50


### ❓ Before we start -- Tecton Workspaces

[Workspaces](https://docs.tecton.ai/overviews/workspaces.html) are like a sandbox environment that can be used for experimenting with a Feature Repo without affecting the production environment. Changes made in one workspace will have no affect on other Workspaces.

By default, new "development" workspaces do not have access to materialization and storage resources. Instead, transformations can be run ad-hoc in your Snowflake Warehouse. This means that the Tecton SDK builds a query that reads directly from your raw data tables, and executes it in your Snowflake Warehouse.

This ad-hoc computation functionality can be used in any workspace and allows you to easily test features without needing to backfill and materialize data to the Feature Store.

New workspaces with full materialization and storage resources can be created with the addition of the _--live_ flag during create time in the below CLI command. This can be useful for creating staging environments for testing features online before pushing changes to prod, or for creating isolation between different teams.

**In this tutorial, we'll create a new workspace to ensure our changes don't effect other's workloads**

### ✅ Create your own Tecton Workspace
In this tutorial, we'll create a new [Workspace](https://docs.tecton.ai/docs/setting-up-tecton/administration-setup/creating-a-workspace-and-adding-users-to-the-workspace) to test our changes.

Workspaces are created using the Tecton CLI. Let's make one now:

Create a workspace by running `tecton workspace create MY_WORKSPACE`.

```
$ tecton workspace create MY_WORKSPACE
Created workspace "MY_WORKSPACE".
Switched to workspace "MY_WORKSPACE".

You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "tecton plan" Tecton will not see any existing state
for this configuration.
```

> 💡**Tip:** For a complete list of workspace commands, simply run `tecton workspace -h`

Then, grab a reference to the new Workspace you created that we'll reference later.

In [5]:
ws = tecton.get_workspace("MY_WORKSPACE")

### ✅ Clone the Sample Feature Repo
In Tecton, a [feature repository](https://docs.tecton.ai/docs/introduction/tecton-concepts#feature-repository) is a collection of declarative Python files that define feature pipelines. In this tutorial, we'll clone a pre-populated feature repository to use as a starting point.

The [sample feature repository for this demo can be found here](https://github.com/tecton-ai-ext/tecton-snowflake-feature-repo) -- if you already checked out this git repository to get a copy of this tutorial, you should already have the important files downloaded.  If not, clone the sample repository.

### ✅ Apply the Sample Feature Repo

To register a local feature repository with Tecton, [you'll use the Tecton CLI.](https://docs.tecton.ai/examples/managing-feature-repos.html) Since you are working in a new Workspace, it does not currently have anything registered, so your first time adding features should be simple.

Navigate to the feature repository's directory in the command line:
```
cd feature_repo
```


Then run the following command to register your feature definitions with Tecton:
```
tecton apply
```


Take note of the workspace you are applying to to make sure it is correct. Then go ahead and apply the plan with `y`.

> 💡 **Tip:** You can always compare your local Feature Repo to the remote Feature Registry before applying it by running `tecton plan`.

## 1. Constructing a new feature
Let's start by building a simple feature -- **the amount of the last transaction a user made**. First, let's run a query against the raw data in Snowflake (feel free to run this yourself in a Snowflake worksheet as well).

In [5]:
# Preview the data directly
user_transaction_amount_query = '''
SELECT 
    *
FROM 
    BILLIE_SAMPLE_DATA.PUBLIC.AUTH_REQUESTS_UNPACKED LIMIT 10 
'''
auth_requests = query_snowflake(user_transaction_amount_query)
auth_requests.head(10)

Unnamed: 0,ID,CREATED_AT,MERCHANT_ID,VISITORID,IP,EMAIL,BILLING_ADDRESS,SHIPPING_ADDRESS,LINE_ITEMS
0,1921606,1679486188000,-1074003659029159318,-8636565255437836707,6960582723091238351,-33295019794765792,"{'id': 1921606, 'country': 'DE', 'postal_code'...","{'id': 1921606, 'country': 'DE', 'postal_code'...",[{'title': 'HAZET Adventskalender Santa Tools ...
1,1921607,1679486197000,-1074003659029159318,2305843009213693951,2305843009213693951,-553434567889759477,"{'id': 1921607, 'country': 'DE', 'postal_code'...","{'id': 1921607, 'country': 'DE', 'postal_code'...",[{'title': 'Fridavo Spiralfeder-Türband Modell...
2,1921610,1679486266000,-1074003659029159318,-589824405368207078,-1215157676276439474,-611886056677952324,"{'id': 1921610, 'country': 'DE', 'postal_code'...","{'id': 1921610, 'country': 'DE', 'postal_code'...",[{'title': 'DIN 580 Ringschraube VG M20 1.1141...
3,1921613,1679486301000,-1074003659029159318,2305843009213693951,2305843009213693951,-6301389037114712840,"{'id': 1921613, 'country': 'DE', 'postal_code'...","{'id': 1921613, 'country': 'DE', 'postal_code'...",[{'title': 'STIER Plattform-Stapler Tragkraft ...
4,1921616,1679486326000,-1074003659029159318,2305843009213693951,2305843009213693951,-183223317835027590,"{'id': 1921616, 'country': 'DE', 'postal_code'...","{'id': 1921616, 'country': 'DE', 'postal_code'...",[{'title': 'Bosch Akku-Drehschlagschrauber GDR...
5,1921617,1679486328000,-1074003659029159318,-4223637637177766309,-5342899456393251989,-5112902425357246269,"{'id': 1921617, 'country': 'DE', 'postal_code'...","{'id': 1921617, 'country': 'DE', 'postal_code'...",[{'title': 'STIER Großer Hygiene- und Reinigun...
6,1921621,1679486389000,-1074003659029159318,6455987449927227743,-5190251482354478166,-6432019298146586702,"{'id': 1921621, 'country': 'DE', 'postal_code'...","{'id': 1921621, 'country': 'DE', 'postal_code'...",[{'title': 'Schutzoverall MICROGARD® 1500 PLUS...
7,1921624,1679486474000,-1074003659029159318,1813400384698639650,-5968779026022807797,-2659081159483958942,"{'id': 1921624, 'country': 'DE', 'postal_code'...","{'id': 1921624, 'country': 'DE', 'postal_code'...",[{'title': 'KNIPEX 10 99 I220 Ohrklemmenzange ...
8,1921625,1679486482000,-1074003659029159318,6391209288703425623,6101083029686041587,6492279592149134372,"{'id': 1921625, 'country': 'DE', 'postal_code'...","{'id': 1921625, 'country': 'DE', 'postal_code'...",[{'title': 'Makita Akku-Gebläse 2x18 V DUB361Z...
9,1921626,1679486556000,-1074003659029159318,-8044194553508927271,4391889581518918425,5480111861921493877,"{'id': 1921626, 'country': 'DE', 'postal_code'...","{'id': 1921626, 'country': 'DE', 'postal_code'...",[{'title': 'Nitrouniversalverdünner 6l Kaniste...


In [19]:
ws = tecton.get_workspace("daniel-dev")

In [21]:
from dateutil.parser import parse
fv = ws.get_feature_view('test_feature')
#df = fv.get_historical_features(from_source=True)

start_time = parse('2023-06-02')
end_time = parse('2023-06-03')
auth_df = fv.run(start_time=start_time,
                 end_time=end_time,
                 authorization_requests=pd.DataFrame({'ID': [1,2], 'CREATED_AT': [start_time, end_time]})
)




In [16]:
auth_df.to_pandas().head(10)

Unnamed: 0,ID,VALUE,TIMESTAMP
0,2042003,2042003,2023-06-02 00:06:45
1,2042004,2042004,2023-06-02 00:07:21
2,2042005,2042005,2023-06-02 00:07:28
3,2042008,2042008,2023-06-02 00:23:29
4,2042010,2042010,2023-06-02 01:33:34
5,2042015,2042015,2023-06-02 02:13:49
6,2042016,2042016,2023-06-02 02:14:48
7,2042049,2042049,2023-06-02 05:07:45
8,2042050,2042050,2023-06-02 05:11:00
9,2042051,2042051,2023-06-02 05:13:01
