In [None]:
# import tecton and other libraries
import os
import tecton
import pandas as pd
from datetime import datetime, timedelta

ws = tecton.get_workspace('prod')

# Data Sources
First we will be creating a connection to Tecton from a batch data source. Once a data source is defined in Tecton, it can be used to build features. The code below shows allows Tecton to read data on demonstrating example transactions. It was been added to your workspace already.

```
from tecton import BatchSource, FileConfig
from datetime import datetime, timedelta

transactions = BatchSource(
    name="transactions",
    batch_config=FileConfig(
        uri="s3://tecton.ai.public/tutorials/fraud_demo/transactions/data.pq",
        file_format="parquet",
        timestamp_field="timestamp",
    ),
)
```

Use the example above to create a customers BatchSource with
* The **name** as ***customers***
* **uri** being ***s3://tecton.ai.public/tutorials/fraud_demo/customers/data.pq***
* The same **file_format** as above
* The **timestamp_field** of ***signup_timestamp***

In [None]:
## add import statements here


## add BatchSource definition here
customers = BatchSource()

# Features Views
With some data sources defined, we can build features views off them. *Feature views* are used by Tecton to define how, when, and where it will materialize features. Feature views can be either Batch FeatureViews, Stream Feature Views, or On-Demand Feature Views. Below, is an example of an incomplete Batch Feature View.

```
from tecton import Entity, BatchSource, FileConfig, batch_feature_view, FilteredSource
from datetime import datetime, timedelta

user = Entity(name="user", join_keys=["user_id"])

@batch_feature_view(
    sources=[FilteredSource(customers)],
    entities=[user],
    mode="spark_sql",
    batch_schedule=timedelta(days=1),
    ttl=timedelta(days=3650),
)
def user_credit_card_issuer(customers):
    return f"""
        SELECT
            user_id,
            signup_timestamp,
            CASE SUBSTRING(CAST(cc_num AS STRING), 0, 1)
                WHEN '4' THEN 'Visa'
                ELSE 'other'
            END as credit_card_issuer
        FROM
            {customers}
        """
```

In the empty cell below, redefine the feature to include the cases when a **'5'** appears to be ***'MasterCard'*** and when **'6'** appears to be ***'Discover'***

Afterwords, running the validate() cell will ensure that Tecton can reach the specified data source and it is has the proper schema

In [None]:
user_credit_card_issuer.validate()

# Feature Views with Aggregations
Aggregations can simplify implementations of common powerful features. For this feature view, we will perform a number of different aggregations to show the average and total transaction amounts for each user of given time periods. We will be utilizing just two, but Tecton provides [many different aggregations.](https://docs.tecton.ai/docs/sdk-reference/time-window-aggregation-functions#docusaurus_skipToContent_fallback)

```
from tecton import Aggregation, batch_feature_view, Entity, FilteredSource
from datetime import datetime, timedelta

transactions = ws.get_data_source("transactions")

@batch_feature_view(
    sources=[FilteredSource(transactions)],
    entities=[user],
    mode="spark_sql",
    batch_schedule=timedelta(days=1),
    aggregation_interval=timedelta(minutes=5),
    aggregations=[
        Aggregation(column="amt", function="count", time_window=timedelta(minutes=15)),
        Aggregation(column="amt", function="count", time_window=timedelta(minutes=30)),
        Aggregation(column="amt", function="count", time_window=timedelta(hours=1)),
        Aggregation(column="amt", function="sum", time_window=timedelta(minutes=15)),
        Aggregation(column="amt", function="sum", time_window=timedelta(minutes=30)),
        Aggregation(column="amt", function="sum", time_window=timedelta(hours=1)),
    ],
    online=True,
    offline=True,
    feature_start_time=datetime(2020, 10, 10),
    tags={'release': 'production'},
    owner='kevin@tecton.ai',
    description='Transaction amount statistics and total over a series of time windows, updated hourly.'
)
def user_transaction_amount_metrics(transactions):
    return f'''
        SELECT
            user_id,
            amt,
            timestamp
        FROM
            {transactions}
        '''
```

In the empty cell below, create a batch_feature_view with the following changes:

* Change the **aggregation_interval** from *5 minutes* to ***days=1***.
  * This will match our batch_schedule
* Change the **15 minute**, **30 minute** and **1 hour** intervals to ***days=1***, ***days=7***, ***days=28***
  * Because this batch source is being refreshed just once daily, aggregates that are not longer than that a daily interval will not be helpful.
* Change the aggregations using **count** to use ***mean***

In [None]:
user_transaction_amount_metrics.validate()

# On-Demand Feature Views
An On-Demand Feature View is used to run row-level, request-time transformations on data from Request Sources, Batch Feature Views, or Stream Feature Views. Unlike Batch and Stream Feature Views, On-Demand Feature Views do not precompute and materialize data to the Feature Store, but instead run transformations both online and offline at the time of the request.

We can build on top of the feature view we just made to compare historical values to requests happening at transaction time. Run the cell below, which defines the input and output Tecton should expect at transaction time.

In [None]:
from tecton import RequestSource
from tecton.types import Float64, Field, Bool

request_schema = [Field('amt', Float64)]
transaction_request = RequestSource(schema=request_schema)
output_schema = [Field('transaction_amount_is_higher_than_7d_average', Bool)]

We can use this new source to create an on_demand_feature_view below.

The @on_demand_feature_view
* will take in 2 **sources** as a list - the incoming **transaction_request** and the previously built feature **user_transaction_amount_metrics**.
* The **mode** will be ***'python'***
* The **schema** will be the **output_schema** that was just defined.

We will define the incoming transaction amount being higher that the current 7 day average for a particular user as follows:
```{python}
amount_mean = 0 if user_transaction_amount_metrics['amt_mean_7d_1d'] is None else user_transaction_amount_metrics['amt_mean_7d_1d']
    return {'transaction_amount_is_higher_than_7d_average': transaction_request['amt'] > amount_mean}
```

Use this as the feature definition and validate the on_demand_feature_view.

In [None]:
from tecton import on_demand_feature_view
from tecton import RequestSource
from tecton.types import Float64, Field, Bool

user_transaction_amount_metrics = ws.get_feature_view('user_transaction_amount_metrics')

@on_demand_feature_view(
   #fill in with parameters described above

)
def transaction_amount_is_higher_than_7d_average(transaction_request, user_transaction_amount_metrics):
    #definition goes here


transaction_amount_is_higher_than_7d_average.validate()

## go to [lab.tecton.ai](https://lab.tecton.ai)
Tecton automatically builds on top of pre-existing features to materialize this one as new data comes in, orchestrates it together automatically with a the other feature views, and creates and runs the spark and python jobs necessary to keep every feature view updated

# Feature Services

A Feature Service refernces a set of features which are exposed as an API. It's generally recommended that each machine learning model has an associated Feature Service. We will create the a Feature Service with 3 of the Feature Views we just built:
* user_credit_card_issuer
* user_transaction_amount_metrics[['amt_mean_7d_1d']]
* transaction_amount_is_higher_than_7d_average

In [None]:
from tecton import FeatureService

lab_fs = FeatureService(
  name = 'lab_fs',
  features = [
    #add features here
  ]
)

lab_fs.validate()

# Feature Retrieval
Now that all the features are defined in a feature set, we can generate data to send to machine learning models for training and serving.

Features are retrieved from the online store with HTTP requests for low latency. The cell below constructs an example request.

Try changing the ***amt*** given in the ***request_data***

In [None]:
import requests

headers = {"Authorization": "Tecton-key " + dbutils.secrets.get(scope='tecton-lab', key='TECTON_API_KEY')}

request_data = """{
  "params": {
    "feature_service_name": "lab_feature_service",
    "join_key_map": {
     "user_id": "user_469998441571"
    },
    "metadata_options": {
      "include_names": true
    },
    "request_context_map": {
      "amt": 123
    },
    "workspace_name": "prod"
  }
}"""

inference_feature_data = requests.request(
    method="POST",
    headers=headers,
    url="https://lab.tecton.ai/api/v1/feature-service/get-features",
    data=request_data,
)
print(inference_feature_data.text)