# Quick Start Tutorial: Reusing Features

## Learning Objectives

In this tutorial you will learn:
1. How to access catalogs of data, entities, features, and feature lists
2. How to search for features suitable for the unit of analysis
3. How to understand an existing feature
4. How to create new features from existing features
5. How to create a new feature list from existing features

## Set up the prerequisites

Learning Objectives

In this section you will:
* connect to the remote featurebyte server
* import libraries
* learn the about catalogs
* activate a pre-built catalog

### Load the featurebyte library and connect to the remote featurebyte server

In [None]:
import urllib.request

# install featurebyte package and download supporting library
%pip install --no-warn-conflicts featurebyte
urllib.request.urlretrieve("https://raw.githubusercontent.com/featurebyte/featurebyte-hosted-tutorials/main/tutorials/notebooks/prebuilt_catalogs.py", "prebuilt_catalogs.py")

In [1]:
# library imports
import pandas as pd
import numpy as np
import random

# load the featurebyte SDK
import featurebyte as fb

# replace <api_token> with your API token you receieved after registering
fb.register_tutorial_api_token("<api_token>")

# define the database name for this tutorial
TUTORIAL_DATABASE = "TUTORIAL_DATASETS"

[32;20m14:08:53[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mUsing configuration file at: /Users/smillet/.featurebyte/config.yaml[0m[0m
[32;20m14:08:53[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mUsing profile: tutorial[0m[0m
[32;20m14:08:53[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mUsing configuration file at: /Users/smillet/.featurebyte/config.yaml[0m[0m
[32;20m14:08:53[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mActive profile: tutorial (https://tutorials.featurebyte.com/api/v1)[0m[0m
[32;20m14:08:54[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mNo catalog activated.[0m[0m
[32;20m14:08:54[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20m2 feature lists, 9 features deployed[0m[0m


### Create a pre-built catalog for this tutorial, with the data, metadata, and features already set up

Note that creating a pre-built catalog is not a step you will do in real-life. This is a function specific to this quick-start tutorial to quickly skip over many of the preparatory steps and get you to a point where you can materialize features.

In a real-life project you would do data modeling, declaring the tables, entities, and the associated metadata. This would not be a frequent task, but forms the basis for best-practice feature engineering.

In [2]:
# get the functions to create a pre-built catalog
from prebuilt_catalogs import *

# create a new catalog for this tutorial
catalog = create_tutorial_catalog(PrebuiltCatalog.QuickStartReusingFeatures)

Cleaning up existing tutorial catalogs
Cleaning catalog: quick start model training 20230726:1402
  1 historical feature tables
  1 observation tables


[32;20m14:08:57[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mCatalog activated: quick start model training 20230726:1402[0m[0m


Done! |████████████████████████████████████████| 100% in 6.5s (0.16%/s)         
Done! |████████████████████████████████████████| 100% in 6.5s (0.16%/s)         
Building a quick start catalog for reusing features named [quick start reusing features 20230726:1409]
Creating new catalog
Catalog created


[32;20m14:09:27[0m | [1m[38;20mINFO    [0m[0m | [1m[38;20mCatalog activated: quick start reusing features 20230726:1409[0m[0m


Registering the source tables
Registering the entities
Tagging the entities to columns in the data tables
Populating the feature store with example features
Done! |████████████████████████████████████████| 100% in 6.7s (0.15%/s)         
Done! |████████████████████████████████████████| 100% in 6.5s (0.16%/s)         
Done! |████████████████████████████████████████| 100% in 9.7s (0.10%/s)         
Loading Feature(s) |████████████████████████████████████████| 1/1 [100%] in 0.5s
Done! |████████████████████████████████████████| 100% in 6.5s (0.16%/s)         
Loading Feature(s) |████████████████████████████████████████| 1/1 [100%] in 0.6s
Done! |████████████████████████████████████████| 100% in 6.5s (0.16%/s)         
Done! |████████████████████████████████████████| 100% in 6.5s (0.16%/s)         
Done! |████████████████████████████████████████| 100% in 6.5s (0.16%/s)         
Loading Feature(s) |████████████████████████████████████████| 1/1 [100%] in 0.5s
Done! |██████████████████████████

### Example: Load the tables and views

In [3]:
# get the tables for this workspace
grocery_customer_table = catalog.get_table("GROCERYCUSTOMER")
grocery_items_table = catalog.get_table("INVOICEITEMS")
grocery_invoice_table = catalog.get_table("GROCERYINVOICE")
grocery_product_table = catalog.get_table("GROCERYPRODUCT")

# create the views
grocery_customer_view = grocery_customer_table.get_view()
grocery_invoice_view = grocery_invoice_table.get_view()
grocery_items_view = grocery_items_table.get_view()
grocery_product_view = grocery_product_table.get_view()

## Accessing Catalogs

Learning Objectives:

In this section you will learn how to display catalogs of:
* tables
* entities
* features
* feature lists

### Example: A catalog of tables

In [4]:
# list the tables in the catalog
catalog.list_tables()

Unnamed: 0,id,name,type,status,entities,created_at
0,64c16161660f7100dc1af5f1,GROCERYPRODUCT,dimension_table,PUBLIC_DRAFT,[groceryproduct],2023-07-26 18:09:38.267
1,64c1615e660f7100dc1af5f0,INVOICEITEMS,item_table,PUBLIC_DRAFT,"[groceryinvoice, groceryproduct]",2023-07-26 18:09:35.685
2,64c1615b660f7100dc1af5ef,GROCERYINVOICE,event_table,PUBLIC_DRAFT,"[groceryinvoice, grocerycustomer]",2023-07-26 18:09:32.411
3,64c16159660f7100dc1af5ee,GROCERYCUSTOMER,scd_table,PUBLIC_DRAFT,"[grocerycustomer, frenchstate]",2023-07-26 18:09:30.265


In [5]:
# load a table
grocery_customer_table = catalog.get_table("GROCERYCUSTOMER")

# show the metadata
grocery_customer_table.info()

Unnamed: 0,name,serving_names,catalog_name
0,frenchstate,[FRENCHSTATE],quick start reusing features 20230726:1409
1,grocerycustomer,[GROCERYCUSTOMERGUID],quick start reusing features 20230726:1409
name,GROCERYCUSTOMER,,
created_at,2023-07-26 18:09:30,,
updated_at,2023-07-26 18:09:42,,
status,PUBLIC_DRAFT,,
catalog_name,quick start reusing features 20230726:1409,,
record_creation_timestamp_column,record_available_at,,
table_details,database_name  TUTORIAL_DATASETS  schema_name  GROCERY  table_name  GROCERYCUSTOMER,,
entities,name  serving_names  catalog_name  0  frenchstate  [FRENCHSTATE]  quick start reusing features 20230726:1409  1  grocerycustomer  [GROCERYCUSTOMERGUID]  quick start reusing features 20230726:1409,,

0,1
database_name,TUTORIAL_DATASETS
schema_name,GROCERY
table_name,GROCERYCUSTOMER

Unnamed: 0,name,serving_names,catalog_name
0,frenchstate,[FRENCHSTATE],quick start reusing features 20230726:1409
1,grocerycustomer,[GROCERYCUSTOMERGUID],quick start reusing features 20230726:1409


### Example: A catalog of entities

In [6]:
# list the entities in the catalog
catalog.list_entities()

Unnamed: 0,id,name,serving_names,created_at
0,64c16165660f7100dc1af5f5,frenchstate,[FRENCHSTATE],2023-07-26 18:09:41.796
1,64c16164660f7100dc1af5f4,groceryproduct,[GROCERYPRODUCTGUID],2023-07-26 18:09:40.895
2,64c16163660f7100dc1af5f3,groceryinvoice,[GROCERYINVOICEGUID],2023-07-26 18:09:40.041
3,64c16162660f7100dc1af5f2,grocerycustomer,[GROCERYCUSTOMERGUID],2023-07-26 18:09:39.123


In [7]:
# list the entity relationships in the catalog
catalog.list_relationships()

Unnamed: 0,id,relationship_type,entity,related_entity,relation_table,relation_table_type,enabled,created_at,updated_at
0,64c16168d3f396eb49f62b63,child_parent,groceryinvoice,grocerycustomer,GROCERYINVOICE,event_table,True,2023-07-26 18:09:44.185,
1,64c16166a20ddc7a8ee9c845,child_parent,grocerycustomer,frenchstate,GROCERYCUSTOMER,scd_table,True,2023-07-26 18:09:42.979,


In [8]:
# load an entity
customer_entity = catalog.get_entity("grocerycustomer")

# show the metadata
customer_entity.info()

0,1
name,grocerycustomer
created_at,2023-07-26 18:09:39
updated_at,2023-07-26 18:09:44
serving_names,['GROCERYCUSTOMERGUID']
catalog_name,quick start reusing features 20230726:1409


### Example: A catalog of features

In [9]:
# list the features in the catalog
catalog.list_features()

Unnamed: 0,id,name,dtype,readiness,online_enabled,tables,primary_tables,entities,primary_entities,created_at
0,64c161ce660f7100dc1af60a,InvoiceUniqueProductGroupCount,FLOAT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT]",[INVOICEITEMS],[groceryinvoice],[groceryinvoice],2023-07-26 18:11:29.471
1,64c161c6660f7100dc1af608,InvoiceDiscountAmount,FLOAT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS]",[INVOICEITEMS],[groceryinvoice],[groceryinvoice],2023-07-26 18:11:21.051
2,64c161bf660f7100dc1af607,InvoiceItemCount,FLOAT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS]",[INVOICEITEMS],[groceryinvoice],[groceryinvoice],2023-07-26 18:11:14.414
3,64c161b7660f7100dc1af606,CustomerYearOfBirth,INT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[grocerycustomer],[grocerycustomer],2023-07-26 18:11:05.293
4,64c161af660f7100dc1af603,CustomerSpend_14d,FLOAT,DRAFT,False,[GROCERYINVOICE],[GROCERYINVOICE],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:57.482
5,64c161a7660f7100dc1af601,CustomerInventory_24w,OBJECT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT]",[INVOICEITEMS],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:50.288
6,64c1619f660f7100dc1af5ff,CustomerInventory_28d,OBJECT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT]",[INVOICEITEMS],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:42.223
7,64c16197660f7100dc1af5fe,StateMeanLongitude,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:10:33.475
8,64c16190660f7100dc1af5fd,StateMeanLatitude,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:10:26.073
9,64c16188660f7100dc1af5fb,StateAvgInvoiceAmount_28d,FLOAT,DRAFT,False,"[GROCERYCUSTOMER, GROCERYINVOICE]",[GROCERYINVOICE],[frenchstate],[frenchstate],2023-07-26 18:10:18.423


In [10]:
# load a feature
state_population = catalog.get_feature("StatePopulation")

# show the metadata
state_population.info()

Unnamed: 0_level_0,name,serving_names,catalog_name
Unnamed: 0_level_1,name,serving_names,catalog_name
Unnamed: 0_level_2,name,status,catalog_name
Unnamed: 0_level_3,name,status,catalog_name
0,frenchstate,[FRENCHSTATE],quick start reusing features 20230726:1409
0,frenchstate,[FRENCHSTATE],quick start reusing features 20230726:1409
0,GROCERYCUSTOMER,PUBLIC_DRAFT,quick start reusing features 20230726:1409
0,GROCERYCUSTOMER,PUBLIC_DRAFT,quick start reusing features 20230726:1409
name,StatePopulation,,
created_at,2023-07-26 18:09:59,,
updated_at,2023-07-26 18:09:59,,
entities,name  serving_names  catalog_name  0  frenchstate  [FRENCHSTATE]  quick start reusing features 20230726:1409,,
primary_entity,name  serving_names  catalog_name  0  frenchstate  [FRENCHSTATE]  quick start reusing features 20230726:1409,,
tables,name  status  catalog_name  0  GROCERYCUSTOMER  PUBLIC_DRAFT  quick start reusing features 20230726:1409,,

Unnamed: 0,name,serving_names,catalog_name
0,frenchstate,[FRENCHSTATE],quick start reusing features 20230726:1409

Unnamed: 0,name,serving_names,catalog_name
0,frenchstate,[FRENCHSTATE],quick start reusing features 20230726:1409

Unnamed: 0,name,status,catalog_name
0,GROCERYCUSTOMER,PUBLIC_DRAFT,quick start reusing features 20230726:1409

Unnamed: 0,name,status,catalog_name
0,GROCERYCUSTOMER,PUBLIC_DRAFT,quick start reusing features 20230726:1409

0,1
this,V230726
default,V230726

0,1
this,DRAFT
default,DRAFT

0,1
this,[]
default,[]

0,1
this,[]
default,[]

0,1
input_columns,Input0  data  GROCERYCUSTOMER  column_name  GroceryCustomerGuid  semantic  scd_natural_key_id  Input1  data  GROCERYCUSTOMER  column_name  ValidFrom  semantic  None
derived_columns,
aggregations,F0  name  StatePopulation  column  None  function  count  keys  ['State']  window  None  category  None  filter  False
post_aggregation,"name  StatePopulation  inputs  ['F0']  transforms  ['is_null', 'conditional']"

0,1
Input0,data  GROCERYCUSTOMER  column_name  GroceryCustomerGuid  semantic  scd_natural_key_id
Input1,data  GROCERYCUSTOMER  column_name  ValidFrom  semantic  None

0,1
data,GROCERYCUSTOMER
column_name,GroceryCustomerGuid
semantic,scd_natural_key_id

0,1
data,GROCERYCUSTOMER
column_name,ValidFrom
semantic,

0,1
F0,name  StatePopulation  column  None  function  count  keys  ['State']  window  None  category  None  filter  False

0,1
name,StatePopulation
column,
function,count
keys,['State']
window,
category,
filter,False

0,1
name,StatePopulation
inputs,['F0']
transforms,"['is_null', 'conditional']"


In [11]:
# show the feature lineage for the state population feature
display(state_population.definition)

### Example: A catalog of feature lists

In [12]:
# list the feature lists in the catalog
catalog.list_feature_lists()

Unnamed: 0,id,name,num_feature,status,deployed,readiness_frac,online_frac,tables,entities,primary_entities,created_at
0,64c161d5660f7100dc1af60b,StateFeatureList,5,DRAFT,False,0.0,0.0,"[GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS...",[frenchstate],[frenchstate],2023-07-26 18:11:34.464


In [13]:
# load the feature list
state_features = catalog.get_feature_list("StateFeatureList")

# show the metadata
state_features.info()

Loading Feature(s) |████████████████████████████████████████| 5/5 [100%] in 0.5s


Unnamed: 0_level_0,name,serving_names,catalog_name
Unnamed: 0_level_1,name,serving_names,catalog_name
Unnamed: 0_level_2,name,status,catalog_name
Unnamed: 0_level_3,dtype,count,Unnamed: 3_level_3
0,frenchstate,[FRENCHSTATE],quick start reusing features 20230726:1409
0,frenchstate,[FRENCHSTATE],quick start reusing features 20230726:1409
0,GROCERYPRODUCT,PUBLIC_DRAFT,quick start reusing features 20230726:1409
1,INVOICEITEMS,PUBLIC_DRAFT,quick start reusing features 20230726:1409
2,GROCERYINVOICE,PUBLIC_DRAFT,quick start reusing features 20230726:1409
3,GROCERYCUSTOMER,PUBLIC_DRAFT,quick start reusing features 20230726:1409
0,OBJECT,1,
1,FLOAT,4,
name,StateFeatureList,,
created_at,2023-07-26 18:11:34,,

Unnamed: 0,name,serving_names,catalog_name
0,frenchstate,[FRENCHSTATE],quick start reusing features 20230726:1409

Unnamed: 0,name,serving_names,catalog_name
0,frenchstate,[FRENCHSTATE],quick start reusing features 20230726:1409

Unnamed: 0,name,status,catalog_name
0,GROCERYPRODUCT,PUBLIC_DRAFT,quick start reusing features 20230726:1409
1,INVOICEITEMS,PUBLIC_DRAFT,quick start reusing features 20230726:1409
2,GROCERYINVOICE,PUBLIC_DRAFT,quick start reusing features 20230726:1409
3,GROCERYCUSTOMER,PUBLIC_DRAFT,quick start reusing features 20230726:1409

Unnamed: 0,dtype,count
0,OBJECT,1
1,FLOAT,4

0,1
this,V230726
default,V230726

0,1
this,0.0
default,0.0

0,1
this,1.0
default,1.0


In [14]:
# list the features in the feature list
state_features.list_features()

Unnamed: 0,id,name,version,dtype,readiness,online_enabled,tables,primary_tables,entities,primary_entities,created_at,is_default
0,64c16197660f7100dc1af5fe,StateMeanLongitude,V230726,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:10:33.462,True
1,64c16190660f7100dc1af5fd,StateMeanLatitude,V230726,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:10:26.058,True
2,64c16188660f7100dc1af5fb,StateAvgInvoiceAmount_28d,V230726,FLOAT,DRAFT,False,"[GROCERYCUSTOMER, GROCERYINVOICE]",[GROCERYINVOICE],[frenchstate],[frenchstate],2023-07-26 18:10:18.406,True
3,64c1617d660f7100dc1af5f9,StateInventory_28d,V230726,OBJECT,DRAFT,False,"[GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS...",[INVOICEITEMS],[frenchstate],[frenchstate],2023-07-26 18:10:09.321,True
4,64c16175660f7100dc1af5f7,StatePopulation,V230726,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:09:59.465,True


## Search for Features

Learning Objectives

In this section, you will learn:
* what a primary entity is
* how to search for suitable features

### Concept: Primary entity

<b>Feature primary entity:</b> The primary entity of a feature defines the level of analysis for that feature. 
When a feature is a result of an aggregation grouped by multiple entities, the primary entity is a tuple of those entities. For instance, if a feature quantifies the interaction between a customer entity and a merchant entity in the past, such as the sum of transaction amounts grouped by customer and merchant in the past 4 weeks, the primary entity is the tuple of customer and merchant.

When a feature is derived for features with different primary entities, the primary entity is determined by the entity relationships, and the lowest level entity is selected as the primary entity. If the underlying entities have no relationship, the primary entity becomes a tuple of those entities. For example, if a feature compares the basket of a customer with the average basket of customers in the same city, the primary entity is the customer since the customer entity is a child of the customer city entity. However, if the feature is the distance between the customer location and the merchant location, the primary entity becomes the tuple of customer and merchant since these entities do not have any child-parent relationship.

<b>Feature List primary entity:</b> The main focus of a feature list is determined by its primary entity, which typically corresponds to the primary entity of the Use Case that the feature list was created for.

If the features within the list pertain to different primary entities, the primary entity of the feature list is selected based on the entities relationships, with the lowest level entity chosen as the primary entity. In cases where there are no relationships between entities, the primary entity may become a tuple comprising those entities.
To illustrate, consider a feature list comprising features related to card, customer, and customer city. In this case, the primary entity is the card entity since it is a child of both the customer and customer city entities. However, if the feature list also contains features for merchant and merchant city, the primary entity is a tuple of card and merchant.

<b>Use Case primary entity:</b> In a Use Case, the primary entity is the object or concept that defines its problem statement. Usually, this entity is singular, but in cases such as the recommendation engine use case, it can be a tuple of entities that interact with each other.

### Case study: Predicting customer spend

Consider a use case to predict customer spend. The unit of analysis and primary entity is grocery customer. You can use features with primary entities of grocery customer or french state (because state is a parent entity of customer).

### Example: Search for suitable features

In [15]:
# get a list of all the features in the catalog
all_features = catalog.list_features()

# filter to retain only those with grocery customer or state as their primary entity
child_entity = "groceryinvoice"
suitable_features = all_features.loc[[child_entity not in x for x in all_features.entities.values]]
product_entity = "groceryproduct"
suitable_features = suitable_features.loc[
    [product_entity not in x for x in suitable_features.entities.values]
]

# show the features
display(suitable_features)

Unnamed: 0,id,name,dtype,readiness,online_enabled,tables,primary_tables,entities,primary_entities,created_at
3,64c161b7660f7100dc1af606,CustomerYearOfBirth,INT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[grocerycustomer],[grocerycustomer],2023-07-26 18:11:05.293
4,64c161af660f7100dc1af603,CustomerSpend_14d,FLOAT,DRAFT,False,[GROCERYINVOICE],[GROCERYINVOICE],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:57.482
5,64c161a7660f7100dc1af601,CustomerInventory_24w,OBJECT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT]",[INVOICEITEMS],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:50.288
6,64c1619f660f7100dc1af5ff,CustomerInventory_28d,OBJECT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT]",[INVOICEITEMS],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:42.223
7,64c16197660f7100dc1af5fe,StateMeanLongitude,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:10:33.475
8,64c16190660f7100dc1af5fd,StateMeanLatitude,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:10:26.073
9,64c16188660f7100dc1af5fb,StateAvgInvoiceAmount_28d,FLOAT,DRAFT,False,"[GROCERYCUSTOMER, GROCERYINVOICE]",[GROCERYINVOICE],[frenchstate],[frenchstate],2023-07-26 18:10:18.423
10,64c1617d660f7100dc1af5f9,StateInventory_28d,OBJECT,DRAFT,False,"[GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS...",[INVOICEITEMS],[frenchstate],[frenchstate],2023-07-26 18:10:09.339
11,64c16175660f7100dc1af5f7,StatePopulation,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:09:59.482
12,64c1616d660f7100dc1af5f6,StateName,VARCHAR,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[grocerycustomer],[grocerycustomer],2023-07-26 18:09:51.427


In [16]:
# find suitable features that use the grocery invoice items table
grocery_items_features = suitable_features.loc[
    ["INVOICEITEMS" in x for x in suitable_features.tables.values]
]

# show the features
display(grocery_items_features)

Unnamed: 0,id,name,dtype,readiness,online_enabled,tables,primary_tables,entities,primary_entities,created_at
5,64c161a7660f7100dc1af601,CustomerInventory_24w,OBJECT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT]",[INVOICEITEMS],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:50.288
6,64c1619f660f7100dc1af5ff,CustomerInventory_28d,OBJECT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT]",[INVOICEITEMS],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:42.223
10,64c1617d660f7100dc1af5f9,StateInventory_28d,OBJECT,DRAFT,False,"[GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS...",[INVOICEITEMS],[frenchstate],[frenchstate],2023-07-26 18:10:09.339


## Understand an Existing feature

Learning Objectives

In this section you will learn how to:
* load a feature from the catalog
* view the metadata of a feature
* materialize feature values
* view feature lineage as a definition file

### Example: Load a feature from the catalog

In [17]:
# get the CustomerInventory_28d feature
customer_inventory_28d = catalog.get_feature("CustomerInventory_28d")

### Example: View the metadata of a feature

In [18]:
# get a list of all the features in the catalog
all_features = catalog.list_features()

# display the current feature
display(all_features.loc[all_features.name == customer_inventory_28d.name])

Unnamed: 0,id,name,dtype,readiness,online_enabled,tables,primary_tables,entities,primary_entities,created_at
6,64c1619f660f7100dc1af5ff,CustomerInventory_28d,OBJECT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT]",[INVOICEITEMS],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:42.223


In [19]:
# view the detailed metadata
customer_inventory_28d.info()

Unnamed: 0_level_0,name,serving_names,catalog_name
Unnamed: 0_level_1,name,serving_names,catalog_name
Unnamed: 0_level_2,name,status,catalog_name
Unnamed: 0_level_3,name,status,catalog_name
Unnamed: 0_level_4,table_name,feature_job_setting,Unnamed: 3_level_4
Unnamed: 0_level_5,table_name,feature_job_setting,Unnamed: 3_level_5
0,grocerycustomer,[GROCERYCUSTOMERGUID],quick start reusing features 20230726:1409
0,grocerycustomer,[GROCERYCUSTOMERGUID],quick start reusing features 20230726:1409
0,GROCERYPRODUCT,PUBLIC_DRAFT,quick start reusing features 20230726:1409
1,INVOICEITEMS,PUBLIC_DRAFT,quick start reusing features 20230726:1409
2,GROCERYINVOICE,PUBLIC_DRAFT,quick start reusing features 20230726:1409
0,INVOICEITEMS,PUBLIC_DRAFT,quick start reusing features 20230726:1409
0,GROCERYINVOICE,"{'blind_spot': '0s', 'frequency': '3600s', 'time_modulo_frequency': '90s'}",
0,GROCERYINVOICE,"{'blind_spot': '0s', 'frequency': '3600s', 'time_modulo_frequency': '90s'}",
name,CustomerInventory_28d,,
created_at,2023-07-26 18:10:42,,

Unnamed: 0,name,serving_names,catalog_name
0,grocerycustomer,[GROCERYCUSTOMERGUID],quick start reusing features 20230726:1409

Unnamed: 0,name,serving_names,catalog_name
0,grocerycustomer,[GROCERYCUSTOMERGUID],quick start reusing features 20230726:1409

Unnamed: 0,name,status,catalog_name
0,GROCERYPRODUCT,PUBLIC_DRAFT,quick start reusing features 20230726:1409
1,INVOICEITEMS,PUBLIC_DRAFT,quick start reusing features 20230726:1409
2,GROCERYINVOICE,PUBLIC_DRAFT,quick start reusing features 20230726:1409

Unnamed: 0,name,status,catalog_name
0,INVOICEITEMS,PUBLIC_DRAFT,quick start reusing features 20230726:1409

0,1
this,V230726
default,V230726

0,1
this,DRAFT
default,DRAFT

Unnamed: 0_level_0,table_name,feature_job_setting
Unnamed: 0_level_1,table_name,feature_job_setting
0,GROCERYINVOICE,"{'blind_spot': '0s', 'frequency': '3600s', 'time_modulo_frequency': '90s'}"
0,GROCERYINVOICE,"{'blind_spot': '0s', 'frequency': '3600s', 'time_modulo_frequency': '90s'}"
this,"table_name  feature_job_setting  0  GROCERYINVOICE  {'blind_spot': '0s', 'frequency': '3600s', 'time_modulo_frequency': '90s'}",
default,"table_name  feature_job_setting  0  GROCERYINVOICE  {'blind_spot': '0s', 'frequency': '3600s', 'time_modulo_frequency': '90s'}",

Unnamed: 0,table_name,feature_job_setting
0,GROCERYINVOICE,"{'blind_spot': '0s', 'frequency': '3600s', 'time_modulo_frequency': '90s'}"

Unnamed: 0,table_name,feature_job_setting
0,GROCERYINVOICE,"{'blind_spot': '0s', 'frequency': '3600s', 'time_modulo_frequency': '90s'}"

0,1
this,[]
default,[]

0,1
input_columns,Input0  data  GROCERYPRODUCT  column_name  ProductGroup  semantic  None
derived_columns,
aggregations,F0  name  CustomerInventory_28d  column  None  function  count  keys  ['GroceryCustomerGuid']  window  28d  category  ProductGroup  filter  False
post_aggregation,

0,1
Input0,data  GROCERYPRODUCT  column_name  ProductGroup  semantic  None

0,1
data,GROCERYPRODUCT
column_name,ProductGroup
semantic,

0,1
F0,name  CustomerInventory_28d  column  None  function  count  keys  ['GroceryCustomerGuid']  window  28d  category  ProductGroup  filter  False

0,1
name,CustomerInventory_28d
column,
function,count
keys,['GroceryCustomerGuid']
window,28d
category,ProductGroup
filter,False


### Example: Materialize sample values

In [20]:
# get some invoice IDs and invoice event timestamps from Q4 2022
filter = (grocery_invoice_view["Timestamp"].dt.year == 2022) & (
    grocery_invoice_view["Timestamp"].dt.month >= 10
)

observation_set = (
    grocery_invoice_view[filter]
    .sample(10)[["GroceryCustomerGuid", "Timestamp"]]
    .rename(
        {
            "Timestamp": "POINT_IN_TIME",
            "GroceryCustomerGuid": "GROCERYCUSTOMERGUID",
        },
        axis=1,
    )
)
display(observation_set)

Unnamed: 0,GROCERYCUSTOMERGUID,POINT_IN_TIME
0,3019bdbf-667c-4081-acb5-26cd2d559c5e,2022-12-19 19:01:19
1,24196ecb-be71-42b2-a748-89ed1960e4fc,2022-10-27 10:26:56
2,f761a5d1-3b66-4faf-82f1-6cd59e2e28f8,2022-12-27 21:09:40
3,d0251d4c-f16a-4db2-a4d2-f025cb90b3be,2022-11-14 10:16:57
4,9eb1b37c-a1f8-498c-b201-55c948a5887f,2022-12-30 19:49:02
5,24196ecb-be71-42b2-a748-89ed1960e4fc,2022-10-09 13:46:30
6,e490ab6d-c699-44c3-a284-41a7bbb1ee6f,2022-11-08 23:06:38
7,8497f78a-c60a-4da7-ab28-26514fede8e2,2022-10-11 16:59:23
8,325682f0-e8ab-445d-87bd-d979b561b2ce,2022-12-07 15:11:13
9,7daf3909-e8c4-487a-b8d1-cb6817d4e04d,2022-10-29 17:21:35


In [21]:
# display the feature values
display(customer_inventory_28d.preview(observation_set))

Unnamed: 0,GROCERYCUSTOMERGUID,POINT_IN_TIME,CustomerInventory_28d
0,3019bdbf-667c-4081-acb5-26cd2d559c5e,2022-12-19 19:01:19,"{\n ""Animalerie, Soins et Hygiène"": 1,\n ""Au..."
1,24196ecb-be71-42b2-a748-89ed1960e4fc,2022-10-27 10:26:56,"{\n ""Aide à la Pâtisserie"": 1,\n ""Beurre"": 2..."
2,f761a5d1-3b66-4faf-82f1-6cd59e2e28f8,2022-12-27 21:09:40,"{\n ""Boucherie"": 7,\n ""Chips et Tortillas"": ..."
3,d0251d4c-f16a-4db2-a4d2-f025cb90b3be,2022-11-14 10:16:57,"{\n ""Adoucissants et Soin du linge"": 1,\n ""A..."
4,9eb1b37c-a1f8-498c-b201-55c948a5887f,2022-12-30 19:49:02,"{\n ""Biscuits apéritifs"": 2,\n ""Chips et Tor..."
5,24196ecb-be71-42b2-a748-89ed1960e4fc,2022-10-09 13:46:30,"{\n ""Biscuits"": 2,\n ""Chips et Tortillas"": 1..."
6,e490ab6d-c699-44c3-a284-41a7bbb1ee6f,2022-11-08 23:06:38,"{\n ""Biscuits"": 2,\n ""Biscuits apéritifs"": 1..."
7,8497f78a-c60a-4da7-ab28-26514fede8e2,2022-10-11 16:59:23,"{\n ""Chips et Tortillas"": 5,\n ""Colas, Thés ..."
8,325682f0-e8ab-445d-87bd-d979b561b2ce,2022-12-07 15:11:13,"{\n ""Biscuits"": 2,\n ""Biscuits apéritifs"": 2..."
9,7daf3909-e8c4-487a-b8d1-cb6817d4e04d,2022-10-29 17:21:35,"{\n ""Animalerie, Soins et Hygiène"": 1,\n ""Ca..."


### Example: View the feature lineage

In [22]:
# display the feature lineage for the feature we just loaded from the feature store
display(customer_inventory_28d.definition)

## Create New Features from Existing Features

You can use existing features as inputs to new features.

Learning objectives

In this section you wil learn how to:
* create a new feature from two existing features

### Example: Create a new similarity feature from two existing features

In [23]:
# get the StateInventory_28d feature
state_inventory_28d = catalog.get_feature("StateInventory_28d")

# get the CustomerInventory_28d feature
customer_inventory_28d = catalog.get_feature("CustomerInventory_28d")

# create a new feature that is the cosine similarity of the two features
customer_state_items_similarity_28d = customer_inventory_28d.cd.cosine_similarity(
    state_inventory_28d
)
customer_state_items_similarity_28d.name = "CustomerStateItemsSimilarity_28d"
customer_state_items_similarity_28d.save()

# display the feature lineage for the feature we just created
display(customer_state_items_similarity_28d.definition)

Done! |████████████████████████████████████████| 100% in 9.7s (0.10%/s)         


## Create a New Feature List From Existing Features

Learning objectives

In this section you will learn how to:
* create a feature list with a primary entity suited to your use case

### Example: Create a customer level feature list

In [24]:
# get a list of all the features in the catalog
all_features = catalog.list_features()

# filter to retain only those with grocery customer or state as their primary entity
child_entity = "groceryinvoice"
suitable_features = all_features.loc[[child_entity not in x for x in all_features.entities.values]]
product_entity = "groceryproduct"
suitable_features = suitable_features.loc[
    [product_entity not in x for x in suitable_features.entities.values]
]

# show the features
display(suitable_features)

Unnamed: 0,id,name,dtype,readiness,online_enabled,tables,primary_tables,entities,primary_entities,created_at
0,64c161f9660f7100dc1af61d,CustomerStateItemsSimilarity_28d,FLOAT,DRAFT,False,"[GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS...",[INVOICEITEMS],"[grocerycustomer, frenchstate]",[grocerycustomer],2023-07-26 18:12:14.394
4,64c161b7660f7100dc1af606,CustomerYearOfBirth,INT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[grocerycustomer],[grocerycustomer],2023-07-26 18:11:05.293
5,64c161af660f7100dc1af603,CustomerSpend_14d,FLOAT,DRAFT,False,[GROCERYINVOICE],[GROCERYINVOICE],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:57.482
6,64c161a7660f7100dc1af601,CustomerInventory_24w,OBJECT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT]",[INVOICEITEMS],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:50.288
7,64c1619f660f7100dc1af5ff,CustomerInventory_28d,OBJECT,DRAFT,False,"[GROCERYINVOICE, INVOICEITEMS, GROCERYPRODUCT]",[INVOICEITEMS],[grocerycustomer],[grocerycustomer],2023-07-26 18:10:42.223
8,64c16197660f7100dc1af5fe,StateMeanLongitude,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:10:33.475
9,64c16190660f7100dc1af5fd,StateMeanLatitude,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:10:26.073
10,64c16188660f7100dc1af5fb,StateAvgInvoiceAmount_28d,FLOAT,DRAFT,False,"[GROCERYCUSTOMER, GROCERYINVOICE]",[GROCERYINVOICE],[frenchstate],[frenchstate],2023-07-26 18:10:18.423
11,64c1617d660f7100dc1af5f9,StateInventory_28d,OBJECT,DRAFT,False,"[GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS...",[INVOICEITEMS],[frenchstate],[frenchstate],2023-07-26 18:10:09.339
12,64c16175660f7100dc1af5f7,StatePopulation,FLOAT,DRAFT,False,[GROCERYCUSTOMER],[GROCERYCUSTOMER],[frenchstate],[frenchstate],2023-07-26 18:09:59.482


In [25]:
# create a new feature list from the 12 features we just searched for
customer_features = fb.FeatureList(
    [catalog.get_feature(x) for x in suitable_features.name.values], name="CustomerFeatures"
)
customer_features.save()

# display a sample of the feature list values
display(customer_features.preview(observation_set))

Done! |████████████████████████████████████████| 100% in 6.5s (0.15%/s)         
Loading Feature(s) |████████████████████████████████████████| 11/11 [100%] in 0.


Unnamed: 0,GROCERYCUSTOMERGUID,POINT_IN_TIME,CustomerStateItemsSimilarity_28d,CustomerYearOfBirth,CustomerSpend_14d,CustomerInventory_24w,CustomerInventory_28d,StateMeanLongitude,StateMeanLatitude,StateAvgInvoiceAmount_28d,StateInventory_28d,StatePopulation,StateName
0,3019bdbf-667c-4081-acb5-26cd2d559c5e,2022-12-19 19:01:19,0.884429,1984,126.56,"{\n ""Animalerie, Soins et Hygiène"": 1,\n ""Au...","{\n ""Animalerie, Soins et Hygiène"": 1,\n ""Au...",45.189819,-12.713308,9.200755,"{\n ""Animalerie, Soins et Hygiène"": 2,\n ""Au...",3,Mayotte
1,24196ecb-be71-42b2-a748-89ed1960e4fc,2022-10-27 10:26:56,0.713778,2001,68.99,"{\n ""Adoucissants et Soin du linge"": 1,\n ""A...","{\n ""Aide à la Pâtisserie"": 1,\n ""Beurre"": 2...",2.241215,48.738384,18.900232,"{\n ""Adoucissants et Soin du linge"": 23,\n ""...",181,Île-de-France
2,f761a5d1-3b66-4faf-82f1-6cd59e2e28f8,2022-12-27 21:09:40,0.614661,1968,24.42,"{\n ""Aide à la Pâtisserie"": 3,\n ""Bonbons"": ...","{\n ""Boucherie"": 7,\n ""Chips et Tortillas"": ...",0.934599,49.391777,23.185862,"{\n ""Adoucissants et Soin du linge"": 5,\n ""A...",13,Haute-Normandie
3,d0251d4c-f16a-4db2-a4d2-f025cb90b3be,2022-11-14 10:16:57,0.819754,1978,218.27,"{\n ""Adoucissants et Soin du linge"": 1,\n ""A...","{\n ""Adoucissants et Soin du linge"": 1,\n ""A...",1.569134,43.706807,20.673288,"{\n ""Adoucissants et Soin du linge"": 1,\n ""A...",19,Midi-Pyrénées
4,9eb1b37c-a1f8-498c-b201-55c948a5887f,2022-12-30 19:49:02,0.616378,1977,46.33,"{\n ""Aide à la Pâtisserie"": 4,\n ""Biscuits"":...","{\n ""Biscuits apéritifs"": 2,\n ""Chips et Tor...",2.242254,48.739038,19.305905,"{\n ""Adoucissants et Soin du linge"": 19,\n ""...",181,Île-de-France
5,24196ecb-be71-42b2-a748-89ed1960e4fc,2022-10-09 13:46:30,0.495916,2001,25.22,"{\n ""Adoucissants et Soin du linge"": 1,\n ""A...","{\n ""Biscuits"": 2,\n ""Chips et Tortillas"": 1...",2.240549,48.737227,18.387097,"{\n ""Adoucissants et Soin du linge"": 19,\n ""...",180,Île-de-France
6,e490ab6d-c699-44c3-a284-41a7bbb1ee6f,2022-11-08 23:06:38,0.583475,1976,17.41,"{\n ""Aide à la Pâtisserie"": 1,\n ""Biscuits"":...","{\n ""Biscuits"": 2,\n ""Biscuits apéritifs"": 1...",2.241215,48.738384,18.646525,"{\n ""Adoucissants et Soin du linge"": 14,\n ""...",181,Île-de-France
7,8497f78a-c60a-4da7-ab28-26514fede8e2,2022-10-11 16:59:23,0.465368,1938,42.05,"{\n ""Autres"": 1,\n ""Chips et Tortillas"": 37,...","{\n ""Chips et Tortillas"": 5,\n ""Colas, Thés ...",-0.494788,44.676056,16.189659,"{\n ""Aide à la Pâtisserie"": 7,\n ""Animalerie...",25,Aquitaine
8,325682f0-e8ab-445d-87bd-d979b561b2ce,2022-12-07 15:11:13,0.491572,1995,44.03,"{\n ""Autres"": 1,\n ""Autres Produits Laitiers...","{\n ""Biscuits"": 2,\n ""Biscuits apéritifs"": 2...",2.241215,48.738384,20.503329,"{\n ""Adoucissants et Soin du linge"": 18,\n ""...",181,Île-de-France
9,7daf3909-e8c4-487a-b8d1-cb6817d4e04d,2022-10-29 17:21:35,0.778244,1949,50.73,"{\n ""Adoucissants et Soin du linge"": 2,\n ""A...","{\n ""Animalerie, Soins et Hygiène"": 1,\n ""Ca...",5.879949,43.450736,14.697778,"{\n ""Adoucissants et Soin du linge"": 4,\n ""A...",54,Provence-Alpes-Côte d'Azur


In [26]:
# list the feature lists in the catalog
catalog.list_feature_lists()

Unnamed: 0,id,name,num_feature,status,deployed,readiness_frac,online_frac,tables,entities,primary_entities,created_at
0,64c1620f660f7100dc1af61e,CustomerFeatures,11,DRAFT,False,0.0,0.0,"[GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS...","[grocerycustomer, frenchstate]",[grocerycustomer],2023-07-26 18:12:33.812
1,64c161d5660f7100dc1af60b,StateFeatureList,5,DRAFT,False,0.0,0.0,"[GROCERYCUSTOMER, GROCERYINVOICE, INVOICEITEMS...",[frenchstate],[frenchstate],2023-07-26 18:11:34.464


## Next Steps

Now that you've completed the quick-start reusing features tutorial, you can put your knowledge into practice or learn more:<br>
1. Learn more about materializing features via the "Deep Dive Materializing Features" tutorial
2. Learn more about feature engineering via the "Deep Dive Feature Engineering" tutorial
3. Learn about data modeling via the "Deep Dive Data Modeling" tutorial