# EFS - Feature Store

## Disclaimer
The sample code (“Sample Code”) provided is not covered by any Teradata agreements. Please be aware that Teradata has no control over the model responses to such sample code and such response may vary. The use of the model by Teradata is strictly for demonstration purposes and does not constitute any form of certification or endorsement. The sample code is provided “AS IS” and any express or implied warranties, including the implied warranties of merchantability and fitness for a particular purpose, are disclaimed. In no event shall Teradata be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) sustained by you or a third party, however caused and on any theory of liability, whether in contract, strict liability, or tort arising in any way out of the use of this sample code, even if advised of the possibility of such damage.

## Context 
**Enterprise Feature Store**

This notebook demonstrates how to build, manage, and utilize an enterprise feature store using TeradataML. It covers the end-to-end workflow for feature engineering, ingestion, cataloging, and governance of features derived from business data such as sales, marketing, and transactions.

**Notebook Purpose:**
- Show how to create and manage a centralized feature repository for analytics and machine learning.
- Demonstrate feature engineering and ingestion from raw business datasets.
- Illustrate feature lineage, versioning, and governance for reproducible ML workflows.

## 1. Import the required libraries

In [None]:
from teradataml import create_context, DataFrame, FeatureStore, \
load_example_data, remove_context, in_schema, \
 db_drop_table, db_drop_view, read_csv, execute_sql
from getpass import getpass
from collections import OrderedDict
from teradatasqlalchemy import INTEGER, FLOAT, VARCHAR, DATE
from sqlalchemy import literal_column

## 2. Connect to Vantage with Admin user

Connecting to Vantage with an Admin user is required for initial setup tasks such as creating the feature store, configuring storage, and granting permissions to other users. These operations typically require elevated privileges.

In [5]:
context=create_context(config_file='admin_config_file.env')

## 3. Setup a Feature Store Repository

### 3.1. Create the FeatureStore

In [8]:
# Create a repo. 
fs = FeatureStore(repo='enterprise_feature_repo')

Repo enterprise_feature_repo does not exist. Run FeatureStore.setup() to create the repo and setup FeatureStore.


### 3.2. List the available feature stores. 

In [11]:
FeatureStore.list_repos()



repos
efs_demo
test_repo_delete_no


### 3.3. Setup the FeatureStore

In [13]:
fs.setup()

True

### 3.4. Grant the access to user

**Note:** 
Granting read/write access to a user is necessary so they can create, modify, and manage features and metadata within the feature store. This ensures the specified user has the required permissions to work with the feature store objects. If needed, you can later revoke these rights using `fs.revoke.read_write(username)`.

In [16]:
username = getpass(prompt = 'username: ')
fs.grant.read_write(username)

username:  ········


True

## 4. Connect to a Vantage system with user that has granted the permissions

### 4.1. Remove context with Admin user

In [22]:
# Disconnect from the system and connect to system again from the 'user'.
remove_context()

True

### 4.2. Create context with non-admin user

In [25]:
context=create_context(config_file='non_admin_config_file.env')

### 4.3. Create Feature Store object with non-admin user to ingest feature values

In [29]:
# First initiate feature store for the repo enterprise_feature_repo. Keep the data domain as analytics.
fs = FeatureStore(repo='enterprise_feature_repo', data_domain='Analytics')

FeatureStore is ready to use.


### Feature Engineering

<p>In this feature engineering process, we will use the <code>teradataml</code> package to create Teradata DataFrames that implement the required computations. The two main feature engineering tasks are:</p>

<ol>
    <li>
        <strong>Statistics on Customer:</strong> For each <code>customer ID</code>, we will compute:
        <ul>
            <li>The sum of all transaction amounts.</li>
            <li>The average amount per transaction.</li>
            <li>The total number of transactions.</li>
            <li>The number of days since the last transaction.</li>
        </ul>
    </li>
    <li>
        <strong>Spending Category Distribution:</strong> For each transaction category, we will compute:
        <ul>
            <li>The sum of transaction amounts.</li>
            <li>The mean, standard deviation, maximum, and median of transaction amounts.</li>
        </ul>
    </li>
</ol>

<p>These computations will result in two Teradata DataFrames:</p>
<ul>
    <li><code>df_eng_feat_cust</code>: Features computed per customer.</li>
    <li><code>df_eng_feat_cat</code>: Features computed for spending category distribution.</li>
</ul>

<p>Note that these DataFrames only implement the processing logic and do not generate data until explicitly stored or exported. When displaying the content of these DataFrames, only a sample of the results will be shown. To generate the actual data, you would need to either:</p>
<ul>
    <li>Store the results in another table within the database.</li>
    <li>Export the results to a <code>pandas</code> DataFrame, files, ...</li>
</ul>

## 5. Get Data For demo

### 5.1. Load the transaction Data

In [None]:
t_types = OrderedDict(CustomerID=INTEGER, Transaction_Amount=FLOAT, Date_transaction=DATE, Category=VARCHAR(200), MerchantID=INTEGER)
df = read_csv(table_name='transactions',
               filepath=r"../data/transactions.csv",
              types=t_types)

### 5.2. Perform Data Transformation

In [39]:
df_eng_feat_cust = df.groupby('CustomerID').agg({'Transaction_Amount' : ['sum','mean','count'], 'Date_transaction':['max']})

#### Statistics on customers

In [42]:
df_eng_feat_cust = df.groupby('CustomerID').assign(total_Transaction_Amount=df.Transaction_Amount.sum(),
                                                   avg_Transaction_Amount=df.Transaction_Amount.mean(),
                                                   count_Transaction_Amount=df.Transaction_Amount.count(),
                                                   max_Date_transaction=df.Date_transaction.max()
                                                  )
df_eng_feat_cust = df_eng_feat_cust.assign(nb_days_since_last_transactions = literal_column('INTERVAL(PERIOD(max_Date_transaction, CURRENT_DATE)) DAY(4)',type_= INTEGER))
df_eng_feat_cust = df_eng_feat_cust[['CustomerID','total_Transaction_Amount','avg_Transaction_Amount','count_Transaction_Amount','nb_days_since_last_transactions']]
df_eng_feat_cust



CustomerID,total_Transaction_Amount,avg_Transaction_Amount,count_Transaction_Amount,nb_days_since_last_transactions
329746,295.67,295.67,1,693
495867,56.49,56.49,1,835
486735,72.83,72.83,1,753
729641,132.64,132.64,1,955
639481,76.89,76.89,1,712
513867,92.15,92.15,1,803
739482,53.91,53.91,1,828
728136,287.39,287.39,1,863
417283,115.67,115.67,1,817
574328,145.52,145.52,1,943


#### Spending Category Distribution

In [44]:
df_eng_feat_cat = df.groupby('Category').agg({'Transaction_Amount':['sum','mean','std','min','max','median']})
df_eng_feat_cat = df_eng_feat_cat.join(
    df[['Category','Transaction_Amount']].groupby("Category").percentile(0.25),
    on = 'Category',
    how = 'inner',
    rprefix = 'r'
)[df_eng_feat_cat.columns + ['percentile_Transaction_Amount']]
df_eng_feat_cat = df_eng_feat_cat.assign(quartile_1_Transaction_Amount=df_eng_feat_cat.percentile_Transaction_Amount)
df_eng_feat_cat = df_eng_feat_cat[[c for c in df_eng_feat_cat.columns if c not in ['percentile_Transaction_Amount']]]
df_eng_feat_cat = df_eng_feat_cat.join(
    df[['Category','Transaction_Amount']].groupby("Category").percentile(0.25),
    on = 'Category',
    how = 'inner',
    rprefix = 'r'
)[df_eng_feat_cat.columns + ['percentile_Transaction_Amount']]
df_eng_feat_cat = df_eng_feat_cat.assign(quartile_3_Transaction_Amount=df_eng_feat_cat.percentile_Transaction_Amount)
df_eng_feat_cat = df_eng_feat_cat[[c for c in df_eng_feat_cat.columns if c not in ['percentile_Transaction_Amount']]]
df_eng_feat_cat = df_eng_feat_cat.assign(var_feature = 'test')
df_eng_feat_cat



Category,sum_Transaction_Amount,mean_Transaction_Amount,std_Transaction_Amount,min_Transaction_Amount,max_Transaction_Amount,median_Transaction_Amount,quartile_1_Transaction_Amount,quartile_3_Transaction_Amount,var_feature
Cosmetic,2653.4,156.08235294117648,25.111340199552156,91.58,198.45,154.89,148.52,148.52,test
Market,1130.4899999999998,62.80499999999999,20.27050167695156,38.76,132.15,58.605,54.2875,54.2875,test
Travel,3127.0400000000004,173.72444444444446,245.85946313774272,72.94,1134.47,100.005,94.76,94.76,test
Restaurant,1455.43,76.60157894736842,15.967152137072596,57.44,134.76,75.42,70.6,70.6,test
Electronics,5242.81,308.4005882352941,65.75844522479508,198.67,456.83,295.73,278.34,278.34,test
Clothing,2426.08,142.71058823529413,51.6305623481127,89.42,287.39,125.69,118.73,118.73,test


## 6. Store the data transformations

We are storing the transformation here. So, even if underlying data varies, the data transformation steps remain same.

In [51]:
df_eng_feat_cust = df_eng_feat_cust.create_view('FEAT_ENG_CUST')
df_eng_feat_cust



CustomerID,total_Transaction_Amount,avg_Transaction_Amount,count_Transaction_Amount,nb_days_since_last_transactions
639481,76.89,76.89,1,712
495867,56.49,56.49,1,835
486735,72.83,72.83,1,753
642109,198.73,198.73,1,758
329746,295.67,295.67,1,693
513867,92.15,92.15,1,803
739482,53.91,53.91,1,828
728136,287.39,287.39,1,863
475918,74.36,74.36,1,746
574328,145.52,145.52,1,943


In [53]:
df_eng_feat_cat = df_eng_feat_cat.create_view('FEAT_ENG_CAT')
df_eng_feat_cat



Category,sum_Transaction_Amount,mean_Transaction_Amount,std_Transaction_Amount,min_Transaction_Amount,max_Transaction_Amount,median_Transaction_Amount,quartile_1_Transaction_Amount,quartile_3_Transaction_Amount,var_feature
Travel,3127.0400000000004,173.72444444444446,245.85946313774272,72.94,1134.47,100.005,94.76,94.76,test
Clothing,2426.08,142.71058823529413,51.6305623481127,89.42,287.39,125.69,118.73,118.73,test
Electronics,5242.809999999999,308.40058823529404,65.75844522479535,198.67,456.83,295.73,278.34,278.34,test
Restaurant,1455.4299999999998,76.60157894736841,15.967152137072656,57.44,134.76,75.42,70.6,70.6,test
Market,1130.4899999999998,62.80499999999999,20.27050167695156,38.76,132.15,58.605,54.2875,54.2875,test
Cosmetic,2653.4,156.08235294117648,25.111340199552156,91.58,198.45,154.89,148.52,148.52,test


## 7. Ingest the features from `df_eng_feat_cust` datasource

### 7.1. See the mind_map for Feature Store

Before ingesting the features, first check the mind map of feature store.
Since no features are ingested and no datasets are built, it should be empty.

In [58]:
fs.mind_map()

### 7.2. Ingest the features

In [61]:
# Ingest the features for data source `df_eng_feat_cust`.
fp = fs.get_feature_process(object=df_eng_feat_cust,
                            entity='CustomerID',
                            features=[
                                'total_Transaction_Amount', 
                                'avg_Transaction_Amount', 
                                'count_Transaction_Amount', 
                                'nb_days_since_last_transactions'],
                            description='Feature Process for Customers'
                           )
# Run it to ingest the features.
fp.run()

Process 'c85cec83-8c6e-11f0-9841-b0dcef8381ea' started.
Process 'c85cec83-8c6e-11f0-9841-b0dcef8381ea' completed.


True

### 7.3. See the mind_map for Feature Store

We ingested three features—`total_Transaction_Amount`, `avg_Transaction_Amount`, `count_Transaction_Amount` and `nb_days_since_last_transactions`—from a single feature process. This demonstrates how multiple related features can be managed and tracked together within the feature store, maintaining their lineage to the originating process.

In [64]:
# Features are ingested. Before we move to other APIs, look at mind map.
fs.mind_map()

### 7.4. Explore FeatureStore for CustomerID entity

#### 7.4.1. list feature_processes

In [69]:
# List the feature processes.
fs.list_feature_processes()



process_id,description,data_domain,process_type,data_source,entity_id,feature_names,feature_ids,valid_start,valid_end
c85cec83-8c6e-11f0-9841-b0dcef8381ea,Feature Process for Customers,Analytics,denormalized view,"""ALICE"".""FEAT_ENG_CUST""",CustomerID,"avg_Transaction_Amount, count_Transaction_Amount, nb_days_since_last_transactions, total_Transaction_Amount",,2025-09-08 04:46:43.330000+00:,9999-12-31 23:59:59.999999+00:


#### 7.4.2. list feature_catalogs

In [71]:
# Look at feature catalogs.
fs.list_feature_catalogs()



entity_name,data_domain,feature_id,table_name,valid_start,valid_end
CustomerID,Analytics,2,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
CustomerID,Analytics,1,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
CustomerID,Analytics,4,FS_T_3937d0af_b408_517a_e99a_b6463b1e8e38,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
CustomerID,Analytics,3,FS_T_22063ce7_d537_561d_5cc0_b06120f5c2f0,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:


#### 7.4.3. list feature versions

##### 7.4.3.1. Get the feature catalog

In [74]:
fc = fs.get_feature_catalog()

##### 7.4.3.2. list feature versions

In [76]:
# Get the feature catalog and examine the versions.
fc.list_feature_versions()



entity_id,data_domain,id,name,table_name,feature_version
CustomerID,Analytics,2,avg_Transaction_Amount,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,c85cec83-8c6e-11f0-9841-b0dcef8381ea
CustomerID,Analytics,1,total_Transaction_Amount,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,c85cec83-8c6e-11f0-9841-b0dcef8381ea
CustomerID,Analytics,4,nb_days_since_last_transactions,FS_T_3937d0af_b408_517a_e99a_b6463b1e8e38,c85cec83-8c6e-11f0-9841-b0dcef8381ea
CustomerID,Analytics,3,count_Transaction_Amount,FS_T_22063ce7_d537_561d_5cc0_b06120f5c2f0,c85cec83-8c6e-11f0-9841-b0dcef8381ea


## 8. Ingest the features from `df_eng_feat_cat` datasource

### 8.1. Ingest the features

In [83]:
# Let's ingest features from another data source `df_eng_feat_cat`.
fp2 = fs.get_feature_process(object=df_eng_feat_cat,
                            entity='Category',
                            features=['sum_Transaction_Amount',
                                      'mean_Transaction_Amount',
                                      'std_Transaction_Amount',
                                      'min_Transaction_Amount',
                                      'max_Transaction_Amount',
                                      'median_Transaction_Amount',
                                      'quartile_1_Transaction_Amount',
                                      'quartile_3_Transaction_Amount',
                                      'var_feature'],
                            description='Feature Process for Category'
                           )
fp2.run()

Process 'f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea' started.
Process 'f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea' completed.


True

### 8.2. See the mind_map for Feature Store

In [85]:
# Let's look at mind map now.
fs.mind_map()

### 8.3. Explore the FeatureStore for the operations that have been done so far

#### 8.3.1. List feture processes

In [88]:
# List the feature process'es.
fs.list_feature_processes()



process_id,description,data_domain,process_type,data_source,entity_id,feature_names,feature_ids,valid_start,valid_end
f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea,Feature Process for Category,Analytics,denormalized view,"""ALICE"".""FEAT_ENG_CAT""",Category,"max_Transaction_Amount, mean_Transaction_Amount, median_Transaction_Amount, min_Transaction_Amount, quartile_1_Transaction_Amount, quartile_3_Transaction_Amount, std_Transaction_Amount, sum_Transaction_Amount, var_feature",,2025-09-08 04:48:11.340000+00:,9999-12-31 23:59:59.999999+00:
c85cec83-8c6e-11f0-9841-b0dcef8381ea,Feature Process for Customers,Analytics,denormalized view,"""ALICE"".""FEAT_ENG_CUST""",CustomerID,"avg_Transaction_Amount, count_Transaction_Amount, nb_days_since_last_transactions, total_Transaction_Amount",,2025-09-08 04:46:43.330000+00:,9999-12-31 23:59:59.999999+00:


#### 8.3.2. List feature catalogs

In [90]:
# Look at feature catalogs.
display.max_rows=20
fs.list_feature_catalogs()



entity_name,data_domain,feature_id,table_name,valid_start,valid_end
CustomerID,Analytics,2,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
Category,Analytics,7,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
Category,Analytics,9,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
Category,Analytics,10,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
Category,Analytics,13,FS_T_44b9b0fd_ee25_32c3_6ae6_c0df59351fd7,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
Category,Analytics,12,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
Category,Analytics,8,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
Category,Analytics,6,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
Category,Analytics,5,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
Category,Analytics,11,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:


#### 8.3.3. List feature versions

In [96]:
# Get the feature catalog and examine the versions.
fc.list_feature_versions()



entity_id,data_domain,id,name,table_name,feature_version
Category,Analytics,10,median_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,13,var_feature,FS_T_44b9b0fd_ee25_32c3_6ae6_c0df59351fd7,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,12,quartile_3_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,8,min_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,5,sum_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
CustomerID,Analytics,3,count_Transaction_Amount,FS_T_22063ce7_d537_561d_5cc0_b06120f5c2f0,c85cec83-8c6e-11f0-9841-b0dcef8381ea
CustomerID,Analytics,4,nb_days_since_last_transactions,FS_T_3937d0af_b408_517a_e99a_b6463b1e8e38,c85cec83-8c6e-11f0-9841-b0dcef8381ea
CustomerID,Analytics,2,avg_Transaction_Amount,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,c85cec83-8c6e-11f0-9841-b0dcef8381ea
CustomerID,Analytics,1,total_Transaction_Amount,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,c85cec83-8c6e-11f0-9841-b0dcef8381ea
Category,Analytics,6,mean_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea


#### 8.3.4. List dataset catalogs

In [99]:
# Let's look at dataset catalogs. Since no dataset is created, it should be empty.
fs.list_dataset_catalogs()



id,data_domain,name,entity_name,database_name,description,valid_start,valid_end


## 9. Build Dataset

### 9.1. Get the dataset catalog

In [103]:
# Let's build dataset in Dataset catalog. First get the pointer for Dataset catalog.
dc = fs.get_dataset_catalog()

### 9.2. Build the dataset for 'df_eng_feat_cust'

In [106]:
df = dc.build_dataset(entity='CustomerID',
                      selected_features={'avg_Transaction_Amount': fp.process_id,
                                         'total_Transaction_Amount': fp.process_id},
                      view_name='CustID_Transactions_1'
                     )
df



CustomerID,avg_Transaction_Amount,total_Transaction_Amount
817592,198.67,198.67
861131,57.44,57.44
228043,118.87,118.87
486735,72.83,72.83
357264,173.94,173.94
739526,67.52,67.52
546372,52.91,52.91
684935,125.89,125.89
329574,295.73,295.73
467220,129.35,129.35


##### Verify the data. 

In [108]:
df_eng_feat_cust[df.CustomerID.isin([467220, 684935, 642109])]



CustomerID,total_Transaction_Amount,avg_Transaction_Amount,count_Transaction_Amount,nb_days_since_last_transactions
684935,125.89,125.89,1,877
642109,198.73,198.73,1,758
467220,129.35,129.35,1,837


### 9.3. See the mind_map for Feature Store

In [110]:
# Let's look at mind map.
fs.mind_map()

### 9.4. Build the time series dataset for 'df_eng_feat_cust'

In [115]:
# Let's build a time series. 
df = dc.build_time_series(entity='Category',
                          selected_features={'std_Transaction_Amount': fp2.process_id,
                                             'max_Transaction_Amount': fp2.process_id,
                                             'median_Transaction_Amount': fp2.process_id,
                                             'quartile_3_Transaction_Amount': fp2.process_id},
                          view_name='Cat_Transactions_1'
                         )
df



Category,std_Transaction_Amount,std_Transaction_Amount_start_time,std_Transaction_Amount_end_time,max_Transaction_Amount,max_Transaction_Amount_start_time,max_Transaction_Amount_end_time,median_Transaction_Amount,median_Transaction_Amount_start_time,median_Transaction_Amount_end_time,quartile_3_Transaction_Amount,quartile_3_Transaction_Amount_start_time,quartile_3_Transaction_Amount_end_time
Clothing,51.6305623481127,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,287.39,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,125.69,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,118.73,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:
Electronics,65.75844522479508,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,456.83,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,295.73,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,278.34,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:
Cosmetic,25.111340199552156,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,198.45,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,154.89,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,148.52,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:
Market,20.27050167695156,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,132.15,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,58.605,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,54.2875,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:
Travel,245.85946313774275,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,1134.47,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,100.005,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,94.76,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:
Restaurant,15.967152137072596,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,134.76,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,75.42,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:,70.6,2025-09-08 04:48:07.120000+00:,9999-12-31 23:59:59.999999+00:


### 9.5. See the mind_map for Feature Store

In [117]:
# Let's look at mind map.
fs.mind_map()

### 9.6. List dataset catalogs 

In [119]:
# Let's look at Dataset catalog. Two datasets created. Hence both should be available.
fs.list_dataset_catalogs()



id,data_domain,name,entity_name,database_name,description,valid_start,valid_end
a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff,Analytics,CustID_Transactions_1,CustomerID,enterprise_feature_repo,,2025-09-08 04:48:34.330000+00:,9999-12-31 23:59:59.999999+00:
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,Cat_Transactions_1,Category,enterprise_feature_repo,,2025-09-08 04:48:48.270000+00:,9999-12-31 23:59:59.999999+00:


## 10. Feature Ingestion with filters

### 10.1. Load the sales data 

In [125]:
# Ingest the features with filters.
load_example_data('dataframe', 'sales')
df = DataFrame("sales")
df





accounts,Feb,Jan,Mar,Apr,datetime
Yellow Inc,360.0,,,,04/01/2017
Jones LLC,800.0,600.0,560.0,720.0,04/01/2017
Orange Inc,840.0,,,1000.0,04/01/2017
Alpha Co,840.0,800.0,860.0,1000.0,04/01/2017
Blue Inc,360.0,200.0,380.0,404.0,04/01/2017
Red Inc,800.0,600.0,560.0,,04/01/2017


### 10.2. Ingest the features

In [127]:
# Ingest the features only when entities has 'Inc' in it.
fp3 = fs.get_feature_process(object=df, 
                             entity='accounts', 
                             features=['Jan', 'Feb'])
fp3.run(filters=df.accounts.str.contains('Inc')==1)

Process '2555081b-8c6f-11f0-a7d8-b0dcef8381ea' started.
Ingesting the features for filter 'CASE WHEN (accounts IS NULL) THEN NULL ELSE CASE WHEN (regexp_substr(accounts, 'Inc', 1, 1, 'c') IS NULL) THEN 0 ELSE 1 END END = 1' to catalog.
Process '2555081b-8c6f-11f0-a7d8-b0dcef8381ea' completed.


True

### 10.3. See the mind_map for Feature Store

In [129]:
# Let's look at mind map to see features are ingested or not. 
fs.mind_map()

### 10.4. Look at the feature table 

In [134]:
# Let's verify whether features ingested for that features or not. Look at catalog table.
fc.list_feature_versions()



entity_id,data_domain,id,name,table_name,feature_version
Category,Analytics,10,median_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,13,var_feature,FS_T_44b9b0fd_ee25_32c3_6ae6_c0df59351fd7,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,12,quartile_3_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,8,min_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,5,sum_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
CustomerID,Analytics,3,count_Transaction_Amount,FS_T_22063ce7_d537_561d_5cc0_b06120f5c2f0,c85cec83-8c6e-11f0-9841-b0dcef8381ea
CustomerID,Analytics,4,nb_days_since_last_transactions,FS_T_3937d0af_b408_517a_e99a_b6463b1e8e38,c85cec83-8c6e-11f0-9841-b0dcef8381ea
CustomerID,Analytics,2,avg_Transaction_Amount,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,c85cec83-8c6e-11f0-9841-b0dcef8381ea
CustomerID,Analytics,1,total_Transaction_Amount,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,c85cec83-8c6e-11f0-9841-b0dcef8381ea
accounts,Analytics,15,Feb,FS_T_62c9e14a_e0ca_d196_bca1_cc3436ebc70d,2555081b-8c6f-11f0-a7d8-b0dcef8381ea


In [135]:
DataFrame(in_schema(fs.repo, "FS_T_4fb46368_0c60_21bf_20a1_b0a3948fedde"))



accounts,feature_id,feature_value,feature_version,valid_start,valid_end,ValidPeriod
Yellow Inc,14,,2555081b-8c6f-11f0-a7d8-b0dcef8381ea,2025-09-08 04:49:15.640000+00:,9999-12-31 23:59:59.999999+00:,('2025-09-08 04:49:15.640000+0
Red Inc,14,600.0,2555081b-8c6f-11f0-a7d8-b0dcef8381ea,2025-09-08 04:49:15.640000+00:,9999-12-31 23:59:59.999999+00:,('2025-09-08 04:49:15.640000+0
Blue Inc,14,200.0,2555081b-8c6f-11f0-a7d8-b0dcef8381ea,2025-09-08 04:49:15.640000+00:,9999-12-31 23:59:59.999999+00:,('2025-09-08 04:49:15.640000+0
Orange Inc,14,,2555081b-8c6f-11f0-a7d8-b0dcef8381ea,2025-09-08 04:49:15.640000+00:,9999-12-31 23:59:59.999999+00:,('2025-09-08 04:49:15.640000+0


### 10.5. Ingest the features for an incremental load.

#### 10.5.1. let's first update the values of features. Otherwise, update is not visible.

In [139]:
execute_sql('update sales set Feb=Feb*2')

TeradataCursor uRowsHandle=589 bClosed=False

#### 10.5.2. Store the values for increamental load

In [141]:
fp4 = fs.get_feature_process(object=df, 
                             entity='accounts', 
                             features=['Jan', 'Feb']
                            )
fp4.run()

Process '2555081b-8c6f-11f0-a7d8-b0dcef8381ea' started.
Process '2555081b-8c6f-11f0-a7d8-b0dcef8381ea' completed.


True

### 10.6. Build the dataset

In [146]:
# Let's first look at the data. For that, let's build a dataset.
df_sales_jan_feb = dc.build_dataset(entity='accounts',
                                    selected_features={
                                        'Jan': fp4.process_id,
                                        'Feb': fp4.process_id
                                    },
                                   view_name='sales_jan_feb')

In [147]:
df_sales_jan_feb.sort('accounts')



accounts,Jan,Feb
Alpha Co,800.0,1680.0
Blue Inc,200.0,720.0
Jones LLC,600.0,1600.0
Orange Inc,,1680.0
Red Inc,600.0,1600.0
Yellow Inc,,720.0


## 11. Ingest the features at specific time

### 11.1. Create new data domain for specific time ingestion

In [152]:
# For that, let's create another data domain.
fs_sales_analytics = FeatureStore(repo='enterprise_feature_repo', data_domain='Sales_Analytics')

FeatureStore is ready to use.


### 11.2. See the FeatureStore for 'Sales_Analytics' data domain

In [154]:
# Let's first see what this data domain has. It should be empty.
fs_sales_analytics.mind_map()

### 11.3. Drop and load the sales data

In [157]:
# Let's drop the sales table and load it again.
db_drop_table('sales')
load_example_data('dataframe', 'sales')
df = DataFrame("sales")

### 11.4. Ingest the features

In [160]:
# Let's first ingest the features.
fp5 = fs_sales_analytics.get_feature_process(object=df, 
                                             entity='accounts', 
                                             features=['Jan', 'Feb', 'Mar', 'Apr']
                                             )
fp5.run()

Process '51acd5c4-8c6f-11f0-9396-b0dcef8381ea' started.
Process '51acd5c4-8c6f-11f0-9396-b0dcef8381ea' completed.


True

In [161]:
fs_sales_analytics.mind_map()

### 11.5. Look at Jan's values first.

In [166]:
fc_sales_analytics = fs_sales_analytics.get_feature_catalog()
fc_sales_analytics.list_feature_versions()



entity_id,data_domain,id,name,table_name,feature_version
accounts,Sales_Analytics,17,Feb,FS_T_9a644a9e_8cf7_5ac4_8c06_a621b7111619,51acd5c4-8c6f-11f0-9396-b0dcef8381ea
accounts,Sales_Analytics,16,Jan,FS_T_feb531d5_375e_5831_05d7_3ac2ac2a7965,51acd5c4-8c6f-11f0-9396-b0dcef8381ea
accounts,Sales_Analytics,19,Apr,FS_T_feb531d5_375e_5831_05d7_3ac2ac2a7965,51acd5c4-8c6f-11f0-9396-b0dcef8381ea
accounts,Sales_Analytics,18,Mar,FS_T_feb531d5_375e_5831_05d7_3ac2ac2a7965,51acd5c4-8c6f-11f0-9396-b0dcef8381ea


In [168]:
# Let's look at Jan's values first.
sales_jan = DataFrame(in_schema(fs.repo, "FS_T_feb531d5_375e_5831_05d7_3ac2ac2a7965")).as_of(valid_time='current')
sales_jan[sales_jan.feature_id == 16]



accounts,feature_id,feature_value,feature_version,valid_start,valid_end
Yellow Inc,16,,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:
Jones LLC,16,150.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:
Orange Inc,16,,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:
Alpha Co,16,200.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:
Blue Inc,16,50.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:
Red Inc,16,150.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:


### 11.5. Update the feature values to simulate the data change. 

In [170]:
execute_sql('update sales set Jan=Jan*2, Feb=Feb*2, Mar=Mar*2, Apr=Apr*2')

TeradataCursor uRowsHandle=858 bClosed=False

### 11.6. Ingest the same features again

In [172]:
fp6 = fs_sales_analytics.get_feature_process(object=df, 
                                             entity='accounts', 
                                             features=['Jan', 'Feb', 'Mar', 'Apr']
                                             )
fp6.run()

Process '51acd5c4-8c6f-11f0-9396-b0dcef8381ea' started.
Process '51acd5c4-8c6f-11f0-9396-b0dcef8381ea' completed.


True

In [173]:
# Let's look at mind map. No new feature process initiated. Hence it should be same as previous state.
fs_sales_analytics.mind_map()

#### look at features again. Observe that only the new values appeared.

**Note:** Old values are not visible since function as_of selected only current records.

In [180]:
sales_jan = DataFrame(in_schema(fs.repo, "FS_T_feb531d5_375e_5831_05d7_3ac2ac2a7965")).as_of(valid_time='current')
sales_jan[sales_jan.feature_id == 16]



accounts,feature_id,feature_value,feature_version,valid_start,valid_end
Yellow Inc,16,,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:
Jones LLC,16,300.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,9999-12-31 23:59:59.999999+00:
Red Inc,16,300.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,9999-12-31 23:59:59.999999+00:
Alpha Co,16,400.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,9999-12-31 23:59:59.999999+00:
Blue Inc,16,100.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,9999-12-31 23:59:59.999999+00:
Orange Inc,16,,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:


#### Select all values. Then notice the difference.

In [183]:
sales_jan = DataFrame(in_schema(fs.repo, "FS_T_feb531d5_375e_5831_05d7_3ac2ac2a7965")).as_of(valid_time=None)
sales_jan[sales_jan.feature_id == 16].sort(['accounts', 'valid_start'])



accounts,feature_id,feature_value,feature_version,valid_start,valid_end
Alpha Co,16,200.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,2025-09-08 04:51:32.020000+00:
Alpha Co,16,400.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,9999-12-31 23:59:59.999999+00:
Blue Inc,16,50.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,2025-09-08 04:51:32.020000+00:
Blue Inc,16,100.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,9999-12-31 23:59:59.999999+00:
Jones LLC,16,150.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,2025-09-08 04:51:32.020000+00:
Jones LLC,16,300.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,9999-12-31 23:59:59.999999+00:
Orange Inc,16,,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:
Red Inc,16,150.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,2025-09-08 04:51:32.020000+00:
Red Inc,16,300.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,9999-12-31 23:59:59.999999+00:
Yellow Inc,16,,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:


### 11.7. Update the data again to ingest for a specific time.

In [185]:
# Double the data again. This time ingest for a specific time.
# Update the feature values to simulate the data change. 
execute_sql('update sales set Jan=Jan*2, Feb=Feb*2, Mar=Mar*2, Apr=Apr*2')

TeradataCursor uRowsHandle=940 bClosed=False

### 11.8. Ingest feature value for specific time

In [188]:
# Let's first ingest the features.
fp7 = fs_sales_analytics.get_feature_process(object=df, 
                                             entity='accounts', 
                                             features=['Jan', 'Feb', 'Mar', 'Apr'],
                                             )
fp7.run(as_of='2025-09-08 04:51:59.000000+00:00')

Process '51acd5c4-8c6f-11f0-9396-b0dcef8381ea' started.
Process '51acd5c4-8c6f-11f0-9396-b0dcef8381ea' completed.


True

In [189]:
# Let's verify the data again.
sales_jan = DataFrame(in_schema(fs.repo, "FS_T_feb531d5_375e_5831_05d7_3ac2ac2a7965")).as_of(valid_time=None)
sales_jan[sales_jan.feature_id == 16].sort(['accounts', 'valid_start'])



accounts,feature_id,feature_value,feature_version,valid_start,valid_end
Alpha Co,16,200.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,2025-09-08 04:51:32.020000+00:
Alpha Co,16,400.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,2025-09-08 04:51:59.000000+00:
Alpha Co,16,800.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:59.000000+00:,9999-12-31 23:59:59.999999+00:
Blue Inc,16,50.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,2025-09-08 04:51:32.020000+00:
Blue Inc,16,100.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,2025-09-08 04:51:59.000000+00:
Blue Inc,16,200.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:59.000000+00:,9999-12-31 23:59:59.999999+00:
Jones LLC,16,150.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,2025-09-08 04:51:32.020000+00:
Jones LLC,16,300.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:32.020000+00:,2025-09-08 04:51:59.000000+00:
Jones LLC,16,600.0,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:51:59.000000+00:,9999-12-31 23:59:59.999999+00:
Orange Inc,16,,51acd5c4-8c6f-11f0-9396-b0dcef8381ea,2025-09-08 04:50:31.840000+00:,9999-12-31 23:59:59.999999+00:


### 11.9. See the mind_map for fs_sales_analytics FeatureStore

In [193]:
# Remove the whole data domain.
# Before removing it, Let's first look at feature store for this data domain.
fs_sales_analytics.mind_map()

## 12. Remove 'Sales_Analytics' DataDomain

### 12.1. Remove DataDomain

In [197]:
# Remove it and then again look at mind map.
fs_sales_analytics.remove_data_domain()

The function will remove the data domain 'Sales_Analytics' and all associated objects. Are you sure you want to proceed? (Y/N):  y


Data domain 'Sales_Analytics' is removed from the FeatureStore.


True

### 12.2. See the FeatureStore for Sales_analytics data domain

In [200]:
# Verify the objects.
fs_sales_analytics.mind_map()

### 12.3. See the FeatureStore Analytics data domain

In [203]:
# Also verify the objects of fs. Since only data domain sales analytics is removed, fs objects should remain as it is.
fs.mind_map()

## 13. Dataset Management

### 13.1. List datasets 

In [207]:
dc.list_datasets()



id,data_domain,name,entity_name,description,valid_start,valid_end
a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff,Analytics,CustID_Transactions_1,CustomerID,,2025-09-08 04:48:34.330000+00:,9999-12-31 23:59:59.999999+00:
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,Cat_Transactions_1,Category,,2025-09-08 04:48:48.270000+00:,9999-12-31 23:59:59.999999+00:
aefc2592-176f-441d-b81b-4a3fda586c8e,Analytics,sales_jan_feb,accounts,,2025-09-08 04:49:46.700000+00:,9999-12-31 23:59:59.999999+00:


### 13.2. List features

In [210]:
dc.list_features()



dataset_id,data_domain,feature_id,feature_name,feature_view
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,10,median_Transaction_Amount,Cat_Transactions_1
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,9,max_Transaction_Amount,Cat_Transactions_1
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,7,std_Transaction_Amount,Cat_Transactions_1
a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff,Analytics,1,total_Transaction_Amount,CustID_Transactions_1
a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff,Analytics,2,avg_Transaction_Amount,CustID_Transactions_1
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,12,quartile_3_Transaction_Amount,Cat_Transactions_1
aefc2592-176f-441d-b81b-4a3fda586c8e,Analytics,14,Jan,sales_jan_feb
aefc2592-176f-441d-b81b-4a3fda586c8e,Analytics,15,Feb,sales_jan_feb


### 13.2. List Entities 

In [213]:
dc.list_entities()



id,data_domain,name,entity_name,description
aefc2592-176f-441d-b81b-4a3fda586c8e,Analytics,sales_jan_feb,accounts,
a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff,Analytics,CustID_Transactions_1,CustomerID,
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,Cat_Transactions_1,Category,


### 13.4.Acrhive datasets

In [215]:
dc.archive_datasets('aefc2592-176f-441d-b81b-4a3fda586c8e')

Dataset id(s) 'aefc2592-176f-441d-b81b-4a3fda586c8e' is/are archived from the dataset catalog.


True

In [217]:
dc.list_datasets()



id,data_domain,name,entity_name,description,valid_start,valid_end
a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff,Analytics,CustID_Transactions_1,CustomerID,,2025-09-08 04:48:34.330000+00:,9999-12-31 23:59:59.999999+00:
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,Cat_Transactions_1,Category,,2025-09-08 04:48:48.270000+00:,9999-12-31 23:59:59.999999+00:
aefc2592-176f-441d-b81b-4a3fda586c8e,Analytics,sales_jan_feb,accounts,,2025-09-08 04:49:46.700000+00:,2025-09-08 04:53:20.060000+00:


### 13.5. Delete datasets

In [219]:
# Delete the archived dataset.
dc.delete_datasets('aefc2592-176f-441d-b81b-4a3fda586c8e')

Dataset id(s) 'aefc2592-176f-441d-b81b-4a3fda586c8e' is/are deleted from the dataset catalog.


True

In [220]:
# Verify the datasets after deleting dataset 02eadbbb-aa37-431c-9489-92fc30f386b8.
dc.list_datasets()



id,data_domain,name,entity_name,description,valid_start,valid_end
a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff,Analytics,CustID_Transactions_1,CustomerID,,2025-09-08 04:48:34.330000+00:,9999-12-31 23:59:59.999999+00:
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,Cat_Transactions_1,Category,,2025-09-08 04:48:48.270000+00:,9999-12-31 23:59:59.999999+00:


### 13.6. List features

In [221]:
# Verify the features after deleting dataset 02eadbbb-aa37-431c-9489-92fc30f386b8.
dc.list_features()



dataset_id,data_domain,feature_id,feature_name,feature_view
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,9,max_Transaction_Amount,Cat_Transactions_1
a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff,Analytics,1,total_Transaction_Amount,CustID_Transactions_1
a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff,Analytics,2,avg_Transaction_Amount,CustID_Transactions_1
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,7,std_Transaction_Amount,Cat_Transactions_1
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,12,quartile_3_Transaction_Amount,Cat_Transactions_1
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,10,median_Transaction_Amount,Cat_Transactions_1


### 13.7. List entities

In [225]:
# Verify the Entities after deleting dataset 02eadbbb-aa37-431c-9489-92fc30f386b8.
dc.list_entities()



id,data_domain,name,entity_name,description
a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff,Analytics,CustID_Transactions_1,CustomerID,
b85a6e1a-5813-4dd5-9114-04afe98ea68b,Analytics,Cat_Transactions_1,Category,


### 13.8. See mind_map for FeatureStore after deleting dataset

In [227]:
# Let's look at mind map of Feature store now. Deleted dataset should not appear.
fs.mind_map()

## 14. Dataset

### 14.1. Explore Dataset methods

#### 14.1.1. Get the dataset

In [230]:
ds = dc.get_dataset('a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff')

### 14.2. Explore Properties

#### 14.2.1. features

In [234]:
ds.features

[Feature(name=total_Transaction_Amount), Feature(name=avg_Transaction_Amount)]

#### 14.2.2. entity

In [236]:
ds.entity

[Entity(name=CustomerID)]

#### 14.2.3. id

In [239]:
ds.id

'a4ad22e3-3572-46aa-bbff-cb5cfe9b5fff'

#### 14.2.4. view_name

In [241]:
ds.view_name

'CustID_Transactions_1'

## 15. Feature Management

### 15.1. List features

In [244]:
fc.list_features()



feature_id,name,entity_name,data_type,feature_type,valid_start,valid_end
9,max_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
3,count_Transaction_Amount,CustomerID,INTEGER,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
1,total_Transaction_Amount,CustomerID,FLOAT,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
7,std_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
4,nb_days_since_last_transactions,CustomerID,INTERVAL_DAY,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
15,Feb,accounts,FLOAT,CONTINUOUS,2025-09-08 04:49:12.050000+00:,9999-12-31 23:59:59.999999+00:
14,Jan,accounts,BIGINT,CONTINUOUS,2025-09-08 04:49:12.050000+00:,9999-12-31 23:59:59.999999+00:
6,mean_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
12,quartile_3_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
5,sum_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:


### 15.2. list feature versions

In [246]:
fc.list_feature_versions()



entity_id,data_domain,id,name,table_name,feature_version
CustomerID,Analytics,2,avg_Transaction_Amount,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,c85cec83-8c6e-11f0-9841-b0dcef8381ea
accounts,Analytics,15,Feb,FS_T_62c9e14a_e0ca_d196_bca1_cc3436ebc70d,2555081b-8c6f-11f0-a7d8-b0dcef8381ea
accounts,Analytics,14,Jan,FS_T_4fb46368_0c60_21bf_20a1_b0a3948fedde,2555081b-8c6f-11f0-a7d8-b0dcef8381ea
Category,Analytics,7,std_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,10,median_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,11,quartile_1_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,13,var_feature,FS_T_44b9b0fd_ee25_32c3_6ae6_c0df59351fd7,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,12,quartile_3_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,8,min_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,6,mean_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea


In [249]:
# Looks to be a bug. Investigation is in progress.
fc.features

[Feature(name=median_Transaction_Amount),
 Feature(name=Feb),
 Feature(name=count_Transaction_Amount),
 Feature(name=quartile_1_Transaction_Amount),
 Feature(name=var_feature),
 Feature(name=Jan),
 Feature(name=nb_days_since_last_transactions),
 Feature(name=total_Transaction_Amount),
 Feature(name=quartile_3_Transaction_Amount),
 Feature(name=avg_Transaction_Amount),
 Feature(name=mean_Transaction_Amount),
 Feature(name=max_Transaction_Amount),
 Feature(name=std_Transaction_Amount),
 Feature(name=sum_Transaction_Amount),
 Feature(name=min_Transaction_Amount)]

In [251]:
fc.entities

[Entity(name=Category), Entity(name=accounts), Entity(name=CustomerID)]

### 15.3. Archive features

In [261]:
# Let's archive a feature which is associated with any dataset. 
fc.archive_features('Jan')

Feature 'Jan' is archived from table 'FS_T_4fb46368_0c60_21bf_20a1_b0a3948fedde'.
Feature 'Jan' is archived from metadata.


True

In [262]:
# Observe that the Jan record is closed.
fc.list_features()



feature_id,name,entity_name,data_type,feature_type,valid_start,valid_end
1,total_Transaction_Amount,CustomerID,FLOAT,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
14,Jan,accounts,BIGINT,CONTINUOUS,2025-09-08 04:49:12.050000+00:,2025-09-08 04:54:52.950000+00:
6,mean_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
13,var_feature,Category,VARCHAR,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
9,max_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
7,std_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
12,quartile_3_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
4,nb_days_since_last_transactions,CustomerID,INTERVAL_DAY,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
11,quartile_1_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
15,Feb,accounts,FLOAT,CONTINUOUS,2025-09-08 04:49:12.050000+00:,9999-12-31 23:59:59.999999+00:


### 15.3. Look at the feature values also. Even they all should be closed.

In [266]:
# Look at the feature values also. Even they all should be closed.
display.max_rows=30
fc.list_feature_versions()



entity_id,data_domain,id,name,table_name,feature_version
CustomerID,Analytics,2,avg_Transaction_Amount,FS_T_14603143_aa22_6a4d_42f7_adfb07356c0c,c85cec83-8c6e-11f0-9841-b0dcef8381ea
accounts,Analytics,15,Feb,FS_T_62c9e14a_e0ca_d196_bca1_cc3436ebc70d,2555081b-8c6f-11f0-a7d8-b0dcef8381ea
Category,Analytics,7,std_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,9,max_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,11,quartile_1_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,13,var_feature,FS_T_44b9b0fd_ee25_32c3_6ae6_c0df59351fd7,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,12,quartile_3_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,8,min_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,6,mean_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea
Category,Analytics,5,sum_Transaction_Amount,FS_T_9e4a4f56_14ff_7629_68eb_ce1cfa28c901,f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea


In [272]:
ndf = DataFrame(in_schema(fs.repo, "FS_T_4fb46368_0c60_21bf_20a1_b0a3948fedde")).as_of(valid_time='current')
ndf[ndf.feature_id==14]



accounts,feature_id,feature_value,feature_version,valid_start,valid_end


In [278]:
# Retrieve all the records. Since Jan is still not deleted, as_of with valid_time as None should still show results.
# However, all the records should be closed.
ndf = DataFrame(in_schema(fs.repo, "FS_T_4fb46368_0c60_21bf_20a1_b0a3948fedde")).as_of(valid_time=None)
ndf[ndf.feature_id==14]



accounts,feature_id,feature_value,feature_version,valid_start,valid_end
Yellow Inc,14,,2555081b-8c6f-11f0-a7d8-b0dcef8381ea,2025-09-08 04:49:15.640000+00:,2025-09-08 04:54:52.950000+00:
Alpha Co,14,800.0,2555081b-8c6f-11f0-a7d8-b0dcef8381ea,2025-09-08 04:49:38.600000+00:,2025-09-08 04:54:52.950000+00:
Jones LLC,14,600.0,2555081b-8c6f-11f0-a7d8-b0dcef8381ea,2025-09-08 04:49:38.600000+00:,2025-09-08 04:54:52.950000+00:
Orange Inc,14,,2555081b-8c6f-11f0-a7d8-b0dcef8381ea,2025-09-08 04:49:15.640000+00:,2025-09-08 04:54:52.950000+00:
Blue Inc,14,200.0,2555081b-8c6f-11f0-a7d8-b0dcef8381ea,2025-09-08 04:49:15.640000+00:,2025-09-08 04:54:52.950000+00:
Red Inc,14,600.0,2555081b-8c6f-11f0-a7d8-b0dcef8381ea,2025-09-08 04:49:15.640000+00:,2025-09-08 04:54:52.950000+00:


### 15.4. Deletes features

In [281]:
# Let's delete the archived feature.
fc.delete_features('Jan')

Feature 'Jan' is deleted from table 'FS_T_4fb46368_0c60_21bf_20a1_b0a3948fedde'.
Feature 'Jan' is deleted from metadata.
Table 'FS_T_4fb46368_0c60_21bf_20a1_b0a3948fedde' is dropped as it is not referenced in metadata.


True

In [282]:
# Verify Jan is completly removed or not.
fc.list_features()



feature_id,name,entity_name,data_type,feature_type,valid_start,valid_end
7,std_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
4,nb_days_since_last_transactions,CustomerID,INTERVAL_DAY,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
13,var_feature,Category,VARCHAR,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
11,quartile_1_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
5,sum_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
3,count_Transaction_Amount,CustomerID,INTEGER,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
1,total_Transaction_Amount,CustomerID,FLOAT,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
9,max_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
12,quartile_3_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
6,mean_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:


### 15.5. Let's look at the mind_map now

In [286]:
# Let's look at mind_map now. 
fs.mind_map()

### 15.6. List feature processes

In [289]:
# Let's remove the feature process. First look at feature processes.
fs.list_feature_processes()



process_id,description,data_domain,process_type,data_source,entity_id,feature_names,feature_ids,valid_start,valid_end
2555081b-8c6f-11f0-a7d8-b0dcef8381ea,,Analytics,denormalized view,"""sales""",accounts,"Feb, Jan",,2025-09-08 04:49:17.020000+00:,9999-12-31 23:59:59.999999+00:
c85cec83-8c6e-11f0-9841-b0dcef8381ea,Feature Process for Customers,Analytics,denormalized view,"""ALICE"".""FEAT_ENG_CUST""",CustomerID,"avg_Transaction_Amount, count_Transaction_Amount, nb_days_since_last_transactions, total_Transaction_Amount",,2025-09-08 04:46:43.330000+00:,9999-12-31 23:59:59.999999+00:
f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea,Feature Process for Category,Analytics,denormalized view,"""ALICE"".""FEAT_ENG_CAT""",Category,"max_Transaction_Amount, mean_Transaction_Amount, median_Transaction_Amount, min_Transaction_Amount, quartile_1_Transaction_Amount, quartile_3_Transaction_Amount, std_Transaction_Amount, sum_Transaction_Amount, var_feature",,2025-09-08 04:48:11.340000+00:,9999-12-31 23:59:59.999999+00:


### 15.7. Achive feature process

In [291]:
fs.archive_feature_process('2555081b-8c6f-11f0-a7d8-b0dcef8381ea')

Feature 'Feb' is archived from table 'FS_T_62c9e14a_e0ca_d196_bca1_cc3436ebc70d'.
Feature 'Feb' is archived from metadata.
Feature 'Jan' does not exist in feature catalog.
FeatureProcess with process id '2555081b-8c6f-11f0-a7d8-b0dcef8381ea' is archived.


False

In [292]:
# Let's look at feature processes. Observe that feature process '85dc42e8-670f-11f0-bb98-c934b24a960f' is also archived. It is closed.
fs.list_feature_processes()



process_id,description,data_domain,process_type,data_source,entity_id,feature_names,feature_ids,valid_start,valid_end
f8ff0505-8c6e-11f0-b8d3-b0dcef8381ea,Feature Process for Category,Analytics,denormalized view,"""ALICE"".""FEAT_ENG_CAT""",Category,"max_Transaction_Amount, mean_Transaction_Amount, median_Transaction_Amount, min_Transaction_Amount, quartile_1_Transaction_Amount, quartile_3_Transaction_Amount, std_Transaction_Amount, sum_Transaction_Amount, var_feature",,2025-09-08 04:48:11.340000+00:,9999-12-31 23:59:59.999999+00:
c85cec83-8c6e-11f0-9841-b0dcef8381ea,Feature Process for Customers,Analytics,denormalized view,"""ALICE"".""FEAT_ENG_CUST""",CustomerID,"avg_Transaction_Amount, count_Transaction_Amount, nb_days_since_last_transactions, total_Transaction_Amount",,2025-09-08 04:46:43.330000+00:,9999-12-31 23:59:59.999999+00:
2555081b-8c6f-11f0-a7d8-b0dcef8381ea,,Analytics,denormalized view,"""sales""",accounts,"Feb, Jan",,2025-09-08 04:49:17.020000+00:,2025-09-08 04:58:14.350000+00:


In [293]:
# Let's look at catalog data. Observe that feature Feb is also archived. It is closed.
fc.list_features()



feature_id,name,entity_name,data_type,feature_type,valid_start,valid_end
1,total_Transaction_Amount,CustomerID,FLOAT,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
6,mean_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
7,std_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
12,quartile_3_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
13,var_feature,Category,VARCHAR,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
11,quartile_1_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
9,max_Transaction_Amount,Category,FLOAT,CONTINUOUS,2025-09-08 04:48:03.760000+00:,9999-12-31 23:59:59.999999+00:
4,nb_days_since_last_transactions,CustomerID,INTERVAL_DAY,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:
15,Feb,accounts,FLOAT,CONTINUOUS,2025-09-08 04:49:12.050000+00:,2025-09-08 04:58:14.350000+00:
3,count_Transaction_Amount,CustomerID,INTEGER,CONTINUOUS,2025-09-08 04:46:37.800000+00:,9999-12-31 23:59:59.999999+00:


### 15.8. Delete feature process

In [297]:
# Delete the archived feature process.
fs.delete_feature_process('2555081b-8c6f-11f0-a7d8-b0dcef8381ea')

Feature 'Feb' is deleted from table 'FS_T_62c9e14a_e0ca_d196_bca1_cc3436ebc70d'.
Feature 'Feb' is deleted from metadata.
Feature 'Jan' does not exist in feature catalog.
Table 'FS_T_62c9e14a_e0ca_d196_bca1_cc3436ebc70d' is dropped as it is not referenced in metadata.
FeatureProcess with process id '2555081b-8c6f-11f0-a7d8-b0dcef8381ea' is deleted.


False

### 15.9. Let's look at mind map now.

In [300]:
fs.mind_map()

In [305]:
# look at mind map now. All should be empty.
fs_sales_analytics.mind_map()

### 15.10. Remove data domain

In [308]:
# Let's remove the whole data domain. That should remove all objects.
fs.remove_data_domain()

The function will remove the data domain 'Analytics' and all associated objects. Are you sure you want to proceed? (Y/N):  y


Data domain 'Analytics' is removed from the FeatureStore.


True

## 16. Cleanup 

### 16.1. Drop views

In [311]:
db_drop_view('FEAT_ENG_CUST')

True

In [312]:
db_drop_view('FEAT_ENG_CAT')

True

### 16.2. Drop table

In [313]:
db_drop_table('transactions')

True

In [316]:
db_drop_table('sales')

True

### 16.3. Remove the Context

In [319]:
# Let's remove the feature store. First, disconnect from user and connect with DB Admin. 
# Removing Feature Store drops the database and all associated objects. So, doing it only with DB Admin.
# First disconnect from current user and establish connection with DBAdmin.
remove_context()

True

### 16.3. Delete Feature Store

In [321]:
context=create_context(config_file='admin_config_file.env')

**Note** : This will drop the database if all objects are removed.

In [323]:
fs.delete()

The function removes Feature Store and drops the corresponding repo also. Are you sure you want to proceed? (Y/N):  y


True

In [326]:
# Let's verify the repos. It should be empty.
FeatureStore.list_repos()



repos
test_repo_delete_no
efs_demo


In [327]:
# Finally disconnect the DB Admin from Vantage.
remove_context()

True