<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       Using the Enterprise Feature Store Functions
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>


<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>Introduction</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
Teradata Enterprise Feature Store (EFS) Functions are designed to handle feature management within the Vantage environment. While inspired by the syntax of Feast, Teradata EFS Functions stands out, offering efficiency and robustness in data management and feature handling tailored specifically for the use of Teradata Vantage.  
</p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
 Teradata EFS Functions use Teradata Dataframes for Feature management, to the contrary of the pandas dataframe of Feast. With Teradata Dataframes we avoid extracting the data to create or use Features from the Enterprise Feature Store (EFS). The EFS Functions are crafted to empower Data Science teams for effective and streamlined feature management. This notebook will walk you through the capabilities of EFS Functions, demonstrating how it integrates seamlessly with your data models and processes.

</p>





<div style = 'font-size:16px;font-family:Arial;color:#00233C'>
<p style = 'font-size:20px;font-family:Arial;color:#00233C'>
<b>Key Concepts of the Enterprise Feature Store (EFS) SDK
</b>
</p>
The Enterprise Feature Store (EFS) SDK is designed with a totally object-oriented approach, focusing on intuitive interaction with feature stores. Central to this design are several core objects: Feature, Entity, DataSource, FeatureGroup. Together, these objects facilitate the efficient management and utilization of features within your data ecosystem, leveraging Teradata Vantage for metadata storage. Here's a closer look at each of these objects and their roles:
</p>
</div>

<center><img src="EFS_key_concepts.png" alt="efs"></center>


<div style = 'font-size:16px;font-family:Arial;color:#00233C'>
<p style = 'font-size:18px;font-family:Arial;color:#00233C'>
<b>Feature
</b>

A Feature represents a single, distinct piece of data that can be used in machine learning models. Features are the fundamental building blocks of the EFS, designed to encapsulate specific data types, validation rules, and metadata essential for downstream analysis and modeling.
</p>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'>
<b>Entity
</b>
</p>
<p>
An Entity serves as the anchor for one or more Features, grouping them by a common identifier. This shared identifier ensures that features within an entity relate to the same logical unit, such as a customer or transaction. The Entity concept ensures data consistency and simplifies the management of feature relationships.
</p>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'>
<b>Data Source
</b>
</p>

The DataSource object provides a flexible mapping between the results of a SQL query or DataFrame and Features. It describes how raw data from Teradata Vantage can be transformed into structured features ready for machine learning. This abstraction allows for the separation of data retrieval logic from feature management, promoting modularity and reuse.

<p style = 'font-size:18px;font-family:Arial;color:#00233C'>
<b>Feature Group
</b>
</p>

A FeatureGroup represents a collection of Features that are related by a common Entity and originate from the same DataSource. By grouping features this way, the EFS SDK encourages logical organization of features and simplifies batch operations like updates, retrievals, and analysis.


<p style = 'font-size:18px;font-family:Arial;color:#00233C'>
<b>Repository
</b>
</p>
A Repository is a logical workspace to enable the user to work in their Feature Groups, with the possibility to promote features between repositories. This enables the possibility to have personal repositories (for example, a Lab), team repositories to collaborate and a central production repository.
</div>


<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>What You Will Do in This Notebook</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
This notebook is designed to guide you through a series of practical exercises that demonstrate the use of Teradata's Enterprise Feature Store capabilities. By the end of this tutorial, you will have a comprehensive understanding of how to manage and utilize feature stores for machine learning workflows. Here's what you'll learn:
</p>

<div style = 'font-size:16px;font-family:Arial;color:#00233C'>
<ol>
<li>
<b>Setup a Feature Store Repository and Grant access on it to users</b>
<ul>
<li>Learn how to set up a new feature store repository, using Feature Groups, which serves as the foundational environment for storing and managing your data features.</li>
<li>Owner of the FeatureStore can grant/revoke read only, write only or read and write authorization to other user(s) </li>
</ul>
</li>

<li>
<b>Create and Register objects with FeatureStore</b>
<ul>
    <li>Discover <span style="color: #FF4500">how to create a Feature</span> from Teradata DataFrame and register the feature with the Teradata Enterprise Feature Store.</li>
    <li>Discover <span style="color: #FF4500">how to create an Entity</span> from Teradata DataFrame and register the Entity with the Teradata Enterprise Feature Store.</li>
    <li>Discover <span style="color: #FF4500">different ways to create DataSource</span> and register DataSource with the Teradata Enterprise Feature Store.</li>
    <li>Discover <span style="color: #FF4500">different ways to create FeatureGroup</span> and register FeatureGroup with the Teradata Enterprise Feature Store.</li>
</ul>
</li>

<li>
<b>Searching inside Teradata Enterprise Feature Store</b>
<ul>
<li>Explore methods to search in Features, Entities, DataSources and FeatureGroups. </li>
</ul>
</li>


<li>
<b>Modifying FeatureStore Objects</b>
<ul>
<li>Explore methods to modify existing features and other objects within your Enterprise Feature Store to adapt to changes in your data or analysis requirements. </li>
</ul>
</li>

<li>
<b>Combining multiple FeatureGroups.</b>
<ul>
<li>Explore a way to combine multiple FeatureGroups to a single FeatureGroup and store the combined FeatureGroup within your Enterprise Feature Store. </li>
</ul>
</li>


<li>
<b>Archive and Delete objects in FeatureStore.</b>
<ul>
<li>Explore method to archive and delete different objects from FeatureStore. </li>
</ul>
</li>


<li>
<b>Creating Datasets and historic Datasets for ML models</b>
<ul>
<li>Teradata EFS approach to get Datasets or historic Datasets to feed your ML Model.</li>
</ul>
</li>


<li>
<b>Use Enterprise Feature Store with teradataml anaylytic functions.</b>
<ul>
<li>Apply Teradata EFS Functions to create suitable for training a machine learning model, ensuring it is clean, well-structured, and aligned with your model's requirements.</li>
</ul>
</li>


<li>
<b>Repository Governance</b>
<ul>
<li>Promote features from one repository to another repository.</li>
</ul>
</li>
</ol>
</div>

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>1. Connect to Vantage</b>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Here, we import the required libraries, set environment variables and environment paths (if required).</p>

In [None]:
%%capture
!pip install --upgrade teradataml

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Enterprise Feature Store is new feature added in teradataml 20.0.0.3 so we are upgrading the installed teradataml version

<div class="alert alert-block alert-info">
<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>Note: </b><i>The above statements may need to be uncommented if you run the notebooks on a platform other than ClearScape Analytics Experience that does not have the libraries installed. If you uncomment those installs, be sure to restart the kernel after executing those lines to bring the installed libraries into memory. The simplest way to restart the Kernel is by typing zero zero: <b> 0 0</b></i></p>
</div>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>You will be prompted to provide the password. Enter your password, press the Enter key, then use down arrow to go to next cell. Begin running steps with Shift + Enter keys.</p>

In [None]:
from getpass import getpass
from teradataml import (
    create_context,
    execute_sql,
    DataFrame,
    DataSource,
    Entity,
    FeatureGroup,
    FeatureStore,
    FeatureType,
    FeatureStatus,
    load_example_data,
    remove_context,
    Feature,
    XGBoost
)


In [None]:
%run -i ../../UseCases/startup.ipynb
eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
print(eng)

In [None]:
%%capture
execute_sql('''SET query_band='DEMO=PP_EFS_Getting_Started_Python.ipynb;' UPDATE FOR SESSION; ''')

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>2. Getting Data for This Demo</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
In this tutorial, we will use the <span style="background-color: #eee; font-style: italic; "> load_example_data() </span> function provided by teradataml, which is responsible to load the data to Vantage.
</p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
This will create two tables in Vantage. 
    <ul>
        <li style = 'font-size:16px;font-family:Arial;color:#00233C'>patient_profile</li>
        <li style = 'font-size:16px;font-family:Arial;color:#00233C'>medical_readings</li></ul></p>


In [None]:
load_example_data('dataframe', 'patient_profile')
load_example_data('dataframe', 'medical_readings')

In [None]:
patient_profile_df = DataFrame('patient_profile')
patient_profile_df

In [None]:
medical_readings_df = DataFrame('medical_readings')
medical_readings_df

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>3. Setup a Feature Store Repository and Grant access on it to different users. </b>


<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Let's first setup the FeatureStore with repo name as <span style="background-color: #eee; font-style: italic; "> LabRepoOne </span><i>. Please Look at Prerequisite for setting up FeatureStore in teradataml user guide.</i>
</p>

In [None]:
# Before creating Repo, let's check existing FeatureStores.
FeatureStore.list_repos()

In [None]:
# FeatureStore is not setup for repo LabRepoOne. Let's setup.
fs = FeatureStore('LabRepoOne')
fs.setup(perm_size='10e8')

In [None]:
# Let's verify by listing the repo's.
FeatureStore.list_repos()

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Now we can see the repo we created 'LabRepoOne' </p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We can authorise the repos to be used by different users. As in this demo environment we have only one user 'demo_user', we are listing the commands that can be used to give access to other users in actual project environments. <br> Let's Look at the ways to authorize access for FeatureStore to user <span style="background-color: #eee; font-style: italic; "> user1 </span>. <br>
<p style = 'font-size:14px;font-family:Courier'><b>Grant Access</b><br>
  <code>user='user1'</code> <br>
 <p style = 'font-size:14px;font-family:Courier;color:#355E3B'>   # Grant read only access to user1. user1 can able to only see the all objects in FeatureStore but cannot modify these objects.
<p style = 'font-size:14px;font-family:Courier'><code>fs.grant.read(user)</code><br>
  <p style = 'font-size:14px;font-family:Courier;color:#355E3B'> # Grant write only access to user1. user1 can able to modify all objects in FeatureStore but cannot see these objects.
<p style = 'font-size:14px;font-family:Courier'><code>fs.grant.write(user)</code><br>   
      <p style = 'font-size:14px;font-family:Courier;color:#355E3B'> # Grant read and write to user1. user1 will get full access on all objects of FeatureStore.
<p style = 'font-size:14px;font-family:Courier'><code>fs.grant.read_write(user)</code><br>
    <p style = 'font-size:14px;font-family:Courier'><b> Revoke Access</b><br>
        <p style = 'font-size:14px;font-family:Courier;color:#355E3B'> # Revoke read access from user1 on FeatureStore LabRepoOne.
<p style = 'font-size:14px;font-family:Courier'><code>fs.revoke.read(user)</code><br>
      <p style = 'font-size:14px;font-family:Courier;color:#355E3B'> # Revoke write access from user1 on FeatureStore LabRepoOne.
<p style = 'font-size:14px;font-family:Courier'><code>fs.revoke.write(user)</code><br>
          <p style = 'font-size:14px;font-family:Courier;color:#355E3B'># Revoke read and write access from user1 on FeatureStore LabRepoOne.
<p style = 'font-size:14px;font-family:Courier'><code>fs.revoke.write(user)</code><br>

<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>4. Create and Register objects with FeatureStore </b></p>
<b style = 'font-size:18px;font-family:Arial;color:#00233C'>4.1 Create and Register Feature </b>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.1.1 Creating a Feature</b></p>

In [None]:
from teradataml import Feature

In [None]:
# Creating Feature for Column 'age' from Teradata DataFrame 'patient_profile_df'.
f1 = Feature(name='PatientAge', 
             column=patient_profile_df.age, 
             feature_type=FeatureType.CONTINUOUS, 
             description=None, 
             tags=["PatientProfile", "PatientDetails"])

In [None]:
# Look at underlying properties.
print(f"\033[1mName:\033[0m {f1.name}")
print(f"\033[1mColumn Name:\033[0m {f1.column_name}")
print(f"\033[1mData Type:\033[0m {f1.data_type}")
print(f"\033[1mDescription:\033[0m {f1.description}")
print(f"\033[1mTags:\033[0m {f1.tags}")
print(f"\033[1mStatus:\033[0m {f1.status}")

<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.1.2 Register Feature with FeatureStore `fs` </b></p>

In [None]:
# Before even register the Feature, let's look at available Features.
fs.list_features()

In [None]:
# FeatureStore.apply() register every object.
fs.apply(f1)

In [None]:
# Let's look at available Features again.
fs.list_features()

<hr style="height:1px;border:none;background-color:#00233C;">
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>4.2 Create and Register Entity </b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.2.1 Creating an Entity </b></p>

In [None]:
# Create entity for DataFrame 'patient_profile_df'
entity=Entity(name='PatientEntity', columns=patient_profile_df.patient_id)

In [None]:
# Look at Entity properties.
entity.name, entity.columns, entity.description

<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.2.2 Register Entity with FeatureStore `fs` </b></p>

In [None]:
# Before even registering Entity, let's look at existing Entities.
fs.list_entities()

In [None]:
# Register the Entity.
fs.apply(entity)

In [None]:
# Look at existing Entities after registering the Entity.
fs.list_entities()

<hr style="height:1px;border:none;background-color:#00233C;">
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>4.3 Create and Register DataSource </b></p>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>DataSource can be either created from a SQL Query or from Teradata DataFrame.</li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>DataSource has argument `timestamp_col_name` which accepts the name of Column in DataSource which indicates when the corresponding record is created. This is much helpfull to get historic dataset. </li>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.3.1 Creating a DataSource from Teradata DataFrame</b></p>

In [None]:
# Let's create DataSource from DataFrame `patient_profile_df`.
ds = DataSource(name='PatientProfileSource', source=patient_profile_df, timestamp_col_name='record_timestamp')

In [None]:
# Let's look at properties of DataSource.
ds.name, ds.source, ds.description, ds.timestamp_col_name

<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.3.2 Creating a DataSource from SQL Query</b></p>

In [None]:
# Let's create DataSource from DataFrame `patient_profile_df`.
ds = DataSource(name='PatientProfileSource', source="SELECT * FROM PATIENT_PROFILE")

In [None]:
# Let's look at properties of DataSource.
ds.name, ds.source, ds.description

<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.3.3 Register DataSource with FeatureStore `fs`</b></p>

In [None]:
# Before registering let's look at existing DataSources.
fs.list_data_sources()

In [None]:
# Register DataSource with repo.
fs.apply(ds)

In [None]:
# Let's look at available DataSources after registration.
fs.list_data_sources()

<hr style="height:1px;border:none;background-color:#00233C;">
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>4.4 Create and Register FeatureGroup </b></p>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>FeatureGroup can be created using Teradata DataFrame.</li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>FeatureGroup can be created using SQL Query. </li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>FeatureGroup can be created using objects of Feature, Entity, DataSource.  </li>


<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.4.1 Creating a FeatureGroup from Teradata DataFrame
</b></p>

In [None]:
fg = FeatureGroup.from_DataFrame(
    name='PatientProfileDF', 
    entity_columns='patient_id', 
    df=patient_profile_df, 
    timestamp_col_name='record_timestamp'
)

In [None]:
# Let's look at Properties.
fg.features, fg.entity, fg.data_source, fg.description

<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.4.2 Creating a FeatureGroup from SQL Query
</b></p>

In [None]:
fg = FeatureGroup.from_query(
    name='PatientProfileQuery', 
    entity_columns='patient_id', 
    query="select * from patient_profile", 
    timestamp_col_name='record_timestamp'
)

In [None]:
# Let's look at Properties.
fg.features, fg.entity, fg.data_source, fg.description

<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.4.3 Creating a FeatureGroup using objects of Feature, Entity and DataSource
</b></p>

In [None]:
fg = FeatureGroup(name='PatientProfileObjs', features=[f1], entity=entity, data_source=ds)

In [None]:
# Let's look at Properties.
fg.features, fg.entity, fg.data_source, fg.description

<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>4.4.4 Register FeatureGroup with FeatureStore `fs`
</b></p>

In [None]:
# Let's look at underlying FeatureGroups first.
fs.list_feature_groups()

In [None]:
# Let's look at Available Features also. Notice: Feature is not associated with any group.
fs.list_features()

In [None]:
# Register FeatureGroup with FeatureStore.
fs.apply(fg)

In [None]:
# Let's look at FeatureGroups after registration.
fs.list_feature_groups()

In [None]:
# Let's look at Available Features again. Notice: Feature is now associated with group.
fs.list_features()

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>5. Searching inside Teradata Enterprise Feature Store </b>

<li style = 'font-size:16px;font-family:Arial;color:#00233C'>How to search for Features: <b>FeatureStore.list_features()</b> returns Teradata DataFrame. All the filter options available on Teradata DataFrame can be used for searching. Look at example below.</li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>How to search for Entities: <b>FeatureStore.list_entities()</b> returns Teradata DataFrame. All the filter options available on Teradata DataFrame can be used for searching.</li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>How to search for DataSources: <b>FeatureStore.list_data_sources()</b> returns Teradata DataFrame. All the filter options available on Teradata DataFrame can be used for searching.  </li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>How to search for FeatureGroups: <b>FeatureStore.list_feature_groups()</b> returns Teradata DataFrame. All the filter options available on Teradata DataFrame can be used for searching.   </li>


In [None]:
# Let's first create some more Features and register with repo. Then we can use same for searching.
f1=Feature(name='PatientBMI', column=patient_profile_df.bmi)
fs.apply(f1)

In [None]:
# First list the Features.
fs.list_features()

In [None]:
# Filter the Features registerd at day 18. 
# Note: One can use all the filter options available on Teradata DataFrame. Look at user guide to look at available filter options.
features_df = fs.list_features()
features_df[features_df.creation_time.day_of_month()==15]

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>6. Modifying FeatureStore Objects </b>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Teradata EFS exposed below API's to get the objects from FeatureStore</p>
<ul >
    <li style = 'font-size:16px;font-family:Arial;color:#00233C'><b>FeatureStore.get_feature()</b>to get the <b>Feature</b> object from FeatureStore.</li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'><b>FeatureStore.get_entity()</b> to get the <b>Entity</b> object from FeatureStore.  </li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'><b>FeatureStore.get_data_source()</b> to get the <b>DataSource</b> object from FeatureStore.   </li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'><b>FeatureStore.get_feature_group()</b> to get the <b>FeatureGroup</b> object from FeatureStore. </li></ul>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Use these API's to get corresponding object, modify the corresponding property, then again register object with repository using <b>FeatureStore.apply()</b>.

<hr style="height:1px;border:none;background-color:#00233C;">
<b style = 'font-size:18px;font-family:Arial;color:#00233C'>6.1 Update description for Feature PatientAge </b>

In [None]:
feature=fs.get_feature('PatientAge')
feature

In [None]:
# Update Description and tags.
feature.description="Patient's age for patient profile."
feature.tags = ['PatientProfile']
fs.apply(feature)

In [None]:
# Let's look at features again. Look for description column. 
fs.list_features()

<hr style="height:1px;border:none;background-color:#00233C;">
<b style = 'font-size:18px;font-family:Arial;color:#00233C'>6.2 Update description for Entity PatientEntity</b>

In [None]:
# Before updating description, let's look at Entities.
fs.list_entities()

In [None]:
# Get Entity from FeatureStore.
entity = fs.get_entity('PatientEntity')
entity

In [None]:
# Update Entity description.
entity.description = "Entity for Patient Profile."
fs.apply(entity)

In [None]:
# After updating description, let's look at Entities.
fs.list_entities()

<hr style="height:1px;border:none;background-color:#00233C;">
<b style = 'font-size:18px;font-family:Arial;color:#00233C'>6.3 Update timestamp_col_name for DataSource PatientProfileSource</b>

In [None]:
# Before updating time stamp column, let's look at DataSource.
fs.list_data_sources()

In [None]:
# First get the DataSource.
data_source = fs.get_data_source('PatientProfileSource')
data_source

In [None]:
# Update time stamp column.
data_source.timestamp_col_name = 'record_timestamp'
fs.apply(data_source)

In [None]:
# After updating time stamp column, let's look at DataSource.
fs.list_data_sources()

<hr style="height:1px;border:none;background-color:#00233C;">
<b style = 'font-size:18px;font-family:Arial;color:#00233C'>6.4 Update FeatureGroup</b>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'> Note: Updating FeatureGroup will update the underlying Feature(s), DataSource, Entity.</li>

In [None]:
# Before updating description, let's look at FeatureGroup.
fs.list_feature_groups()

In [None]:
# Before updating description, let's look at DataSources.
fs.list_data_sources()

In [None]:
# Get FeatureGroup.
fg = fs.get_feature_group('PatientProfileObjs')

# Update DataSource description and FeatureGroup description.
fg.data_source.description = "Data Source for Patient Profile."
fg.description = "FeatureGroup for Patient Profile."

# Register FeatureGroup with FeatureStore.
fs.apply(fg)

In [None]:
# After updating description, let's look at FeatureGroup.
fs.list_feature_groups()

In [None]:
# After updating description, let's look at DataSource.
fs.list_data_sources()

<hr style="height:1px;border:none;background-color:#00233C;">
<b style = 'font-size:18px;font-family:Arial;color:#00233C'>6.5 How to add a new Feature or change Entity or DataSource to an Existing FeatureGroup</b>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'> You can always modify FeatureGroup with `FeatureGroup.apply()` method.</li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'> Note: `FeatureGroup.apply()` will not update details to `repo`. You should do `FeatureStore.apply()` to update repo.</li>


In [None]:
# Before adding Feature, let's look at available Features.
fs.list_features()

In [None]:
# Let's add a new Feature for FeatureGroup PatientProfileObjs
f2 = fs.get_feature('PatientBMI')
# First register the Feature with FeatureGroup.
fg.apply(f2)

In [None]:
# Then, Register FeatureGroup with FeatureStore.
fs.apply(fg)

In [None]:
# Let's look at Features.
fs.list_features()

<hr style="height:1px;border:none;background-color:#00233C;">
<b style = 'font-size:18px;font-family:Arial;color:#00233C'>6.6 How to remove a new Feature from an Existing FeatureGroup</b>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'> You can use `FeatureGroup.remove()` method to remove object from FeatureGroup.</li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'> Note: `FeatureGroup.remove()` will not update details to `repo`. You should do `FeatureStore.apply()` to update repo.</li>


In [None]:
# Let's remove Feature `PatientBMI` from FeatureGroup `PatientProfileObjs`.
fg.remove(f2)

In [None]:
# Update FeatureGroup with FeatureStore.
fs.apply(fg)

In [None]:
# Let's look at Features.
fs.list_features()

In [None]:
# Let's look at FeatureGroups.
fs.list_feature_groups()

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>7. Combining multiple FeatureGroups </b>

<li style = 'font-size:16px;font-family:Arial;color:#00233C'>
One can combine multiple FeatureGroups to a single group using `+` operator. Once you combine multiple FeatureGroups, you will again get a new FeatureGroup. The name of new FeatureGroup is combined name of all FeatureGroups. For example, if you are combining FeatureGroups `group1`, `group2`, `group3`, then the new name is `group1_group2_group3` . You can change the name if you are looking to.</li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>New FeatureGroup will have Features from all the individual FeatureGroups.</li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>When you are combining multiple FeatureGroups, corresponding `Entity` and `time_stamp_column` should be same for all individual FeatureGroups.</li>

In [None]:
# Let's first create individual FeatureGroups first.
patient_profile_fg = FeatureGroup.from_DataFrame(
    name='PatientProfile', 
    df=patient_profile_df, 
    entity_columns='patient_id', 
    timestamp_col_name='record_timestamp'
)
medical_readings_fg = FeatureGroup.from_DataFrame(
    name='MedicalReadings', 
    df=medical_readings_df, 
    entity_columns='patient_id', 
    timestamp_col_name='record_timestamp'
)

In [None]:
# Look at Features first for FeatureGroups.
patient_profile_fg.features

In [None]:
medical_readings_fg.features

In [None]:
# Create new FeatureGroup.
combined_fg = patient_profile_fg + medical_readings_fg

In [None]:
# Look at new FeatureGroup name.
combined_fg.name

In [None]:
# Look at combined features.
combined_fg.features

In [None]:
# Push individual FeatureGroups and also Combined FeatureGroup.
fs.apply(patient_profile_fg)

In [None]:
# Push individual FeatureGroups and also Combined FeatureGroup.
fs.apply(medical_readings_fg)

In [None]:
# Push individual FeatureGroups and also Combined FeatureGroup.
fs.apply(combined_fg)

In [None]:
# Let's look at FeatureGroups.
fs.list_feature_groups()

In [None]:
# Let's look at DataSources.
fs.list_data_sources()

In [None]:
# Let's look at Entities.
fs.list_entities()

In [None]:
# Let's look at Features after pushing all FeatureGroups.
fs.list_features()

In [None]:
# Filter the features to understand the data more. Note that, Feature `blood_pressure` is mapped to two FeatureGroups.
features_df = fs.list_features()
features_df = features_df[features_df.name == 'blood_pressure']
features_df

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>8. Archive and Delete objects in FeatureStore </b>

<li style = 'font-size:16px;font-family:Arial;color:#00233C'>Archive and Delete are two different operations and they are not same in FeatureStore.</li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>Archive stages objects instead of removing it completly from FeatureStore. Archived objects will not be part of any further processing.</li>
<ul style = 'font-size:16px;font-family:Arial;color:#00233C'>
<li> use `FeatureStore.archive_feature()` to archive a Feature. `FeatureStore.list_features(archived=True)` lists archived Features.</li>
<li>use `FeatureStore.archive_feature()` to archive a Entity. `FeatureStore.list_entities(archived=True)` lists archived Entities.</li>
<li>use `FeatureStore.archive_data_source()` to archive a DataSource. `FeatureStore.list_data_sources(archived=True)` to list archived DataSources.</li>
<li>use `FeatureStore.archive_feature_group()` to archive a FeatureGroup. `FeatureStore.list_feature_groups(archived=True)` to list archived FeatureGroups.</li>
    </ul></li>
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>Archiving FeatureGroup will archive the corresponding Feature, Entity and DataSource.</li>  
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>If a Feature is associated with a FeatureGroup, it can not be archived. First the Feature should be removed from FeatureGroup and then archive it.</li> 
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>If an Entity is associated with a FeatureGroup, it can not be archived. First the Entity should be removed from FeatureGroup and then archive it.</li> 
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>If a DataSource is associated with a FeatureGroup, it can not be archived. First the DataSource should be removed from FeatureGroup and then archive it.</li> 
<li style = 'font-size:16px;font-family:Arial;color:#00233C'>Delete will remove the archived objects.</li> 
<ul style = 'font-size:16px;font-family:Arial;color:#00233C'>
    <li>use `FeatureStore.delete_feature()` to delete a Feature.</li>
    <li>use `FeatureStore.delete_entity()` to delete a Entity.</li>
    <li>use `FeatureStore.delete_data_source()` to delete a DataSource.</li>
    <li>use `FeatureStore.delete_feature_group()` to delete a FeatureGroup.</li></ul></li>
    <li style = 'font-size:16px;font-family:Arial;color:#00233C'>Deleting FeatureGroup <u><b> will not </b></u> remove corresponding archived Features or archived Entities or archived DataSources. You should delete these with corresponding API's.</li>


<hr style="height:1px;border:none;background-color:#00233C;">
<b style = 'font-size:18px;font-family:Arial;color:#00233C'>8.1 Archive a Feature. Delete the archived Feature.</b>


In [None]:
# Let's first look at Features which are not associated with any FeatureGroup.
features_df = fs.list_features()
features_df[features_df.group_name == None]

In [None]:
# Let's archive the Feature `PatientBMI`. Before archiving Feature, let's look at features which are archived.
fs.list_features(archived=True)

In [None]:
# Archive it.
fs.archive_feature('PatientBMI')

In [None]:
# Let's look at archived Features.
fs.list_features(archived=True)

In [None]:
# Delete the archived Feature.
fs.delete_feature('PatientBMI')

In [None]:
# Let's look at archived Features again.
fs.list_features(archived=True)

<hr style="height:1px;border:none;background-color:#00233C;"> 
<b style = 'font-size:18px;font-family:Arial;color:#00233C'>8.2 Archive a FeatureGroup. Delete the archived objects.</b>

In [None]:
# Before archiving group, let's look at Features for FeatureGroup. 
features_df[features_df.group_name == 'PatientProfile']

In [None]:
# Note: These Features are mapped to other FeatureGroups also.
features_df[(
    (features_df.name == 'skin_thickness') | (features_df.name == 'pregnancies') | (features_df.name == 'age') | (features_df.name == 'bmi')
)]

In [None]:
# Before archiving group, let's look at DataSources.
fs.list_data_sources()

In [None]:
# Let's look at archived DataSources.
fs.list_data_sources(archived=True)

In [None]:
# Before archiving group, let's look at Entities.
fs.list_entities()

In [None]:
# Let's look at archived Entities.
fs.list_entities(archived=True)

In [None]:
# Before archiving group, let's look at FeatureGroups. 
# Notice, FeatureGroup `PatientProfileObjs` is associated with DataSource `PatientProfileSource` and Entity `PatientEntity`
fs.list_feature_groups()

In [None]:
# Let's look at archived FeatureGroups.
fs.list_feature_groups(archived=True)

In [None]:
# Let's archive FeatureGroup `PatientProfileObjs`.
fs.archive_feature_group('PatientProfile')

In [None]:
# Let's look at FeatureGroups after archive.
fs.list_feature_groups()

In [None]:
# Look at archived FeatureGroup.
fs.list_feature_groups(archived=True)

In [None]:
# After archiving group, let's look at Features for FeatureGroup.
features_df[features_df.group_name == 'PatientProfile']

In [None]:
# Look at Features and observe group_name.
features_df[(
    (features_df.name == 'skin_thickness') | (features_df.name == 'pregnancies') | (features_df.name == 'age') | (features_df.name == 'bmi')
)]

In [None]:
# Let's look at archived Features. No Feature is archived because these Features are mapped to other FeatureGroup also.
fs.list_features(archived=True)

In [None]:
# After archiving group, let's look at DataSources. 
# Notice, DataSource `PatientProfileSource`, which is associated with FeatureGroup `PatientProfileObjs` is also archived.
fs.list_data_sources()

In [None]:
# Look at archived DataSources.
fs.list_data_sources(archived=True)

In [None]:
# After archiving group, let's look at Entities. 
# Notice, Entity `PatientEntity`, which is associated with FeatureGroup `PatientProfileObjs` is also archived.
fs.list_entities()

In [None]:
fs.list_entities(archived=True)

In [None]:
# Delete archived FeatureGroup. 
fs.delete_feature_group('PatientProfile')

In [None]:
# Let's look at archived FeatureGroups after delete.
fs.list_feature_groups(archived=True)

In [None]:
# Delete archived DataSources.
fs.delete_data_source('PatientProfile')

In [None]:
# Let's look at archived DataSources after delete.
fs.list_data_sources(archived=True)

In [None]:
# Delete archived Entities.
fs.delete_entity('PatientProfile')

In [None]:
# Let's look at archived Entities after delete.
fs.list_entities(archived=True)

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>8.3 Creating Datasets and historic Datasets for ML models </b>


<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Since FeatureStore stores DataSource also, you can retrive Teradata DataFrame from FeatureStore. <br> `FeatureStore.get_dataset()` get's Teradata DataFrame from FeatureGroup.</p>

In [None]:
# Let's look at available FeatureGroups first.
fs.list_feature_groups()

In [None]:
# Get DataSet for FeatureGroup PatientProfile. 
fs.get_dataset('MedicalReadings')

In [None]:
# Let's get DataSet for combined FeatureGroup. 
# Interesting point to observe:
#     DataSet will have all the combined Features of both FeatureGroups.
#     patient_id and record_timestamp will remain as it is.
fs.get_dataset('PatientProfile_MedicalReadings')

<p style = 'font-size:16px;font-family:Arial;color:#00233C'> In some cases, you need the historic DataSet to perform ML Model. In such cases, use API `FeatureStore.get_dataset()` and filter the data using filter options. Let's look at an example.</p>


In [None]:
df = fs.get_dataset('PatientProfile_MedicalReadings')

In [None]:
# Assume you want to feed only week 15 data to your model. 
week15_df = df[df.record_timestamp.week()==15]
week15_df

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>9. Use Enterprise Feature Store with teradataml analytic functions </b>


<p style = 'font-size:16px;font-family:Arial;color:#00233C'>teradataml analytic functions accepts Features as input.
<br> Let's look at Diabetes prediction using teradataml analytic function `XGBoost()`.</p>

In [None]:
# First get the Dataset.
medical_readings_df = fs.get_dataset('MedicalReadings')
medical_readings_df

In [None]:
# Split DataSet in to two samples.
sampled_df = medical_readings_df.sample(frac=[0.7, 0.3])
train_df = sampled_df[sampled_df.sampleid==2]
test_df = sampled_df[sampled_df.sampleid==1]

In [None]:
# Get the FeatureGroup. Notice the Feature `outcome` should be set as label. 
medical_readings_fg=fs.get_feature_group('MedicalReadings')
medical_readings_fg.labels='outcome'

In [None]:
from teradataml import XGBoost
model = XGBoost(data=train_df,
                input_columns=medical_readings_fg.features,
                response_column = medical_readings_fg.labels,
                max_depth=3,
                lambda1 = 1000.0,
                model_type='Classification',
                seed=-1,
                shrinkage_factor=0.1,
                iter_num=2)

In [None]:
# Score the model using test data.
XGBoostPredict_out_1 = model.predict(newdata=test_df,
                                     id_column='patient_id',
                                     model_type='Classification'
                                    )

In [None]:
XGBoostPredict_out_1.result

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>10. Repository Governance </b>


<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
So far, we have been working in a repo called "LabRepoOne", with Teradata EFS Functions, you can manage your Feature Store Repos and "Promote" them to work as a production repo.
</p> 

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
<b>Note : </b> The Feature Store Functions are not materializing the data of your features into production, this only for the metadata of the feature repo. Make sure that your ETL processes are executed in the production datasources. 
</p> 

In [None]:
# First, create a new repo to 'ProdRepoOne' to move Features.
ProdLabRepoOne = FeatureStore("ProdLabRepoOne")
# Setup prod repo if it is not setup.
ProdLabRepoOne.setup()

In [None]:
# Assume you want to promote FeatureGroup 'MedicalReadings' from 'LabRepoOne' to 'ProdLabRepoOne'.
# First get the FeatureGroup from LabRepoOne. Then apply it to 'ProdLabRepoOne'.
ProdLabRepoOne.apply(fs.get_feature_group('MedicalReadings'))

In [None]:
# Let's verify ProdLabRepoOne FeatureGroups.
ProdLabRepoOne.list_feature_groups()

In [None]:
# Let's verify ProdLabRepoOne Features.
ProdLabRepoOne.list_features()

In [None]:
# Let's verify ProdLabRepoOne DataSources.
ProdLabRepoOne.list_data_sources()

In [None]:
# Let's verify ProdLabRepoOne Entities.
ProdLabRepoOne.list_entities()

<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>11. Cleanup </b></p>
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Work Tables</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We need to clean up our work tables to prevent errors next time.</p>

In [None]:
tables = ['patient_profile','medical_readings']

# Loop through the list of tables and execute the drop table command for each table
for table in tables:
    try:
        db_drop_table(table_name=table)
    except:
        pass

In [None]:
remove_context()

<footer style="padding-bottom:35px; background:#f9f9f9; border-bottom:3px solid #00233C">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2024. All Rights Reserved
        </div>
    </div>
</footer>