<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       Enterprise Feature Store - DataDomain
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:18px;font-family:Arial;'><b>Multi-Domain Feature Store Demo</b></p>
<p style = 'font-size:16px;font-family:Arial;'>This notebook demonstrates how to build and manage a feature store across multiple business domains (such as sales and marketing) using TeradataML. Key steps include:
<ul style = 'font-size:14px;font-family:Arial;'>
  <li>Loading, transforming, and aggregating sales and marketing data to engineer features relevant to each domain.</li>
  <li>Creating a centralized feature store repository to manage features, entities, and processes for different data domains.</li>
  <li>Ingesting features into separate data domains for robust governance, traceability, and reusability.</li>
  <li>Building datasets and exploring the feature landscape for scalable, collaborative machine learning and analytics.</li>
</ul>
<p style = 'font-size:16px;font-family:Arial;'>The workflow provides a practical example of operationalizing feature engineering and feature management in a modern enterprise environment with multiple subject areas.</p>

<p style = 'font-size:18px;font-family:Arial;'><b>Disclaimer</b></p>

<p style = 'font-size:12px;font-family:Arial;'>
The sample code (“Sample Code”) provided is not covered by any Teradata agreements. Please be aware that Teradata has no control over the model responses to such sample code and such response may vary. The use of the model by Teradata is strictly for demonstration purposes and does not constitute any form of certification or endorsement. The sample code is provided “AS IS” and any express or implied warranties, including the implied warranties of merchantability and fitness for a particular purpose, are disclaimed. In no event shall Teradata be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) sustained by you or a third party, however caused and on any theory of liability, whether in contract, strict liability, or tort arising in any way out of the use of this sample code, even if advised of the possibility of such damage.</p>

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial;'><b>1. Connect to Vantage, Import python packages and explore the dataset</b></p>


In [None]:
!pip install teradataml==20.0.0.7 --quiet

In [None]:
!pip install pandas>=2.8.4 --quiet

<div class="alert alert-block alert-info">
<p style = 'font-size:16px;font-family:Arial;'><b>Note: </b><i>Please execute the above pip install to get the latest version of the required library. Be sure to restart the kernel after executing those lines to bring the installed libraries into memory. The simplest way to restart the Kernel is by typing zero zero: <b> 0 0</b></i></p>
</div>

In [None]:
import os
from teradataml import *
from getpass import getpass
from collections import OrderedDict
from teradatasqlalchemy import VARCHAR, INTEGER

display.max_rows = 5

<hr style="height:2px;border:none;">
<b style = 'font-size:18px;font-family:Arial;'> 1.1 Connect to Vantage</b>
<p style = 'font-size:16px;font-family:Arial;'>We will be prompted to provide the password. We will enter the password, press the Enter key, and then use the down arrow to go to the next cell.</p>

In [None]:
%run -i ../../UseCases/startup.ipynb
eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
print(eng)

In [None]:
%%capture
execute_sql('''SET query_band='DEMO=EFS-DataDomain.ipynb;' UPDATE FOR SESSION; ''')

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial;'><b>2. Setup a Feature Store Repository</b></p>
<p style = 'font-size:18px;font-family:Arial;'><b>2.1 Create the FeatureStore</b></p>

In [None]:
fs = FeatureStore(repo="enterprise_marketing_sales")

<p style = 'font-size:18px;font-family:Arial;'><b>2.2 Setup the FeatureStore</b></p>

In [None]:
fs.setup()

<p style = 'font-size:18px;font-family:Arial;'><b>2.3 Checking Availability</b></p>

In [None]:
fs = FeatureStore(repo="enterprise_marketing_sales")

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial;'><b>3. Get Data For demo</b>
<p style = 'font-size:18px;font-family:Arial;'><b>3.1 Load the sales_data</b></p>

In [None]:
sales2_dt = OrderedDict(CustomerID=VARCHAR(10), Sales_Q1=INTEGER, Sales_Q2=INTEGER, Region=VARCHAR(20), Loayalty_Score=INTEGER, Channel=VARCHAR(20))
sales = read_csv(filepath=r"data/sales_data.csv", 
                 table_name="sales2_data", 
                 types=sales2_dt)
sales.head(3)

<p style = 'font-size:18px;font-family:Arial;'><b>3.2 Perform Data Transformation</b></p>
<p style = 'font-size:16px;font-family:Arial;'><b>Transformation Details:</b>    
In this step, we aggregate the sales data by the 'Region' column. For each region, we calculate:
<ul>
  <li>The mean of the 'Loayalty_Score' to understand average customer loyalty per region.</li>
  <li>The count of 'Channel' to determine the number of sales channels or transactions in each region.</li>
</ul>
<p style = 'font-size:16px;font-family:Arial;'>These aggregated features provide insights into regional sales performance and customer engagement.</p>

In [None]:
sales_agg = sales.groupby("Region").agg({"Loayalty_Score": "mean", "Channel": "count"})
sales_agg

<p style = 'font-size:18px;font-family:Arial;'><b>3.3 Load the marketing_data</b></p>

In [None]:
marketing2_dt = OrderedDict(AccountID=VARCHAR(10), Campaign_1=INTEGER, Campaign_2=INTEGER, Region=VARCHAR(20),
                            Loayalty_Score=INTEGER, Engagement_Channel=VARCHAR(20))
marketing = read_csv(filepath=r"data/marketing_data.csv", 
                     table_name="marketing2_data", 
                     types=marketing2_dt)
marketing.head(3)

<p style = 'font-size:18px;font-family:Arial;'><b>3.4 Perform Data Transformation</b></p>
<p style = 'font-size:16px;font-family:Arial;'><b>Transformation Details:</b>    
In this step, we aggregate the sales data by the 'Region' column. For each region, we calculate:
<ul>
  <li>The mean of the 'Loayalty_Score' to understand average customer loyalty per region.</li>
  <li>The count of 'Channel' to determine the number of sales channels or transactions in each region.</li>
</ul>
<p style = 'font-size:16px;font-family:Arial;'>These aggregated features provide insights into regional sales performance and customer engagement.</p>

In [None]:
marketing_agg = marketing.groupby("Region").agg({"Loayalty_Score": "mean", "Engagement_Channel": "count"})
marketing_agg

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial;'><b>4. Store the data transformations</b></p>
<p style = 'font-size:16px;font-family:Arial;'>We are storing the transformation here. So, even if underlying data varies, the data transformation steps remain same.</p>

In [None]:
sales_df = sales_agg.create_view('sales_data_view')
marketing_df = marketing_agg.create_view('marketing_data_view')

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial;'><b>5. Create FeatureStore with sales and marketing domain</b></p>

In [None]:
fs_sales = FeatureStore("enterprise_marketing_sales", data_domain='sales')
fs_marketing = FeatureStore("enterprise_marketing_sales", data_domain='marketing')

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial;'><b>6. Perform operation in sales domain</b>
<p style = 'font-size:18px;font-family:Arial;'><b>6.1 Ingest features from sales data</b></p>
<p style = 'font-size:16px;font-family:Arial;'><b>Note:</b> Feature ingestion can also be performed using <code>FeatureStore.get_feature_process()</code>.</p>

In [None]:
fp_sales = FeatureProcess(repo='enterprise_marketing_sales',
                          data_domain='sales',
                          entity='Region',
                          object=sales_df,
                          features=['mean_Loayalty_Score', 'count_Channel'],
                          description='Ingesting Features in sales DD')
fp_sales.run()

<p style = 'font-size:18px;font-family:Arial;'><b>6.2 Build dataset in sales domain</b></p>

In [None]:
dc_sales = fs_sales.get_dataset_catalog()

dc_sales.build_dataset(entity='Region',
                       selected_features={'mean_Loayalty_Score': fp_sales.process_id,
                                          'count_Channel': fp_sales.process_id},
                       view_name="sales_dc_view",
                       description="Building datatset for sales")


<p style = 'font-size:18px;font-family:Arial;'><b>6.3 See the mind_map for FeatureStore in sales domain</b></p>
<p style = 'font-size:16px;font-family:Arial;'>We ingested three features—<code>count_price</code>, <code>max_price</code>, and <code>total_price</code>—from a single feature process. This demonstrates how multiple related features, datasets can be managed and tracked together within the feature store, maintaining their lineage to the originating process.</p>

In [None]:
fs_sales.mind_map()

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial;'><b>7. Perform operation in marketing domain</b>
<p style = 'font-size:18px;font-family:Arial;'><b>7.1 Ingest features from marketing data</b></p>

In [None]:
fp_mar = fs_marketing.get_feature_process(entity='Region',
                                          features=['mean_Loayalty_Score', 'count_Engagement_Channel'],
                                          object=marketing_df,
                                          description='Ingesting Features in marketing DD')
fp_mar.run()

<p style = 'font-size:18px;font-family:Arial;'><b>7.2 Build dataset in marketing domain</b></p>

In [None]:
dc_mar = fs_marketing.get_dataset_catalog()

dc_mar.build_dataset(entity='Region',
                     selected_features={'mean_Loayalty_Score': fp_mar.process_id,
                                        'count_Engagement_Channel': fp_mar.process_id},
                      view_name='marketing_dc_view',
                      description='Building dataset for marketing')

<p style = 'font-size:18px;font-family:Arial;'><b>7.3 See the mind_map for FeatureStore in marketing domain</b></p>
<p style = 'font-size:16px;font-family:Arial;'>We ingested three features—<code>count_price</code>, <code>max_price</code>, and <code>total_price</code>—from a single feature process. This demonstrates how multiple related features, datasets can be managed and tracked together within the feature store, maintaining their lineage to the originating process.</p>

In [None]:
fs_marketing.mind_map()

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial;'><b>8. Explore DataDomain</b>
<p style = 'font-size:18px;font-family:Arial;'><b>8.1 Explore <code>sales</code> datadomain</b></p>
<p style = 'font-size:16px;font-family:Arial;'>Create DataDomain object for sales </p>

In [None]:
sales_domain = DataDomain(repo='enterprise_marketing_sales',
                          data_domain='sales')
sales_domain

<p style = 'font-size:18px;font-family:Arial;'><b>Explore properties</b></p>
<p style = 'font-size:16px;font-family:Arial;'><b>features:</b> The <code>features</code> property of the dataset catalog lists all features currently available in the datasetcatalog.</p>

In [None]:
sales_domain.features

<p style = 'font-size:16px;font-family:Arial;'><b>entities:</b> The <code>entities</code> property of the dataset catalog lists all entities currently available in the datasetcatalog.</p>

In [None]:
sales_domain.entities

<p style = 'font-size:16px;font-family:Arial;'><b>processes:</b> The <code>processes</code> property of the dataset catalog lists all processes currently available in the datasetcatalog.</p>

In [None]:
sales_domain.processes

<p style = 'font-size:16px;font-family:Arial;'><b>datasets:</b> The <code>datasets</code> property of the dataset catalog lists all datasets currently available in the datasetcatalog.</p>

In [None]:
sales_domain.datasets

<p style = 'font-size:18px;font-family:Arial;'><b>8.2 Explore `marketing` datadomain</b></p>
<p style = 'font-size:16px;font-family:Arial;'>Create DataDomain object for marketing </p>

In [None]:
marketing_domain = DataDomain(repo='enterprise_marketing_sales',
                              data_domain='marketing')
marketing_domain

<p style = 'font-size:18px;font-family:Arial;'><b>Explore properties</b></p>
<p style = 'font-size:16px;font-family:Arial;'><b>features:</b> The <code>features</code> property of the dataset catalog lists all features currently available in the datasetcatalog.</p>

In [None]:
marketing_domain.features

<p style = 'font-size:16px;font-family:Arial;'><b>entities:</b> The <code>entities</code> property of the dataset catalog lists all entities currently available in the datasetcatalog.</p>

In [None]:
marketing_domain.entities

<p style = 'font-size:16px;font-family:Arial;'><b>processes:</b> The <code>processes</code> property of the dataset catalog lists all processes currently available in the datasetcatalog.</p>

In [None]:
marketing_domain.processes

<p style = 'font-size:16px;font-family:Arial;'><b>datasets:</b> The <code>datasets</code> property of the dataset catalog lists all datasets currently available in the datasetcatalog.</p>

In [None]:
marketing_domain.datasets

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial;'>9. Cleanup</b></p>
<p style = 'font-size:18px;font-family:Arial;'> <b>Work Tables and Views </b></p>

In [None]:
db_drop_view('sales_data_view')

In [None]:
db_drop_view('marketing_data_view')

In [None]:
db_drop_table('sales2_data')

In [None]:
db_drop_table('marketing2_data')

<p style = 'font-size:18px;font-family:Arial;'><b>9.1 Delete the Feature Store</b></p>
<p style = 'font-size:16px;font-family:Arial;'><b>Note :</b> This will drop the database if all objects are removed.</p>

In [None]:
fs = FeatureStore(repo="enterprise_marketing_sales")

In [None]:
fs.delete()

In [None]:
remove_context()

<footer style="padding-bottom:35px; background:#f9f9f9; border-bottom:3px solid ">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2025. All Rights Reserved
        </div>
    </div>
</footer>