# Building Realtime Features with Tecton

<a href="https://colab.research.google.com/github/tecton-ai/demo-notebooks/blob/main/Tutorial_Building_Realtime_Features_with_Tecton.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



Many of the most powerful ML features can only be calculated at the exact moment
they're needed. Imagine an e-commerce fraud detection system - when a customer
places an order, you might want to check if their shipping address matches their
usual location, or if the purchase amount is unusually high compared to their
typical spending.

These "realtime features" need to be computed on-the-fly during model inference,
either because:

- The data is only available at request time (like the current purchase amount)
- The computation involves comparing request data against historical patterns
- Pre-computing all possible combinations would be impractical or impossible


In this tutorial, we'll build realtime features for a fraud detection system
that can:

1. Check if a transaction amount is unusually high
2. Compare the transaction against the user's historical spending patterns
3. Serve these features with millisecond latency in production


You'll learn how to:

- Create realtime features using Python
- Test your features interactively in a notebook
- Combine realtime data with historical user patterns
- Generate training data for your model
- Deploy your features to production


No prior Tecton experience is required, though basic Python knowledge is
assumed. Let's get started by setting up our environment!


## Prerequisites

Before we dive into building features, let's get our environment set up. You'll
need Python >= 3.8 to get started.


Run this command to install the Tecton SDK and supporting libraries:


In [None]:
!pip install 'tecton[rift]==1.1.0' gcsfs s3fs -q

### 2. Connect to Tecton

Log in to your Tecton account (replace `explore.tecton.ai` with your
organization's URL if different):


In [None]:
import tecton

tecton.login("explore.tecton.ai")

### 3. Import Required Dependencies

Copy these imports - we'll use them throughout the tutorial:


In [None]:
from tecton import *
from tecton.types import *
from datetime import datetime, timedelta
import pandas as pd

tecton.conf.set("TECTON_OFFLINE_RETRIEVAL_COMPUTE_MODE", "rift")

### 4. Sample Data

For this tutorial, we'll use a sample transaction dataset that includes:

- Historical transaction amounts
- Transaction timestamps
- User IDs
- Fraud labels

You don't need to download anything - we'll access this data directly from an S3
bucket when needed.

✅ With your environment ready, let's build your first realtime feature!


## Part 1: Your First Realtime Feature

Let's start by building a simple but useful feature for fraud detection:
identifying high-value transactions that might need extra scrutiny. We'll create
a feature that checks if a transaction amount exceeds $1,000.


First, we need to tell Tecton what data we expect to receive at request time. We
do this using a `RequestSource`:


In [None]:
# Define the schema for our request data
transaction_request = RequestSource(schema=[Field("amount", Float64)])  # We expect to receive a transaction amount

### Creating the Realtime Feature

Now let's create our first realtime feature. We'll write a Python function that
takes the transaction amount and returns True if it's over $1,000:


In [None]:
@realtime_feature_view(
    sources=[transaction_request],  # Use our RequestSource as input
    mode="python",  # We'll write our transformation in Python
    features=[Attribute("transaction_amount_is_high", Bool)],  # Our output feature
)
def transaction_amount_is_high(request):
    """Check if a transaction amount is over $1,000."""
    return {"transaction_amount_is_high": request["amount"] > 1000}

### Testing the Feature

Let's test our feature with some sample data:


In [None]:
# Test with a small transaction amount
small_transaction = {"request": {"amount": 182.40}}
print("Small transaction result:")
print(transaction_amount_is_high.run_transformation(input_data=small_transaction))

large_transaction = {"request": {"amount": 1500.00}}
print("\nLarge transaction result:")
print(transaction_amount_is_high.run_transformation(input_data=large_transaction))

Great! You've created your first realtime feature. However, a static threshold
of $1,000 might not make sense for all users - someone who regularly makes large
purchases shouldn't trigger the same alerts as someone who typically makes small
transactions.

In the next section, we'll make this feature smarter by comparing the
transaction amount to each user's typical spending patterns.

## Part 2: Making Features Smarter with Historical Context

Now let's improve our fraud detection by comparing each transaction against the
user's historical spending patterns. Instead of using a fixed threshold, we'll
check if the transaction amount is unusually high compared to their average
transaction amount.


First, let's create a Batch Feature View that calculates each user's average
transaction amount over the past year:


In [None]:
# Define our data source containing historical transactions
transactions_batch = BatchSource(
    name="transactions_batch",
    batch_config=FileConfig(
        uri="s3://tecton.ai.public/tutorials/transactions.pq",
        file_format="parquet",
        timestamp_field="timestamp",
    ),
)

user = Entity(name="user", join_keys=[Field("user_id", String)])

@batch_feature_view(
    sources=[transactions_batch],
    entities=[user],
    mode="pandas",
    timestamp_field="timestamp",
    aggregation_interval=timedelta(days=1),
    features=[
        Aggregate(
            input_column=Field("amount", Float64),
            function="mean",
            time_window=timedelta(days=365),
            name="yearly_average",
        ),
    ],
)
def user_transaction_averages(transactions):
    """Calculate the yearly average transaction amount per user."""
    return transactions[["user_id", "timestamp", "amount"]]

### Combining Real-time and Historical Data

Now let's create an improved realtime feature that compares the current
transaction amount against the user's yearly average:


In [None]:
@realtime_feature_view(
    sources=[transaction_request, user_transaction_averages],  # Current transaction data + Historical averages
    mode="python",
    features=[Attribute("transaction_amount_is_higher_than_average", Bool)],
)
def transaction_amount_is_higher_than_average(transaction_request, user_transaction_averages):
    """Check if transaction amount exceeds user's yearly average."""
    amount_mean = user_transaction_averages["yearly_average"] or 0
    current_amount = transaction_request["amount"]

    return {"transaction_amount_is_higher_than_average": current_amount > amount_mean}

### Testing with Historical Context

Let's test our improved feature with some realistic scenarios:


In [None]:
# Test scenario: Regular user with transaction history
input_data = {"transaction_request": {"amount": 182.40}, "user_transaction_averages": {"yearly_average": 33.46}}

print("Regular user making larger than usual purchase:")
print(transaction_amount_is_higher_than_average.run_transformation(input_data))

input_data = {"transaction_request": {"amount": 182.40}, "user_transaction_averages": {"yearly_average": 500.00}}

print("\nHigh-value shopper making typical purchase:")
print(transaction_amount_is_higher_than_average.run_transformation(input_data))

Now we have a smarter feature that understands user context!

What's Powerful About This?

**Request-Aware Features in Minutes**: You defined a feature that reacts to the
incoming transaction amount -- no precomputation, no infrastructure setup. This
lets you incorporate request-time context into your model immediately.

**Contextual Intelligence from Historical Patterns**: By combining request-time
data with each user's historical average, you created a feature that adapts to
individual behavior instead of relying on static thresholds. This enables more
intelligent, personalized decisions.

**Fast, Flexible Iteration**: You tested both features directly in your
notebook, using just Python and sample inputs. No deployment or materialization
required, making it easy to explore different ideas quickly.

## Part 3: Getting Ready for Production

Now that we've built and tested our realtime features, let's prepare them for
production use. We'll cover how to generate training data, deploy the features,
and serve them in production.


To train a model with our features, we need to generate historical training
data. First, let's create a Feature Service that bundles our features together:


In [None]:
from tecton import FeatureService

fraud_detection_feature_service = FeatureService(
    name="fraud_detection_feature_service",
    features=[
        user_transaction_averages,  # Historical averages
        transaction_amount_is_higher_than_average,  # Realtime comparison
    ],
)

Now let's load some historical transaction data with fraud labels:


In [None]:
# Load historical transactions with fraud labels
training_events = pd.read_parquet("s3://tecton.ai.public/tutorials/transactions.pq", storage_options={"anon": True})[
    ["user_id", "timestamp", "amount", "is_fraud"]
]

training_data = fraud_detection_feature_service.get_features_for_events(training_events).to_pandas()

print("Training data preview:")
display(training_data.head())

### Deploying to Production

To deploy our features, we need to:

1. Copy our feature definitions to a Feature Repository
2. Apply them to a live workspace
3. Generate an API key for serving

Here's the complete feature repository code:

```python

from tecton import *
from tecton.types import *
from datetime import datetime, timedelta


fraud_detection_feature_service = FeatureService(
    name="fraud_detection_feature_service",
    features=[user_transaction_averages, transaction_amount_is_higher_than_average],
)
```

Deploy using the Tecton CLI:

```bash
tecton workspace create --live fraud-detection
tecton apply
```

### Serving Realtime Features

First, generate a service account API key from the Tecton UI:

1. Navigate to Settings > Service Accounts
2. Create a new service account
3. Save the API key
4. Grant the service account "Consumer" access to your workspace

Now we can make realtime feature requests:


In [None]:
# This code would be run in your production environment
import tecton

TECTON_API_KEY = "your-api-key"  # Replace with your API key
WORKSPACE_NAME = "fraud-detection"

tecton.login(tecton_url="https://example.tecton.ai", tecton_api_key=TECTON_API_KEY)
ws = tecton.get_workspace(WORKSPACE_NAME)
fraud_detection_service = ws.get_feature_service("fraud_detection_feature_service")

features = fraud_detection_service.get_online_features(
    join_keys={"user_id": "user_123"}, request_data={"amount": 750.00}
)

print("\nRealtime feature response:")
print(features.to_dict())

### Important Production Notes

1. For best performance in production:

   - Use the REST API directly or
   - Use Tecton's Python/Java client libraries
   - Avoid using `get_online_features()` in production

2. Monitor your features:
   - Watch feature freshness in the Tecton UI
   - Set up alerts for serving latency
   - Track feature distribution changes

That's it! You've successfully built, tested, and deployed realtime features
with Tecton.

## Wrap-up

Congratulations! You've successfully built production-ready realtime features
for fraud detection. Let's recap what you've learned:


- A basic realtime feature checking transaction amounts
- A smarter feature that adapts to each user's spending patterns
- A production-ready feature service combining historical and realtime data


- Using `RequestSource` to define realtime inputs
- Creating `realtime_feature_view`s for on-the-fly computations
- Combining realtime data with historical features
- Generating training data while maintaining consistency
- Deploying features to production


1. **Experiment with your own data**:

   - Try different aggregation windows for historical patterns
   - Add more features like time-of-day or location checks
   - Combine multiple historical features

2. **Optimize for production**:

   - Set up proper monitoring
   - Configure alerts
   - Test performance at scale

3. **Dive deeper**:
   - Explore more complex transformations
   - Add feature monitoring
   - Implement feature logging

Remember: realtime features in Tecton use the exact same code for training and
serving, eliminating the risk of training-serving skew.

Ready to build more? Check out our other tutorials and documentation for more
advanced features and best practices!