# Feast Feature Store - Fraud Detection

This notebook demonstrates how to use Feast for feature management in the fraud detection project.

In [None]:
import sys
sys.path.append('..')

import pandas as pd
from datetime import datetime, timedelta
from src.features import get_fraud_feature_store

## 1. Initialize Feature Store

First, we need to apply our feature definitions to the Feast registry.

In [None]:
# Apply feature definitions
# Run this in terminal: cd feature_store && feast apply
# Or uncomment below:
# !cd ../feature_store && feast apply

## 2. Load Feature Store

In [None]:
# Initialize the feature store
fs = get_fraud_feature_store(repo_path='../feature_store')

# List available feature views
print("Available Feature Views:")
for fv in fs.list_feature_views():
    print(f"  - {fv}")

print("\nAvailable Feature Services:")
for service in fs.list_feature_services():
    print(f"  - {service}")

## 3. Get Historical Features for Training

Create an entity DataFrame with timestamps and entity keys to retrieve historical features.

In [None]:
# Load your training data
X_train = pd.read_csv('../data/processed/X_train.csv')

# Create entity DataFrame with required columns
# You'll need to ensure your data has these columns
entity_df = pd.DataFrame({
    'trans_num': X_train.index,  # Transaction IDs
    'timestamp': pd.to_datetime('2024-01-01'),  # Add actual timestamps from your data
})

# Note: This is a simplified example
# In production, your data should already have proper timestamps
print("Entity DataFrame:")
print(entity_df.head())

In [None]:
# Get historical features
# Note: This will fail until you properly set up your data sources
# and run `feast apply` in the feature_store directory

try:
    historical_features = fs.get_historical_features(
        entity_df=entity_df.head(100),  # Start with small sample
        features=["fraud_detection_v1"]
    )
    print("\nHistorical Features:")
    print(historical_features.head())
except Exception as e:
    print(f"Error getting historical features: {e}")
    print("\nMake sure to:")
    print("1. Run 'cd feature_store && feast apply'")
    print("2. Ensure your data has timestamp columns")
    print("3. Create aggregated feature files (customer_aggregates.parquet, merchant_aggregates.parquet)")

## 4. Materialize Features to Online Store

Before serving features online, materialize them from offline to online store.

In [None]:
# Materialize features for a date range
try:
    fs.materialize(
        start_date='2024-01-01',
        end_date='2024-12-31'
    )
    print("Features materialized successfully!")
except Exception as e:
    print(f"Error materializing features: {e}")

## 5. Get Online Features for Real-time Prediction

In [None]:
# Define entity rows for online serving
entity_rows = [
    {
        "trans_num": "txn_001",
        "cc_num": "1234567890123456",
        "merchant": "merchant_xyz"
    }
]

# Get online features
try:
    online_features = fs.get_online_features(
        entity_rows=entity_rows,
        features=["fraud_detection_v1"]
    )
    print("Online Features:")
    print(online_features)
except Exception as e:
    print(f"Error getting online features: {e}")

## 6. Integration with Training Pipeline

Example of how to integrate Feast with your model training.

In [None]:
# Example training integration
from sklearn.linear_model import LogisticRegression
import mlflow

# 1. Get features from Feast
# training_features = fs.get_historical_features(...)

# 2. Train model
# model = LogisticRegression()
# model.fit(training_features, labels)

# 3. Log to MLflow
# with mlflow.start_run():
#     mlflow.log_param("feature_store_version", "v1")
#     mlflow.sklearn.log_model(model, "model")

print("See the modeling notebook for full training example with Feast")

## Next Steps

1. **Apply feature definitions**: Run `cd feature_store && feast apply`
2. **Prepare aggregated features**: Create customer and merchant aggregate files
3. **Update data sources**: Ensure your data has proper timestamps
4. **Materialize features**: Run materialization for your date range
5. **Integrate with API**: Use `get_online_features` in your fraud detection API

## Useful Commands

```bash
# Apply feature definitions
cd feature_store && feast apply

# List all features
feast feature-views list

# Materialize features
feast materialize-incremental $(date -u +"%Y-%m-%dT%H:%M:%S")

# Tear down (reset)
feast teardown
```