# Databricks AI Segmentation (POC)
This notebook builds simple customer segments using PySpark/pandas **DataFrames** and prepares a CSV for **Salesforce Data Cloud** ingest.


In [1]:
import pandas as pd
from datetime import datetime
df = pd.read_csv('../data/sample_transactions.csv')
df['TxnDate'] = pd.to_datetime(df['TxnDate'])
df.head()

In [2]:
# Simple RFM-like features
as_of = pd.Timestamp('2025-07-10')
agg = df.groupby('CustomerId').agg(total_amount=('Amount','sum'),
                                   last_txn=('TxnDate','max'),
                                   txn_count=('TxnId','count'))
agg['recency_days'] = (as_of - agg['last_txn']).dt.days
agg['segment'] = pd.cut(agg['total_amount'], bins=[0,100,300,10000], labels=['Low','Medium','High'])
agg.reset_index().to_csv('segments.csv', index=False)
agg.head()

Exported **segments.csv** can be ingested into **Salesforce Data Cloud** via CSV data stream to drive **CRM Analytics** and **Agentforce** actions.