## Banking Scenario


Background: First National Bank is implementing an AI-powered system to help relationship managers identify suitable customers for personalized product recommendations. The bank wants to move beyond rule-based targeting ("all customers over 50 with $100K+ income") to a more nuanced approach that can understand natural language queries and customer characteristics holistically.

Use Case: Relationship managers can use natural language to find customers matching specific profiles or needs. For example, they might search for "high-income professionals nearing retirement who might be interested in wealth management services" or "young professionals with good credit scores who might qualify for premium credit cards."


## Implementation Overview

This notebook demonstrates how to build and deploy a machine learning model that uses natural language processing to match banking customer profiles with relationship managers' queries. The system leverages:

1. **Sentence Transformers**: To convert both customer profiles and natural language queries into semantic vector embeddings
2. **MLflow**: For model packaging, versioning, and deployment
3. **Semantic Search**: Using cosine similarity to find the most relevant customer matches

When deployed, relationship managers can simply type queries like "Find affluent customers with children in college who might need education loans" and get a list of the most relevant customers without needing to construct complex database queries.


## Process Description

This Banking Customer Similarity Model provides a semantic search capability over customer profiles. It works by:

Data Preparation: Customer data (demographics, financial information, product history) is processed and organized.

Embedding Generation: The SentenceTransformer model converts customer profiles into numerical embeddings (high-dimensional vectors) that capture semantic meaning.

MLflow Deployment: The model, embeddings, and dataset are packaged and deployed using MLflow for reproducible inference.

Query Processing: When a relationship manager enters a natural language query, it's converted to an embedding using the same model.

Similarity Matching: The system calculates cosine similarity between the query embedding and all customer embeddings to find the most similar customers.

Result Presentation: Top matching customers are presented with their relevant information and similarity scores.

## Benefits

Personalized Marketing: Target customers with relevant offers based on their profile similarity to successful past campaigns.

Risk Assessment: Identify customers with similar risk profiles to known good/bad accounts.

Cross-Selling Opportunities: Find customers similar to those who have already purchased specific products.

Natural Language Interface: Allows relationship managers to search customers without needing complex SQL queries or predefined segments.

Scalability: The system can handle millions of customer profiles efficiently due to the vector-based search approach.

This system bridges the gap between rich customer data and actionable insights by providing an intuitive way to explore customer segments and identify targeted opportunities for engagement.


## Technical Implementation

The following code implements the complete Banking Customer Similarity system. We'll start by importing necessary libraries and setting up our model environment.



In [1]:
# Imports
import os
import json
import torch
import numpy as np
import pandas as pd
from tabulate import tabulate
import mlflow
import mlflow.pyfunc

from mlflow import MlflowClient
from mlflow.models.signature import ModelSignature
from mlflow.types.schema import Schema, ColSpec, TensorSpec, ParamSchema, ParamSpec
from sklearn.metrics.pairwise import cosine_similarity
from sentence_transformers import SentenceTransformer

# Model settings - using local sentence-transformer model
model_filename = "sentence-transformer"
model_dir = "model"
MODEL_PATH = os.path.join(model_dir, model_filename)

2025-04-01 22:27:53.618837: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-01 22:27:53.630344: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1743546473.642509    1263 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1743546473.646174    1263 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1743546473.655495    1263 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking 

## Model Loading

First, we define a function to load our sentence transformer model, which will handle the semantic encoding of text. This model converts text into high-dimensional vectors where similar meanings are positioned closer together in the vector space.


## Model Architecture

The core of our system is the `BankingSimilarityModel` class which:
1. Loads pre-computed customer embeddings and banking data
2. Handles query encoding and similarity computation
3. Formats results for display
4. Includes MLflow integration for model deployment

This class inherits from `mlflow.pyfunc.PythonModel` to make it deployable through MLflow's model registry.


In [2]:
# Banking Similarity Model Class
class BankingSimilarityModel(mlflow.pyfunc.PythonModel):
    def load_context(self, context):
        """
        Load precomputed embeddings, banking data, and sentence-transformer model.
        """
        # Load precomputed embeddings
        embeddings_path = context.artifacts['embeddings_path']
        self.embeddings = np.load(embeddings_path)
        
        # Load banking dataset
        banking_dataset_path = context.artifacts['banking_dataset_path']
        self.banking_df = pd.read_csv(banking_dataset_path)
        
        # Print essential diagnostics
        print(f"Loaded embeddings shape: {self.embeddings.shape}")
        print(f"Loaded banking data shape: {self.banking_df.shape}")
        
        # Load model from artifacts - no fallback needed since we're explicitly including it
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        model_path = context.artifacts['model_dir']
        self.model = SentenceTransformer(model_path, device=self.device)
        print(f"SentenceTransformer model loaded successfully from {model_path}")
    
    def generate_query_embedding(self, query):
        """Generate embedding for the input query text using SentenceTransformer"""
        # Use mean pooling to get sentence embedding
        embedding = self.model.encode(query)
        return embedding.reshape(1, -1)  # Reshape to 2D for cosine_similarity
    
    def predict(self, context, model_input, params):
        """Find similar banking customers based on semantic similarity to the query text."""
        try:
            # Extract the query string from model input - handle different input types
            query = model_input["query"]
            if isinstance(query, pd.Series):
                query = query.iloc[0]
            elif isinstance(query, list):
                query = query[0]
            
            # Convert query to string if it's not already
            query = str(query)
            print(f"Processing query: '{query}'")
            
            # Extract parameters
            top_n = params.get("top_n", 5) if params else 5
            
            # Check for filtering keywords in query
            filter_customers = None
            original_query = query
            
            # Handle credit score filtering
            if "credit score over" in query.lower() or "credit score above" in query.lower():
                import re
                match = re.search(r'credit score (?:over|above) (\d+)', query.lower())
                if match:
                    threshold = int(match.group(1))
                    print(f"Applying filter: Credit Score > {threshold}")
                    filter_customers = self.banking_df['credit_score'] > threshold
                    query = re.sub(r'credit score (?:over|above) \d+', 'good credit score', query)
            
            # Generate embedding for the query text
            query_embedding = self.generate_query_embedding(query)
            
            # Apply filtering if needed
            df_to_search = self.banking_df
            embeddings_to_search = self.embeddings
            if filter_customers is not None:
                filter_indices = np.where(filter_customers)[0]
                if len(filter_indices) > 0:
                    df_to_search = self.banking_df.iloc[filter_indices]
                    embeddings_to_search = self.embeddings[filter_indices]
                    print(f"Filtered to {len(filter_indices)} customers matching criteria")
                else:
                    print("No customers match the filtering criteria")
                    return {"predictions": []}
            
            # Perform semantic search first to get initial candidates
            similarities = cosine_similarity(query_embedding, embeddings_to_search)[0]
            top_indices = np.argsort(similarities)[::-1][:min(top_n * 3, len(df_to_search))]  # Get more candidates

            # Check for credit score threshold in filtering
            credit_score_threshold = None
            query_lower = query.lower()
            if "credit score over" in query_lower or "credit score above" in query_lower:
                match = re.search(r'credit score (?:over|above) (\d+)', query_lower)
                if match:
                    credit_score_threshold = int(match.group(1))
                    print(f"Found explicit credit score threshold: {credit_score_threshold}")
            # Also set a threshold for keyword-based queries
            elif any(term in query_lower for term in ["high credit", "good credit", "excellent credit"]):
                credit_score_threshold = 700
                print(f"Setting default high credit score threshold: {credit_score_threshold}")

            # Check for income threshold in filtering
            income_threshold = None
            if "income over" in query_lower or "income above" in query_lower:
                match = re.search(r'income (?:over|above) \$?(\d+)', query_lower)
                if match:
                    income_threshold = int(match.group(1))
                    print(f"Found explicit income threshold: ${income_threshold}")
            # Also set a threshold for keyword-based queries
            elif any(term in query_lower for term in ["high income", "wealthy", "affluent", "rich"]):
                income_threshold = 75000
                print(f"Setting default high income threshold: ${income_threshold}")

            # Apply the boost to the top candidates
            top_candidates = df_to_search.iloc[top_indices]
            sim_boost = np.zeros(len(top_indices))

            # Adjust similarity scores based on numerical criteria
            if credit_score_threshold is not None:
                credit_scores = top_candidates['credit_score'].values
                credit_boost = np.clip((credit_scores - credit_score_threshold) / 100, 0, 0.3)
                sim_boost += credit_boost
                print(f"Applying credit score boost up to 0.3 for scores above {credit_score_threshold}")

            if income_threshold is not None:
                incomes = top_candidates['income'].values
                income_boost = np.clip((incomes - income_threshold) / 50000, 0, 0.3)
                sim_boost += income_boost
                print(f"Applying income boost up to 0.3 for income above ${income_threshold}")

            # Apply the boost to the similarities for the top indices
            for i, idx in enumerate(top_indices):
                similarities[idx] += sim_boost[i]
            
            # Analyze query for potential sorting criteria
            rerank = False
            sort_field = None
            sort_ascending = False
            
            # Simple keyword detection - much more generalizable than specific conditions
            if any(term in query_lower for term in ["high credit", "good credit", "excellent credit"]):
                sort_field = "credit_score"
                sort_ascending = False
                rerank = True
            elif any(term in query_lower for term in ["low credit", "poor credit", "bad credit"]):
                sort_field = "credit_score"
                sort_ascending = True
                rerank = True
            elif any(term in query_lower for term in ["high income", "wealthy", "affluent", "rich"]):
                sort_field = "income"
                sort_ascending = False
                rerank = True
            elif any(term in query_lower for term in ["low income", "budget"]):
                sort_field = "income"
                sort_ascending = True
                rerank = True
            
            # Apply re-ranking if needed
            if rerank and sort_field:
                # Create a DataFrame with original indices and similarities
                candidates = pd.DataFrame({
                    'original_idx': top_indices,
                    'similarity': similarities[top_indices]
                })
                
                # Add the sort field values
                candidates[sort_field] = df_to_search.iloc[top_indices][sort_field].values
                
                # Sort by the attribute first, then by similarity
                candidates = candidates.sort_values(
                    by=[sort_field, 'similarity'], 
                    ascending=[sort_ascending, False]
                )
                
                # Extract the original indices after sorting
                top_indices = candidates['original_idx'].values[:top_n]
            else:
                # Just take the top semantic matches
                top_indices = top_indices[:top_n]
            
            # Format results (fix the comma issue)
            predictions = []
            for idx in top_indices:
                customer = df_to_search.iloc[idx]
                
                # Format customer details with correct comma placement
                info = f"Customer ID: {customer['customer_id']}, "
                info += f"Age: {customer['age']}, "
                info += f"Income: ${customer['income']:,.2f}, "  # Add comma here
                info += f"Credit Score: {customer['credit_score']}, "
                info += f"Segment: {customer['segment']}, "
                info += f"Risk Profile: {customer['risk_profile']}"
                
                # Add to predictions
                result = {
                    'Customer': info,
                    'Similarity': float(similarities[idx])
                }
                predictions.append(result)
            
            return {"predictions": predictions}
            
        except Exception as e:
            print(f"Error during prediction: {e}")
            return {"predictions": []}
    
    @classmethod
    def log_model(cls, model_name, embeddings_path, banking_dataset_path, demo_dir=None):
        """
        Logs the model to MLflow with appropriate artifacts and schema.
        """
        # Check if the files exist
        for path, desc in [
            (embeddings_path, "Embeddings file"),
            (banking_dataset_path, "Banking dataset"),
            (MODEL_PATH, "Model directory")
        ]:
            if not os.path.exists(path):
                raise FileNotFoundError(f"{desc} not found: {path}")
        
        # Define input/output schemas
        input_schema = Schema([ColSpec("string", "query")])
        output_schema = Schema([
            TensorSpec(np.dtype("object"), (-1,), "predictions")
        ])
        params_schema = ParamSchema([
            ParamSpec("top_n", "integer", 5),
            ParamSpec("show_score", "boolean", True)
        ])
        signature = ModelSignature(inputs=input_schema, outputs=output_schema, params=params_schema)
        
        # Define requirements
        requirements = [
            "scikit-learn", "pandas", "numpy", "tabulate", 
            "torch", "transformers", "sentence-transformers"
        ]
        
        # Define artifacts - including model directory
        artifacts = {
            "embeddings_path": embeddings_path,
            "banking_dataset_path": banking_dataset_path,
            "model_dir": MODEL_PATH
        }
        
        # Add demo directory if provided
        if demo_dir and os.path.exists(demo_dir):
            artifacts["demo"] = demo_dir
            
        # Define metadata
        metadata = {}
        if demo_dir and os.path.exists(os.path.join(demo_dir, "index.html")):
            metadata["demo_template"] = "demo/index.html"
        
        # Log the model
        mlflow.pyfunc.log_model(
            model_name,
            python_model=cls(),
            artifacts=artifacts,
            signature=signature,
            pip_requirements=requirements,
            metadata=metadata
        )

## MLflow Model Deployment

The following function handles the MLflow experiment setup, model logging, and registration. 
MLflow is used to:
- Track experiments and model versions
- Package the model with its dependencies and artifacts
- Register the model in the Model Registry for deployment
- Store the UI components for interactive demos

In [3]:
# Log Model to MLflow function
def log_model_to_mlflow():
    """Log the banking similarity model to MLflow."""
    # Set the MLflow experiment name
    experiment_name = "Banking Customer Similarity"
    mlflow.set_experiment(experiment_name=experiment_name)
    
    # Check if demo directory exists
    demo_dir = "demo"
    index_html_path = os.path.join(demo_dir, "index.html")
    
    if not os.path.exists(index_html_path):
        os.makedirs(demo_dir, exist_ok=True)

    # Start an MLflow run
    with mlflow.start_run(run_name="Banking_Similarity_Run") as run:
        # Log the model
        model_name = "Banking_Customer_Similarity"
        BankingSimilarityModel.log_model(
            model_name=model_name,
            embeddings_path="data/customer_embeddings.npy",
            banking_dataset_path="data/banking_dataset.csv",
            demo_dir=demo_dir if os.path.exists(demo_dir) else None
        )

        # Register the model
        registered_model = mlflow.register_model(
            model_uri=f"runs:/{run.info.run_id}/{model_name}", 
            name=model_name
        )
        
        print(f"Registered model: {model_name}, version: {registered_model.version}")
        return run.info.run_id

## Inference Pipeline

The `find_similar_customers` function demonstrates how to load our deployed model and use it for inference. This represents what would happen in a production environment when the model is called through an API or application interface.


In [4]:
# Find Similar Customers function
def find_similar_customers(query, run_id=None, top_n=5):
    """Find similar banking customers for a given query."""
    # Determine model URI based on run_id
    if run_id:
        model_uri = f"runs:/{run_id}/Banking_Customer_Similarity"
    else:
        client = MlflowClient()
        model_metadata = client.get_latest_versions("Banking_Customer_Similarity", stages=["None"])
        latest_model_version = model_metadata[0].version
        model_uri = f"models:/Banking_Customer_Similarity/{latest_model_version}"
    
    # Load the model
    print(f"Loading model from: {model_uri}")
    model = mlflow.pyfunc.load_model(model_uri)
    
    # Run prediction
    try:
        result = model.predict({"query": [query]}, params={"top_n": top_n})
        
        # Extract predictions with proper error handling
        if result is None or "predictions" not in result:
            return pd.DataFrame(columns=["Customer", "Similarity"])
            
        predictions = result.get("predictions", [])
        return pd.DataFrame(predictions)
    except Exception as e:
        print(f"Error during inference: {e}")
        return pd.DataFrame(columns=["Customer", "Similarity"])


## Interactive Demo

The `run_demo` function shows a complete end-to-end demonstration, from model deployment to customer search. This simulates how a relationship manager would interact with the system in a real banking environment.

The demo covers:
1. Model deployment to MLflow
2. Running a sample query 
3. Displaying formatted results
4. Error handling for failed queries

## Testing Individual Components

The following cells demonstrate how to test individual components of the system for debugging or development purposes.


In [5]:
# Run Demo function
def run_demo():
    """Run a complete end-to-end demo."""
    # Log model
    run_id = log_model_to_mlflow()
    
    if not run_id:
        print("Model logging failed.")
        return
    
    # Query
    query = "Find high-income customers with excellent credit scores interested in retirement planning"
    
    try:
        # Get similar customers
        similar_customers = find_similar_customers(query=query, run_id=run_id, top_n=5)
        
        # Display results
        print(f"\nQuery: {query}")
        print("\nTop similar banking customers:")
        print(tabulate(similar_customers, headers='keys', tablefmt='fancy_grid', showindex=False))
    
    except Exception as e:
        print(f"Error during inference: {e}")
        print("Displaying sample of the banking dataset instead:")
        banking_df = pd.read_csv("data/banking_dataset.csv", nrows=5)
        print(tabulate(banking_df, headers='keys', tablefmt='fancy_grid', showindex=False))

run_demo()


Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/3 [00:00<?, ?it/s]

Registered model 'Banking_Customer_Similarity' already exists. Creating a new version of this model...
Created version '51' of model 'Banking_Customer_Similarity'.


Registered model: Banking_Customer_Similarity, version: 51
Loading model from: runs:/b196eb3c0a5c42cea62ffaaeb8ea2789/Banking_Customer_Similarity
Loaded embeddings shape: (1000, 384)
Loaded banking data shape: (1000, 30)
SentenceTransformer model loaded successfully from /phoenix/mlflow/470237435216347360/b196eb3c0a5c42cea62ffaaeb8ea2789/artifacts/Banking_Customer_Similarity/artifacts/sentence-transformer
Processing query: '['Find high-income customers with excellent credit scores interested in retirement planning']'
Setting default high credit score threshold: 700
Applying credit score boost up to 0.3 for scores above 700

Query: Find high-income customers with excellent credit scores interested in retirement planning

Top similar banking customers:
╒════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╤══════════════╕
│ Customer                                                                                               


## Next Steps and Extensions

This model could be extended in several ways:

1. **Feature Enhancement**:
   - Add support for filtering (e.g., only show customers from specific regions)
   - Incorporate transaction history for more nuanced matching
   - Include product ownership and interest data

2. **Performance Optimization**:
   - Use vector databases (like FAISS or Pinecone) for faster similarity search at scale
   - Implement batch processing for large customer bases

3. **User Experience**:
   - Develop a more advanced UI with result filtering and sorting
   - Add visualizations of customer segments
   - Implement feedback mechanisms to improve recommendations

4. **Integration**:
   - Connect with CRM systems for seamless workflow integration
   - Set up automated alerts when high-value opportunities are identified
   - Create scheduled reports based on specific queries

By deploying this system, banks can move from static, rule-based customer segmentation to dynamic, semantically-rich customer discovery that adapts to relationship managers' specific needs.
