# Explainable Unified Recommendation Model End-to-End Demo

This notebook demonstrates KMR's ExplainableUnifiedRecommendationModel combining collaborative filtering and content-based approaches with per-component explanations, including:

- Data generation using KMR utilities
- Model creation and training with recommendation metrics
- Recommendation generation with component-wise explanations
- Evaluation of recommendations and explainability


In [1]:
# Create model
model = ExplainableUnifiedRecommendationModel(
    num_users=n_users,
    num_items=n_items,
    user_feature_dim=user_feature_dim,
    item_feature_dim=item_feature_dim,
    embedding_dim=64,
    tower_dim=64,
    top_k=10,
    l2_reg=0.01
)

# Create recommendation metrics
acc_at_5 = AccuracyAtK(k=5, name="acc@5")
acc_at_10 = AccuracyAtK(k=10, name="acc@10")
prec_at_5 = PrecisionAtK(k=5, name="prec@5")
prec_at_10 = PrecisionAtK(k=10, name="prec@10")
recall_at_5 = RecallAtK(k=5, name="recall@5")
recall_at_10 = RecallAtK(k=10, name="recall@10")

# Compile model with custom ranking loss and metrics
# Model returns 7-tuple: (combined_scores, rec_indices, rec_scores, cf_similarities, cb_similarities, weights, raw_cf_scores)
# Use list mapping: first element has loss/metrics, others are None
model.compile(
    optimizer=Adam(learning_rate=0.001),
    loss=[
        ImprovedMarginRankingLoss(margin=1.0, max_min_weight=0.6, avg_weight=0.4),  # For combined_scores
        None,  # For rec_indices
        None,  # For rec_scores
        None,  # For cf_similarities
        None,  # For cb_similarities
        None,  # For weights
        None   # For raw_cf_scores
    ],
    metrics=[
        [acc_at_5, acc_at_10, prec_at_5, prec_at_10, recall_at_5, recall_at_10],  # For combined_scores
        None,  # For rec_indices
        None,  # For rec_scores
        None,  # For cf_similarities
        None,  # For cb_similarities
        None,  # For weights
        None   # For raw_cf_scores
    ]
)

print("✅ Model created and compiled!")
print(f"   - Users: {model.num_users}")
print(f"   - Items: {model.num_items}")
print(f"   - Embedding dim: {model.embedding_dim}")
print(f"   - Tower dim: {model.tower_dim}")
print(f"   - Top-K: {model.top_k}")
print(f"   - Metrics: Accuracy@5, Accuracy@10, Precision@5, Precision@10, Recall@5, Recall@10")


✅ All imports successful!
TensorFlow version: 2.18.0
Keras version: 3.8.0


## 1. Generate Hybrid Recommendation Data

We'll use KMR's data generator to create synthetic user-item interactions with both collaborative and content-based features.


In [2]:
print("📦 Generating hybrid recommendation data...")

# Generate collaborative filtering data (user-item IDs)
user_ids, item_ids, ratings, user_features, item_features = KMRDataGenerator.generate_collaborative_filtering_data(
    n_users=1000,
    n_items=500,
    n_interactions=10000,
    random_state=42,
    rating_scale=(1, 5),
    sparsity=0.95
)

n_users = len(np.unique(user_ids))
n_items = len(np.unique(item_ids))
user_feature_dim = user_features.shape[1]
item_feature_dim = item_features.shape[1]

print(f"✅ Generated data:")
print(f"   - Users: {n_users}")
print(f"   - Items: {n_items}")
print(f"   - User features: {user_features.shape}")
print(f"   - Item features: {item_features.shape}")
print(f"   - Interactions: {len(user_ids)}")
print(f"   - Rating range: {ratings.min():.1f} - {ratings.max():.1f}")
print(f"   - Average rating: {ratings.mean():.2f}")

# Convert to binary interaction (for implicit feedback)
interactions = (ratings >= 3.0).astype(np.float32)

# Split into train/test
train_size = int(0.8 * len(user_ids))
train_user_ids = user_ids[:train_size]
train_item_ids = item_ids[:train_size]
train_interactions = interactions[:train_size]

test_user_ids = user_ids[train_size:]
test_item_ids = item_ids[train_size:]
test_interactions = interactions[train_size:]


📦 Generating hybrid recommendation data...
✅ Generated data:
   - Users: 1000
   - Items: 500
   - User features: (1000, 10)
   - Item features: (500, 8)
   - Interactions: 10000
   - Rating range: 1.0 - 5.0
   - Average rating: 2.99


## 2. Build Unified Recommendation Model

The unified model combines collaborative filtering and content-based approaches with learnable weights.


In [None]:
# Create model
model = UnifiedRecommendationModel(
    num_users=n_users,
    num_items=n_items,
    embedding_dim=64,
    user_feature_dim=user_feature_dim,
    item_feature_dim=item_feature_dim,
    tower_dim=64,
    top_k=10,
    l2_reg=0.01
)

# Create recommendation metrics
acc_at_5 = AccuracyAtK(k=5, name="acc@5")
acc_at_10 = AccuracyAtK(k=10, name="acc@10")
prec_at_5 = PrecisionAtK(k=5, name="prec@5")
prec_at_10 = PrecisionAtK(k=10, name="prec@10")
recall_at_5 = RecallAtK(k=5, name="recall@5")
recall_at_10 = RecallAtK(k=10, name="recall@10")

# Compile model with custom ranking loss and metrics
# Model returns tuple: (combined_scores, rec_indices, rec_scores)
# Use list mapping: first element has loss/metrics, others are None
model.compile(
    optimizer=Adam(learning_rate=0.001),
    loss=[
        ImprovedMarginRankingLoss(margin=1.0, max_min_weight=0.6, avg_weight=0.4),  # For combined_scores
        None,  # For rec_indices
        None   # For rec_scores
    ],
    metrics=[
        [acc_at_5, acc_at_10, prec_at_5, prec_at_10, recall_at_5, recall_at_10],  # For combined_scores
        None,  # For rec_indices
        None   # For rec_scores
    ]
)

print("✅ Model created and compiled!")
print(f"   - Users: {model.num_users}")
print(f"   - Items: {model.num_items}")
print(f"   - Embedding dim: {model.embedding_dim}")
print(f"   - Tower dim: {model.tower_dim}")
print(f"   - Top-K: {model.top_k}")
print(f"   - Metrics: Accuracy@5, Accuracy@10, Precision@5, Precision@10, Recall@5, Recall@10")


[32m2025-11-06 16:44:32.881[0m | [34m[1mDEBUG   [0m | [36mkmr.layers._base_layer[0m:[36m_log_initialization[0m:[36m73[0m - [34m[1mInitialized CollaborativeUserItemEmbedding with parameters: {'name': 'collaborative_user_item_embedding', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'num_users': 1000, 'num_items': 500, 'embedding_dim': 64, 'l2_reg': 0.01}[0m
[32m2025-11-06 16:44:32.882[0m | [34m[1mDEBUG   [0m | [36mkmr.layers._base_layer[0m:[36m_log_initialization[0m:[36m73[0m - [34m[1mInitialized DeepFeatureTower with parameters: {'name': 'user_tower', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'float32'}, 'registered_name': None}, 'units': 64, 'hidden_layers': 2, 'dropout_rate': 0.2, 'l2_reg': 0.01, 'activation': 'relu'}[0m
[32m2025-11-06 16:44:32.883[0m | [34m[1mDEBUG   [0m | [36mkmr.layers._base_layer[0m:[36m_

✅ Model created and compiled!
   - Users: 1000
   - Items: 500
   - Embedding dim: 64
   - Tower dim: 64
   - Top-K: 10
   - Metrics: Accuracy@5, Accuracy@10, Precision@5, Precision@10, Recall@5, Recall@10


## 3. Train Model


In [4]:
print("🚀 Training Model")
print("=" * 60)
print("Using model.fit() with built-in ranking loss")
print("=" * 60)
print("The model combines CF and CB approaches with learnable weights!")
print("Just prepare data and call model.fit() - no custom training loop needed.\n")

# Prepare data for keras.fit() format
# For each user, provide all items and binary labels
unique_users = np.unique(train_user_ids)[:50]  # Use subset for demo
# Filter to only valid user IDs (within range of user_features)
unique_users = unique_users[unique_users < len(user_features)]
batch_size = 8

# Create training data: for each user, provide all items and binary labels
train_x_user_ids = []
train_x_user_features = []
train_x_item_ids = []
train_x_item_features = []
train_y = []

for user_id in unique_users:
    # Get user's features
    user_feat = user_features[user_id]
    
    # Get user's positive items
    user_item_ids = train_item_ids[train_user_ids == user_id]
    positive_set = set(user_item_ids[user_item_ids < n_items])  # Filter valid items
    
    # Create label vector: 1 for positive items, 0 for others
    labels = np.zeros(n_items, dtype=np.float32)
    labels[list(positive_set)] = 1.0
    
    # Prepare item features: all items for this user
    item_feats = item_features[:n_items]  # (n_items, item_feature_dim)
    item_ids_all = np.arange(n_items, dtype=np.int32)
    
    train_x_user_ids.append(user_id)
    train_x_user_features.append(user_feat)
    train_x_item_ids.append(item_ids_all)
    train_x_item_features.append(item_feats)
    train_y.append(labels)

train_x_user_ids = np.array(train_x_user_ids, dtype=np.int32)
train_x_user_features = np.array(train_x_user_features, dtype=np.float32)
train_x_item_ids = np.array(train_x_item_ids, dtype=np.int32)
train_x_item_features = np.array(train_x_item_features, dtype=np.float32)
train_y = np.array(train_y, dtype=np.float32)

print(f"Prepared training data: {len(train_x_user_ids)} users")
print(f"  - User IDs shape: {train_x_user_ids.shape}")
print(f"  - User features shape: {train_x_user_features.shape}")
print(f"  - Item IDs shape: {train_x_item_ids.shape}")
print(f"  - Item features shape: {train_x_item_features.shape}")
print(f"  - Labels shape: {train_y.shape}")
print(f"  - Positive items per user: {train_y.sum(axis=1).mean():.1f} on average\n")

# Build model by calling it once with sample data
# This ensures all layers are initialized before training
_ = model.predict([tf.constant(train_x_user_ids[:1]), tf.constant(train_x_user_features[:1]), 
           tf.constant(train_x_item_ids[:1]), tf.constant(train_x_item_features[:1])], verbose=0)

print("Training with model.fit()...")
print("Note: Metrics may start at 0.0 with random initial embeddings and many items (500).")
print("      This is expected - metrics will improve as the model learns to rank positive items higher.")
print("      With 500 items and ~8 positives per user, it takes time for the model to learn.")
print("      Watch the loss decrease and metrics gradually increase over epochs.\n")

history = model.fit(
    x=[train_x_user_ids, train_x_user_features, train_x_item_ids, train_x_item_features],
    y=train_y,
    epochs=30,  # More epochs needed for large item space (500 items)
    batch_size=batch_size,
    verbose=1
)

print("\n✅ Training completed!")
print(f"Final loss: {history.history['loss'][-1]:.4f}")

# Display recommendation metrics
if 'acc@5' in history.history:
    print("\n📊 Recommendation Metrics:")
    print(f"   - Accuracy@5:  {history.history['acc@5'][-1]:.4f}")
    print(f"   - Accuracy@10: {history.history['acc@10'][-1]:.4f}")
    print(f"   - Precision@5:  {history.history['prec@5'][-1]:.4f}")
    print(f"   - Precision@10: {history.history['prec@10'][-1]:.4f}")
    print(f"   - Recall@5:  {history.history['recall@5'][-1]:.4f}")
    print(f"   - Recall@10: {history.history['recall@10'][-1]:.4f}")

print("\nNote: The model uses margin ranking loss internally.")
print("      Positive items are encouraged to rank higher than negative items.")
print("      The unified model combines CF and CB approaches with learned weights.")

🚀 Training Model
Using model.fit() with built-in ranking loss
The model combines CF and CB approaches with learnable weights!
Just prepare data and call model.fit() - no custom training loop needed.

Prepared training data: 50 users
  - User IDs shape: (50,)
  - User features shape: (50, 10)
  - Item IDs shape: (50, 500)
  - Item features shape: (50, 500, 8)
  - Labels shape: (50, 500)
  - Positive items per user: 8.0 on average

Training with model.fit()...
Note: Metrics may start at 0.0 with random initial embeddings and many items (500).
      This is expected - metrics will improve as the model learns to rank positive items higher.
      With 500 items and ~8 positives per user, it takes time for the model to learn.
      Watch the loss decrease and metrics gradually increase over epochs.

Epoch 1/30




[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step - combined_scores_acc@10: 0.1308 - combined_scores_acc@5: 0.0768 - combined_scores_prec@10: 0.0131 - combined_scores_prec@5: 0.0154 - combined_scores_recall@10: 0.0210 - combined_scores_recall@5: 0.0130 - loss: 3.0857           
Epoch 2/30
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - combined_scores_acc@10: 0.3837 - combined_scores_acc@5: 0.2704 - combined_scores_prec@10: 0.0469 - combined_scores_prec@5: 0.0570 - combined_scores_recall@10: 0.0666 - combined_scores_recall@5: 0.0401 - loss: 2.7045             
Epoch 3/30
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - combined_scores_acc@10: 0.7367 - combined_scores_acc@5: 0.5542 - combined_scores_prec@10: 0.0947 - combined_scores_prec@5: 0.1294 - combined_scores_recall@10: 0.1338 - combined_scores_recall@5: 0.0964 - loss: 2.3968 
Epoch 4/30
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - co

## 4. Generate Recommendations and Visualize


In [5]:
# Generate recommendations for multiple users to check diversity
print("🔍 Checking recommendation diversity across users...")
n_sample_users = min(10, len(train_x_user_ids))
sample_user_indices = np.arange(n_sample_users)

# Get recommendations for all sample users
all_rec_indices = []
all_rec_scores = []

for i in range(n_sample_users):
    user_idx = sample_user_indices[i]
    sample_user_id = tf.constant([train_x_user_ids[user_idx]])
    sample_user_feat = tf.constant([train_x_user_features[user_idx]])
    sample_item_ids = tf.constant([train_x_item_ids[user_idx]])
    sample_item_feats = tf.constant([train_x_item_features[user_idx]])
    
    # Model returns dictionary: {"combined_scores": ..., "rec_indices": ..., "rec_scores": ...}
    combined_scores, rec_indices, rec_scores = model.predict([sample_user_id, sample_user_feat, sample_item_ids, sample_item_feats], verbose=0)
    rec_indices = rec_indices
    rec_scores = rec_scores
    
    rec_indices_np = rec_indices[0].numpy() if hasattr(rec_indices[0], 'numpy') else np.array(rec_indices[0])
    rec_scores_np = rec_scores[0].numpy() if hasattr(rec_scores[0], 'numpy') else np.array(rec_scores[0])
    
    all_rec_indices.append(rec_indices_np)
    all_rec_scores.append(rec_scores_np)

all_rec_indices = np.array(all_rec_indices)

# Check diversity
print(f"\n📊 Recommendation Diversity Analysis:")
print(f"   Checking {n_sample_users} users...")
unique_items_per_user = [len(np.unique(rec)) for rec in all_rec_indices]
shared_items = len(set(all_rec_indices[0]).intersection(*[set(rec) for rec in all_rec_indices[1:]]))
diversity_ratio = 1.0 - (shared_items / model.top_k) if model.top_k > 0 else 0.0
print(f"   Shared items across all users: {shared_items}/{model.top_k}")
print(f"   Diversity ratio: {diversity_ratio:.2%}")
print(f"   Average unique items per user: {np.mean(unique_items_per_user):.1f}")

if shared_items == model.top_k:
    print(f"\n⚠️  WARNING: All users receive the same recommendations!")
    print(f"   This suggests the model may not be learning user-specific preferences.")
else:
    print(f"\n✅ Recommendations are diverse across users - model is working correctly!")

# Visualize recommendation diversity
print("\n📊 Visualizing recommendation diversity...")
fig_diversity = KMRPlotter.plot_recommendation_diversity(
    all_rec_indices,
    user_ids=sample_user_indices,
    title="Recommendation Diversity Across Sample Users"
)
fig_diversity.show()

# Show detailed example for first user
print(f"\n📋 Detailed example for user {sample_user_indices[0]}:")
print(f"   Top-{model.top_k} recommended items: {all_rec_indices[0]}")
print(f"   Recommendation scores: {all_rec_scores[0]}")

# Visualize recommendation scores for first user
print("\n📊 Visualizing recommendation scores for sample user...")
fig_scores = KMRPlotter.plot_recommendation_scores(
    all_rec_scores[0],
    top_k=model.top_k,
    title=f"Recommendation Scores for User {sample_user_indices[0]}"
)
fig_scores.show()

🔍 Checking recommendation diversity across users...

📊 Recommendation Diversity Analysis:
   Checking 10 users...
   Shared items across all users: 0/10
   Diversity ratio: 100.00%
   Average unique items per user: 10.0

✅ Recommendations are diverse across users - model is working correctly!

📊 Visualizing recommendation diversity...



📋 Detailed example for user 0:
   Top-10 recommended items: [ 88 495 123   6 102 117 483 467 403 284]
   Recommendation scores: [0.78748107 0.78548247 0.7783553  0.76914954 0.7389859  0.7254062
 0.61647403 0.6114017  0.606956   0.5886569 ]

📊 Visualizing recommendation scores for sample user...


## 5. Comprehensive Model Diagnostics

Use the one-stop diagnostic report to verify model learning:


In [6]:
# Generate comprehensive diagnostic report
print("📊 Generating comprehensive diagnostic report...\n")

report = KMRPlotter.create_recommendation_diagnostic_report(
    model=model,
    history=history,
    user_features=train_x_user_features,
    item_features=train_x_item_features,
    train_y=train_y,
    n_sample_users=10,
)

print("✅ Report generated successfully!\n")


📊 Generating comprehensive diagnostic report...

✅ Report generated successfully!



## 6. Display Diagnostic Visualizations


In [7]:
# Display all diagnostic plots
print("📈 Displaying diagnostic visualizations...\n")

# 1. Training history
report['figures']['training_history'].show()

# 2. Similarity distribution
report['figures']['similarity_distribution'].show()

# 3. Top-K scores
report['figures']['topk_scores'].show()

# 4. Prediction confidence
report['figures']['prediction_confidence'].show()

# 5. Embedding space (skip if None)
if report['figures']['embedding_space'] is not None:
    report['figures']['embedding_space'].show()
else:
    print("⚠️  Embedding space visualization not available for this model")

# 6. Recommendation diversity
report['figures']['recommendation_diversity'].show()

print("✅ All diagnostic visualizations displayed!")


📈 Displaying diagnostic visualizations...



✅ All diagnostic visualizations displayed!


## Summary

The Unified Recommendation Model successfully combines collaborative filtering and content-based approaches:

- **Collaborative Filtering**: Learns from user-item interaction history
- **Content-Based**: Uses user and item feature representations
- **Hybrid Approach**: Learns optimal weights to combine both signals

Key observations:
- Training loss decreased, indicating the model is learning
- Metrics show recommendation quality improving over epochs
- Recommendation diversity suggests personalized learning across users
- Diagnostic visualizations reveal model behavior and learning patterns
