# MCP Cart Analysis & Cart Data Server Integration

This notebook demonstrates the integration between a Cart Data Server (running on port 3001) and a Model Context Protocol (MCP) server with a cart analysis tool. We'll:

1. Set up a simple data server to provide customer cart data
2. Analyze the cart data using various techniques
3. Visualize the results

Let's start by importing the necessary libraries.

In [None]:
# Import required libraries
import json
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import requests
import threading
import time
from flask import Flask, jsonify, request
from IPython.display import display, HTML

# Set plot style
plt.style.use('ggplot')
sns.set(style="whitegrid")
%matplotlib inline

## 1. Setting Up Data Server on Port 3001

First, we'll create a simple Flask server that runs on port 3001 and serves our customer cart data. We'll run this server in a separate thread so that we can continue to use the notebook while the server is running.

In [None]:
# Create a Flask application
app = Flask(__name__)

# Define the customer data
customers = [
  {
    "id": "cust01",
    "name": "Alice",
    "cart": [
      {"id": "prod1", "name": "Laptop", "price": 1200, "quantity": 1, "category": "electronics"},
      {"id": "prod2", "name": "Mouse", "price": 25, "quantity": 2, "category": "electronics"}
    ]
  },
  {
    "id": "cust02",
    "name": "Bob",
    "cart": [
      {"id": "prod3", "name": "T-Shirt", "price": 15, "quantity": 3, "category": "fashion"},
      {"id": "prod4", "name": "Jeans", "price": 45, "quantity": 1, "category": "fashion"}
    ]
  },
  {
    "id": "cust03",
    "name": "Charlie",
    "cart": [
      {"id": "prod5", "name": "Coffee Maker", "price": 85, "quantity": 1, "category": "home"},
      {"id": "prod6", "name": "Coffee Beans", "price": 12, "quantity": 5, "category": "grocery"}
    ]
  },
  {
    "id": "cust04",
    "name": "Diana",
    "cart": [
      {"id": "prod7", "name": "Headphones", "price": 150, "quantity": 1, "category": "electronics"},
      {"id": "prod8", "name": "Power Bank", "price": 40, "quantity": 1, "category": "electronics"}
    ]
  },
  {
    "id": "cust05",
    "name": "Ethan",
    "cart": [
      {"id": "prod9", "name": "Sneakers", "price": 70, "quantity": 2, "category": "fashion"},
      {"id": "prod10", "name": "Socks", "price": 5, "quantity": 6, "category": "fashion"}
    ]
  },
  {
    "id": "cust06",
    "name": "Fiona",
    "cart": [
      {"id": "prod11", "name": "Microwave", "price": 200, "quantity": 1, "category": "home"},
      {"id": "prod12", "name": "Dinner Plates", "price": 30, "quantity": 1, "category": "home"}
    ]
  },
  {
    "id": "cust07",
    "name": "George",
    "cart": [
      {"id": "prod13", "name": "Smartphone", "price": 900, "quantity": 1, "category": "electronics"},
      {"id": "prod14", "name": "Phone Case", "price": 20, "quantity": 2, "category": "electronics"}
    ]
  },
  {
    "id": "cust08",
    "name": "Hannah",
    "cart": [
      {"id": "prod15", "name": "Notebook", "price": 3, "quantity": 10, "category": "stationery"},
      {"id": "prod16", "name": "Pen", "price": 1.5, "quantity": 20, "category": "stationery"}
    ]
  },
  {
    "id": "cust09",
    "name": "Ian",
    "cart": [
      {"id": "prod17", "name": "Backpack", "price": 60, "quantity": 1, "category": "fashion"},
      {"id": "prod18", "name": "Water Bottle", "price": 12, "quantity": 2, "category": "accessories"}
    ]
  },
  {
    "id": "cust10",
    "name": "Julia",
    "cart": [
      {"id": "prod19", "name": "Blender", "price": 100, "quantity": 1, "category": "home"},
      {"id": "prod20", "name": "Smoothie Cup", "price": 8, "quantity": 3, "category": "home"}
    ]
  }
]

# Add more customers
more_customers = [
  {
    "id": "cust11",
    "name": "Kevin",
    "cart": [
      {"id": "prod21", "name": "Gaming Console", "price": 450, "quantity": 1, "category": "electronics"},
      {"id": "prod22", "name": "Extra Controller", "price": 60, "quantity": 1, "category": "electronics"}
    ]
  },
  {
    "id": "cust12",
    "name": "Laura",
    "cart": [
      {"id": "prod23", "name": "Dress", "price": 80, "quantity": 1, "category": "fashion"},
      {"id": "prod24", "name": "Scarf", "price": 25, "quantity": 1, "category": "fashion"}
    ]
  },
  {
    "id": "cust13",
    "name": "Mark",
    "cart": [
      {"id": "prod25", "name": "Drill Machine", "price": 120, "quantity": 1, "category": "tools"},
      {"id": "prod26", "name": "Safety Glasses", "price": 15, "quantity": 1, "category": "tools"}
    ]
  },
  {
    "id": "cust14",
    "name": "Nina",
    "cart": [
      {"id": "prod27", "name": "Skincare Cream", "price": 35, "quantity": 2, "category": "beauty"},
      {"id": "prod28", "name": "Lipstick", "price": 20, "quantity": 1, "category": "beauty"}
    ]
  },
  {
    "id": "cust15",
    "name": "Oscar",
    "cart": [
      {"id": "prod29", "name": "Office Chair", "price": 180, "quantity": 1, "category": "furniture"},
      {"id": "prod30", "name": "Desk Lamp", "price": 40, "quantity": 1, "category": "furniture"}
    ]
  },
  {
    "id": "cust16",
    "name": "Paula",
    "cart": [
      {"id": "prod31", "name": "Cookware Set", "price": 150, "quantity": 1, "category": "home"},
      {"id": "prod32", "name": "Cutting Board", "price": 20, "quantity": 1, "category": "home"}
    ]
  },
  {
    "id": "cust17",
    "name": "Quentin",
    "cart": [
      {"id": "prod33", "name": "Running Shoes", "price": 90, "quantity": 1, "category": "fashion"},
      {"id": "prod34", "name": "Sports Watch", "price": 130, "quantity": 1, "category": "fashion"}
    ]
  },
  {
    "id": "cust18",
    "name": "Rachel",
    "cart": [
      {"id": "prod35", "name": "Air Fryer", "price": 160, "quantity": 1, "category": "home"},
      {"id": "prod36", "name": "Cooking Oil Spray", "price": 10, "quantity": 2, "category": "grocery"}
    ]
  },
  {
    "id": "cust19",
    "name": "Sam",
    "cart": [
      {"id": "prod37", "name": "Yoga Mat", "price": 45, "quantity": 1, "category": "fitness"},
      {"id": "prod38", "name": "Dumbbells", "price": 70, "quantity": 2, "category": "fitness"}
    ]
  },
  {
    "id": "cust20",
    "name": "Tina",
    "cart": [
      {"id": "prod39", "name": "Baby Stroller", "price": 250, "quantity": 1, "category": "baby"},
      {"id": "prod40", "name": "Baby Bottle", "price": 15, "quantity": 4, "category": "baby"}
    ]
  }
]

# Combine all customers
customers.extend(more_customers)

# Define API routes
@app.route('/api/customers', methods=['GET'])
def get_customers():
    return jsonify(customers)

@app.route('/api/customers/<customer_id>', methods=['GET'])
def get_customer(customer_id):
    customer = next((c for c in customers if c['id'] == customer_id), None)
    if customer:
        return jsonify(customer)
    return jsonify({"error": "Customer not found"}), 404

@app.route('/api/customers/<customer_id>/cart', methods=['GET'])
def get_customer_cart(customer_id):
    customer = next((c for c in customers if c['id'] == customer_id), None)
    if customer:
        return jsonify(customer['cart'])
    return jsonify({"error": "Customer not found"}), 404

@app.route('/health', methods=['GET'])
def health_check():
    return jsonify({"status": "ok"})

# Function to run the Flask app in a separate thread
def run_flask_app():
    app.run(host='localhost', port=3001, debug=False, use_reloader=False)

# Create and start the server thread
server_thread = threading.Thread(target=run_flask_app)
server_thread.daemon = True  # Thread will exit when the main program exits
server_thread.start()

# Wait a moment for the server to start
time.sleep(2)
print("Cart Data Server is running on http://localhost:3001")
print("API endpoints available:")
print("- http://localhost:3001/api/customers")
print("- http://localhost:3001/api/customers/<customer_id>")
print("- http://localhost:3001/api/customers/<customer_id>/cart")
print("- http://localhost:3001/health")

## 2. Testing the Cart Data Server

Let's verify that our server is working by making a few requests to it.

In [None]:
# Test the health endpoint
health_response = requests.get('http://localhost:3001/health')
print("Health check response:", health_response.json())

# Test getting all customers
customers_response = requests.get('http://localhost:3001/api/customers')
print(f"Retrieved {len(customers_response.json())} customers")

# Test getting a specific customer
customer_response = requests.get('http://localhost:3001/api/customers/cust01')
print("\nCustomer details:")
print(json.dumps(customer_response.json(), indent=2))

# Test getting a specific customer's cart
cart_response = requests.get('http://localhost:3001/api/customers/cust01/cart')
print("\nCustomer cart:")
print(json.dumps(cart_response.json(), indent=2))

## 3. Fetching and Processing All Cart Data

Now that we have confirmed our server is working, let's fetch all the customer data and transform it into DataFrames for analysis.

In [None]:
# Fetch all customer data
all_customers = requests.get('http://localhost:3001/api/customers').json()

# Create a DataFrame for customer information
customers_df = pd.DataFrame([
    {
        'customer_id': customer['id'],
        'customer_name': customer['name'],
        'cart_items': len(customer['cart']),
        'total_quantity': sum(item['quantity'] for item in customer['cart']),
        'cart_value': sum(item['price'] * item['quantity'] for item in customer['cart'])
    }
    for customer in all_customers
])

# Create a DataFrame for all cart items
cart_items = []
for customer in all_customers:
    for item in customer['cart']:
        cart_items.append({
            'customer_id': customer['id'],
            'customer_name': customer['name'],
            'product_id': item['id'],
            'product_name': item['name'],
            'price': item['price'],
            'quantity': item['quantity'],
            'category': item['category'],
            'subtotal': item['price'] * item['quantity']
        })

cart_items_df = pd.DataFrame(cart_items)

# Display the DataFrames
print("Customer Summary:")
display(customers_df.head())

print("\nCart Items:")
display(cart_items_df.head())

## 4. Basic Cart Statistics

Let's calculate some basic statistics about our customer carts.

In [None]:
# Calculate basic statistics
total_revenue = customers_df['cart_value'].sum()
average_cart_value = customers_df['cart_value'].mean()
median_cart_value = customers_df['cart_value'].median()
max_cart_value = customers_df['cart_value'].max()
min_cart_value = customers_df['cart_value'].min()
total_items_sold = cart_items_df['quantity'].sum()
average_items_per_cart = customers_df['total_quantity'].mean()
unique_categories = cart_items_df['category'].nunique()
unique_products = cart_items_df['product_id'].nunique()

# Display statistics
stats = pd.DataFrame({
    'Metric': [
        'Total Revenue', 
        'Average Cart Value', 
        'Median Cart Value',
        'Maximum Cart Value',
        'Minimum Cart Value',
        'Total Items Sold',
        'Average Items per Cart',
        'Unique Categories',
        'Unique Products'
    ],
    'Value': [
        f"${total_revenue:.2f}",
        f"${average_cart_value:.2f}",
        f"${median_cart_value:.2f}",
        f"${max_cart_value:.2f}",
        f"${min_cart_value:.2f}",
        total_items_sold,
        f"{average_items_per_cart:.1f}",
        unique_categories,
        unique_products
    ]
})

display(stats)

# Create a histogram of cart values
plt.figure(figsize=(10, 6))
sns.histplot(customers_df['cart_value'], bins=10, kde=True)
plt.title('Distribution of Cart Values')
plt.xlabel('Cart Value ($)')
plt.ylabel('Frequency')
plt.grid(True, alpha=0.3)
plt.show()

# Create a scatter plot of items vs cart value
plt.figure(figsize=(10, 6))
sns.scatterplot(data=customers_df, x='total_quantity', y='cart_value', s=100, alpha=0.7)
plt.title('Relationship Between Number of Items and Cart Value')
plt.xlabel('Total Quantity of Items')
plt.ylabel('Cart Value ($)')
plt.grid(True, alpha=0.3)
plt.show()

## 5. Category Analysis

Now let's analyze the product categories to see which ones are most popular and generate the most revenue.

In [None]:
# Analyze categories
category_stats = cart_items_df.groupby('category').agg({
    'product_id': 'count',
    'quantity': 'sum',
    'subtotal': 'sum'
}).reset_index()

category_stats.columns = ['Category', 'Product Count', 'Quantity Sold', 'Revenue']
category_stats['Average Price'] = category_stats['Revenue'] / category_stats['Quantity Sold']
category_stats = category_stats.sort_values('Revenue', ascending=False)

# Display category statistics
display(category_stats)

# Create a bar chart of revenue by category
plt.figure(figsize=(12, 6))
sns.barplot(data=category_stats, x='Category', y='Revenue', palette='viridis')
plt.title('Revenue by Product Category')
plt.xlabel('Category')
plt.ylabel('Revenue ($)')
plt.xticks(rotation=45)
plt.grid(True, alpha=0.3)
plt.show()

# Create a pie chart of revenue distribution
plt.figure(figsize=(10, 10))
plt.pie(
    category_stats['Revenue'],
    labels=category_stats['Category'],
    autopct='%1.1f%%',
    startangle=90,
    shadow=True,
    explode=[0.05] * len(category_stats),
    colors=sns.color_palette('viridis', len(category_stats))
)
plt.title('Revenue Distribution by Category')
plt.axis('equal')
plt.show()

# Create a grouped bar chart for product count and quantity sold
fig, ax1 = plt.subplots(figsize=(12, 6))

x = np.arange(len(category_stats))
width = 0.35

ax1.bar(x - width/2, category_stats['Product Count'], width, label='Product Count')
ax1.set_ylabel('Product Count')
ax1.set_title('Product Count vs Quantity Sold by Category')
ax1.set_xticks(x)
ax1.set_xticklabels(category_stats['Category'], rotation=45)

ax2 = ax1.twinx()
ax2.bar(x + width/2, category_stats['Quantity Sold'], width, color='orange', label='Quantity Sold')
ax2.set_ylabel('Quantity Sold')

fig.tight_layout()
plt.legend(loc='upper left')
plt.show()

## 6. Price Range Analysis

Let's analyze customer purchasing behavior across different price ranges.

In [None]:
# Define price ranges
def assign_price_range(price):
    if price < 20:
        return 'Low (< $20)'
    elif price < 50:
        return 'Medium ($20-$49)'
    elif price < 100:
        return 'High ($50-$99)'
    else:
        return 'Premium ($100+)'

# Add price range column
cart_items_df['price_range'] = cart_items_df['price'].apply(assign_price_range)

# Analyze products by price range
price_range_stats = cart_items_df.groupby('price_range').agg({
    'product_id': 'count',
    'quantity': 'sum',
    'subtotal': 'sum'
}).reset_index()

price_range_stats.columns = ['Price Range', 'Product Count', 'Quantity Sold', 'Revenue']
price_range_stats['Average Quantity per Product'] = price_range_stats['Quantity Sold'] / price_range_stats['Product Count']

# Define correct sorting order
price_range_order = ['Low (< $20)', 'Medium ($20-$49)', 'High ($50-$99)', 'Premium ($100+)']
price_range_stats['Price Range'] = pd.Categorical(
    price_range_stats['Price Range'],
    categories=price_range_order,
    ordered=True
)
price_range_stats = price_range_stats.sort_values('Price Range')

# Display price range statistics
display(price_range_stats)

# Create a bar chart of revenue by price range
plt.figure(figsize=(10, 6))
sns.barplot(data=price_range_stats, x='Price Range', y='Revenue', palette='rocket')
plt.title('Revenue by Price Range')
plt.xlabel('Price Range')
plt.ylabel('Revenue ($)')
plt.grid(True, alpha=0.3)
plt.show()

# Create a bar chart of quantity sold by price range
plt.figure(figsize=(10, 6))
sns.barplot(data=price_range_stats, x='Price Range', y='Quantity Sold', palette='rocket')
plt.title('Quantity Sold by Price Range')
plt.xlabel('Price Range')
plt.ylabel('Quantity Sold')
plt.grid(True, alpha=0.3)
plt.show()

# Create a more detailed analysis with categories and price ranges
category_price_stats = cart_items_df.groupby(['category', 'price_range']).agg({
    'product_id': 'count',
    'quantity': 'sum',
    'subtotal': 'sum'
}).reset_index()

category_price_stats.columns = ['Category', 'Price Range', 'Product Count', 'Quantity Sold', 'Revenue']
category_price_stats['Price Range'] = pd.Categorical(
    category_price_stats['Price Range'],
    categories=price_range_order,
    ordered=True
)
category_price_stats = category_price_stats.sort_values(['Category', 'Price Range'])

# Display category and price range statistics
display(category_price_stats.head(10))

# Create a heatmap of revenue by category and price range
pivot_data = category_price_stats.pivot_table(
    values='Revenue',
    index='Category',
    columns='Price Range',
    fill_value=0
)

plt.figure(figsize=(12, 8))
sns.heatmap(pivot_data, annot=True, fmt='.0f', cmap='YlGnBu', linewidths=.5)
plt.title('Revenue by Category and Price Range')
plt.ylabel('Category')
plt.xlabel('Price Range')
plt.tight_layout()
plt.show()

## 11. Advanced Visualization and Cross-Analysis

Let's create more advanced visualizations to show relationships between different factors in our cart data, such as correlations between product categories, price ranges, and customer demographics.

In [None]:
# Create a customer demographics dataframe with additional information
customer_demographics = customer_spending.copy()

# Simulate customer ages and genders for demonstration purposes
np.random.seed(42)  # For reproducibility
customer_demographics['age'] = np.random.randint(18, 75, size=len(customer_demographics))
customer_demographics['gender'] = np.random.choice(['Male', 'Female', 'Other'], size=len(customer_demographics), p=[0.48, 0.48, 0.04])

# Create age groups
bins = [18, 25, 35, 45, 55, 65, 100]
labels = ['18-24', '25-34', '35-44', '45-54', '55-64', '65+']
customer_demographics['age_group'] = pd.cut(customer_demographics['age'], bins=bins, labels=labels)

# Merge with category preferences
category_preferences = []

for customer_id in customer_demographics['id']:
    # Get customer's products
    products = all_products_df[all_products_df['customer_id'] == customer_id]
    
    if len(products) > 0:
        # Get most purchased category
        category_counts = products.groupby('category').size()
        if not category_counts.empty:
            favorite_category = category_counts.idxmax()
        else:
            favorite_category = 'none'
    else:
        favorite_category = 'none'
    
    category_preferences.append(favorite_category)

customer_demographics['favorite_category'] = category_preferences

# Display demographics summary
print("Customer Demographics Summary:")
print(f"Total Customers: {len(customer_demographics)}")
print(f"Gender Distribution: {customer_demographics['gender'].value_counts().to_dict()}")
print(f"Age Group Distribution: {customer_demographics['age_group'].value_counts().sort_index().to_dict()}")
print(f"Favorite Category Distribution: {customer_demographics['favorite_category'].value_counts().to_dict()}")

# Create visualizations for demographics analysis

# Plot age distribution
plt.figure(figsize=(12, 10))

# Age group spending
plt.subplot(2, 2, 1)
age_spending = customer_demographics.groupby('age_group')['total_spending'].mean().sort_index()
sns.barplot(x=age_spending.index, y=age_spending.values, palette='viridis')
plt.title('Average Spending by Age Group')
plt.xlabel('Age Group')
plt.ylabel('Average Spending ($)')
plt.xticks(rotation=45)

# Gender spending
plt.subplot(2, 2, 2)
gender_spending = customer_demographics.groupby('gender')['total_spending'].mean()
sns.barplot(x=gender_spending.index, y=gender_spending.values, palette='magma')
plt.title('Average Spending by Gender')
plt.xlabel('Gender')
plt.ylabel('Average Spending ($)')

# Category preferences by age group
plt.subplot(2, 2, 3)
cat_age = pd.crosstab(customer_demographics['age_group'], customer_demographics['favorite_category'])
cat_age_pct = cat_age.div(cat_age.sum(axis=1), axis=0) * 100
cat_age_pct.plot(kind='bar', stacked=True, colormap='tab10', figsize=(10, 6))
plt.title('Category Preferences by Age Group')
plt.xlabel('Age Group')
plt.ylabel('Percentage')
plt.xticks(rotation=45)
plt.legend(title='Category', bbox_to_anchor=(1.05, 1), loc='upper left')

# Category preferences by gender
plt.subplot(2, 2, 4)
cat_gender = pd.crosstab(customer_demographics['gender'], customer_demographics['favorite_category'])
cat_gender_pct = cat_gender.div(cat_gender.sum(axis=1), axis=0) * 100
cat_gender_pct.plot(kind='bar', stacked=True, colormap='tab10')
plt.title('Category Preferences by Gender')
plt.xlabel('Gender')
plt.ylabel('Percentage')
plt.xticks(rotation=45)
plt.legend(title='Category', bbox_to_anchor=(1.05, 1), loc='upper left')

plt.tight_layout()
plt.show()

# Create correlation heatmap between age, spending and quantity
correlation_data = pd.DataFrame({
    'age': customer_demographics['age'],
    'total_spending': customer_demographics['total_spending'],
    'segment_value': customer_demographics['segment'].map({'Low': 1, 'Medium': 2, 'High': 3, 'VIP': 4})
})

plt.figure(figsize=(10, 8))
correlation_matrix = correlation_data.corr()
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', linewidths=0.5)
plt.title('Correlation Between Age, Spending, and Customer Segment')
plt.tight_layout()
plt.show()

# Create a scatter plot with age vs spending, colored by segment
plt.figure(figsize=(12, 8))
sns.scatterplot(
    data=customer_demographics,
    x='age',
    y='total_spending',
    hue='segment',
    palette='viridis',
    size='total_spending',
    sizes=(20, 200),
    alpha=0.7
)
plt.title('Customer Age vs. Spending by Segment')
plt.xlabel('Age')
plt.ylabel('Total Spending ($)')
plt.legend(title='Customer Segment')
plt.grid(True, alpha=0.3)
plt.show()

# Display summary of findings
print("\nKey Insights from Demographics Analysis:")
print("1. Age Correlation with Spending:", round(correlation_matrix.loc['age', 'total_spending'], 2))
print("2. Most Popular Category Overall:", customer_demographics['favorite_category'].value_counts().idxmax())
print("3. Age Group with Highest Average Spending:", age_spending.idxmax(), f"(${round(age_spending.max(), 2)})")
print("4. Gender with Highest Average Spending:", gender_spending.idxmax(), f"(${round(gender_spending.max(), 2)})")

# Create a final recommendation summary based on demographics
print("\nRecommendations Based on Demographics Analysis:")
high_spending_age = age_spending.idxmax()
high_spending_gender = gender_spending.idxmax()
popular_category = customer_demographics['favorite_category'].value_counts().idxmax()

print(f"1. Target marketing campaigns toward {high_spending_age} age group as they spend the most on average.")
print(f"2. Consider special promotions for {high_spending_gender} shoppers to leverage their higher spending patterns.")
print(f"3. Expand {popular_category} product offerings as it's the most popular category across demographics.")
print("4. Use age and gender preferences to create personalized product recommendations.")
print("5. Develop loyalty programs for VIP segment customers to maintain their high spending levels.")

## 12. Time Series Analysis of Cart Data

Let's analyze how cart data changes over time to identify trends, seasonal patterns, and potential growth opportunities.

In [None]:
# Generate historical cart data for time series analysis
# Using a simulated dataset since our original data doesn't have timestamps

# Set seed for reproducibility
np.random.seed(42)

# Generate dates for the past 12 months
end_date = pd.Timestamp.now()
start_date = end_date - pd.DateOffset(months=12)
date_range = pd.date_range(start=start_date, end=end_date, freq='D')

# Create a time series dataframe
ts_data = pd.DataFrame({
    'date': date_range,
    'orders': np.random.randint(10, 50, size=len(date_range)),  # Random number of orders per day
})

# Add some seasonality and trend
ts_data['month'] = ts_data['date'].dt.month
ts_data['day_of_week'] = ts_data['date'].dt.dayofweek  # Monday=0, Sunday=6
ts_data['is_weekend'] = ts_data['day_of_week'] >= 5  # Weekend indicator

# Add seasonal effects (higher in certain months like November-December for holidays)
month_effects = {
    1: -5,    # January (post-holiday dip)
    2: -10,   # February (lowest)
    3: 0,     # March
    4: 5,     # April
    5: 10,    # May
    6: 5,     # June
    7: 0,     # July
    8: 15,    # August (back to school)
    9: 5,     # September
    10: 10,   # October
    11: 30,   # November (early holiday shopping)
    12: 50,   # December (holiday shopping peak)
}

# Add day-of-week effects (weekends higher than weekdays)
dow_effects = {
    0: 0,     # Monday
    1: 5,     # Tuesday
    2: 10,    # Wednesday
    3: 15,    # Thursday
    4: 25,    # Friday
    5: 40,    # Saturday
    6: 30,    # Sunday
}

# Apply seasonal effects
for month, effect in month_effects.items():
    ts_data.loc[ts_data['month'] == month, 'orders'] += effect

for dow, effect in dow_effects.items():
    ts_data.loc[ts_data['day_of_week'] == dow, 'orders'] += effect

# Add a general upward trend (business growth)
days = (ts_data['date'] - ts_data['date'].min()).dt.days
ts_data['trend'] = days * 0.1  # Slight upward trend
ts_data['orders'] = ts_data['orders'] + ts_data['trend']

# Add some random special event spikes
special_events = pd.DataFrame({
    'date': [
        pd.Timestamp(year=end_date.year-1, month=11, day=25),  # Black Friday
        pd.Timestamp(year=end_date.year-1, month=12, day=15),  # Holiday shopping peak
        pd.Timestamp(year=end_date.year, month=1, day=1),      # New Year
        pd.Timestamp(year=end_date.year, month=2, day=14),     # Valentine's Day
        pd.Timestamp(year=end_date.year, month=5, day=15),     # Spring sale
        pd.Timestamp(year=end_date.year, month=7, day=4),      # Independence Day
    ],
    'effect': [100, 80, 30, 50, 40, 60]
})

# Apply special event effects
for _, event in special_events.iterrows():
    event_date = event['date'].strftime('%Y-%m-%d')
    if event_date in ts_data['date'].dt.strftime('%Y-%m-%d').values:
        ts_data.loc[ts_data['date'].dt.strftime('%Y-%m-%d') == event_date, 'orders'] += event['effect']

# Ensure we have realistic values (no negative values)
ts_data['orders'] = ts_data['orders'].clip(lower=5).astype(int)

# Calculate revenue assuming average order value varies by month
average_order_values = {
    1: 45,    # January
    2: 40,    # February
    3: 50,    # March
    4: 55,    # April
    5: 60,    # May
    6: 65,    # June
    7: 70,    # July
    8: 80,    # August (back to school)
    9: 65,    # September
    10: 70,   # October
    11: 85,   # November (early holiday shopping)
    12: 95,   # December (holiday shopping peak)
}

# Apply average order values
ts_data['avg_order_value'] = ts_data['month'].map(average_order_values)
ts_data['revenue'] = ts_data['orders'] * ts_data['avg_order_value']

# Also generate category mix over time
categories = ['electronics', 'clothing', 'home', 'beauty', 'books', 'food']
for cat in categories:
    # Create a base percentage for each category
    base_pct = np.random.uniform(0.05, 0.3)
    
    # Add some monthly variation
    month_variation = np.random.normal(0, 0.05, 12)
    
    # Create monthly percentages
    month_pcts = {m: max(0.01, base_pct + month_variation[m-1]) for m in range(1, 13)}
    
    # Apply to dataframe
    ts_data[f'{cat}_pct'] = ts_data['month'].map(month_pcts)

# Normalize percentages so they sum to 1 for each day
pct_columns = [f'{cat}_pct' for cat in categories]
row_sums = ts_data[pct_columns].sum(axis=1)
for col in pct_columns:
    ts_data[col] = ts_data[col] / row_sums

# Calculate category revenue
for cat in categories:
    ts_data[f'{cat}_revenue'] = ts_data['revenue'] * ts_data[f'{cat}_pct']

# Monthly aggregation
monthly_data = ts_data.groupby(ts_data['date'].dt.strftime('%Y-%m')).agg({
    'orders': 'sum',
    'revenue': 'sum',
    **{f'{cat}_revenue': 'sum' for cat in categories}
}).reset_index()
monthly_data['date'] = pd.to_datetime(monthly_data['date'] + '-01')
monthly_data = monthly_data.sort_values('date')

# Weekly aggregation
weekly_data = ts_data.groupby(pd.Grouper(key='date', freq='W-MON')).agg({
    'orders': 'sum',
    'revenue': 'sum',
    'avg_order_value': 'mean',
    **{f'{cat}_revenue': 'sum' for cat in categories}
}).reset_index()
weekly_data = weekly_data.sort_values('date')

# Display overview of the time series data
print("Time Series Data Overview:")
print(f"Date Range: {ts_data['date'].min().strftime('%Y-%m-%d')} to {ts_data['date'].max().strftime('%Y-%m-%d')}")
print(f"Total Days: {len(ts_data)}")
print(f"Total Orders: {ts_data['orders'].sum():,}")
print(f"Total Revenue: ${ts_data['revenue'].sum():,.2f}")
print("\nTop 5 Highest Revenue Days:")
display(ts_data.sort_values('revenue', ascending=False).head(5)[['date', 'orders', 'revenue', 'avg_order_value']])

# Time series visualizations
plt.figure(figsize=(18, 12))

# Daily orders with 7-day moving average
plt.subplot(3, 1, 1)
plt.plot(ts_data['date'], ts_data['orders'], color='skyblue', alpha=0.5, label='Daily Orders')
plt.plot(ts_data['date'], ts_data['orders'].rolling(7).mean(), color='blue', linewidth=2, label='7-Day Moving Avg')
plt.title('Daily Orders with 7-Day Moving Average')
plt.xlabel('Date')
plt.ylabel('Number of Orders')
plt.legend()
plt.grid(True, alpha=0.3)

# Monthly revenue
plt.subplot(3, 1, 2)
plt.bar(monthly_data['date'], monthly_data['revenue'], color='green', alpha=0.7)
plt.title('Monthly Revenue')
plt.xlabel('Month')
plt.ylabel('Revenue ($)')
plt.grid(True, alpha=0.3)

# Revenue by category over time (stacked area chart)
plt.subplot(3, 1, 3)
category_columns = [f'{cat}_revenue' for cat in categories]
monthly_data_stacked = monthly_data.copy()
plt.stackplot(monthly_data_stacked['date'], 
              [monthly_data_stacked[col] for col in category_columns],
              labels=categories, alpha=0.7)
plt.title('Monthly Revenue by Category')
plt.xlabel('Month')
plt.ylabel('Revenue ($)')
plt.legend(loc='upper left')
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Day of week analysis
dow_analysis = ts_data.groupby('day_of_week').agg({
    'orders': 'mean',
    'revenue': 'mean',
    'avg_order_value': 'mean'
}).reset_index()

dow_analysis['day_name'] = dow_analysis['day_of_week'].map({
    0: 'Monday', 1: 'Tuesday', 2: 'Wednesday', 3: 'Thursday', 
    4: 'Friday', 5: 'Saturday', 6: 'Sunday'
})

plt.figure(figsize=(14, 6))
plt.subplot(1, 2, 1)
sns.barplot(x='day_name', y='orders', data=dow_analysis, palette='viridis')
plt.title('Average Daily Orders by Day of Week')
plt.xlabel('Day of Week')
plt.ylabel('Average Orders')
plt.xticks(rotation=45)

plt.subplot(1, 2, 2)
sns.barplot(x='day_name', y='revenue', data=dow_analysis, palette='viridis')
plt.title('Average Daily Revenue by Day of Week')
plt.xlabel('Day of Week')
plt.ylabel('Average Revenue ($)')
plt.xticks(rotation=45)

plt.tight_layout()
plt.show()

# Monthly category analysis
plt.figure(figsize=(14, 8))
for i, cat in enumerate(categories):
    plt.subplot(2, 3, i+1)
    plt.plot(monthly_data['date'], monthly_data[f'{cat}_revenue'], marker='o')
    plt.title(f'{cat.capitalize()} Revenue Trend')
    plt.xlabel('Month')
    plt.ylabel('Revenue ($)')
    plt.grid(True, alpha=0.3)
    plt.xticks(rotation=45)

plt.tight_layout()
plt.show()

# Display insights
print("\nKey Time Series Insights:")
print(f"1. Highest Revenue Month: {monthly_data.loc[monthly_data['revenue'].idxmax(), 'date'].strftime('%B %Y')}")
print(f"2. Highest Revenue Day of Week: {dow_analysis.loc[dow_analysis['revenue'].idxmax(), 'day_name']}")
print(f"3. Most Consistent Category: {min([(cat, monthly_data[f'{cat}_revenue'].std() / monthly_data[f'{cat}_revenue'].mean()) for cat in categories], key=lambda x: x[1])[0].capitalize()}")
print(f"4. Fastest Growing Category: {max([(cat, monthly_data[f'{cat}_revenue'].iloc[-1] / monthly_data[f'{cat}_revenue'].iloc[0] - 1) for cat in categories], key=lambda x: x[1])[0].capitalize()}")

# Generate forecasts (simple moving average)
forecast_window = 3  # months
forecasted_months = []
forecasted_revenue = []

last_date = monthly_data['date'].iloc[-1]
for i in range(1, forecast_window + 1):
    next_month = last_date + pd.DateOffset(months=i)
    forecasted_months.append(next_month)
    
    # Simple moving average of last 3 months
    last_3_months_avg = monthly_data['revenue'].iloc[-3:].mean()
    forecasted_revenue.append(last_3_months_avg)

forecast_df = pd.DataFrame({
    'date': forecasted_months,
    'forecasted_revenue': forecasted_revenue
})

# Plot actual vs forecasted revenue
plt.figure(figsize=(14, 6))
plt.plot(monthly_data['date'], monthly_data['revenue'], marker='o', color='blue', label='Actual Revenue')
plt.plot(forecast_df['date'], forecast_df['forecasted_revenue'], marker='o', color='red', linestyle='--', label='Forecasted Revenue')
plt.title('Revenue Forecast for Next 3 Months')
plt.xlabel('Month')
plt.ylabel('Revenue ($)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

print("\nRevenue Forecast for Next 3 Months:")
for i, row in forecast_df.iterrows():
    print(f"{row['date'].strftime('%B %Y')}: ${row['forecasted_revenue']:,.2f}")

# Time series recommendations
print("\nRecommendations Based on Time Series Analysis:")
best_day = dow_analysis.loc[dow_analysis['revenue'].idxmax(), 'day_name']
print(f"1. Focus marketing campaigns on {best_day}s as they generate the highest average revenue.")
print(f"2. Prepare inventory for increased demand in {monthly_data.loc[monthly_data['revenue'].idxmax(), 'date'].strftime('%B')}.")
print("3. Develop targeted promotions for lower-revenue months to smooth out seasonal variations.")
print("4. Analyze special event spikes to identify successful promotion strategies.")
print("5. Adjust staffing levels based on day-of-week order patterns to ensure optimal customer service.")

## 13. Predictive Analytics and AI Recommendations

In this section, we'll use machine learning to create a simple recommendation system and predictive model based on our cart data. This demonstrates how the MCP Cart Analysis tool could use AI to generate personalized recommendations.

In [None]:
# Import necessary libraries for machine learning
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, classification_report
from sklearn.cluster import KMeans
from scipy.sparse import hstack, csr_matrix

print("Building AI recommendation and prediction models...")

# Create a more detailed customer purchase dataset
# This will simulate purchase history for customers

# First, get all unique products from the cart data
all_products = pd.DataFrame()
for customer_id, customer_data in customers_df.iterrows():
    if 'cart' in customer_data and isinstance(customer_data['cart'], list):
        for item in customer_data['cart']:
            all_products = pd.concat([all_products, pd.DataFrame([{
                'customer_id': customer_data['id'],
                'product_id': item.get('id', ''),
                'product_name': item.get('name', ''),
                'category': item.get('category', 'uncategorized'),
                'price': item.get('price', 0),
                'quantity': item.get('quantity', 1)
            }])], ignore_index=True)

# Merge with customer demographics
customer_purchase_data = all_products.merge(
    customer_demographics[['id', 'age', 'gender', 'age_group', 'total_spending']],
    left_on='customer_id',
    right_on='id'
)

# Create a product purchase matrix (customer-product matrix)
purchase_matrix = customer_purchase_data.pivot_table(
    index='customer_id',
    columns='category',
    values='quantity',
    aggfunc='sum',
    fill_value=0
)

# Ensure all categories are represented
for category in ['electronics', 'clothing', 'home', 'food', 'beauty', 'books']:
    if category not in purchase_matrix.columns:
        purchase_matrix[category] = 0

# Generate additional features for our predictive model
customer_features = customer_demographics.copy()
# Add purchase frequency per category
for category in purchase_matrix.columns:
    customer_features = customer_features.merge(
        purchase_matrix[category].reset_index().rename(columns={category: f'{category}_quantity'}),
        left_on='id',
        right_on='customer_id',
        how='left'
    )
    customer_features[f'{category}_quantity'] = customer_features[f'{category}_quantity'].fillna(0)

# Feature: total number of categories purchased
customer_features['category_diversity'] = customer_features[[f'{cat}_quantity' for cat in purchase_matrix.columns]].gt(0).sum(axis=1)

# Add average price point preference (weighted by quantity)
customer_avg_price = customer_purchase_data.groupby('customer_id').apply(
    lambda x: (x['price'] * x['quantity']).sum() / x['quantity'].sum() if x['quantity'].sum() > 0 else 0
).reset_index().rename(columns={0: 'avg_price_preference'})

customer_features = customer_features.merge(
    customer_avg_price,
    left_on='id',
    right_on='customer_id',
    how='left'
)
customer_features['avg_price_preference'] = customer_features['avg_price_preference'].fillna(0)

# Display feature set
print("\nCustomer Features for Predictive Models:")
display(customer_features.head())

# 1. Customer Segmentation using K-means
# Prepare data for clustering
cluster_features = customer_features[[
    'total_spending', 'avg_price_preference', 'category_diversity',
    'age'
]].copy()

# Standardize features
scaler = StandardScaler()
cluster_data_scaled = scaler.fit_transform(cluster_features)

# Determine optimal number of clusters using elbow method
inertias = []
k_range = range(2, 8)
for k in k_range:
    kmeans = KMeans(n_clusters=k, random_state=42)
    kmeans.fit(cluster_data_scaled)
    inertias.append(kmeans.inertia_)

# Plot elbow curve
plt.figure(figsize=(10, 6))
plt.plot(k_range, inertias, 'o-', color='blue')
plt.title('Elbow Method for Optimal k')
plt.xlabel('Number of clusters')
plt.ylabel('Inertia')
plt.grid(True, alpha=0.3)
plt.show()

# Apply K-means with chosen number of clusters
k = 4  # Based on elbow method
kmeans = KMeans(n_clusters=k, random_state=42)
customer_features['cluster'] = kmeans.fit_predict(cluster_data_scaled)

# Analyze clusters
cluster_analysis = customer_features.groupby('cluster').agg({
    'total_spending': 'mean',
    'avg_price_preference': 'mean',
    'category_diversity': 'mean',
    'age': 'mean',
    **{f'{cat}_quantity': 'mean' for cat in purchase_matrix.columns}
}).reset_index()

# Display cluster profiles
print("\nCustomer Segment Profiles:")
display(cluster_analysis)

# Visualize clusters
plt.figure(figsize=(12, 8))
for i, feature_pair in enumerate([
    ('total_spending', 'avg_price_preference'),
    ('category_diversity', 'age')
]):
    plt.subplot(1, 2, i+1)
    scatter = plt.scatter(
        customer_features[feature_pair[0]], 
        customer_features[feature_pair[1]], 
        c=customer_features['cluster'], 
        cmap='viridis',
        s=50,
        alpha=0.7
    )
    plt.colorbar(scatter, label='Cluster')
    plt.xlabel(feature_pair[0])
    plt.ylabel(feature_pair[1])
    plt.title(f'Clusters by {feature_pair[0]} and {feature_pair[1]}')
    plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Assign descriptive labels to clusters
cluster_labels = {
    0: "Budget Shoppers",      # Low spending, low price point
    1: "Selective Shoppers",   # Medium spending, medium diversity
    2: "Premium Shoppers",     # High spending, high price point
    3: "Diverse Shoppers"      # Medium spending, high diversity
}

# Map cluster numbers to labels
customer_features['segment_label'] = customer_features['cluster'].map(cluster_labels)

# Print cluster distribution
print("\nCustomer Segment Distribution:")
segment_dist = customer_features['segment_label'].value_counts()
for segment, count in segment_dist.items():
    print(f"  {segment}: {count} customers ({count/len(customer_features)*100:.1f}%)")

# 2. Product Recommendation Model
# Create a simple collaborative filtering system

# Function to find similar customers
def find_similar_customers(customer_id, purchase_matrix, n=3):
    """Find the most similar customers based on purchase patterns"""
    if customer_id not in purchase_matrix.index:
        return []
    
    # Get the target customer's purchase pattern
    target_pattern = purchase_matrix.loc[customer_id].values
    
    # Calculate similarity with all other customers
    similarities = {}
    for cust_id in purchase_matrix.index:
        if cust_id == customer_id:
            continue
        
        other_pattern = purchase_matrix.loc[cust_id].values
        
        # Use cosine similarity
        dot_product = np.dot(target_pattern, other_pattern)
        norm_target = np.linalg.norm(target_pattern)
        norm_other = np.linalg.norm(other_pattern)
        
        # Avoid division by zero
        if norm_target == 0 or norm_other == 0:
            similarity = 0
        else:
            similarity = dot_product / (norm_target * norm_other)
        
        similarities[cust_id] = similarity
    
    # Sort by similarity and return top n
    similar_customers = sorted(similarities.items(), key=lambda x: x[1], reverse=True)[:n]
    return similar_customers

# Function to recommend products
def recommend_products(customer_id, purchase_matrix, all_products, n=3):
    """Recommend products based on similar customers"""
    similar_customers = find_similar_customers(customer_id, purchase_matrix)
    
    if not similar_customers:
        return []
    
    # Get products purchased by the target customer
    if customer_id in purchase_matrix.index:
        target_purchases = set(all_products[all_products['customer_id'] == customer_id]['product_id'])
    else:
        target_purchases = set()
    
    # Collect products from similar customers
    recommended_products = {}
    
    for similar_id, similarity in similar_customers:
        # Get products this similar customer purchased
        similar_purchases = all_products[all_products['customer_id'] == similar_id]
        
        # Add to recommendations if not already purchased by target
        for _, product in similar_purchases.iterrows():
            if product['product_id'] not in target_purchases:
                if product['product_id'] not in recommended_products:
                    recommended_products[product['product_id']] = {
                        'product_id': product['product_id'],
                        'product_name': product['product_name'],
                        'category': product['category'],
                        'price': product['price'],
                        'score': similarity
                    }
                else:
                    # If already recommended, increase the score
                    recommended_products[product['product_id']]['score'] += similarity
    
    # Sort by score and return top n
    sorted_recommendations = sorted(recommended_products.values(), key=lambda x: x['score'], reverse=True)[:n]
    return sorted_recommendations

# Test recommendations for a sample customer
sample_customer_id = customer_features['id'].iloc[2]  # Arbitrary customer
print(f"\nProduct Recommendations for Customer {sample_customer_id}:")

recommendations = recommend_products(sample_customer_id, purchase_matrix, all_products)
if recommendations:
    customer_name = customer_demographics[customer_demographics['id'] == sample_customer_id]['name'].iloc[0]
    print(f"Recommendations for {customer_name}:")
    for i, rec in enumerate(recommendations, 1):
        print(f"  {i}. {rec['product_name']} (${rec['price']:.2f}) - Category: {rec['category']}")
else:
    print("No recommendations available for this customer.")

# 3. Predict Customer Spending
# Build a regression model to predict total spending

# Prepare features and target
X = customer_features.drop(['id', 'name', 'customer_id', 'favorite_category', 'segment', 'cluster', 'segment_label', 'total_spending'], axis=1)
y = customer_features['total_spending']

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create preprocessing pipeline
categorical_features = ['gender', 'age_group']
numerical_features = [col for col in X.columns if col not in categorical_features]

preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), numerical_features),
        ('cat', OneHotEncoder(handle_unknown='ignore'), categorical_features)
    ])

# Build and train model
model = Pipeline(steps=[
    ('preprocessor', preprocessor),
    ('regressor', RandomForestRegressor(n_estimators=100, random_state=42))
])

model.fit(X_train, y_train)

# Evaluate model
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)

print("\nSpending Prediction Model Performance:")
print(f"  Root Mean Squared Error: ${rmse:.2f}")
print(f"  Mean Absolute Error: ${np.mean(np.abs(y_test - y_pred)):.2f}")

# Feature importance
if hasattr(model['regressor'], 'feature_importances_'):
    # Get feature names after preprocessing
    preprocessed_features = []
    for name, transformer, features in preprocessor.transformers_:
        if name == 'num':
            preprocessed_features.extend(features)
        elif name == 'cat':
            preprocessed_features.extend([f"{feature}_{category}" for feature in features 
                                        for category in transformer.categories_[preprocessor.transformers_[1][2].index(feature)]])
    
    # Get feature importance
    importances = model['regressor'].feature_importances_
    
    # If the lengths match, we can create a proper mapping
    if len(preprocessed_features) == len(importances):
        feature_importance = pd.DataFrame({
            'Feature': preprocessed_features,
            'Importance': importances
        }).sort_values('Importance', ascending=False)
        
        # Plot top 10 features
        plt.figure(figsize=(12, 6))
        sns.barplot(x='Importance', y='Feature', data=feature_importance.head(10))
        plt.title('Top 10 Features for Predicting Customer Spending')
        plt.xlabel('Importance')
        plt.ylabel('Feature')
        plt.tight_layout()
        plt.show()
        
        print("\nTop 5 Factors Influencing Customer Spending:")
        for i, (_, row) in enumerate(feature_importance.head(5).iterrows(), 1):
            print(f"  {i}. {row['Feature']}: {row['Importance']:.4f}")

# 4. Combine everything for comprehensive customer insights
print("\nComprehensive Customer Insights Tool")
print("=" * 40)

def analyze_customer(customer_id):
    """Provide comprehensive insights for a specific customer"""
    # Get customer data
    if customer_id not in customer_features['id'].values:
        return "Customer not found"
    
    customer_data = customer_features[customer_features['id'] == customer_id].iloc[0]
    
    # Basic info
    customer_name = customer_data['name']
    cluster = customer_data['segment_label']
    spending = customer_data['total_spending']
    
    # Get recommendations
    product_recs = recommend_products(customer_id, purchase_matrix, all_products)
    
    # Generate insights
    insights = {
        "customer_name": customer_name,
        "segment": cluster,
        "spending_level": spending,
        "product_recommendations": product_recs,
        "spending_percentile": percentileofscore(customer_features['total_spending'], spending),
        "favorite_categories": [col.replace('_quantity', '') for col in customer_features.columns if col.endswith('_quantity') 
                              and customer_data[col] > 0],
        "category_diversity_percentile": percentileofscore(customer_features['category_diversity'], customer_data['category_diversity'])
    }
    
    return insights

# Test the analysis with a sample customer
sample_customer = customer_features['id'].iloc[5]  # Different sample customer
insights = analyze_customer(sample_customer)

print(f"Customer Analysis for: {insights['customer_name']}")
print(f"Segment: {insights['segment']}")
print(f"Spending: ${insights['spending_level']:.2f} (Higher than {insights['spending_percentile']:.1f}% of customers)")
print(f"Category Diversity: Higher than {insights['category_diversity_percentile']:.1f}% of customers")
print(f"Favorite Categories: {', '.join(insights['favorite_categories'])}")

print("\nRecommended Products:")
for i, rec in enumerate(insights['product_recommendations'], 1):
    print(f"  {i}. {rec['product_name']} (${rec['price']:.2f}) - Category: {rec['category']}")

print("\nMarketing Recommendations:")
if insights['segment'] == "Budget Shoppers":
    print("  - Send discount coupons for everyday essentials")
    print("  - Highlight value-for-money promotions")
elif insights['segment'] == "Premium Shoppers":
    print("  - Send exclusive offers for premium products")
    print("  - Highlight new arrivals and limited editions")
elif insights['segment'] == "Diverse Shoppers":
    print("  - Send curated multi-category bundles")
    print("  - Highlight complementary products")
else:
    print("  - Send personalized offers based on past category purchases")
    print("  - Highlight quality and uniqueness of products")

print("\nThis is what the MCP Cart Analysis tool could provide when integrated with AI capabilities!")

## 14. Extended Conclusion

In this comprehensive notebook, we have built a robust cart data analysis ecosystem that demonstrates how to:

1. **Set up a dedicated cart data server** that provides structured access to shopping cart information
2. **Integrate with an MCP server** for AI model access to rich cart data analysis
3. **Perform diverse analytical approaches** from basic statistics to time series analysis
4. **Build predictive models and recommendation systems** that can enhance the MCP tool capabilities

This work illustrates how separating data provision (cart data server), analysis logic (MCP tools), and AI interaction (models) creates a powerful and flexible architecture. The MCP approach allows AI models to leverage specialized tools without needing to understand the underlying data structures or analysis algorithms.

Future enhancements could include:
- Real-time analytics capabilities
- More sophisticated recommendation algorithms
- Integration with actual e-commerce platforms
- Enhanced visualization dashboards for business users
- Natural language query capabilities for business analysts

This notebook serves as both a demonstration of the integration architecture and a starting point for building production-ready cart analysis tools for AI systems.