[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jstilb/meaningful_metrics/blob/main/notebooks/quickstart.ipynb)

# Meaningful Metrics — Quick Start Notebook

[![PyPI](https://img.shields.io/pypi/v/meaningful-metrics)](https://pypi.org/project/meaningful-metrics/)

This notebook introduces the **Meaningful Metrics** framework — a Python library for evaluating AI products against human wellbeing rather than engagement.

**In this notebook you'll learn to:**
1. Install and import the library
2. Define user goals and domain priorities
3. Calculate core metrics: Quality Time Score, Goal Alignment, Distraction Ratio
4. Generate a full metrics report with recommendations
5. Apply the framework to evaluate an AI product

---

**Framework philosophy:** Most digital products optimize for engagement — session duration, clicks, DAUs. Meaningful Metrics replaces these proxies with direct measures of whether users' time advances their *declared* goals.

In [None]:
# Install the package (only needed once per runtime)
!pip install meaningful-metrics -q

In [None]:
from meaningful_metrics import (
    calculate_quality_time_score,
    calculate_goal_alignment,
    calculate_distraction_ratio,
    calculate_actionability_score,
    generate_metrics_report,
    calculate_domain_contributions,
)
from meaningful_metrics.schemas import (
    Goal,
    DomainPriority,
    TimeEntry,
    ActionLog,
    ActionWeights,
)

print(f"meaningful_metrics imported successfully")

## Part 1: Core Concepts

The framework has three input types:

| Type | Purpose | Example |
|------|---------|------|
| `Goal` | What the user wants to achieve | "Learn Python", linked to domains ["coding", "tutorials"] |
| `DomainPriority` | How valuable each domain is (0–1) | coding=1.0, social_media=0.2 |
| `TimeEntry` | Time actually spent in each domain | coding=3.0h, social_media=1.5h |

And one output type: `MetricsReport` — which aggregates everything.

In [None]:
# Step 1: Define what the user wants to achieve
goals = [
    Goal(
        id="learn-python",
        name="Learn Python",
        domains=["coding", "tutorials", "documentation"],
        target_hours_per_week=8.0,
    ),
    Goal(
        id="stay-healthy",
        name="Stay Healthy",
        domains=["exercise", "meal_prep"],
    ),
]

print("Goals defined:")
for g in goals:
    print(f"  {g.name} → domains: {g.domains}")

In [None]:
# Step 2: Set domain priorities (user-controlled, not algorithmic)
# max_daily_hours applies diminishing returns — time beyond cap counts less
priorities = [
    DomainPriority(domain="coding", priority=1.0, max_daily_hours=4.0),
    DomainPriority(domain="tutorials", priority=0.9, max_daily_hours=2.0),
    DomainPriority(domain="documentation", priority=0.7, max_daily_hours=1.0),
    DomainPriority(domain="exercise", priority=0.8, max_daily_hours=1.5),
    DomainPriority(domain="meal_prep", priority=0.5, max_daily_hours=1.0),
    DomainPriority(domain="social_media", priority=0.1),
    DomainPriority(domain="news", priority=0.3),
]

print("Domain priorities:")
for p in sorted(priorities, key=lambda x: x.priority, reverse=True):
    cap_str = f" (cap: {p.max_daily_hours}h)" if p.max_daily_hours else ""
    print(f"  {p.domain:<20} priority={p.priority:.1f}{cap_str}")

In [None]:
# Step 3: Log today's time
entries = [
    TimeEntry(domain="coding", hours=3.0),
    TimeEntry(domain="tutorials", hours=1.5),
    TimeEntry(domain="social_media", hours=2.0),
    TimeEntry(domain="exercise", hours=0.75),
    TimeEntry(domain="news", hours=0.5),
    TimeEntry(domain="meal_prep", hours=0.5),
]

total_hours = sum(e.hours for e in entries)
print(f"Total hours tracked: {total_hours:.1f}h")
print("\nTime log:")
for e in sorted(entries, key=lambda x: x.hours, reverse=True):
    bar = "█" * int(e.hours * 4)
    print(f"  {e.domain:<20} {bar:<20} {e.hours:.1f}h")

## Part 2: Individual Metrics

You can call each metric function directly for targeted calculations.

In [None]:
# Quality Time Score: priority-weighted time with diminishing returns
qts = calculate_quality_time_score(entries, priorities)
print(f"Quality Time Score: {qts:.2f}")
print()

# Domain-level breakdown
domain_metrics = calculate_domain_contributions(entries, priorities)
print("QTS breakdown by domain:")
for dm in sorted(domain_metrics, key=lambda x: x.contribution, reverse=True):
    capped = " (capped)" if dm.effective_time < dm.time_spent else ""
    print(
        f"  {dm.domain:<20} "
        f"{dm.time_spent:.1f}h × {dm.priority:.1f} priority"
        f"{capped} = {dm.contribution:.2f} QTS"
    )

In [None]:
# Goal Alignment: what % of time advances stated goals?
alignment = calculate_goal_alignment(entries, goals)
distraction = calculate_distraction_ratio(entries, goals)

print(f"Goal Alignment:   {alignment:.1f}%")
print(f"Distraction Ratio: {distraction:.1f}%")

# Visualize
bar_len = 40
aligned_bars = int((alignment / 100) * bar_len)
distracted_bars = bar_len - aligned_bars
print(f"\n[{'█' * aligned_bars}{'░' * distracted_bars}]")
print(f" {'Goal-aligned':>{aligned_bars - 1}} | {'Distracted':<{distracted_bars}}")

In [None]:
# Actionability Score: does consumed content translate into action?
# Imagine tracking your AI assistant interactions today:
# - Consumed 25 responses
# - Bookmarked 5 (saved to notes)
# - Shared 2 (sent to colleague)
# - Applied 8 (directly used in a deliverable)

actionability = calculate_actionability_score(
    consumed=25,
    bookmarked=5,
    shared=2,
    applied=8,
)

# Custom weights (optional)
custom_weights = ActionWeights(bookmarked=0.3, shared=0.5, applied=1.0)

print(f"Actionability Score: {actionability:.3f}")
print(f"(Default weights: bookmarked=0.3, shared=0.5, applied=1.0)")
print()

# Interpretation
if actionability > 0.5:
    label = "HIGH — you're acting on what you consume"
elif actionability > 0.2:
    label = "MODERATE — some output, room to improve"
else:
    label = "LOW — mostly passive consumption"
print(f"Rating: {label}")

## Part 3: Full Metrics Report

`generate_metrics_report` combines all metrics into a structured report with recommendations.

In [None]:
action_log = ActionLog(consumed=25, bookmarked=5, shared=2, applied=8)

report = generate_metrics_report(
    time_entries=entries,
    priorities=priorities,
    goals=goals,
    actions=action_log,
    period="daily",
)

print("=" * 50)
print("DAILY METRICS REPORT")
print("=" * 50)
print(f"Quality Time Score:   {report.quality_time_score:.2f}")
print(f"Raw Time Tracked:     {report.raw_time_hours:.1f}h")
print(f"Goal Alignment:       {report.goal_alignment_percent:.1f}%")
print(f"Distraction Ratio:    {report.distraction_percent:.1f}%")
print(f"Actionability Score:  {report.actionability_score:.3f}")
print()
print("Recommendations:")
for rec in report.recommendations:
    icons = {"high": "[!!!]", "medium": "[!! ]", "low": "[ i ]"}
    print(f"  {icons[rec.priority]} {rec.message}")

## Part 4: AI Product Evaluation

The real power of Meaningful Metrics is evaluating AI products. Here's a mini version of the ChatGPT case study — modeling how users' time with an AI assistant maps to their goals.

In [None]:
# Evaluating an AI writing assistant
# User's declared goal: improve their writing craft

writing_goals = [
    Goal(
        id="improve-craft",
        name="Improve Writing Craft",
        domains=["drafting_with_feedback", "style_learning", "revision_practice"],
    ),
]

writing_priorities = [
    DomainPriority(domain="drafting_with_feedback", priority=1.0, max_daily_hours=1.5),
    DomainPriority(domain="revision_practice", priority=0.9, max_daily_hours=1.0),
    DomainPriority(domain="style_learning", priority=0.8, max_daily_hours=0.5),
    DomainPriority(domain="ai_ghostwriting", priority=0.1),  # AI writes for user
    DomainPriority(domain="off_task", priority=0.05),
]

# Scenario A: The user is using AI as a tutor (good pattern)
tutor_entries = [
    TimeEntry(domain="drafting_with_feedback", hours=1.0),
    TimeEntry(domain="revision_practice", hours=0.5),
    TimeEntry(domain="style_learning", hours=0.3),
    TimeEntry(domain="off_task", hours=0.2),
]

# Scenario B: The user is using AI to write for them (engagement without growth)
ghostwriter_entries = [
    TimeEntry(domain="ai_ghostwriting", hours=1.5),
    TimeEntry(domain="off_task", hours=0.5),
    TimeEntry(domain="drafting_with_feedback", hours=0.0),
    TimeEntry(domain="revision_practice", hours=0.0),
]

report_a = generate_metrics_report(tutor_entries, writing_priorities, writing_goals)
report_b = generate_metrics_report(ghostwriter_entries, writing_priorities, writing_goals)

print("SCENARIO A: AI as Tutor")
print(f"  Goal Alignment: {report_a.goal_alignment_percent:.1f}%")
print(f"  Quality Time Score: {report_a.quality_time_score:.2f}")
print()
print("SCENARIO B: AI as Ghostwriter")
print(f"  Goal Alignment: {report_b.goal_alignment_percent:.1f}%")
print(f"  Quality Time Score: {report_b.quality_time_score:.2f}")
print()
print("KEY INSIGHT: Both scenarios generate similar engagement metrics")
print("(time with the product, messages sent, tokens consumed).")
print("Meaningful Metrics reveals that only Scenario A advances the user's goal.")

## Next Steps

- Read the full [ChatGPT Case Study](https://github.com/jstilb/meaningful_metrics/blob/main/results/case-studies/chatgpt-goal-alignment.md)
- Browse the [API Reference](https://jstilb.github.io/meaningful_metrics/api/)
- Contribute your own metrics or case studies via [GitHub](https://github.com/jstilb/meaningful_metrics)

---

*Meaningful Metrics is open source under the MIT License.*