Resolution Lab

An AI motivation coach that experiments on itself to get better at motivating YOU.

The Problem

Generic motivation advice doesn't work. "Just set reminders" works for some people but gets ignored by others. "Track your streaks" motivates gamers but stresses perfectionists. What motivates your friend might actually demotivate you.

There is no universal motivation formula — but there might be a personal one.

The Solution

Resolution Lab is an AI-powered behavioral science platform that runs real experiments on each user to discover their unique motivation formula.

Instead of guessing what works, the system:

Tests 8 evidence-based motivation strategies using AI-generated personalized messages
Measures real outcomes — did the user actually follow through?
Learns and adapts using a multi-armed bandit algorithm (epsilon-greedy)
Continuously improves its own prompts using Opik's auto-optimization pipeline
Evaluates conversation quality with LLM-as-Judge metrics tracked in Opik threads

Every interaction generates data. Every data point makes the next interaction better.

How It Works

The 8 Motivation Strategies

Each strategy is rooted in behavioral psychology:

#	Strategy	Approach	Example
1	Gentle Reminder	Warm, friendly nudges	"Hey! Just checking in on your goal today..."
2	Direct Accountability	Yes/no commitment framing	"Did you do it today? Be honest with yourself."
3	Streak Gamification	Progress chains and rewards	"Day 12 streak! Don't break the chain!"
4	Social Comparison	Peer benchmarking	"73% of users with similar goals completed today"
5	Loss Aversion	What you stand to lose	"Skip today and you lose your 5-day momentum..."
6	Reward Preview	Future benefit visualization	"Imagine how strong you'll feel in 30 days!"
7	Identity Reinforcement	Becoming-based framing	"You're becoming someone who prioritizes health"
8	Micro-Commitment	Lower the barrier	"Can you commit to just 2 minutes?"

The Multi-Armed Bandit

The system uses an epsilon-greedy multi-armed bandit to balance exploration vs. exploitation:

20% of the time: Explore — try a random strategy to gather data
80% of the time: Exploit — use the strategy with the highest observed success rate
Each strategy needs 3+ data points before the system trusts its effectiveness score
Per-user state: your bandit learns your patterns, not global averages

When the system identifies a clear winner, users can lock in their personal formula (90% best strategy / 10% continued exploration).

The AI Coach Agent — 6-Step Cognitive Loop

The full agent mode runs an autonomous reasoning loop, where every step is traced in Opik with parent-child relationships:

OBSERVE → THINK → PLAN → ACT → EVALUATE → LEARN

Step	What Happens	Opik Trace
Observe	Gather user history, streak data, emotional state, experiment results	`agent_observe`
Think	Chain-of-thought reasoning about motivation patterns	`agent_think`
Plan	Multi-armed bandit selects strategy; agent plans personalization	`agent_plan`
Act	Generate personalized motivation message via LLM	`agent_act`
Evaluate	Custom evaluators score message quality (strategy alignment, effectiveness, personalization, tone)	`agent_evaluate`
Learn	Record outcome, update bandit state, trigger optimization if threshold reached	`agent_learn`

There's also a Quick Stream mode using the Vercel AI SDK (useCompletion) for instant, streamed motivation messages — same learning pipeline, faster delivery.

6-step cognitive loop with visible reasoning at each step

Opik Integration — Deep & Production-Grade

This project demonstrates comprehensive Opik integration across tracing, evaluation, threads, feedback scores, and automatic prompt optimization.

1. Full Tracing Pipeline

Every LLM call is automatically traced via litellm.callbacks = ["opik"]. On top of that, 15+ functions are decorated with @opik.track() for fine-grained observability:

Agent Traces (Nested Parent-Child)
└── agent_full_loop (parent trace)
    ├── agent_observe
    ├── agent_think
    ├── agent_plan
    ├── agent_act
    ├── agent_evaluate    ← runs custom evaluators
    └── agent_learn       ← updates bandit + triggers optimization

2. Thread Evaluation — Conversation-Level Quality

Every goal becomes an Opik thread. All check-ins for a goal are grouped under the same thread_id, enabling conversation-level analysis:

Goal: "Drink 8 glasses of water daily"     thread_id: goal_abc123
│
├── Check-in 1: Strategy=gentle_reminder     Outcome: completed
├── Check-in 2: Strategy=streak_gamification Outcome: completed
├── Check-in 3: Strategy=loss_aversion       Outcome: missed
├── Check-in 4: Strategy=gentle_reminder     Outcome: completed
├── Check-in 5: Strategy=identity            Outcome: completed
│
└── AUTO-EVALUATION TRIGGERED (every 5 check-ins)
    ├── ConversationalCoherenceMetric → 0.85
    └── UserFrustrationMetric         → 0.12
    → Feedback scores attached to thread in Opik

After every 5 check-ins, the system automatically:

Closes the Opik thread (marks inactive)
Runs evaluate_threads() with LLM-as-Judge metrics
Attaches feedback scores visible in Opik's Thread view
Reopens the thread for continued tracking

Auto-evaluation triggered every 5 check-ins with coherence and frustration metrics

Goal-based conversation threads with all check-ins grouped together

3. Custom Opik Evaluators

Every AI-generated message is scored by custom evaluators:

Evaluator	What It Measures
Strategy Alignment	Does the message match the intended strategy's keywords and tone?
Motivation Effectiveness	Is this message likely to drive the user to action?
Personalization	Does it feel tailored to this specific user, or generic?
Tone Consistency	Does the emotional tone match what the strategy demands?
Insight Quality	Are generated insights actionable and data-grounded?
Celebration Image	Quality assessment of AI-generated celebration images

Hybrid scoring: Custom evaluators (40%) + LLM-as-Judge (60%) = Overall letter grade (A–F)

Custom evaluators scoring every AI-generated message with detailed metrics

4. Auto Prompt Optimization (Opik Agent Optimizer)

The system automatically improves its own prompts using opik-optimizer:

User checks in → Intervention counter increments
                  │
                  ├── Count < 3  → Continue collecting data
                  │
                  └── Count >= 3 → BACKGROUND OPTIMIZATION TRIGGERED
                                   │
                                   ├── Algorithm: MetaPromptOptimizer
                                   │   (LLM critiques current prompt,
                                   │    iteratively refines for better
                                   │    completion rates)
                                   │
                                   ├── Runs in ProcessPoolExecutor
                                   │   (separate process, non-blocking)
                                   │
                                   └── Results logged to Opik
                                       • Original score → Optimized score
                                       • New prompt saved and used going forward

Three optimization algorithms available:

Algorithm	Method
MetaPromptOptimizer	LLM self-critique and iterative refinement
FewShotBayesianOptimizer	Bayesian search for optimal few-shot examples
EvolutionaryOptimizer	Genetic mutation/crossover for novel prompt discovery

Background optimization running with opik-optimizer SDK

Prompt improvement scores showing before/after effectiveness

5. Feedback Scores & Experiments

Intervention-level: strategy alignment, effectiveness, personalization, tone (per message)
Thread-level: conversational coherence, user frustration (per goal)
Engagement-level: reminder interactions, response time, voice usage
Optimization-level: prompt improvement percentage, before/after scores
A/B Experiments: prompt_experiment_select, prompt_experiment_record, prompt_experiment_report

Architecture

┌────────────────────────────────────────────────────────────────────────┐
│                          RESOLUTION LAB                                │
├────────────────────────────────────────────────────────────────────────┤
│                                                                        │
│  ┌──────────────┐     ┌──────────────┐     ┌────────────────────────┐ │
│  │   Next.js    │────▶│   FastAPI    │────▶│    AI Coach Agent      │ │
│  │   Frontend   │◀────│   Backend    │◀────│  (6-Step Cognitive     │ │
│  │  (Vercel)    │     │  (Railway)   │     │   Loop + Streaming)    │ │
│  └──────────────┘     └──────────────┘     └────────────────────────┘ │
│         │                    │                        │                │
│         │                    ▼                        ▼                │
│         │            ┌──────────────┐        ┌──────────────┐         │
│         │            │   Supabase   │        │   Gemini     │         │
│         │            │  (Postgres   │        │   (LiteLLM)  │         │
│         │            │   + Auth)    │        └──────────────┘         │
│         │            └──────────────┘                │                │
│         │                                            ▼                │
│         │                              ┌──────────────────────┐       │
│         └─────────────────────────────▶│        OPIK          │       │
│                  (View Traces)         │  Traces · Threads ·  │       │
│                                        │  Evaluators · Scores │       │
│                                        │  Auto-Optimization   │       │
│                                        └──────────────────────┘       │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

Tech Stack

Layer	Technology
Frontend	Next.js 14, React 18, TypeScript, TailwindCSS, Vercel AI SDK
Backend	FastAPI, Python, LiteLLM, asyncio
LLM	Google Gemini (via LiteLLM with Opik callback)
Database	Supabase (PostgreSQL + Row-Level Security + Auth)
Auth	Supabase Auth with Google OAuth
Observability	Opik — tracing, threads, evaluators, feedback scores
Optimization	opik-optimizer (MetaPrompt, FewShot Bayesian, Evolutionary)
Image Generation	Gemini 2.5 Flash (celebration images on check-in)
Hosting	Vercel (frontend) + Railway (backend)

Features

Core Loop

Goal creation — set personal goals with descriptions and target dates
Strategy testing — multi-armed bandit selects and tests motivation strategies per check-in
Check-in flow — simple yes/no + optional feedback after each motivation message
Formula discovery — after enough data, the system reveals your personal motivation formula
Per-goal formulas — different goals can have different winning strategies

AI Coach

Full Agent mode — 6-step cognitive loop with visible reasoning at each step
Quick Stream mode — instant streamed messages via Vercel AI SDK
Voice playback — text-to-speech using Web Speech API with voice selection
Micro-commitment fallback — if user says "not yet", offers a smaller commitment

Instant streamed motivation via Vercel AI SDK

Engagement

Streak calendar — 35-day visual check-in history
Streak highlights — goals with 3+ day streaks get visual badges
Celebration images — AI-generated personalized images on check-in (12 goal categories)
In-app reminders — smart notification banners for goals needing attention
Time-based greetings — personalized by time of day

Observability (Opik)

Automatic LLM tracing — every Gemini call captured via LiteLLM callback
Nested agent traces — parent-child relationships across 6 cognitive steps
Thread evaluation — auto-triggered every 5 check-ins with coherence + frustration metrics
Custom evaluators — 6 evaluators scoring every AI output with letter grades
Auto prompt optimization — background optimization after 3 interventions per strategy
Feedback scores — engagement, coherence, frustration, optimization improvement

Prerequisites

Python 3.10+
Node.js 18+
A Supabase project (free tier works)
A Google AI Studio API key (for Gemini)
A Comet/Opik API key

Quick Start

1. Set Up Supabase

Create a new project at supabase.com
Go to SQL Editor and run the contents of backend/supabase_schema.sql — this creates all tables, RLS policies, and triggers
Go to Authentication > Providers and enable Google OAuth (you'll need a Google Cloud OAuth client ID)
Copy your project URL, anon key, and service key from Settings > API

2. Backend

cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Create your .env file:

cp .env.example .env

Fill in the values:

# Supabase (from Settings > API in your Supabase dashboard)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_KEY=your-service-key

# Opik (from comet.com/account-settings/apiKeys)
OPIK_API_KEY=your-opik-api-key
OPIK_WORKSPACE=your-workspace-name
OPIK_PROJECT_NAME=resolution-lab

# Google Gemini (from aistudio.google.com/apikey)
GOOGLE_API_KEY=your-google-api-key

Start the server:

uvicorn main:app --reload

3. Frontend

cd frontend
npm install

Create your .env.local file:

cp .env.local.example .env.local

Fill in the values:

# Backend API URL
NEXT_PUBLIC_API_URL=http://localhost:8000

# Supabase (same project as backend)
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key

# Google Gemini key for streaming mode (from aistudio.google.com/apikey)
GOOGLE_STREAMING_API_KEY=your-google-api-key

Start the dev server:

npm run dev

4. Access

Service	URL
Frontend	http://localhost:3000
API Docs (Swagger)	http://localhost:8000/docs
Opik Dashboard	https://comet.com/opik

Project Structure

resolution-lab/
├── backend/
│   ├── main.py                              # FastAPI app, Opik + LiteLLM config
│   ├── config.py                            # Environment settings
│   ├── models/
│   │   ├── schemas.py                       # 8 strategies, goals, interventions (Pydantic)
│   │   └── database.py                      # Database models
│   ├── routers/
│   │   ├── interventions.py                 # Message generation, check-in, auto-optimization trigger
│   │   ├── agent.py                         # AI Coach agent + optimization endpoints
│   │   ├── goals.py                         # Goal CRUD
│   │   ├── insights.py                      # Formula discovery + strategy analytics
│   │   ├── auth.py                          # Authentication
│   │   └── opik_stats.py                    # Opik metrics API
│   └── services/
│       ├── coach_agent.py                   # 6-step cognitive loop
│       ├── experiment_engine.py             # Epsilon-greedy multi-armed bandit
│       ├── intervention_generator.py        # LLM message generation + evaluation
│       ├── evaluators.py                    # Custom Opik evaluators (6 metrics)
│       ├── thread_evaluator.py              # Opik thread lifecycle + auto-evaluation
│       ├── auto_optimizer.py                # Background prompt optimization (ProcessPoolExecutor)
│       ├── prompt_optimizer.py              # opik-optimizer integration (3 algorithms)
│       ├── celebration_image_generator.py   # Gemini image generation
│       ├── analysis_engine.py               # Sentiment analysis + recommendations
│       ├── reminder_service.py              # In-app reminder scheduling
│       ├── user_context_builder.py          # Personalization context
│       └── database.py                      # Supabase operations
├── frontend/
│   └── src/
│       ├── app/
│       │   ├── page.tsx                     # Landing page
│       │   ├── dashboard/page.tsx           # Dashboard with goals + streaks
│       │   ├── agent/page.tsx               # AI Coach (full agent + streaming)
│       │   ├── goals/page.tsx               # Goal management
│       │   ├── insights/page.tsx            # Formula discovery + analytics
│       │   └── experiment/page.tsx          # Experiment simulation
│       ├── components/
│       │   ├── Header.tsx                   # Navigation
│       │   ├── GoalCard.tsx                 # Goal cards with formula UI
│       │   ├── CheckInModal.tsx             # Check-in with celebration images
│       │   ├── StreakCalendar.tsx            # 35-day visual calendar
│       │   └── ReminderBanner.tsx           # Smart notification banners
│       ├── contexts/AuthContext.tsx          # Global auth state
│       ├── hooks/useTextToSpeech.ts         # Web Speech API
│       └── lib/api.ts                       # API client
└── README.md

API Endpoints

Method	Endpoint	Description
`POST`	`/api/agent/run`	Run full 6-step AI Coach agent
`POST`	`/api/interventions/generate`	Generate motivation message (bandit selects strategy)
`POST`	`/api/interventions/{id}/record-outcome`	Submit check-in result
`POST`	`/api/interventions/{id}/celebration`	Generate celebration image
`GET`	`/api/goals`	List user goals
`POST`	`/api/goals`	Create goal
`GET`	`/api/insights`	Get personal motivation formula + strategy stats
`GET`	`/api/agent/optimization/auto-status`	View auto-optimization status
`POST`	`/api/agent/optimization/reset-counts`	Reset optimization counters
`GET`	`/api/opik/stats`	Opik experiment statistics

Full interactive docs at /docs (Swagger UI).

What Makes This Different

Aspect	Resolution Lab	Typical AI Coach
Strategy selection	Multi-armed bandit with real outcome data	Fixed prompts or random
Learning	Per-user effectiveness tracking across 8 strategies	No personalization loop
Self-improvement	Auto prompt optimization via opik-optimizer	Static prompts
Observability	Full Opik pipeline — traces, threads, evaluators, feedback	Basic logging
Evaluation	Hybrid: custom evaluators + LLM-as-Judge + thread-level metrics	None
Transparency	User sees their experiment data and formula	Black box
Agent architecture	6-step cognitive loop with visible reasoning	Single LLM call

The core insight: the system doesn't just coach you — it runs a scientific experiment on which coaching approach works best for you, and gets better at it over time.

Built for the Comet "Commit to Change" AI Agents Hackathon

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
DEMO_SCRIPT.md		DEMO_SCRIPT.md
README.md		README.md
SETUP.md		SETUP.md
USER_STORIES.md		USER_STORIES.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Resolution Lab

The Problem

The Solution

How It Works

The 8 Motivation Strategies

The Multi-Armed Bandit

The AI Coach Agent — 6-Step Cognitive Loop

Opik Integration — Deep & Production-Grade

1. Full Tracing Pipeline

2. Thread Evaluation — Conversation-Level Quality

3. Custom Opik Evaluators

4. Auto Prompt Optimization (Opik Agent Optimizer)

5. Feedback Scores & Experiments

Architecture

Tech Stack

Features

Core Loop

AI Coach

Engagement

Observability (Opik)

Prerequisites

Quick Start

1. Set Up Supabase

2. Backend

3. Frontend

4. Access

Project Structure

API Endpoints

What Makes This Different

About

Uh oh!

Releases

Packages

Languages

jerrymusaga/resolution-lab

Folders and files

Latest commit

History

Repository files navigation

Resolution Lab

The Problem

The Solution

How It Works

The 8 Motivation Strategies

The Multi-Armed Bandit

The AI Coach Agent — 6-Step Cognitive Loop

Opik Integration — Deep & Production-Grade

1. Full Tracing Pipeline

2. Thread Evaluation — Conversation-Level Quality

3. Custom Opik Evaluators

4. Auto Prompt Optimization (Opik Agent Optimizer)

5. Feedback Scores & Experiments

Architecture

Tech Stack

Features

Core Loop

AI Coach

Engagement

Observability (Opik)

Prerequisites

Quick Start

1. Set Up Supabase

2. Backend

3. Frontend

4. Access

Project Structure

API Endpoints

What Makes This Different

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages