# Experiment Setup and Validation Notebook

This notebook walks you through setting up a JSPsych experiment with WAVE backend integration. It will:

1. **Setup Phase**: Guide you through installing dependencies and testing your experiment locally
2. **Schema Definition**: Help you define the experiment data schema that matches your experiment's output
3. **Backend Integration**: Create and test experiment types in the WAVE backend
4. **Validation**: Run a complete test cycle to ensure data logging works correctly
5. **Production Setup**: Create your final experiment for real data collection

**Prerequisites**: This notebook assumes you have already (mostly) set up your experiment code and WAVE client integration in `src/js/integrations/wave-client.js`. Also, copy `tools/.env.example` and rename it `tools/.env`, taking care to fill out the API keys

**Important**: Only share EXPERIMENTEE-level API keys with participants. Keep RESEARCHER-level keys secure and private.

## Setup Instructions

### 1. Install uv Package Manager

First, install [uv](https://docs.astral.sh/uv/) - a fast Python package manager and project manager:

```bash
# On macOS/Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh

# On Windows:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```

### 2. Install Project Dependencies

In the root directory of this project, install the Python dependencies:

```bash
uv sync
```

This will install all dependencies defined in `pyproject.toml`, which specifies the packages needed for experiment setup, data validation, and WAVE backend integration.

### 3. Understanding UV Virtual Environment

UV automatically creates and manages a virtual environment in the `.venv` folder. This keeps your project dependencies isolated from your system Python installation.

**Important**: If you encounter any package conflicts or want a completely clean slate, simply delete the `.venv` folder and `uv.lock` file and run `uv sync` again to recreate it from scratch.

### 4. Running This Notebook

You have multiple options for running this notebook:

**Option A: VS Code Jupyter Extension**
1. Install the [Jupyter VSCode extension](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter)
2. Open this notebook file in VS Code
3. When prompted to select a kernel, choose the Python interpreter from the `.venv` folder:
   - Click "Select Kernel" in the top-right of the notebook
   - Choose "Python Environments..."
   - Select the interpreter at `.venv/bin/python` (or `.venv/Scripts/python.exe` on Windows)
4. VS Code will use this virtual environment for all notebook cells


**Option B: Jupyter Lab/Notebook**
```bash
uv tool run jupyter lab
```
This will install dependencies and launch a Jupyter Lab instance in your browser. Navigate to `setup_experiment.ipynb` in the Jupyter file explorer and run this notebook from there


**Option C: PyCharm IDE**
1. Open PyCharm and select "Open" to open this project folder
2. PyCharm should automatically detect the `pyproject.toml` file and prompt you to configure the Python
interpreter
1. If not prompted automatically:
  - Go to File → Settings (or PyCharm → Preferences on macOS)
  - Navigate to Project → Python Interpreter
  - Click the gear icon → Add
  - Select "Existing environment"
  - Browse to `.venv/bin/python` (or `.venv/Scripts/python.exe` on Windows)
  - Click OK to apply
2. Open the `tools/setup_experiment.ipynb` file



### 5. Verify Installation

In [None]:
from wave_client import (
    WaveClient,
)  # <-- If this import doesn't work, then something went wrong with the Wave Client install!!!

In [None]:
import webbrowser
import socketserver
import threading
import time
from pathlib import Path
import pandas as pd
import sys
import http.server

In [None]:
import warnings

warnings.filterwarnings("ignore", message="To exit: use 'exit', 'quit', or Ctrl-D.")

## Phase 1: Local Experiment Testing

### Test Your Experiment Locally

First, let's open your experiment in a web browser to test it locally (without WAVE logging).

In [None]:
# Configuration: Modify this path if your experiment root is located elsewhere
EXPERIMENT_ROOT = Path("../").resolve()
PORT = 8080


def start_server():
    """Start HTTP server in experiment directory"""
    import os

    os.chdir(EXPERIMENT_ROOT)

    Handler = http.server.SimpleHTTPRequestHandler
    httpd = socketserver.TCPServer(("", PORT), Handler)
    httpd.serve_forever()


# Start server in background thread
print(f"Starting HTTP server at {EXPERIMENT_ROOT}")
server_thread = threading.Thread(target=start_server, daemon=True)
server_thread.start()

# Give server time to start
time.sleep(2)

# Open experiment in browser
experiment_url = f"http://localhost:{PORT}/"
print(f"Opening experiment at: {experiment_url}")

try:
    webbrowser.open(experiment_url)
    print("✅ Experiment opened in browser")
    print(f"✅ HTTP server running on localhost:{PORT}")
except Exception as e:
    print(f"❌ Failed to open browser: {e}")
    print(f"Please manually open: {experiment_url}")

print(f"\n💡 Note: Server serves files from {EXPERIMENT_ROOT}")
print(f"💡 To stop server: restart the notebook kernel or run 'Kernel > Restart' in Jupyter")

Exception in thread Thread-4 (start_server):
Traceback (most recent call last):
  File "/Users/doug/.pyenv/versions/3.12.11/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
    self.run()
  File "/Users/doug/Documents/code/wave/experiment-template/.venv/lib/python3.12/site-packages/ipykernel/ipkernel.py", line 772, in run_closure
    _threading_Thread_run(self)
  File "/Users/doug/.pyenv/versions/3.12.11/lib/python3.12/threading.py", line 1012, in run
    self._target(*self._args, **self._kwargs)
  File "/var/folders/rk/qvbx4gjd6vld_v99zl8b1vpm0000gn/T/ipykernel_18326/43285537.py", line 11, in start_server
  File "/Users/doug/.pyenv/versions/3.12.11/lib/python3.12/socketserver.py", line 457, in __init__
    self.server_bind()
  File "/Users/doug/.pyenv/versions/3.12.11/lib/python3.12/socketserver.py", line 478, in server_bind
    self.socket.bind(self.server_address)
OSError: [Errno 48] Address already in use


Starting HTTP server at /Users/doug/Documents/code/wave/experiment-template
Opening experiment at: http://localhost:8080/
✅ Experiment opened in browser
✅ HTTP server running on localhost:8080

💡 Note: Server serves files from /Users/doug/Documents/code/wave/experiment-template
💡 To stop server: restart the notebook kernel or run 'Kernel > Restart' in Jupyter


### Next Steps

1. Complete the entire experiment
2. Check browser console for any errors
3. At the end, verify data appears via `jsPsych.data.displayData()` or console output
4. Return to this notebook when done testing
5. Server will keep running until you restart this notebook kernel

In [None]:
# Local experiment validation check
print("🔍 Local Experiment Validation")
print("Please complete the following validation steps:")
print("1. Did you successfully complete the entire experiment?")
print("2. Check the browser console log OR run jsPsych.data.displayData() in console")
print("3. Verify that experiment data was GENERATED (not logged to WAVE yet)")
print("If so, input 'y' to continue with the walkthrough")

continue_setup = (
    input("\nDid the experiment run successfully and generate data? (y/n): ").strip().lower()
)

if continue_setup not in ["y", "yes"]:
    print("❌ Please fix any issues with your local experiment before continuing.")
    print("💡 Common issues:")
    print("   - Missing stimulus files")
    print("   - JavaScript errors in console")
    print("   - Experiment not completing properly")
    print("   - No data being generated")
    sys.exit("Stopping setup - fix experiment issues first")

print("✅ Great! Your experiment generates data locally.")
print("🔧 Now we'll set up WAVE backend integration...")

# Server cleanup note
print("\n💡 Note: You can keep the local server running for now.")
print("   We'll use it again later with WAVE integration parameters.")

🔍 Local Experiment Validation
Please complete the following validation steps:
1. Did you successfully complete the entire experiment?
2. Check the browser console log OR run jsPsych.data.displayData() in console
3. Verify that experiment data was GENERATED (not logged to WAVE yet)
If so, input 'y' to continue with the walkthrough
✅ Great! Your experiment generates data locally.
🔧 Now we'll set up WAVE backend integration...

💡 Note: You can keep the local server running for now.
   We'll use it again later with WAVE integration parameters.


## Phase 2: Understanding Data Logging Structure

### Experiment Data Schema

The data that gets LOGGED to the WAVE backend is defined by the `processTrialData(data)` function in `src/js/integrations/wave-client.js` (lines 95-126).

**Key Points:**
- The function processes JSPsych trial data and extracts specific fields for logging
- Currently configured to log data from trials with `trial_category` containing 'expt'  
- The extracted `waveData` object (lines 106-120) defines exactly what gets sent to WAVE

**Current Schema Fields** (you may need to modify these to match your experiment):
```javascript
const waveData = {
    trial_number: data.trial_index,
    trial_type: data.trial_type, 
    trial_category: data.trial_category,
    stimulus: data.stimulus,
    response: data.response,
    response_time: data.rt / 1000, // Convert to seconds
    accuracy: data.thisAcc === 1,
    correct_response: data.correct_response,
    stimulus_duration: data.trial_duration,
    time_elapsed: data.time_elapsed,
    participant_id: data.participant_id,
    timestamp: data.timestamp,
    user_agent: data.user_agent
};
```

⚠️ **IMPORTANT**: The experiment schema in the WAVE backend must exactly match the fields defined in `processTrialData`. If there's a mismatch, data logging will fail.

In [None]:
# Define experiment schema and check for existing types
import os
from dotenv import load_dotenv
from wave_client import WaveClient

# Load environment variables
load_dotenv("tools/.env")
RESEARCHER_API_KEY = os.getenv("RESEARCHER_API_KEY")
WAVE_BACKEND_URL = os.getenv("WAVE_BACKEND_URL")

if not RESEARCHER_API_KEY:
    print("❌ RESEARCHER_API_KEY not found in tools/.env file")
    print("💡 Please create tools/.env file with RESEARCHER_API_KEY before continuing")
    sys.exit("Missing API key")

if not WAVE_BACKEND_URL:
    print("❌ WAVE_BACKEND_URL not found in tools/.env file")
    print("💡 Please create tools/.env file with WAVE_BACKEND_URL before continuing")
    sys.exit("Missing Wave Backend URL")

print(f"🔍 Connecting to WAVE backend...")
print(f"📡 Backend URL: {WAVE_BACKEND_URL}")
print(f"🔑 Using API key ending in: ...{RESEARCHER_API_KEY[-4:]}")


async def get_existing_experiment_types():
    async with WaveClient(api_key=RESEARCHER_API_KEY, base_url=WAVE_BACKEND_URL) as client:
        try:
            # Test connection and get experiment types
            experiment_types = await client.experiment_types.list(skip=0, limit=1000)
            print(f"\n✅ Connected successfully!")
            print(f"📊 Found {len(experiment_types)} existing experiment types:")

            if experiment_types:
                for exp_type in experiment_types:
                    print(
                        f"  - {exp_type['name']} (table: {exp_type['table_name']}) [ID: {exp_type['id']}]"
                    )
            else:
                print("  (No experiment types found)")

            return experiment_types

        except Exception as e:
            print(f"❌ Error connecting to WAVE backend: {e}")
            sys.exit("Failed to connect to WAVE backend - check API key and URL")


# Get existing experiment types
existing_experiment_types = await get_existing_experiment_types()
print("✅ Ready to define experiment schema")

🔍 Connecting to WAVE backend...
📡 Backend URL: https://wave-backend-production-8781.up.railway.app
🔑 Using API key ending in: ...wKYt
[]

✅ Connected successfully!
📊 Found 0 existing experiment types:
  (No experiment types found)
✅ Ready to define experiment schema


In [None]:
# Define experiment schema and check for naming conflicts

print("📋 Experiment Schema Definition")
print("The schema below matches the processTrialData function in wave-client.js")
print("Review it carefully before proceeding!\n")

# Experiment type details - MODIFY THESE AS NEEDED
experiment_type_name = "jspsych_color_circles_demo"
table_name = "jspsych_circles_data"  # lowercase, underscores only

# Schema definition matching processTrialData in wave-client.js
experiment_schema = {
    "trial_number": "INTEGER",
    "trial_type": "STRING",
    "trial_category": "STRING",
    "stimulus": "STRING",
    "response": "STRING",
    "response_time": "FLOAT",  # Converted from milliseconds to seconds in processTrialData
    "accuracy": "BOOLEAN",
    "correct_response": "STRING",
    "stimulus_duration": "INTEGER",
    "time_elapsed": "INTEGER",
    "timestamp": "STRING",
    "user_agent": "STRING",
}

print(f"Proposed Experiment Type: {experiment_type_name}")
print(f"Proposed Table Name: {table_name}")
print("\n📊 Schema Fields:")
for field, field_type in experiment_schema.items():
    print(f"  {field}: {field_type}")

# Check for naming conflicts with existing experiment types
existing_names = [exp_type["name"] for exp_type in existing_experiment_types]
existing_table_names = [exp_type["table_name"] for exp_type in existing_experiment_types]

skip_experiment_type_creation = False
existing_experiment_type_id = None

if experiment_type_name in existing_names:
    print(f"\n⚠️  Experiment type '{experiment_type_name}' already exists!")
    skip_experiment_type_creation = True
    # Find the existing ID for later use
    for exp_type in existing_experiment_types:
        if exp_type["name"] == experiment_type_name:
            existing_experiment_type_id = exp_type["id"]
            break
    print(f"✅ Will use existing experiment type (ID: {existing_experiment_type_id})")

elif table_name in existing_table_names:
    print(f"\n❌ Table name '{table_name}' already exists!")
    print("💡 Please modify the 'table_name' variable above and re-run this cell")
    sys.exit("Table name conflict - choose a different table name")

else:
    print(f"\n✅ Experiment type '{experiment_type_name}' is available")
    print(f"✅ Table name '{table_name}' is available")
    skip_experiment_type_creation = False
    print("📝 Will create new experiment type")

print(f"\n💡 Total fields: {len(experiment_schema)}")
print("Please confirm the new experiment schema")

# Confirm schema is correct
schema_confirmed = (
    input("\nDoes this schema match your processTrialData function in wave-client.js? (y/n): ")
    .strip()
    .lower()
)

if schema_confirmed not in ["y", "yes"]:
    print(
        "❌ Please modify the experiment_schema dictionary above to match your processTrialData function."
    )
    print("💡 Look at src/js/integrations/wave-client.js lines 106-120")
    sys.exit("Schema needs modification - update the schema and re-run this cell")

print("✅ Schema confirmed!")
print(f"🎯 Skip experiment type creation: {skip_experiment_type_creation}")
print("✅ Ready for validation and backend setup")

### Schema Validation Rules

**Reserved Column Names** (cannot be used in your schema):
- `id` - Auto-generated primary key
- `experiment_uuid` - Links data to experiment 
- `participant_id` - Participant identifier (added automatically)
- `created_at` - Timestamp when data was created
- `updated_at` - Timestamp when data was last modified

**Supported Data Types**:
- `INTEGER` - Whole numbers (e.g., trial numbers, counts)
- `FLOAT` - Decimal numbers (e.g., reaction times, scores)
- `STRING` - Text up to 255 characters (e.g., responses, stimulus names)
- `TEXT` - Longer text content (unlimited length)
- `BOOLEAN` - True/false values (e.g., accuracy, conditions)
- `DATETIME` - Date and time stamps
- `JSON` - Complex structured data

**Important Notes**:
- STRING fields are capped at 255 characters - use TEXT for longer content
- Column names are case-sensitive and should match exactly
- Field names cannot conflict with reserved names above

In [None]:
# Validate experiment schema using pydantic model
from wave_client.models.base import ExperimentTypeCreate

print("🔍 Validating Experiment Schema")

try:
    # Create the pydantic model with our schema
    experiment_type_data = ExperimentTypeCreate(
        name=experiment_type_name,
        table_name=table_name,
        schema_definition=experiment_schema,
        description=f"JSPsych experiment data schema for {experiment_type_name}",
    )

    print("✅ Schema validation passed!")
    print(f"✅ Experiment Type: {experiment_type_data.name}")
    print(f"✅ Table Name: {experiment_type_data.table_name}")
    print(f"✅ Fields: {len(experiment_type_data.schema_definition)}")
    print("✅ No conflicts with reserved names")
    print("✅ All data types are supported")

except ValueError as e:
    print(f"❌ Schema validation failed: {e}")
    print("💡 Common issues:")
    print(
        "   - Field names conflict with reserved names (id, experiment_uuid, participant_id, created_at, updated_at)"
    )
    print("   - Unsupported data type used")
    print("   - Invalid schema definition format")
    sys.exit("Fix schema issues and re-run this cell")

except Exception as e:
    print(f"❌ Unexpected error during validation: {e}")
    sys.exit("Check your schema definition and try again")

print("\n🎯 Schema is ready for WAVE backend!")

In [None]:
# Create or use existing experiment type in WAVE backend

if skip_experiment_type_creation:
    exp_type_id = existing_experiment_type_id
    print(f"🔄 Using existing experiment type: {experiment_type_name}")
    print(f"📊 Experiment Type ID: {exp_type_id}")
    print("⏭️  Skipping experiment type creation")

else:
    print("🚀 Creating new experiment type in WAVE backend...")

    async def create_experiment_type():
        async with WaveClient(api_key=RESEARCHER_API_KEY, base_url=WAVE_BACKEND_URL) as client:
            try:
                exp_type = await client.experiment_types.create(
                    name=experiment_type_data.name,
                    table_name=experiment_type_data.table_name,
                    description=experiment_type_data.description,
                    schema_definition=experiment_type_data.schema_definition,
                )
                return exp_type
            except Exception as e:
                print(f"❌ Error creating experiment type: {e}")
                raise

    # Create the experiment type
    try:
        created_experiment_type = await create_experiment_type()
        exp_type_id = created_experiment_type["id"]

        print(f"✅ Created experiment type: {created_experiment_type['name']}")
        print(f"📊 Experiment Type ID: {exp_type_id}")
        print(f"🗄️  Database Table: {created_experiment_type['table_name']}")
        print(f"📝 Description: {created_experiment_type['description']}")
        print(f"📊 Schema Fields: {len(created_experiment_type['schema_definition'])}")
        print(f"🕒 Created: {created_experiment_type['created_at']}")

    except Exception as e:
        print(f"❌ Failed to create experiment type: {e}")
        print("💡 Common issues:")
        print("   - API key lacks RESEARCHER permissions")
        print("   - Network connectivity problems")
        print("   - Conflicting experiment type name or table name")
        print("   - Invalid schema definition")
        sys.exit("Fix the issue and re-run this cell")

print(f"\n🎯 Experiment Type ID for next steps: {exp_type_id}")
print("✅ Ready to create test experiment!")

In [None]:
# Generate unique test experiment name and static experimentee ID
from datetime import datetime
import uuid

# Generate unique test experiment name with timestamp
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
test_experiment_name = f"TEST_{experiment_type_name}_{timestamp}"

# Generate static experimentee ID for testing
test_experimentee_id = f"test_participant_{uuid.uuid4().hex[:8]}"

print("🆔 Test Identifiers Generated")
print(f"📝 Test Experiment Name: {test_experiment_name}")
print(f"👤 Test Experimentee ID: {test_experimentee_id}")

print(f"\n💡 The test experiment will be created with:")
print(f"   - Name: {test_experiment_name}")
print(f"   - Type ID: {exp_type_id}")
print(f"   - Participant: {test_experimentee_id}")

print(f"\n✅ Identifiers ready for test experiment creation")

In [None]:
# Define experiment tags and check for existence

print("🏷️  Experiment Tags Setup")
print("Tags help categorize and organize your experiments in the WAVE system")

# Define tags for the test experiment - MODIFY THESE AS NEEDED
experiment_tags = [
    {"name": "test", "description": "Test experiments for validation purposes"},
]

print(f"\n📋 Proposed experiment tags:")
for tag in experiment_tags:
    print(f"  - {tag['name']}: {tag['description']}")

# Get existing tags from WAVE backend
print(f"\n🔍 Checking existing tags in WAVE backend...")


async def get_existing_tags():
    async with WaveClient(api_key=RESEARCHER_API_KEY, base_url=WAVE_BACKEND_URL) as client:
        try:
            existing_tags = await client.tags.list(skip=0, limit=1000)
            return existing_tags
        except Exception as e:
            print(f"❌ Error fetching existing tags: {e}")
            raise


existing_tags = await get_existing_tags()
existing_tag_names = [tag["name"] for tag in existing_tags]

print(f"✅ Found {len(existing_tags)} existing tags:")
if existing_tags:
    for tag in existing_tags:
        print(f"  - {tag['name']}: {tag['description']} [ID: {tag['id']}]")
else:
    print("  (No existing tags found)")

# Check which tags need to be created
tags_to_create = []
existing_tag_names_set = set(existing_tag_names)

for tag in experiment_tags:
    if tag["name"] not in existing_tag_names_set:
        tags_to_create.append(tag)

if tags_to_create:
    print(f"\n📝 Need to create {len(tags_to_create)} new tags:")
    for tag in tags_to_create:
        print(f"  - {tag['name']}")
else:
    print(f"\n✅ All required tags already exist!")

# Confirm tags are correct
tags_confirmed = input("\nAre these tags appropriate for your experiment? (y/n): ").strip().lower()

if tags_confirmed not in ["y", "yes"]:
    print("❌ Please modify the experiment_tags list above and re-run this cell")
    sys.exit("Tags need modification - update the tags list and re-run this cell")

print("✅ Tags confirmed!")
tag_names_for_experiment = [tag["name"] for tag in experiment_tags]
print(f"🎯 Tags for experiment: {', '.join(tag_names_for_experiment)}")

In [None]:
# Create missing tags in WAVE backend

if tags_to_create:
    print(f"🚀 Creating {len(tags_to_create)} missing tags...")

    async def create_missing_tags():
        async with WaveClient(api_key=RESEARCHER_API_KEY, base_url=WAVE_BACKEND_URL) as client:
            created_tags = []
            for tag in tags_to_create:
                try:
                    created_tag = await client.tags.create(
                        name=tag["name"], description=tag["description"]
                    )
                    created_tags.append(created_tag)
                    print(f"  ✅ Created tag: {created_tag['name']} [ID: {created_tag['id']}]")
                except Exception as e:
                    print(f"  ❌ Failed to create tag '{tag['name']}': {e}")
                    raise
            return created_tags

    try:
        new_tags = await create_missing_tags()
        print(f"\n✅ Successfully created {len(new_tags)} new tags!")

    except Exception as e:
        print(f"❌ Failed to create tags: {e}")
        print("💡 Common issues:")
        print("   - Tag name already exists (case-sensitive)")
        print("   - API key lacks RESEARCHER permissions")
        print("   - Network connectivity problems")
        print("   - Tag name exceeds 100 character limit")
        sys.exit("Fix the issue and re-run this cell")

else:
    print("✅ No new tags needed - all tags already exist!")

print(f"\n🎯 Ready to create experiment with tags: {', '.join(tag_names_for_experiment)}")
print("✅ All tags are available in WAVE backend!")

In [None]:
# Create test experiment in WAVE backend

print("🚀 Creating test experiment in WAVE backend...")


async def create_experiment(description, experiment_type_id, tags, additional_data=None):
    """Create an experiment with the given parameters"""
    async with WaveClient(api_key=RESEARCHER_API_KEY, base_url=WAVE_BACKEND_URL) as client:
        try:
            if additional_data is None:
                additional_data = {"created_by": "python_notebook"}

            experiment = await client.experiments.create(
                experiment_type_id=experiment_type_id,
                description=description,
                tags=tags,
                additional_data=additional_data,
            )
            return experiment

        except Exception as e:
            print(f"❌ Error creating experiment: {e}")
            raise


# Create the test experiment
test_description = (
    f"Test experiment for {experiment_type_name} - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"
)
test_additional_data = {"created_by": "python_notebook"}

try:
    test_experiment = await create_experiment(
        description=test_description,
        experiment_type_id=exp_type_id,
        tags=tag_names_for_experiment,
        additional_data=test_additional_data,
    )
    experiment_id = str(test_experiment["uuid"])

    print(f"✅ Created test experiment successfully!")
    print(f"📝 Experiment Name: {test_experiment['description']}")
    print(f"🆔 Experiment ID: {experiment_id}")
    print(f"🏷️  Tags: {', '.join(test_experiment['tags'])}")
    print(f"📊 Experiment Type ID: {test_experiment['experiment_type_id']}")
    print(f"🕒 Created: {test_experiment['created_at']}")

    print(f"\n💾 Additional Data:")
    for key, value in test_experiment["additional_data"].items():
        print(f"   {key}: {value}")

except Exception as e:
    print(f"❌ Failed to create test experiment: {e}")
    print("💡 Common issues:")
    print("   - Experiment type ID doesn't exist")
    print("   - One or more tags don't exist")
    print("   - API key lacks RESEARCHER permissions")
    print("   - Network connectivity problems")
    print("   - Invalid experiment data format")
    sys.exit("Fix the issue and re-run this cell")

print(f"\n🎯 Test Experiment ID for next steps: {experiment_id}")
print(f"👤 Test Participant ID: {test_experimentee_id}")
print("✅ Ready to launch experiment with WAVE integration!")

In [None]:
# Launch experiment with WAVE integration parameters

from urllib.parse import quote

# Get the EXPERIMENTEE API key for the experiment URL
EXPERIMENTEE_API_KEY = os.getenv("EXPERIMENTEE_API_KEY")

if not EXPERIMENTEE_API_KEY:
    print("❌ EXPERIMENTEE_API_KEY not found in tools/.env file")
    print("💡 Please add EXPERIMENTEE_API_KEY to your tools/.env file")
    print("⚠️  IMPORTANT: Use an EXPERIMENTEE-level key, NOT the RESEARCHER key!")
    sys.exit("Missing EXPERIMENTEE API key")

print("🚀 Launching Experiment with WAVE Integration")
print("Now we'll open your experiment with the actual WAVE backend parameters")

# Create full URL with all required parameters
base_url = f"http://localhost:{PORT}/"
full_url = (
    f"{base_url}"
    f"?key={quote(EXPERIMENTEE_API_KEY)}"
    f"&experiment_id={quote(experiment_id)}"
    f"&participant_id={quote(test_experimentee_id)}"
)

# Create censored URL for display (hide the API key)
censored_url = full_url.replace(EXPERIMENTEE_API_KEY, "[EXPERIMENTEE_API_KEY_HIDDEN]")

print(f"\n🔗 Experiment URL (with hidden API key):")
print(f"   {censored_url}")
print(f"\n🔑 API Key Security:")
print(f"   Using EXPERIMENTEE key ending in: ...{EXPERIMENTEE_API_KEY[-4:]}")
print(f"   ⚠️  This key should have LIMITED permissions for data logging only")

# Open experiment in browser
print(f"\n🌐 Opening experiment in browser...")
try:
    webbrowser.open(full_url)
    print("✅ Browser opened with WAVE-integrated experiment")
    print("🔄 Server running in background - experiment should log data to WAVE now!")

except Exception as e:
    print(f"❌ Could not open browser automatically: {e}")
    print(f"\n📋 Manual steps:")
    print(f"   1. Open your browser")
    print(f"   2. Navigate to: {censored_url}")
    print(f"   3. Replace [EXPERIMENTEE_API_KEY_HIDDEN] with your actual EXPERIMENTEE API key")

print(f"\n💡 What to do next:")
print(f"   1. Complete the experiment in the browser")
print(f"   2. Verify data logging works (check browser console for any errors)")
print(f"   3. Return to this notebook to check if data was logged to WAVE")
print(f"   4. The experiment will now log data to experiment ID: {experiment_id}")

In [None]:
# Data validation and cleanup

# Only proceed if we have data
if len(data_df) == 0:
    print("❌ No data found to validate. Please complete the experiment first.")
    sys.exit("Cannot proceed without experiment data")

print("🔍 Data Validation")
print("Please review the experiment data above and confirm it matches your expectations.")
print("\n💡 Check that:")
print("   - All expected columns are present")
print("   - Data types look correct (numbers, strings, booleans)")
print("   - Values are within expected ranges")
print("   - No missing or null values where they shouldn't be")
print("   - Participant ID matches what you used")

# Ask user to validate the logged data
data_validation = input("\nIs the logged data as expected? (y/n): ").strip().lower()

print(f"\n🧹 Cleaning up test data...")
print("⚠️  This will delete the test experiment and all associated data.")


# Clean up test data regardless of validation result
async def cleanup_test_data():
    async with WaveClient(api_key=RESEARCHER_API_KEY, base_url=WAVE_BACKEND_URL) as client:
        try:
            # Delete all experiment data first
            if len(data_df) > 0:
                print(f"🗑️  Deleting {len(data_df)} test data points...")
                deleted_count = 0

                for _, row in data_df.iterrows():
                    try:
                        await client.experiment_data.delete_row(experiment_id, int(row["id"]))
                        deleted_count += 1
                    except Exception as e:
                        print(f"  ⚠️  Failed to delete row {row['id']}: {e}")

                print(f"✅ Deleted {deleted_count} data rows")
            else:
                print("ℹ️  No experiment data found to delete")

            # Delete the test experiment
            try:
                delete_response = await client.experiments.delete(experiment_id)
                print(f"✅ Deleted test experiment: {experiment_id}")
            except Exception as e:
                print(f"⚠️  Failed to delete test experiment: {e}")

            # Delete experiment type ONLY if we created it (not if we skipped creation)
            if not skip_experiment_type_creation:
                try:
                    delete_type_response = await client.experiment_types.delete(exp_type_id)
                    print(f"✅ Deleted experiment type: {exp_type_id}")
                except Exception as e:
                    print(f"⚠️  Failed to delete experiment type: {e}")
            else:
                print(f"ℹ️  Kept existing experiment type: {exp_type_id}")

            print(f"\n✅ Cleanup complete!")

        except Exception as e:
            print(f"❌ Error during cleanup: {e}")
            print("💡 You may need to manually delete the test data from WAVE backend")
            raise


# Perform cleanup
try:
    await cleanup_test_data()
except Exception as e:
    print(f"❌ Cleanup failed: {e}")

# Handle validation result
if data_validation not in ["y", "yes"]:
    print(f"\n❌ Data validation failed!")
    print("💡 Common issues to check:")
    print("   - Schema mismatch between processTrialData and experiment type")
    print("   - Missing or incorrect field names")
    print("   - Data type mismatches (string vs number, etc.)")
    print("   - JavaScript errors preventing proper data extraction")
    print("   - WAVE client configuration issues")
    print(f"\n🔧 Next steps:")
    print("   1. Fix the identified issues in your experiment code")
    print("   2. Update the schema definition if needed")
    print("   3. Re-run this notebook from the schema definition step")
    sys.exit("Fix data issues before proceeding to production setup")

print(f"\n🎉 Excellent! Your WAVE integration is working correctly!")
print("✅ Test data has been cleaned up")
print("✅ Schema validation passed")
print("✅ Data logging is functioning properly")
print(f"\n🚀 Ready to create your production experiment!")

In [None]:
# Confirm experiment completion and retrieve data

# Confirmation input
completion_confirmed = (
    input("\nDid you complete the experiment with WAVE integration? (y/n): ").strip().lower()
)

if completion_confirmed not in ["y", "yes"]:
    print("❌ Please complete the experiment in your browser before continuing.")
    print("💡 Make sure to:")
    print("   - Complete all experiment trials")
    print("   - Check browser console for any JavaScript errors")
    print("   - Verify no WAVE client error messages appeared")
    sys.exit("Complete the experiment first, then re-run this cell")

print("🔍 Retrieving experiment data from WAVE backend...")
print(f"📊 Looking for data in experiment ID: {experiment_id}")


async def get_experiment_data():
    async with WaveClient(api_key=RESEARCHER_API_KEY, base_url=WAVE_BACKEND_URL) as client:
        try:
            # Get all data for this experiment
            data_df = await client.experiment_data.get_all_data(experiment_id=experiment_id)
            return data_df

        except Exception as e:
            print(f"❌ Error retrieving experiment data: {e}")
            raise


try:
    data_df = await get_experiment_data()

    if len(data_df) > 0:
        print(f"🎉 Success! Found {len(data_df)} data points!")
        print(f"📈 Participants: {data_df['participant_id'].nunique()}")

        # Display the data
        print(f"\n📋 Experiment Data:")

        with pd.option_context("display.max_rows", None, "display.max_columns", None):
            display(data_df)

    else:
        print("⏳ No data found yet.")
        print("💡 Possible issues:")
        print("   - Experiment didn't complete successfully")
        print("   - JavaScript errors prevented data logging")
        print("   - WAVE client configuration problems")
        print("   - Network connectivity issues during data logging")
        print("\n🔄 You can re-run this cell after completing the experiment again")

except Exception as e:
    print(f"❌ Failed to retrieve data: {e}")
    print("💡 Common issues:")
    print("   - Experiment ID doesn't exist")
    print("   - API key lacks RESEARCHER permissions")
    print("   - Network connectivity problems")
    sys.exit("Fix the issue and re-run this cell")

if len(data_df) > 0:
    print(f"\n✅ Data logging is working correctly!")
    print(f"🎯 Ready for final validation step")
else:
    print(f"\n❌ No data was logged - troubleshoot experiment before continuing")

In [None]:
# Create production experiment

print("🎯 Production Experiment Creation")
print("Now let's create your real experiment for actual data collection!")

# Get experiment name from user
print(f"\n📝 Current experiment type: {experiment_type_name}")
print("You can use the same type for multiple experiments with different descriptions.")

experiment_name = input("\nEnter a name/description for your production experiment: ").strip()

if not experiment_name:
    print("❌ Experiment name cannot be empty")
    sys.exit("Please provide an experiment name")

print(f"\n🏷️  Available tags: {', '.join(tag_names_for_experiment)}")
use_same_tags = input("Use the same tags for production experiment? (y/n): ").strip().lower()

if use_same_tags in ['y', 'yes']:
    production_tags = tag_names_for_experiment
else:
    print("💡 Enter tag names separated by commas (or press Enter for no tags)")
    tag_input = input("Production experiment tags: ").strip()
    if tag_input:
        production_tags = [tag.strip() for tag in tag_input.split(',')]
    else:
        production_tags = []

print(f"\n🚀 Creating production experiment...")

# Create production experiment using the consolidated function
production_additional_data = {
    "created_by": "python_notebook",
    "production_experiment": True,
    "notebook_version": "1.0", 
    "experiment_type_name": experiment_type_name,
    "created_for_production": True
}

try:
    production_experiment = await create_experiment(
        description=experiment_name,
        experiment_type_id=exp_type_id,
        tags=production_tags,
        additional_data=production_additional_data
    )
    production_experiment_id = str(production_experiment["uuid"])
    
    print(f"🎉 Production experiment created successfully!")
    print(f"📝 Name: {production_experiment['description']}")
    print(f"🆔 Experiment ID: {production_experiment_id}")
    print(f"🏷️  Tags: {', '.join(production_experiment['tags'])}")
    print(f"📊 Experiment Type: {experiment_type_name} (ID: {production_experiment['experiment_type_id']})")
    print(f"🕒 Created: {production_experiment['created_at']}")

except Exception as e:
    print(f"❌ Failed to create production experiment: {e}")
    sys.exit("Fix the issue and re-run this cell")

print(f"\n🎯 Your production experiment is ready!")
print(f"🔗 Experiment ID for participants: {production_experiment_id}")

## 🎊 WAVE Experiment Production Setup Complete!

**Congratulations!** Your production experiment is ready for participant data collection!

### 🔗 Your Production Experiment

### ⚠️ Security Requirements
- **Only distribute EXPERIMENTEE-level API keys** to participants (data logging only)
- **Never share your RESEARCHER API key** (full admin access)
- **Set appropriate expiration dates** on participant API keys
- **Monitor via WAVE dashboard** for unauthorized access

### 🚀 Deployment Process
1. **Test thoroughly** with the production ID above
2. **Create feature branch** for your experiment code
3. **Commit and push** your changes
4. **Open PR** from feature branch to main
5. **After approval**, merge to main branch
6. **Open PR** from main → release and merge to deploy to Vercel
7. **Distribute experiment URL** with EXPERIMENTEE-level API keys to participants

### 📊 Data Collection & Monitoring
- Use the **WAVE client** to monitor incoming participant data

### 🔧 Troubleshooting
- **API Key Issues**: Verify EXPERIMENTEE vs RESEARCHER permissions
- **Connection Problems**: Ensure WAVE_BACKEND_URL matches your `params.js` file
- **Data Issues**: Re-run relevant notebook cells to regenerate components
- **Schema Mismatches**: Verify `processTrialData` function matches the schema

## 📚 Support Resources
- Experiment template documentation in `/docs`
- WAVE client documentation
- Research team for technical support

**Your experiment is live and ready for participants!** 🧪✨