<center>
    <p style="text-align:center">
        <img alt="phoenix logo" src="https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg" width="200"/>
        <br>
        <a href="https://arize.com/docs/phoenix/">Docs</a>
        |
        <a href="https://github.com/Arize-ai/phoenix">GitHub</a>
        |
        <a href="https://arize-ai.slack.com/join/shared_invite/zt-2w57bhem8-hq24MB6u7yE_ZF_ilOYSBw#/shared-invite/email">Community</a>
    </p>
</center>

# <center>Run Experiments with Splits</center>

This guide shows you how to use splits in Phoenix to isolate, evaluate, and compare subsets of your dataset. We'll go through the following steps:

* Assign examples to named splits (e.g., train, validation, hard_examples) in the Phoenix UI  

* Fetch a dataset limited to a specific split via the `get_dataset(..., splits=["â€¦"])` API  

* Run an experiment on that filtered dataset using `run_experiment(...)`  

* Inspect the results in Phoenix to evaluate performance on just that split  

* Compare across splits (or full-dataset runs) to identify targeted improvements 

In [None]:
%pip install pandas openai arize-phoenix

In [None]:
import json
import os
from datetime import datetime, timezone
from getpass import getpass
from typing import Any

from openai import AsyncOpenAI

from phoenix.client import AsyncClient

if not (openai_api_key := os.getenv("OPENAI_API_KEY")):
    openai_api_key = getpass("ðŸ”‘ Enter your OpenAI API key: ")

os.environ["OPENAI_API_KEY"] = openai_api_key

openai_client = AsyncOpenAI()

phoenix_client = AsyncClient()

now = datetime.now(timezone.utc).isoformat()

In [None]:
examples: list[dict[str, Any]] = [
    {
        "input": {
            "question": "File uploads over 50MB fail with an unknown error. Smaller files are fine."
        },
        "output": {
            "summary": "Uploads >50MB fail; small files succeed.",
            "category": "File Upload",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "email"},
    },
    {
        "input": {
            "question": "Payment gateway keeps timing out and customers can't checkout since yesterday."
        },
        "output": {
            "summary": "Payment timeouts blocking checkout.",
            "category": "Payment",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {
            "question": "Password reset link just reloads the same page; doesn't send a new email."
        },
        "output": {
            "summary": "Reset link reloads; no email sent.",
            "category": "Authentication",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "chat"},
    },
    {
        "input": {
            "question": "Analytics dashboard takes more than 2 minutes to load during peak hours."
        },
        "output": {
            "summary": "Analytics dashboard slow at peak hours.",
            "category": "Performance",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "email"},
    },
    {
        "input": {
            "question": "After updating the app, camera permission is granted but access is denied."
        },
        "output": {
            "summary": "Camera denied post-update despite permissions.",
            "category": "Permissions",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "mobile"},
    },
    {
        "input": {"question": "2FA codes arrive late (5â€“10 minutes), making login impossible."},
        "output": {
            "summary": "Delayed 2FA codes prevent timely login.",
            "category": "Authentication",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "sms"},
    },
    {
        "input": {
            "question": "Invoices for September show duplicate charges for the same subscription."
        },
        "output": {
            "summary": "Duplicate charges on September invoices.",
            "category": "Billing",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "billing"},
    },
    {
        "input": {"question": "Search returns irrelevant results when filtering by tag:beta."},
        "output": {
            "summary": "Search filter 'tag:beta' returns irrelevant items.",
            "category": "Search",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "Our webhook endpoint receives duplicate events for a single order."},
        "output": {
            "summary": "Duplicate webhook deliveries per order.",
            "category": "Webhooks",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "api"},
    },
    {
        "input": {
            "question": "The /v2/orders API intermittently returns 500 errors with no message."
        },
        "output": {
            "summary": "Intermittent 500s on /v2/orders with empty body.",
            "category": "API",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "api"},
    },
    {
        "input": {"question": "CSV import fails if a row contains emojis. No error is shown."},
        "output": {
            "summary": "CSV import fails on emoji characters; silent error.",
            "category": "Import/Export",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {
            "question": "Slack integration stopped posting deployment notifications to #releases."
        },
        "output": {
            "summary": "Slack integration not posting to #releases.",
            "category": "Integration",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "slack"},
    },
    {
        "input": {"question": "Our EU users see dates in US format despite locale set to de-DE."},
        "output": {
            "summary": "Locale ignored; EU users get US date formats.",
            "category": "Localization",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "de", "channel": "web"},
    },
    {
        "input": {"question": "Push notifications are not delivered on Android 14 devices."},
        "output": {
            "summary": "Android 14 devices not receiving push notifications.",
            "category": "Notifications",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "mobile"},
    },
    {
        "input": {"question": "Dark mode makes some text unreadable in the settings page."},
        "output": {
            "summary": "Unreadable text in dark mode on settings.",
            "category": "UI/UX",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "Single sign-on with Okta succeeds but user roles are missing."},
        "output": {
            "summary": "Okta SSO logs in but roles not applied.",
            "category": "SSO",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "sso"},
    },
    {
        "input": {"question": "Our daily data export file is empty for the last two days."},
        "output": {
            "summary": "Daily exports generated but empty content.",
            "category": "Import/Export",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "email"},
    },
    {
        "input": {"question": "Report scheduling at 9am UTC triggers at random times."},
        "output": {
            "summary": "Scheduled reports firing at incorrect times.",
            "category": "Reporting",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {
            "question": "Credit card updates fail with 'Invalid postal code' for Canadian users."
        },
        "output": {
            "summary": "Postal code validation blocks CA card updates.",
            "category": "Billing",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en-CA", "channel": "billing"},
    },
    {
        "input": {
            "question": "The iOS app crashes when opening the 'Team' tab with more than 100 members."
        },
        "output": {
            "summary": "iOS crash on Team tab for large orgs.",
            "category": "Mobile",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "mobile"},
    },
    {
        "input": {"question": "Exported PDFs render charts without labels."},
        "output": {
            "summary": "PDF exports missing chart labels.",
            "category": "Export",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "We see frequent 429 errors after migrating to the new SDK."},
        "output": {
            "summary": "Rate limiting (429) after SDK migration.",
            "category": "API",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "api"},
    },
    {
        "input": {"question": "New users don't receive the onboarding email sequence."},
        "output": {
            "summary": "Onboarding emails not sent to new users.",
            "category": "Email",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "email"},
    },
    {
        "input": {"question": "S3 backups failed over the weekend; no snapshots available."},
        "output": {
            "summary": "Backups failed; missing weekend snapshots.",
            "category": "Backup/Recovery",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "ops"},
    },
    {
        "input": {"question": "Customers in India can't complete UPI payments."},
        "output": {
            "summary": "UPI payments failing for IN customers.",
            "category": "Payment",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en-IN", "channel": "web"},
    },
    {
        "input": {"question": "The 'Download as CSV' button does nothing in Safari."},
        "output": {
            "summary": "CSV download non-functional in Safari.",
            "category": "Browser Compatibility",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "Our Stripe payouts are delayed by three days compared to usual."},
        "output": {
            "summary": "Stripe payouts delayed by three days.",
            "category": "Billing",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "billing"},
    },
    {
        "input": {"question": "Search indexing lags; new items not searchable for hours."},
        "output": {
            "summary": "Indexing delay makes new items unsearchable.",
            "category": "Search",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "Email verification links expire immediately when clicked."},
        "output": {
            "summary": "Verification links expire instantly on click.",
            "category": "Authentication",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "email"},
    },
    {
        "input": {
            "question": "Our Jira integration creates duplicate tickets for the same incident."
        },
        "output": {
            "summary": "Jira integration creating duplicate issues.",
            "category": "Integration",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "jira"},
    },
    {
        "input": {"question": "The dashboard times out behind our corporate VPN only."},
        "output": {
            "summary": "Dashboard timeouts when accessed via VPN.",
            "category": "Networking",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "it"},
    },
    {
        "input": {
            "question": "Users canâ€™t drag-and-drop files into the uploader on Windows touch devices."
        },
        "output": {
            "summary": "Drag-and-drop fails on Windows touch devices.",
            "category": "UI/UX",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "We get 'signature mismatch' errors on webhook validation randomly."},
        "output": {
            "summary": "Intermittent webhook signature mismatches.",
            "category": "Security",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "api"},
    },
    {
        "input": {"question": "Charts render blank when ad blockers are enabled."},
        "output": {
            "summary": "Charts blocked by ad blockers rendering blank.",
            "category": "Browser Compatibility",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "Webhook retry policy stops after two attempts instead of five."},
        "output": {
            "summary": "Webhook retries capped at 2 instead of 5.",
            "category": "Webhooks",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "api"},
    },
    {
        "input": {"question": "SSO logout doesn't end the session; users remain logged in."},
        "output": {
            "summary": "SSO logout fails to destroy session.",
            "category": "SSO",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "sso"},
    },
    {
        "input": {"question": "Map tiles fail to load in China region."},
        "output": {
            "summary": "Map tiles blocked/failing in CN region.",
            "category": "Maps/Geolocation",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "zh", "channel": "web"},
    },
    {
        "input": {"question": "Real-time sync shows conflicts even when editing different fields."},
        "output": {
            "summary": "False-positive sync conflicts across fields.",
            "category": "Sync",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "desktop"},
    },
    {
        "input": {"question": "Video uploads succeed but transcoding never finishes."},
        "output": {
            "summary": "Transcoding stuck after successful upload.",
            "category": "Media/Encoding",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "OCR misses characters in scanned PDFs with light backgrounds."},
        "output": {
            "summary": "OCR accuracy poor on light-background scans.",
            "category": "OCR",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "ocr"},
    },
    {
        "input": {"question": "Calendar sync duplicates events when time zones change."},
        "output": {
            "summary": "Calendar duplicates around timezone changes.",
            "category": "Calendar",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "calendar"},
    },
    {
        "input": {"question": "The 'Forgot workspace' flow leaves users stuck without options."},
        "output": {
            "summary": "Forgot-workspace flow dead-ends users.",
            "category": "Onboarding",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "Abandoned carts emails are sent even after successful purchase."},
        "output": {
            "summary": "Abandoned-cart emails sent post-purchase.",
            "category": "Email",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "email"},
    },
    {
        "input": {"question": "Some API responses cache for too long; stale data in clients."},
        "output": {
            "summary": "Overaggressive caching causes stale API data.",
            "category": "Caching/CDN",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "api"},
    },
    {
        "input": {"question": "Our PayPal payments show 'pending' forever and never capture."},
        "output": {
            "summary": "PayPal stays pending; capture not occurring.",
            "category": "Payment",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "billing"},
    },
    {
        "input": {"question": "Keyboard navigation skips form fields on accessibility mode."},
        "output": {
            "summary": "Keyboard nav skips fields in a11y mode.",
            "category": "Accessibility",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "Refund API rejects requests if reason string includes emojis."},
        "output": {
            "summary": "Refund API rejects emoji in reason field.",
            "category": "API",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "api"},
    },
    {
        "input": {"question": "Feature flags don't propagate to edge locations for hours."},
        "output": {
            "summary": "Feature flag propagation delayed to edge.",
            "category": "Feature Flags",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "ops"},
    },
    {
        "input": {"question": "The 'Apply coupon' button accepts expired codes without warning."},
        "output": {
            "summary": "Expired coupons accepted; no warning.",
            "category": "Checkout",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "Inventory counts go negative after simultaneous purchases."},
        "output": {
            "summary": "Race condition causes negative inventory.",
            "category": "Inventory",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "ops"},
    },
    {
        "input": {"question": "Shipping calculator overcharges for multi-item orders to Alaska."},
        "output": {
            "summary": "Shipping overcharge for AK multi-item orders.",
            "category": "Shipping",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "checkout"},
    },
    {
        "input": {"question": "Salesforce sync drops leads created from chat widget."},
        "output": {
            "summary": "Leads from chat not syncing to Salesforce.",
            "category": "Integration",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "crm"},
    },
    {
        "input": {
            "question": "GitHub SSO works, but org membership doesn't grant repo access in app."
        },
        "output": {
            "summary": "GitHub SSO lacks org-based permissions.",
            "category": "SSO",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "sso"},
    },
    {
        "input": {"question": "Zapier trigger fires twice for a single form submission."},
        "output": {
            "summary": "Zapier triggers firing twice per submission.",
            "category": "Integration",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "zapier"},
    },
    {
        "input": {"question": "Apple Pay button doesn't appear in Safari on iPhone 15."},
        "output": {
            "summary": "Apple Pay not visible on iPhone Safari.",
            "category": "Payment",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "mobile"},
    },
    {
        "input": {"question": "Google Pay completes but order shows as unpaid in admin."},
        "output": {
            "summary": "Google Pay success but admin marks unpaid.",
            "category": "Payment",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "checkout"},
    },
    {
        "input": {"question": "AB test allocations drift; groups not at 50/50 after weeks."},
        "output": {
            "summary": "A/B allocations skewed; not 50/50.",
            "category": "Experimentation",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
    {
        "input": {"question": "Fraud detection flags legitimate repeat customers too often."},
        "output": {
            "summary": "High false positives in fraud detection.",
            "category": "Fraud/Compliance",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "risk"},
    },
    {
        "input": {"question": "Logs in the admin console stop updating after midnight UTC."},
        "output": {
            "summary": "Admin logs stop updating after 00:00 UTC.",
            "category": "Logging/Monitoring",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "ops"},
    },
    {
        "input": {"question": "Deployment rollbacks fail with 'missing artifact' error."},
        "output": {
            "summary": "Rollback fails due to missing artifact.",
            "category": "Deployment",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "ops"},
    },
    {
        "input": {"question": "Subscription cancellations don't prorate refunds correctly."},
        "output": {
            "summary": "Proration incorrect on subscription cancel.",
            "category": "Subscription",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "billing"},
    },
    {
        "input": {"question": "Voice calls drop after 30 seconds when using cellular data."},
        "output": {
            "summary": "VoIP calls drop at 30s on cellular.",
            "category": "Voice/Calls",
            "urgency": "Medium",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "mobile"},
    },
    {
        "input": {"question": "SMS OTPs not delivered to users on T-Mobile."},
        "output": {
            "summary": "T-Mobile customers not receiving SMS OTPs.",
            "category": "Notifications",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "sms"},
    },
    {
        "input": {"question": "CDN edge returns outdated JavaScript after release."},
        "output": {
            "summary": "CDN serving stale JS post-release.",
            "category": "Caching/CDN",
            "urgency": "High",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "ops"},
    },
    {
        "input": {"question": "Users can submit the same form multiple times by double-clicking."},
        "output": {
            "summary": "No duplicate submission guard on form.",
            "category": "UI/UX",
            "urgency": "Low",
        },
        "metadata": {"source": "support_ticket", "language": "en", "channel": "web"},
    },
]

In [None]:
from phoenix.client import AsyncClient

dataset = await phoenix_client.datasets.create_dataset(
    name="triage-dataset",
    examples=examples,
)

### Create your Splits in the UI

Navigate over to the Datasets section of Phoenic & see your newly created dataset: "Triage-Dataset." Once you can see all of your examples, you can start creating your splits! 

<center>
    <p style="text-align:center">
        <img alt="phoenix logo" src="https://arize.com/docs/phoenix/~gitbook/image?url=https%3A%2F%2Fstorage.googleapis.com%2Farize-phoenix-assets%2Fassets%2Fimages%2Fphoenix-docs-images%2Fexample_dataset.png&width=768&dpr=2&quality=100&sign=3136e7d4&sv=2"/>

In [None]:
TRIAGE_PROMPT = """
You are a customer support summarization and triage assistant.

Your task:
1. Read the input message from a user.
2. Summarize it in one short, clear sentence describing the issue.
3. Classify it into ONE of these categories:
   - Account & Access
   - Billing & Payments
   - Performance & Reliability
   - Integrations & APIs
   - App Functionality & UI
   - Other
4. Assign an urgency level:
   - high â†’ business-critical or blocking issue
   - medium â†’ major inconvenience or degraded experience
   - low â†’ minor inconvenience, cosmetic, or question
"""

In [None]:
async def triage_issue(input: Any) -> dict[str, Any]:
    response = await openai_client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": TRIAGE_PROMPT},
            {"role": "user", "content": str(input)},
        ],
        tools=[
            {
                "type": "function",
                "function": {
                    "name": "triage",
                    "description": (
                        "Triage the input message into a summary, category, and urgency level."
                    ),
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "summary": {
                                "type": "string",
                                "description": "A short, clear sentence describing the issue.",
                            },
                            "category": {
                                "type": "string",
                                "description": "The category of the issue.",
                                "enum": [
                                    "account_and_access",
                                    "billing_and_payments",
                                    "performance_and_reliability",
                                    "integrations_and_apis",
                                    "app_functionality_and_ui",
                                    "other",
                                ],
                            },
                            "urgency": {
                                "type": "string",
                                "description": "The urgency level of the issue.",
                                "enum": ["High", "Medium", "Low"],
                            },
                        },
                        "required": ["summary", "category", "urgency"],
                    },
                },
            }
        ],
    )
    tool_calls = response.choices[0].message.tool_calls
    if not tool_calls:
        raise ValueError("No tool call found in response")
    arguments = tool_calls[0].function.arguments

    parsed = json.loads(arguments)
    return parsed  # type: ignore

To get certain splits of your dataset, include the `splits` parameter in the `get_dataset()` parameter. Ex. `splits=["hard_examples"]`

In [None]:
dataset = await phoenix_client.datasets.get_dataset(
    dataset="triage-dataset", splits=["hard_examples"]
)

In [None]:
experiments = await phoenix_client.experiments.run_experiment(
    dataset=dataset,
    task=triage_issue,
    experiment_name="few-shot-experiment",
)