Skip to content

Add prompt_style A/B experiment to daily-news workflow#31192

Merged
pelikhan merged 3 commits into
mainfrom
copilot/ab-advisor-experiment-prompt-style
May 9, 2026
Merged

Add prompt_style A/B experiment to daily-news workflow#31192
pelikhan merged 3 commits into
mainfrom
copilot/ab-advisor-experiment-prompt-style

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 9, 2026

✨ Enhancement

The daily-news prompt is currently highly prescriptive; this introduces an A/B campaign to measure whether a concise prompt can reduce token usage while preserving output quality and chart/discussion reliability. The workflow now supports detailed vs concise prompt variants with experiment metadata and runtime branching.

  • Experiment configuration (frontmatter)

    • Added rich experiments.prompt_style object to .github/workflows/daily-news.md:
      • variants: detailed, concise
      • primary metric: effective_token_count
      • secondary metrics: output_length_chars, run_duration_ms, chart_generated
      • guardrails for discussion/chart success rates
      • min_samples, weight, start_date, analysis_type, tags
      • wired to issue tracking/notification fields for this campaign
    • Regenerated .github/workflows/daily-news.lock.yml via workflow compile.
  • Prompt-body variant gating

    • Wrapped the verbose trend-chart phase block in variant conditionals.
    • Wrapped the verbose data-sources/reporting block in variant conditionals.
    • detailed path keeps existing long-form instructions; concise path uses a short directive focused on:
      • pre-downloaded data in /tmp/gh-aw/daily-news-data/
      • exactly 2 trend charts
      • discussion output structure and tone requirements
  • Lockfile cleanup

    • Removed duplicated actions/github-script manifest comment entry in the generated lock file.
{{#if experiments.prompt_style == "concise"}}
## 📊 Trend Charts Requirement
Generate exactly **2 trend charts** ... from `/tmp/gh-aw/daily-news-data/` ...
{{else}}
## 📊 Trend Charts Requirement
**IMPORTANT**: Generate exactly 2 trend charts ...
### Chart Generation Process
...
{{/if}}

Copilot AI and others added 2 commits May 9, 2026 11:46
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Experiment campaign for daily-news: A/B test prompt style Add prompt_style A/B experiment to daily-news workflow May 9, 2026
Copilot AI requested a review from pelikhan May 9, 2026 11:51
@pelikhan pelikhan marked this pull request as ready for review May 9, 2026 11:52
Copilot AI review requested due to automatic review settings May 9, 2026 11:52
@pelikhan pelikhan merged commit 1ca07c9 into main May 9, 2026
@pelikhan pelikhan deleted the copilot/ab-advisor-experiment-prompt-style branch May 9, 2026 11:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a prompt_style A/B experiment to the daily-news agentic workflow to compare the existing verbose prompt against a new concise variant, and regenerates the compiled lock workflow to wire up experiment selection/state persistence.

Changes:

  • Introduces an experiments.prompt_style configuration block (variants + metrics + guardrails) to the daily-news workflow frontmatter.
  • Adds handlebars conditionals to branch between detailed and concise prompt instructions for charts and the report body.
  • Regenerates .github/workflows/daily-news.lock.yml, including experiment pick/persist plumbing and other compile-time output changes.
Show a summary per file
File Description
.github/workflows/daily-news.md Adds experiment metadata + prompt-body branching for detailed vs concise.
.github/workflows/daily-news.lock.yml Recompiled workflow lockfile to include experiment runtime/state jobs and updated generated workflow content.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 4

variants: [detailed, concise]
description: "Tests whether a concise directive produces equivalent discussion quality to the current verbose 5-phase prompt"
hypothesis: "H0: no change in output quality. H1: concise prompt reduces token usage by ≥20% with no significant drop in output completeness score"
metric: effective_token_count
Comment on lines +39 to +40
notify:
issue: 31190
# Fix permissions on firewall logs/audit dirs so they can be uploaded as artifacts
# AWF runs with sudo, creating files owned by root
sudo chmod -R a+rX /tmp/gh-aw/sandbox/firewall 2>/dev/null || true
sudo chmod -R a+r /tmp/gh-aw/sandbox/firewall 2>/dev/null || true
Comment on lines 66 to 70
name: "Daily News"
"on":
schedule:
- cron: "34 8 * * 1-5"
- cron: "45 8 * * 1-5"
# Friendly format: daily around 9:00 on weekdays (scattered)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ab-advisor] Experiment campaign for daily-news: A/B test prompt_style

3 participants