You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
37 active experiments across github/gh-aw. 10 reached min_samples (READY); 27 still collecting. 1 show assignment imbalance (χ2 p<0.05). Outcome metrics (success rate, duration) are unavailable this run — git-branch state only.
⚡ Quick Stats
Metric
Value
Active experiments
37
🟢 READY for analysis
10
🟡 EXTEND (collecting)
27
Significant (p < 0.05)
0 (outcome data N/A)
⚠️ Assignment imbalance
1
Recommendations
🟡 EXTEND: 37
📊 Charts
🟢 READY Experiments (10)
All variants ≥ min_samples. Awaiting outcome data for significance testing.
Experiment
Workflow
Variants
χ2 p-val
Balance
caveman
smoke-copilot
no=103, yes=103
1.000
✅
subagent_model
smoke-copilot
large=93, small=93
1.000
✅
sub_agent_strategy
smoke-gemini
single_agent=65, sub_agents=97
0.012
✅
sub_agent_strategy
smoke-antigravity
single_agent=62, sub_agents=71
0.435
✅
sub_agent_decomposition
smoke-pi
single_agent=34, parallel_sub_agents=42
0.359
✅
caveman
smoke-copilot-aoai-a
no=41, yes=40
0.912
✅
subagent_model
smoke-copilot-aoai-a
large=40, small=41
0.912
✅
caveman
smoke-copilot-aoai-e
no=27, yes=26
0.891
✅
subagent_model
smoke-copilot-aoai-e
large=27, small=26
0.891
✅
prompt_style
daily-community-attr
concise=25, verbose=25
1.000
✅
⚠️smoke-gemini.sub_agent_strategy: Assignment imbalance detected (sub_agents=97 vs single_agent=65, χ2=6.32, p=0.012). Investigate variant weighting logic — imbalanced assignment may bias outcome analysis.
Recommendation for all READY experiments: EXTEND — Collect outcome data (success/failure, duration) to run significance tests on the primary metric.
🟡 EXTEND Experiments (27)
View all 27 collecting experiments
Experiment
Workflow
Total n
Slowest variant progress
Issue
tone_variant
aw-failure-investigato
93
████░░░░ 26/50 (52%)
#36105
prompt_style
daily-astrostylelite-m
52
███████░ 25/30 (83%)
—
output_format
daily-issues-report
49
██████░░ 23/30 (77%)
#30573
prompt_style
daily-news
44
█████░░░ 20/30 (67%)
#31190
reasoning_depth
daily-security-red-tea
43
█████░░░ 20/30 (67%)
#31673
output_format
daily-compiler-quality
39
██████░░ 16/20 (80%)
#32390
output_format
daily-code-metrics
38
███████░ 17/20 (85%)
#1
semgrep_output_format
daily-semgrep-scan
38
███░░░░░ 11/30 (37%)
#32795
output_format
deep-report
36
██████░░ 11/15 (73%)
—
sub_agent_strategy
agent-persona-explorer
35
███████░ 12/14 (86%)
—
prompt_compression
agent-performance-anal
34
███████░ 13/14 (93%)
#33280
tool_verbosity
gpclean
31
███░░░░░ 13/30 (43%)
—
reasoning_depth
daily-fact
29
███░░░░░ 11/30 (37%)
#31324
prompt_style
ci-coach
28
█████░░░ 12/20 (60%)
#32335
tone_style
typist
21
███████░ 9/10 (90%)
#34032
model_size
daily-doc-healer
17
███░░░░░ 7/20 (35%)
—
model_size
daily-caveman-optimize
17
██░░░░░░ 6/20 (30%)
—
model_size
daily-doc-updater
17
███░░░░░ 8/20 (40%)
—
sub_agent_strategy
daily-agentrx-trace-op
17
██░░░░░░ 5/20 (25%)
—
output_format
copilot-agent-analysis
16
██░░░░░░ 7/30 (23%)
—
prompt_style
dependabot-go-checker
11
█░░░░░░░ 3/30 (10%)
—
log_fetch_strategy
daily-safe-output-opti
9
██░░░░░░ 4/20 (20%)
—
sub_agent_strategy
architecture-guardian
7
█░░░░░░░ 2/30 (7%)
#39062
detail_level
daily-architecture-dia
4
███░░░░░ 4/10 (40%)
#31926
prompt_style
issue-arborist
4
█░░░░░░░ 2/30 (7%)
#30015
caveman_mode
dataflow-pr-discussion
2
█░░░░░░░ 1/10 (10%)
#37102
prefetch_strategy
weekly-blog-post-write
2
██░░░░░░ 2/10 (20%)
#38590
📋 Summary
Metric
Value
Total active experiments
37
Ready (n ≥ min_samples)
10
Still collecting
27
Experiments with tracking issues
16
Assignment balance violations
1
Analysis: git state branches · Significance: p < 0.05 · Outcome data: N/A (no GitHub API)
Run: §28086804240
Warning
Firewall blocked 2 domains
The following domains were blocked by the firewall during workflow execution:
proxy.golang.org
releaseassets.githubusercontent.com
To allow these domains, add them to the network.allowed list in your workflow frontmatter:
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
🧪 Daily Experiment Report — 2026-06-24
37 active experiments across
github/gh-aw. 10 reachedmin_samples(READY); 27 still collecting. 1 show assignment imbalance (χ2 p<0.05). Outcome metrics (success rate, duration) are unavailable this run — git-branch state only.⚡ Quick Stats
📊 Charts
🟢 READY Experiments (10)
All variants ≥
min_samples. Awaiting outcome data for significance testing.cavemansmoke-copilotno=103,yes=103subagent_modelsmoke-copilotlarge=93,small=93sub_agent_strategysmoke-geminisingle_agent=65,sub_agents=97sub_agent_strategysmoke-antigravitysingle_agent=62,sub_agents=71sub_agent_decompositionsmoke-pisingle_agent=34,parallel_sub_agents=42cavemansmoke-copilot-aoai-ano=41,yes=40subagent_modelsmoke-copilot-aoai-alarge=40,small=41cavemansmoke-copilot-aoai-eno=27,yes=26subagent_modelsmoke-copilot-aoai-elarge=27,small=26prompt_styledaily-community-attrconcise=25,verbose=25Recommendation for all READY experiments: EXTEND — Collect outcome data (success/failure, duration) to run significance tests on the primary metric.
🟡 EXTEND Experiments (27)
View all 27 collecting experiments
tone_variantaw-failure-investigato#36105prompt_styledaily-astrostylelite-moutput_formatdaily-issues-report#30573prompt_styledaily-news#31190reasoning_depthdaily-security-red-tea#31673output_formatdaily-compiler-quality#32390output_formatdaily-code-metrics#1semgrep_output_formatdaily-semgrep-scan#32795output_formatdeep-reportsub_agent_strategyagent-persona-explorerprompt_compressionagent-performance-anal#33280tool_verbositygpcleanreasoning_depthdaily-fact#31324prompt_styleci-coach#32335tone_styletypist#34032model_sizedaily-doc-healermodel_sizedaily-caveman-optimizemodel_sizedaily-doc-updatersub_agent_strategydaily-agentrx-trace-opoutput_formatcopilot-agent-analysisprompt_styledependabot-go-checkerlog_fetch_strategydaily-safe-output-optisub_agent_strategyarchitecture-guardian#39062detail_leveldaily-architecture-dia#31926prompt_styleissue-arborist#30015caveman_modedataflow-pr-discussion#37102prefetch_strategyweekly-blog-post-write#38590📋 Summary
Warning
Firewall blocked 2 domains
The following domains were blocked by the firewall during workflow execution:
proxy.golang.orgreleaseassets.githubusercontent.comSee Network Configuration for more information.
Beta Was this translation helpful? Give feedback.
All reactions