[experiments] Daily Experiment Report — 2026-06-21 #40603
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by daily-experiment-report. A newer discussion is available at Discussion #40763. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🧪 Daily Experiment Report — 2026-06-21
38 workflows · 41 A/B experiments · 11 READY (🟢 min_samples reached) · 30 EXTEND (🟡 collecting) · 4 imbalanced (p < 0.05).
⚡ Quick Stats
🟢 READY FOR ANALYSIS (11 experiments)
smoke-copilot—caveman×subagent_model(factorial)Both experiments perfectly balanced (p=1.000). 192 runs (caveman) / 172 runs (subagent_model).
Details
cavemanno(ctrl)cavemanyessubagent_modellarge(ctrl)subagent_modelsmallRecommendation: 🟢 READY — Outcome metrics needed for PROMOTE/ABANDON.
smoke-gemini—sub_agent_strategyDetails
sub_agents(ctrl)single_agentchi2=8.22, df=1, p=0.0041 (IMBALANCED) — expected 50/50, got 62/38. Check
weight:config.Recommendation:⚠️ INVESTIGATE BALANCE before outcome analysis.
smoke-antigravity—sub_agent_strategysingle_agent(ctrl)sub_agentsBalance: p=0.584 ✅ · Recommendation: 🟢 READY
Additional READY experiments
prompt_styleprompt_stylesub_agent_decompositioncavemansubagent_modelcavemansubagent_model🟡 EXTEND (30 experiments)
View all 30 EXTEND experiments
tone_variantprompt_compressionsub_agent_strategyprompt_styleoutput_formatoutput_formatreasoning_depthoutput_formatprompt_stylereasoning_depthsemgrep_output_formatoutput_formattool_verbositytone_styleagent-persona-explorer/sub_agent_strategyalso imbalanced (p=0.034).Warning
Firewall blocked 2 domains
The following domains were blocked by the firewall during workflow execution:
proxy.golang.orgreleaseassets.githubusercontent.comSee Network Configuration for more information.
Beta Was this translation helpful? Give feedback.
All reactions