Skip to content

Add daily workflow for monitoring compiler error message quality#13806

Merged
pelikhan merged 9 commits intomainfrom
copilot/improve-syntax-error-rendering
Feb 4, 2026
Merged

Add daily workflow for monitoring compiler error message quality#13806
pelikhan merged 9 commits intomainfrom
copilot/improve-syntax-error-rendering

Conversation

Copy link
Contributor

Copilot AI commented Feb 4, 2026

Creates automated testing of compiler error message quality to drive DX improvements.

Implementation

New workflow: .github/workflows/daily-syntax-error-quality.md

Systematically evaluates error messages by:

  • Introducing 3 syntax errors across diverse workflows (YAML syntax, invalid config, semantic conflicts)
  • Scoring each error across 5 dimensions (100-point scale): clarity, actionability, context, examples, consistency
  • Creating GitHub issues with specific improvement recommendations when average score < 70 or any test < 55

Scoring Framework

Dimension Weight Evaluates
Clarity 25 pts Message understandability
Actionability 25 pts Fix guidance provided
Context 20 pts Error location shown
Examples 15 pts Correct usage demonstrated
Consistency 15 pts Format adherence

Quality thresholds:

  • Excellent: 85-100
  • Good: 70-84
  • Acceptable: 55-69 (improvement needed)
  • Poor: 40-54 (confusing)
  • Critical: 0-39 (harmful)

Example Error Evaluation

# Test case: Invalid engine name
./gh-aw compile test-workflow.md

# Output:
# test-workflow.md:3:8: error: at '/engine': value must be one of 'claude', 'codex', 'copilot', 'custom'
# Shows: line:column, valid options, source context
# Score: Clarity 22/25, Actionability 18/25, Context 16/20, Examples 12/15, Consistency 14/15
# Total: 82/100 (Good)

Configuration

  • Schedule: Daily (scattered)
  • Safe outputs: Max 1 issue per run, auto-closes older issues, labels: dx, error-messages, automated-analysis
  • Timeout: 20 minutes

Workflow ensures error messages answer: What broke? Where? How to fix? Show example?

Original prompt

Create a daily agentic workflow that makes sure the rendering of syntax errors in workflows look good.

report 3 times:

  • take an existing agentic workflow and generate a syntactic error
  • run compiler and collect output
  • evaluate the quality of the output such that it helps an agent fixing the error

generate a suggestion as an angetic prompt to improve the output in a create-issue


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 7 commits February 4, 2026 23:00
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
… syntax

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Enhance rendering of syntax errors in workflows Add daily workflow for monitoring compiler error message quality Feb 4, 2026
Copilot AI requested a review from pelikhan February 4, 2026 23:13
@pelikhan pelikhan marked this pull request as ready for review February 4, 2026 23:37
Copilot AI review requested due to automatic review settings February 4, 2026 23:37
@pelikhan pelikhan merged commit 7d59438 into main Feb 4, 2026
55 checks passed
@pelikhan pelikhan deleted the copilot/improve-syntax-error-rendering branch February 4, 2026 23:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces automated monitoring of compiler error message quality through a new daily workflow. The workflow systematically tests error messages by introducing syntax errors in diverse workflows, evaluating them across five dimensions (clarity, actionability, context, examples, consistency), and creating GitHub issues with specific improvement recommendations when quality falls below defined thresholds.

Changes:

  • Adds new daily workflow for testing compiler error message quality with comprehensive evaluation framework
  • Implements 100-point scoring system across 5 dimensions to objectively assess error message effectiveness
  • Configures safe outputs with automatic issue creation (max 1 issue/run, 3-day expiration, auto-closes older issues)

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
.github/workflows/daily-syntax-error-quality.md Main workflow definition with detailed instructions for error testing, evaluation criteria, and issue reporting template
.github/workflows/daily-syntax-error-quality.lock.yml Compiled YAML workflow with full job configuration, MCP server setup, and safe outputs handling
docs/src/content/docs/agent-factory-status.mdx Adds new workflow entry to the agent factory status table with badge

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

name: Daily Syntax Error Quality Check
description: Tests compiler error message quality by introducing syntax errors in workflows, evaluating error clarity, and suggesting improvements
on:
schedule: daily
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schedule value 'daily' is ambiguous as a configuration value. While it may be valid syntax for this system, it would be clearer to use a standard cron expression or explicitly state that this is a custom keyword that the compiler translates to a randomized daily schedule. The generated lock file shows this becomes '48 1 * * *', but this transformation is not obvious from the source.

Suggested change
schedule: daily
schedule:
- cron: '48 1 * * *'

Copilot uses AI. Check for mistakes.
1. **Attempt to compile** the modified workflow:
```bash
cd /tmp/syntax-error-tests
./gh-aw compile test-1.md 2>&1 | tee test-1-output.txt
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler executable './gh-aw' is being referenced from '/tmp/syntax-error-tests' directory, but the workflow doesn't include instructions to copy or ensure the compiler binary is available in that directory. The workflow should either reference the compiler with an absolute path (e.g., '${GITHUB_WORKSPACE}/gh-aw') or include instructions to copy it to the test directory.

Suggested change
./gh-aw compile test-1.md 2>&1 | tee test-1-output.txt
${GITHUB_WORKSPACE}/gh-aw compile test-1.md 2>&1 | tee test-1-output.txt

Copilot uses AI. Check for mistakes.

```bash
# Find candidate workflows
find .github/workflows -name '*.md' -type f ! -name 'daily-*.md' ! -name '*-test.md' | head -10
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command limits results to 10 workflows with 'head -10', but the instructions state to select 3 workflows. This could be confusing. Consider either changing to 'head -5' for a smaller sample set, or adding a comment explaining why 10 results are shown when only 3 will be selected.

Suggested change
find .github/workflows -name '*.md' -type f ! -name 'daily-*.md' ! -name '*-test.md' | head -10
find .github/workflows -name '*.md' -type f ! -name 'daily-*.md' ! -name '*-test.md' | head -5

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants