chore: add daily Drain3 weight training workflow; delete agentdrain demo binary#24344
Conversation
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/c5c57258-f0a1-4ade-8afc-9c4464b162cc Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
* feat: add pkg/agentdrain - Drain3-style log template mining package Agent-Logs-Url: https://github.com/github/gh-aw/sessions/850383e4-6ce1-4a3d-aa07-dae32343caa6 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * feat: integrate drain3 analysis into audit report and logs Agent-Logs-Url: https://github.com/github/gh-aw/sessions/850383e4-6ce1-4a3d-aa07-dae32343caa6 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * fix: remove incorrect build tags from non-test source files Agent-Logs-Url: https://github.com/github/gh-aw/sessions/850383e4-6ce1-4a3d-aa07-dae32343caa6 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * feat: add --train flag to logs command for drain3 weight pretraining Agent-Logs-Url: https://github.com/github/gh-aw/sessions/621cd144-30cc-44cd-9e7c-37361cee1b70 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * fix: address code review - bytes.TrimSpace, log pretty-print errors, return train error Agent-Logs-Url: https://github.com/github/gh-aw/sessions/621cd144-30cc-44cd-9e7c-37361cee1b70 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * feat: integrate drain3 analysis into audit report subcommand Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1361e355-3eb5-4c65-9f64-ee483320bd65 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * fix: improve drain3 cross-run test assertions and doc comment Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1361e355-3eb5-4c65-9f64-ee483320bd65 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * fix: remove Drain3 name from user-facing report output Agent-Logs-Url: https://github.com/github/gh-aw/sessions/89ebe149-2934-400a-a97e-a8f73ee6bbe4 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * chore: add daily Drain3 weight training workflow; delete agentdrain demo binary (#24344) * Initial plan * feat: add daily drain3 weight training workflow and delete demo binary Agent-Logs-Url: https://github.com/github/gh-aw/sessions/c5c57258-f0a1-4ade-8afc-9c4464b162cc Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Co-authored-by: Peli de Halleux <pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Removes a throwaway Drain3 demo binary and adds a scheduled GitHub Actions workflow to retrain and propose updates to the embedded Drain3 default weights.
Changes:
- Deleted
cmd/agentdrain-demo/main.godemo program. - Added a daily (
cron) + manual (workflow_dispatch) workflow to run./gh-aw logs --train, updatepkg/agentdrain/data/default_weights.json, and open a PR when it changes.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
cmd/agentdrain-demo/main.go |
Removes non-production demo code from the repo. |
.github/workflows/train-drain3-weights.yml |
Automates periodic Drain3 weights training and proposes updates via PR. |
Comments suppressed due to low confidence (1)
.github/workflows/train-drain3-weights.yml:73
- The branch name is only date-based (
ci/train-drain3-weights-YYYYMMDD). Re-runs on the same day (manual dispatch, retries, or overlapping runs) will fail togit pushdue to the branch already existing.
Include a uniqueness suffix such as time (%H%M%S) or ${{ github.run_id }}. For reference, .github/workflows/format-and-commit.yml:70 uses a timestamped branch name for this reason.
BRANCH_NAME="ci/train-drain3-weights-$(date +%Y%m%d)"
git checkout -b "$BRANCH_NAME"
git add pkg/agentdrain/data/default_weights.json
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| name: Download logs and train drain3 weights | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 30 | ||
| permissions: |
There was a problem hiding this comment.
The job token permissions are missing actions: read, which ./gh-aw logs --train needs to list/download workflow run data. With permissions: {} at the workflow level, unspecified scopes default to none, so the logs step is likely to fail with 403.
Add actions: read to jobs.train.permissions (keep contents: write / pull-requests: write as-is). See .github/aw/debug-agentic-workflow.md:68-71 for the documented requirement.
| permissions: | |
| permissions: | |
| actions: read |
| name: Train Log Pattern Weights | ||
|
|
||
| on: | ||
| schedule: | ||
| - cron: "0 4 * * *" # Daily at 04:00 UTC | ||
| workflow_dispatch: | ||
|
|
||
| permissions: {} | ||
|
|
||
| jobs: | ||
| train: | ||
| name: Download logs and train drain3 weights | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 30 |
There was a problem hiding this comment.
Consider adding a concurrency group for this scheduled workflow to prevent overlapping runs (e.g., if a run is slow or a manual dispatch happens near the cron time). Overlaps can lead to duplicate branches/PRs and wasted runner time.
This repo commonly uses concurrency: { group: "gh-aw-${{ github.workflow }}" } for scheduled workflows (e.g., .github/workflows/daily-integrity-analysis.lock.yml:48-50).
This issue also appears on line 70 of the same file.
| - name: Build gh-aw | ||
| run: make build | ||
|
|
There was a problem hiding this comment.
make build runs sync-action-pins/sync-action-scripts as prerequisites (Makefile:22,669-685), which can modify tracked files unrelated to Drain3 weights during this automation run. To avoid incidental working-tree changes and keep the workflow focused, consider building only the CLI binary via go build ... ./cmd/gh-aw (or a dedicated make build-cli target without sync steps).
Addresses review feedback on #24328: remove the throwaway demo program and automate refreshing the embedded Drain3 log-pattern weights.
Changes
cmd/agentdrain-demo/main.go— one-off exploration code, not needed in tree.github/workflows/train-drain3-weights.yml— non-agentic daily workflow that:gh-awfrom sourcegh aw logs --train --count 50to download recent run logs and producedrain3_weights.jsonpkg/agentdrain/data/default_weights.json(the embedded defaults path)mainwhen the weights change; skips PR creation if output is identicalworkflow_dispatch