Improve e2e and eval workflows. by polina-c · Pull Request #1443 · google/A2UI

polina-c · 2026-05-15T02:53:04Z

Update eval to run on schedule too.
Factor out issue creation to script and invoke from both workflows.
Update the script to check if an open issue with this label and title already exists.

gemini-code-assist

Code Review

This pull request introduces a new Bash script, scripts/create_issue.sh, designed to automate the creation of GitHub issues when CI workflows fail. The script utilizes the GitHub CLI to gather context about the failure and the associated pull request. Reviewers suggested several improvements to enhance the script's robustness and flexibility, including adopting set -o pipefail, using modern Bash conditional syntax [[ ... ]], and replacing hardcoded branch names with environment variables. Additionally, a recommendation was made to include logic that prevents the creation of duplicate issues for repeated failures.

gemini-code-assist · 2026-05-15T02:54:31Z

+gh issue create \
+  --title "$TITLE" \
+  --body "$BODY" \
+  --label "$LABEL_NAME"


This script creates a new issue every time a workflow fails. For scheduled workflows that might fail repeatedly, this can lead to many duplicate issues. Consider adding logic to check if an open issue with the same label already exists using gh issue list --label "$LABEL_NAME" --state open before creating a new one.

This is a reasonable enhancement. What do you think? I'd check for more than the label, though: I'd check for one with the label and an identical title.

The script is already checking for label. Added check for title. Thank you!

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gspencergoog · 2026-05-15T19:09:54Z

+  # - catch regressions introduced by changes in environment
+  # - have more data for observing dynamics of performance degradation
+  schedule:
+    - cron: "0 * * * *" # hourly


I don't think this is a good use of our AI budget. The next time we submit something that triggers evals, it will catch any updated environment changes anyhow, so this just feels like a waste.

It also seems way too fast. If we did this, I'd run it at most once a day, since otherwise a single change in environment could result in 48 issues being filed over a weekend.

Switched to daily. Thanks.

gspencergoog · 2026-05-15T19:12:01Z

    # Do not run on forked branches,
    # because the test does not have access to secrets in forks.
-    if: github.repository == 'google/a2ui'
+    if: github.repository == 'flutter/genui'


Whoa, so I guess it wasn't running?

nope

we are in A2UI

reverted

gspencergoog · 2026-05-15T19:12:29Z

      actions: read

-    if: github.repository == 'google/A2UI'
+    if: github.repository == 'flutter/genui'


Where do our docs end up? Is it published anywhere?

oh, I missed it

reverted

thank you

gspencergoog · 2026-05-15T19:14:48Z

+gh issue create \
+  --title "$TITLE" \
+  --body "$BODY" \
+  --label "$LABEL_NAME"


This is a reasonable enhancement. What do you think? I'd check for more than the label, though: I'd check for one with the label and an identical title.

polina-c added 2 commits May 14, 2026 19:46

-

8615401

Merge branch 'main' of github.com:google/A2UI into merge-ci

48abf54

github-project-automation Bot added this to A2UI May 15, 2026

github-project-automation Bot moved this to Todo in A2UI May 15, 2026

gemini-code-assist Bot reviewed May 15, 2026

View reviewed changes

polina-c and others added 9 commits May 15, 2026 10:11

Update scripts/create_issue.sh

5787c97

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update scripts/create_issue.sh

3f79073

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update scripts/create_issue.sh

a63e8fa

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update scripts/create_issue.sh

2e09ac5

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update create_issue.sh

36bb67c

Merge branch 'merge-ci' of https://github.com/google/A2UI into merge-ci

d35a750

Update run_evals.yml

9de5093

-

013c702

Merge branch 'merge-ci' of github.com:google/A2UI into merge-ci

170a7a8

gspencergoog reviewed May 15, 2026

View reviewed changes

polina-c added 4 commits May 15, 2026 12:24

also check for title

11f8da7

-

12b348c

-

a21822b

Merge branch 'main' of github.com:google/A2UI into merge-ci

a1fb782

polina-c requested a review from gspencergoog May 15, 2026 19:53

gspencergoog approved these changes May 15, 2026

View reviewed changes

Comment thread .github/workflows/run_evals.yml Outdated

Update run_evals.yml

349dd0e

polina-c merged commit c4d6ad1 into main May 15, 2026
19 checks passed

polina-c deleted the merge-ci branch May 15, 2026 21:23

github-project-automation Bot moved this from Todo to Done in A2UI May 15, 2026

Conversation

polina-c commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

gspencergoog May 15, 2026

Choose a reason for hiding this comment

Uh oh!

polina-c May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gspencergoog May 15, 2026

Choose a reason for hiding this comment

Uh oh!

polina-c May 15, 2026

Choose a reason for hiding this comment

Uh oh!

gspencergoog May 15, 2026

Choose a reason for hiding this comment

Uh oh!

polina-c May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gspencergoog May 15, 2026

Choose a reason for hiding this comment

Uh oh!

polina-c May 15, 2026

Choose a reason for hiding this comment

Uh oh!

gspencergoog May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

polina-c commented May 15, 2026 •

edited

Loading

polina-c May 15, 2026 •

edited

Loading

polina-c May 15, 2026 •

edited

Loading