Skip to content

Re-enable integration test trigger and route cross-org dispatch through emu-access#5034

Merged
mihaimitrea-db merged 2 commits intomainfrom
mihai.mitrea/reenable-integration-trigger
Apr 23, 2026
Merged

Re-enable integration test trigger and route cross-org dispatch through emu-access#5034
mihaimitrea-db merged 2 commits intomainfrom
mihai.mitrea/reenable-integration-trigger

Conversation

@mihaimitrea-db
Copy link
Copy Markdown
Contributor

@mihaimitrea-db mihaimitrea-db commented Apr 20, 2026

Summary

  • Reverts the intent of Temporarily disable integration test trigger #4899 (temporary stub) and restores automatic integration test triggering. The original PR could not simply be reverted (as initially intended) because of the new distinction between runners with cross-org access and ones with same-org access.
  • Both upstream blockers are now resolved:
  • The job is split in two:
    • integration-trigger (deco runners) handles same-org Integration Tests check writes for the PR-skip and merge-group-auto-approve paths, using the DECO_TEST_APPROVAL app token. Testmask-based gating and the pre-Temporarily disable integration test trigger #4899 summaries (Skipped (changes do not require integration tests) / Auto-approved for merge queue (tests already passed on PR)) are restored.
    • trigger-tests (emu-access runners) mints the DECO_WORKFLOW_TRIGGER token and issues the cross-org gh workflow run cli-isolated-pr.yml / cli-isolated-nightly.yml dispatches.
  • integration-trigger-dependabot is unchanged.

NO_CHANGELOG=true

Test plan

  • On this PR, confirm integration-trigger runs on databricks-deco-testing-runner-group and succeeds.
  • Confirm trigger-tests runs on databricks-release-runner-group-emu-access; Generate GitHub App Token step succeeds (no 403); Trigger integration tests (pull request) dispatches cli-isolated-pr.yml on databricks-eng/eng-dev-ecosystem with pull_request_number and commit_sha inputs.
  • On databricks-eng/eng-dev-ecosystem, confirm the dispatched cli-isolated-pr run appears (event: workflow_dispatch) and its checkout job uploads the update-check-action and gh-report-action artifacts.
  • Confirm mark-as-pending runs on linux-ubuntu-latest-ghec-access and updates the Integration Tests check on the PR commit to in_progress.
  • When the integration-tests-prod matrix finishes, confirm mark-as-success / mark-as-failure updates the check to success / failure. (Known separate issue: integration-tests-prod on main has been failing due to a Go 1.25.9 toolchain fetch against proxy.golang.org; that is out of scope here.)
  • Merge-queue path: after ready-for-merge, confirm integration-trigger writes the Auto-approved for merge queue (tests already passed on PR) check.
  • Push-to-main path: after merge, confirm a workflow_dispatch run of cli-isolated-nightly.yml appears on eng-dev-ecosystem keyed to the merge commit SHA.

This pull request and its description were written by Isaac.

…gh emu-access

Reverts the intent of #4899 (temporary stub) now that both blockers are resolved:
the eng-dev-ecosystem `mark-as-*` jobs moved onto ghec-access runners in
databricks-eng/eng-dev-ecosystem#1252, and this change moves the cross-org
`gh workflow run` dispatch onto `linux-ubuntu-latest-emu-access` following the
pattern from databricks/databricks-sdk-go#1638 so the call is no longer 403'd
by the databricks-eng org IP allowlist.

Restores the pre-#4899 behavior in `push.yml`:
- PRs dispatch `cli-isolated-pr.yml`; pushes to `main` dispatch `cli-isolated-nightly.yml`.
- The required `Integration Tests` check is updated by the eng-dev-ecosystem
  `mark-as-*` jobs (in_progress → success/failure) instead of being fake-stamped.
- Testmask-based skip/auto-approve paths are restored with their original summaries.

Split into two jobs:
- `integration-trigger` (deco runners) writes same-org Skip/Auto-approve checks
  via the DECO_TEST_APPROVAL token.
- `trigger-tests` (emu-access runners) mints the DECO_WORKFLOW_TRIGGER token and
  does the cross-org `gh workflow run` dispatch.

NO_CHANGELOG=true

Co-authored-by: Isaac
@github-actions
Copy link
Copy Markdown
Contributor

Waiting for approval

Based on git history, these people are best suited to review:

  • @pietern -- recent work in .github/workflows/

Eligible reviewers: @andrewnester, @anton-107, @denik, @renaudhartert-db, @shreyas-goenka, @simonfaltum

Suggestions based on git history. See OWNERS for ownership rules.

…-integration-trigger

# Conflicts:
#	.github/workflows/push.yml
@mihaimitrea-db
Copy link
Copy Markdown
Contributor Author

From offline discussion with Hector, we decided to merge this even though the tests are failing, due to the posibility of using bypass rules.

@mihaimitrea-db mihaimitrea-db merged commit f0d90e6 into main Apr 23, 2026
22 of 23 checks passed
@mihaimitrea-db mihaimitrea-db deleted the mihai.mitrea/reenable-integration-trigger branch April 23, 2026 12:36
simonfaltum added a commit that referenced this pull request Apr 23, 2026
## Why

Integration test nightlies (`cli-isolated-pr.yml`) have been red on
every main run since 2026-04-02, when `#4899` temporarily disabled the
trigger. The trigger was re-enabled in `#5034` and all accumulated
failures surfaced at once. Nothing in any in-flight feature PR is to
blame; this PR just clears the backlog so nightly signal goes green
again.

Two independent regressions:

1. The host-metadata cache (`#5011`) regenerated goldens for tests that
run locally, but could not touch `Cloud=true, Local=false` suites.
`acceptance/selftest/record_cloud/{pipeline-crud,workspace-file-io}`
still expected the pre-cache `/.well-known/databricks-config` calls.
2. Lakeview server behavior now varies by cloud on workspace import. AWS
staging includes `serialized_dashboard` in the updated fields; GCP
production no longer clears `warehouse_id`. The exact-match assertions
in `TestDashboardAssumptions_WorkspaceImport` fail differently on each
cloud.

## Changes

**Before:** `record_cloud` goldens include redundant
`/.well-known/databricks-config` GETs; dashboard test hard-codes exact
updated/deleted fields.

**Now:** goldens regenerated against e2-dogfood (only diff is removal of
the cached requests); dashboard assertions use `assert.Subset` so they
tolerate cross-cloud drift but still fail on anything outside the
known-allowed set.

- `acceptance/selftest/record_cloud/pipeline-crud/output.txt`,
`acceptance/selftest/record_cloud/workspace-file-io/output.txt`: rerun
with `-update` under `CLOUD_ENV=aws` against e2-dogfood. Both terraform
and direct variants produce identical output.
- `integration/assumptions/dashboard_assumptions_test.go`: `etag` and
`update_time` must appear in updated fields; `serialized_dashboard` is
allowed; `warehouse_id` is the only allowed deletion. Comment points to
the observed cross-cloud split so the next reader knows why.

Follows the pattern of the previous Lakeview-behavior-change fix in
`#4640`.

## Test plan

- [x] `make checks` clean
- [x] `make lint` clean (0 issues)
- [x] `go test ./acceptance -run
'TestAccept/selftest/record_cloud/{workspace-file-io,pipeline-crud}'`
passes against e2-dogfood (both terraform and direct variants)
- [x] `go test ./integration/assumptions -run
TestDashboardAssumptions_WorkspaceImport` passes against e2-dogfood
- [ ] cli-isolated-pr.yml integration run on this branch comes back
green
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants