Skip to content

Migrate from Azure Pipelines to GitHub Actions#1

Merged
BenjaminMichaelis merged 22 commits into
mainfrom
agents/azure-pipelines-to-gha-plan
May 17, 2026
Merged

Migrate from Azure Pipelines to GitHub Actions#1
BenjaminMichaelis merged 22 commits into
mainfrom
agents/azure-pipelines-to-gha-plan

Conversation

@BenjaminMichaelis
Copy link
Copy Markdown
Owner

@BenjaminMichaelis BenjaminMichaelis commented May 15, 2026

Summary

Replaces all 5 Azure Pipelines YAML files with GitHub Actions workflows.

Non-CI fixes included

  • JupyterKernel: creates the ToTask-derived task before awaiting SendAsync to avoid missing hot-observable completion.
  • test-retry-runner.ps1: adds timestamped retry logging so CI retry attempts are easier to correlate in logs.

New files

  • .github/workflows/ci.yml — push/PR CI on Windows (windows-2022) and Linux (ubuntu-22.04)
  • .github/workflows/publish.yml — tag-triggered publish (NuGet, NPM, VS Code Marketplace); workflow_dispatch simulate mode
  • .github/actions/build-and-test/action.yml — composite action (build + test, reused by both CI and publish)
  • .github/dependabot.yml — weekly github-actions dependency updates

What's dropped (no replacement needed)

  • MicroBuild code signing (Microsoft-internal)
  • 1ES SDL compliance tooling (BinSkim, PoliCheck, APIScan)
  • TSA bug filing, SBOM, OneLocBuild, symbol server publishing
  • NPM publish to Azure Artifacts → replaced by npmjs.org

Publish authentication

  • VS Code Marketplace: OIDC / managed identity (azure/login@v3 + DefaultAzureCredential)
  • NuGet: NUGET_API_KEY secret
  • NPM: NPM_TOKEN secret

Required repo setup (before enabling publish)

  1. Create production GitHub Environment → add Required Reviewers
  2. Add secrets: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_SUBSCRIPTION_ID, NUGET_API_KEY, NPM_TOKEN
  3. OIDC federated credential on Azure app registration: repo:BenjaminMichaelis/interactive:environment:production

AZP files not yet deleted

Shadow-CI phase: run GHA alongside AZP until validated, then delete:

  • azure-pipelines.yml
  • azure-pipelines-official.yml
  • azure-publish-stable-polyglot-notebooks.yml
  • azure-publish-insiders-polyglot-notebooks.yml
  • apiscan-compliance.yml / es-metadata.yml

Review history

4 rounds of GPT-5.5 × Opus 4.6 cross-review; all critical findings applied.

- Add .github/workflows/ci.yml: push/PR CI on windows-2022 and ubuntu-22.04
- Add .github/workflows/publish.yml: tag-triggered publish (NuGet, NPM, VSCode extension)
  with environment: production gate and workflow_dispatch simulate mode
- Add .github/actions/build-and-test/action.yml: composite action for build+test
- Add .github/dependabot.yml: weekly github-actions dependency updates
- Patch eng/publish/PublishPolyglotNotebooksHelper.psm1:
  - FindChildItem null guard before Get-FileHash
  - vsce verify-signature captures stderr (2>&1 | Out-String) to prevent NullRef
  - NuGet push uses --skip-duplicate
  - azure/login guarded with if: !inputs.simulate

Drops all Microsoft-internal infra (MicroBuild signing, 1ES SDL, TSA, OneLocBuild,
symbol publishing). Publish uses OIDC/managed identity for VS Code Marketplace;
PATs for NuGet and NPM. AZP files are NOT yet deleted (shadow-CI phase).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Migrates the repository from Azure Pipelines to GitHub Actions: introduces CI and Publish workflows plus a shared composite build/test action, sets up weekly Dependabot updates for GitHub Actions, and adjusts the existing publish PowerShell helpers to be tolerant of test-signed builds and idempotent NuGet pushes. Microsoft-internal/1ES tooling (signing, SDL, TSA, OneLocBuild, etc.) is intentionally dropped. The legacy AZP YAMLs are kept temporarily for shadow-CI validation.

Changes:

  • Add .github/workflows/ci.yml, .github/workflows/publish.yml, composite build-and-test/action.yml, and dependabot.yml.
  • Switch VS Code Marketplace publish auth to OIDC/azure/login; switch NPM to npmjs.org; NuGet via NUGET_API_KEY.
  • Soften vsce verify-signature to a warning and add --skip-duplicate to dotnet nuget push in PublishPolyglotNotebooksHelper.psm1.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
.github/workflows/ci.yml New push/PR CI on Windows + Linux delegating to the composite action.
.github/workflows/publish.yml Tag/dispatch-triggered publish pipeline for NuGet, NPM, and VS Code Marketplace with OIDC auth.
.github/actions/build-and-test/action.yml Shared composite build/test action used by CI and Publish, with platform-specific steps and caching.
.github/dependabot.yml Weekly grouped Dependabot updates for GitHub Actions.
eng/publish/PublishPolyglotNotebooksHelper.psm1 Adds missing-file guard, downgrades signature verification to a warning, and uses --skip-duplicate for NuGet push.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread eng/publish/PublishPolyglotNotebooksHelper.psm1
Comment thread .github/workflows/publish.yml
Comment thread .github/workflows/publish.yml Outdated
Comment thread .github/actions/build-and-test/action.yml Outdated
Comment thread .github/workflows/publish.yml
Comment thread .github/workflows/publish.yml
Comment thread .github/actions/build-and-test/action.yml
Comment thread .github/actions/build-and-test/action.yml Outdated
Comment thread .github/workflows/publish.yml
DRY: hoist NPM_CONFIG_REGISTRY and NODE_VERSION to workflow-level env; add
npm-version input to composite action; add paths-ignore comment.

Best practices: add concurrency groups (ci cancel-in-progress, publish no-cancel);
add workflow_dispatch to ci.yml; move checks:write to per-job in ci.yml; add
timeout-minutes to publish-nuget/npm/vscode; pin ubuntu-latest to ubuntu-22.04;
pass NUGET_API_KEY via env var instead of CLI arg.

Versions: Node.js 22.18.0 -> 22.22.3, npm 11.6.0 -> 11.14.1.

Dependabot: add npm (/src) and nuget (/) ecosystem scans.
npm 11.x strict npm ci validation requires all optional dependencies to have
a resolved package entry in the lock file, even macOS-only optional packages
like fsevents. The polyglot-notebooks-ui-components lock file was generated
without a resolved fsevents entry (only the range specifier), causing npm ci
to fail with 'Missing: fsevents@2.3.3 from lock file'.

Adds the node_modules/fsevents entry (packages section) and the legacy
fsevents entry (dependencies section) matching the version used in the
other package lock files in this repo.
HIGH - signature verification now fails closed on production publishes
  (PublishPolyglotNotebooksHelper.psm1): gate soft-fail on simulate
  parameter; non-simulate builds exit 1 if signature check fails.

HIGH - NuGet push loop no longer swallows errors (publish.yml):
  replace ind ... | while read subshell pattern with or loop and
  set -eo pipefail so dotnet nuget push failures propagate correctly.

HIGH - npm publish glob replaced with explicit for-loop (publish.yml):
  validate exactly one tarball exists before publishing; fail clearly if
  zero or multiple tarballs are found.

MEDIUM - NuGet cache path corrected (action.yml):
  Arcade sets NUGET_PACKAGES to {repo}/.packages when -ci is passed
  (useGlobalNuGetCache=false). Changed cache path from ~/.nuget/packages
  to github.workspace/.packages so the cache actually hits.

LOW - cp error swallowing narrowed (action.yml):
  check eng/resources exists before copy; only suppress empty-glob with
  nullglob rather than silently ignoring all errors.

LOW - comments added:
  inputs.simulate empty-string behavior documented in publish.yml.
  pwsh dependency on ubuntu-22.04 documented in action.yml.
@BenjaminMichaelis BenjaminMichaelis marked this pull request as ready for review May 15, 2026 21:54
- make simulate behavior explicit via prepare job output in publish workflow

- validate extracted package version matches tag version on tag-triggered runs

- harden Linux resource copy and add explicit pwsh availability precheck

- align NuGet cache path and NUGET_PACKAGES usage in build-and-test action
Skip blame-hang collection for Microsoft.DotNet.Interactive.Jupyter.Tests on non-Windows to avoid false-positive Test Run Aborted failures while preserving test execution.
- add timestamped progress logging to test-retry-runner

- add workflow_dispatch diagnostics workflow for Jupyter Linux test hangs
- run isolated Linux Jupyter diagnostics on pull requests and manual dispatch

- add timestamped test progress logs in test-retry-runner
- build prerequisites before running isolated Jupyter diagnostics

- run diagnostics step with continue-on-error to collect artifacts without breaking PR
- run targeted Jupyter blame-hang diagnostics when Linux test step fails

- upload diagnostics via existing artifacts/TestResults path
Subscribe to Jupyter response streams before sending requests so fast in-memory playback cannot emit replies before the awaiting code is subscribed. This prevents recorded Jupyter tests from missing responses and hanging after test execution completes.
- Remove linux-hang-diagnostics.yml temporary workflow
- Remove temporary diagnostics step from build-and-test action
- Restore blame-hang collection for Jupyter tests (root cause now fixed)
- Keep timestamped logging utility functions for debugging

CI run 25950724045 confirmed the race fix resolves the hang issue.
Both build-linux and build-windows jobs passed without timeouts.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 9 comments.

Files not reviewed (1)
  • src/polyglot-notebooks-ui-components/package-lock.json: Language not supported
Comments suppressed due to low confidence (1)

.github/actions/build-and-test/action.yml:170

  • The diagnostics test step uses --no-build and runs only when the preceding regular test run failed. If the failure was a build/compile error (vs. a test hang), the build outputs needed by --no-build will not be present and this step will fail with a confusing "could not find assembly" error rather than helping diagnose the hang. Consider guarding this with a check that the test assembly exists, or dropping --no-build so it falls back to building on demand.
    - name: Report test results (Linux)
      if: ${{ inputs.platform == 'linux' && !cancelled() && (github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository) }}
      uses: dorny/test-reporter@v3
      with:
        name: Linux Tests
        path: artifacts/TestResults/${{ inputs.build-config }}/**/*.trx
        reporter: dotnet-trx
        fail-on-error: false

Comment thread .github/dependabot.yml Outdated
Comment thread .github/workflows/publish.yml Outdated
Comment thread .github/workflows/publish.yml
Comment thread .github/workflows/publish.yml Outdated
Comment thread .github/workflows/publish.yml
Comment thread .github/workflows/publish.yml
Comment thread src/polyglot-notebooks-ui-components/package-lock.json
Comment thread .github/actions/build-and-test/action.yml
Comment thread .github/actions/build-and-test/action.yml
- dependabot.yml: Replace /src with per-package directories using
  'directories:' (plural) so Dependabot actually finds npm manifests

- publish.yml: Anchor symbols-exclusion regex to '\.symbols\.nupkg$'
  instead of the looser 'symbols' substring match

- publish.yml: Add global-json-file to setup-dotnet in publish-nuget job
  to pin the SDK version used for 'dotnet nuget push'

- publish.yml: Replace fixed Start-Sleep 180 with a polling loop that
  queries the nuget.org v3 flat-container API every 10s up to 15min

- publish.yml: Add clarifying comment explaining the intentional double-push
  pattern (publish-nuget owns primary OIDC push; PS script re-push is safe)

- publish.yml: Add comments explaining why working-directory is load-bearing
  for PackVSCodeExtension.ps1 and PackNpmPackage.ps1

- build-and-test/action.yml: Set npm cache to workspace-relative path
  .npm-cache (via 'npm config set cache') so it works on both Linux and
  Windows runners (Windows default cache is %%AppData%%\npm-cache, not ~/.npm)
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 8 comments.

Files not reviewed (1)
  • src/polyglot-notebooks-ui-components/package-lock.json: Language not supported
Comments suppressed due to low confidence (2)

.github/workflows/publish.yml:312

  • The polling loop always sleeps 10 seconds before its first probe, which means publish-vscode is delayed by ≥10s even when nuget.org has already indexed the package. More importantly, after a failed Invoke-RestMethod (e.g., transient 5xx), the catch branch logs the error but does not break — that's fine on its own, but the do { ... } while ([DateTimeOffset]::UtcNow -lt $timeout) only re-evaluates the time check after the body completes, so if many consecutive failures occur the loop will continue until timeout without ever succeeding. Consider probing once before sleeping (so the happy path returns immediately) and treating prolonged HTTP failures as a hard error rather than spinning silently.
      - name: Wait for nuget.org propagation
        if: needs.prepare.outputs.simulate != 'true'
        shell: pwsh
        run: |
          $version = '${{ needs.build.outputs.version }}'
          $packageId = 'Microsoft.dotnet-interactive'
          $url = "https://api.nuget.org/v3-flatcontainer/$($packageId.ToLower())/index.json"
          $timeout = [DateTimeOffset]::UtcNow.AddMinutes(15)
          do {
            Start-Sleep -Seconds 10
            try {
              $index = (Invoke-RestMethod -Uri $url).versions
              if ($index -contains $version) {
                Write-Host "Version $version is indexed on nuget.org"
                break
              }
              Write-Host "Waiting for version $version to appear on nuget.org..."
            } catch {
              Write-Host "Failed to query nuget.org: $_"
            }
          } while ([DateTimeOffset]::UtcNow -lt $timeout)
          if ([DateTimeOffset]::UtcNow -ge $timeout) {
            Write-Error "Timed out waiting for version $version to appear on nuget.org"
            exit 1
          }

.github/workflows/publish.yml:307

  • In try { $index = (Invoke-RestMethod -Uri $url).versions; ... } catch { ... }, when the request returns a 404 (which the v3 flat-container endpoint returns for a not-yet-indexed package id), Invoke-RestMethod throws and the catch swallows the error as "Failed to query nuget.org". That's fine for the very first push of a new package id but means transient network errors and "version not yet indexed" are indistinguishable in the logs. Consider differentiating 404s (expected during indexing) from other failures so operators can diagnose stuck publishes.
            try {
              $index = (Invoke-RestMethod -Uri $url).versions
              if ($index -contains $version) {
                Write-Host "Version $version is indexed on nuget.org"
                break
              }
              Write-Host "Waiting for version $version to appear on nuget.org..."
            } catch {
              Write-Host "Failed to query nuget.org: $_"
            }

Comment thread eng/CIBuild.cmd
Comment thread eng/cibuild.sh
Comment thread .github/workflows/publish.yml Outdated
Comment thread .github/workflows/publish.yml
Comment thread .github/actions/build-and-test/action.yml
Comment thread .github/actions/build-and-test/action.yml
Comment thread .github/dependabot.yml
Comment thread src/polyglot-notebooks-ui-components/package-lock.json
When CIBuild.cmd/cibuild.sh run on GitHub Actions they set DisableArcade=1,
which prevents the Arcade SDK from being imported in Directory.Build.targets.
Without the Arcade import, 'dotnet build' writes test binaries to the default
'bin/{Config}/{TFM}/' directory instead of 'artifacts/bin/{project}/…'.

However 'dotnet test --no-build' (run in a separate GHA step without
DisableArcade=1) evaluates the project with Arcade and expects to find the DLL
at the Arcade-convention path 'artifacts/bin/{project}/{Config}/{TFM}/'.  This
mismatch produced the 'test source file not found' failure.

Fix: in Directory.Build.targets, when DisableArcade=1, set ArtifactsDir,
ArtifactsBinDir, and OutputPath to match the Arcade layout.  This makes the
build output land in the same place that the test-runner evaluation expects.
Double-dashes are invalid in XML comments. Replace '--no-build' and the
em-dash ellipsis with plain ASCII equivalents.
The build scripts (CIBuild.cmd / cibuild.sh) set DisableArcade=1, but
env vars set inside a step are not inherited by subsequent GHA steps
unless written to GITHUB_ENV. The test-retry-runner.ps1 step therefore
evaluated project files without DisableArcade=1, so Arcade was imported
and OutDir was computed as artifacts/bin/{project}/... which is where the
DLLs were not found (they were built to the default bin/ path).

Fix: after setting DisableArcade=1, also append it to GITHUB_ENV so
that the Run-tests step inherits the variable and dotnet test --no-build
resolves the output path consistently with how the build ran.

Also revert the Directory.Build.targets PropertyGroup added in earlier
commits (which incorrectly caused dotnet pack to look for DLLs at
artifacts/bin/... while the build was still writing to bin/).
The CMD echo redirect in CIBuild.cmd was broken: 'echo DisableArcade=1>>'
is parsed by CMD as echoing 'DisableArcade=' with fd1 redirected to the
file, so GITHUB_ENV received an empty value. Similarly, relying on
GITHUB_ENV from a sibling step was fragile.

The clean fix is to explicitly set DisableArcade in the 'env:' block of
both test steps in action.yml. This ensures dotnet test evaluates projects
without Arcade (matching how they were built), so it finds DLLs at the
default bin/ path instead of artifacts/bin/.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 7 comments.

Files not reviewed (1)
  • src/polyglot-notebooks-ui-components/package-lock.json: Language not supported

Comment thread eng/CIBuild.cmd Outdated
Comment thread eng/CIBuild.cmd Outdated
Comment thread .github/workflows/publish.yml Outdated
Comment thread Directory.Build.targets Outdated
Comment thread src/Microsoft.DotNet.Interactive.Jupyter/JupyterKernel.cs
Comment thread src/polyglot-notebooks-ui-components/package-lock.json
Comment thread .github/workflows/publish.yml
- Forward unknown CIBuild.cmd arguments consistently with cibuild.sh and parse arbitrary SignType values
- Add no-build/no-restore flags to pack in CI scripts
- Harden publish NuGet package loops and nuget.org polling behavior
- Document VSIX artifact layout expected by publish helpers
- Pin npm registry setup and rewrite npm lockfile resolved URLs to npmjs.org
- Remove unrelated Directory.Build.targets whitespace
Keep CI pack in no-build/no-restore mode, but disable project reference builds so SDK 10 solution pack does not invoke Build with NoBuild=true and fail with NETSDK1085.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 16 changed files in this pull request and generated 6 comments.

Files not reviewed (1)
  • src/polyglot-notebooks/package-lock.json: Language not supported

Comment thread .github/workflows/publish.yml
Comment thread .github/workflows/ci.yml
Comment thread .github/actions/build-and-test/action.yml
Comment thread .github/workflows/publish.yml Outdated
Comment thread eng/CIBuild.cmd Outdated
Comment thread .github/workflows/publish.yml
- Pin GitHub-published actions to existing v4 major tags
- Resolve publish simulate mode explicitly by event type
- Extract publish version only from the bare Microsoft.dotnet-interactive package
- Quote forwarded Windows CI args and document expected token shape
- Set checkout to v6
- Set upload-artifact to v7 and download-artifact to v8
- Set setup-dotnet to v5, setup-node to v6, and cache to v5
@BenjaminMichaelis BenjaminMichaelis merged commit e551ae7 into main May 17, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants