fix(ci): pin baseline x86-64 compiler target to prevent SIGILL on cached venvs#1785
Conversation
Greptile SummaryThis PR fixes intermittent
|
| Filename | Overview |
|---|---|
| .github/workflows/_test.yml | Adds CFLAGS/CXXFLAGS=-march=x86-64 step before poetry install (Linux-only, correctly gated) and includes runner arch in cache key — the core fix for cross-CPU SIGILL crashes. |
| .github/workflows/lint.yml | Sets CFLAGS/CXXFLAGS at workflow-level env (safe since lint always runs on ubuntu-latest) and adds arch to cache key — consistent with the fix in _test.yml. |
| .github/workflows/full-tests.yml | Replaced scoped triggers (branch-filtered push, PR review requests, workflow_dispatch) with a bare push: that fires the full Windows+macOS matrix on every push to every branch — excessive CI spend if merged. |
| .github/workflows/pr-tests.yml | Changed from pull_request to push trigger (developer noted this is a temporary testing commit); removes fork-PR CI coverage. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Push / PR Event] --> B{Workflow trigger}
B --> C[pr-tests.yml\npush only]
B --> D[full-tests.yml\npush only]
B --> E[lint.yml\npull_request + push to main/develop]
C --> F[Ubuntu matrix\n4 Python versions]
D --> G[Windows + macOS matrix\n4 Python versions each]
F --> H{Cache hit?}
G --> H
H -->|Yes| I[Ensure cache healthy\npoetry run pip --version]
H -->|No| J[Set CFLAGS/CXXFLAGS\n-march=x86-64\nLinux only]
I -->|Healthy| K[poetry install]
I -->|Unhealthy| L[rm -rf .venv]
L --> J
J --> K
K --> M[Native extensions compiled\nwith baseline x86-64 ISA]
M --> N[Cache venv\nkey: OS + arch + py-version + lock-hash]
N --> O[Run tests]
style J fill:#90EE90
style M fill:#90EE90
style D fill:#FFB6C1
Comments Outside Diff (1)
-
.github/workflows/full-tests.yml, line 4-5 (link)Bare
push:trigger runs full matrix on every branch pushThe original trigger had branch filters (
main,develop,v*tags), PR review-request triggers, andworkflow_dispatch. The new barepush:fires the full Windows + macOS matrix (8 jobs: 2 OS × 4 Python versions) on every push to every feature branch, hotfix branch, or draft branch — with no way to manually re-trigger via the Actions UI either (workflow_dispatchwas removed). Combined withpr-tests.ymlalso using a barepush:, both workflows now fire simultaneously on every branch push. If this is a temporary testing commit, it should not land indevelopas-is.Prompt To Fix With AI
This is a comment left during a code review. Path: .github/workflows/full-tests.yml Line: 4-5 Comment: **Bare `push:` trigger runs full matrix on every branch push** The original trigger had branch filters (`main`, `develop`, `v*` tags), PR review-request triggers, and `workflow_dispatch`. The new bare `push:` fires the full Windows + macOS matrix (8 jobs: 2 OS × 4 Python versions) on every push to every feature branch, hotfix branch, or draft branch — with no way to manually re-trigger via the Actions UI either (`workflow_dispatch` was removed). Combined with `pr-tests.yml` also using a bare `push:`, both workflows now fire simultaneously on every branch push. If this is a temporary testing commit, it should not land in `develop` as-is. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: .github/workflows/full-tests.yml
Line: 4-5
Comment:
**Bare `push:` trigger runs full matrix on every branch push**
The original trigger had branch filters (`main`, `develop`, `v*` tags), PR review-request triggers, and `workflow_dispatch`. The new bare `push:` fires the full Windows + macOS matrix (8 jobs: 2 OS × 4 Python versions) on every push to every feature branch, hotfix branch, or draft branch — with no way to manually re-trigger via the Actions UI either (`workflow_dispatch` was removed). Combined with `pr-tests.yml` also using a bare `push:`, both workflows now fire simultaneously on every branch push. If this is a temporary testing commit, it should not land in `develop` as-is.
```suggestion
on:
pull_request:
types: [review_requested, ready_for_review]
paths-ignore:
- "**/*.md"
- ".github/**"
push:
branches:
- main
- develop
tags:
- "v*"
workflow_dispatch:
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (4): Last reviewed commit: "fix(ci): pin baseline x86-64 compiler ta..." | Re-trigger Greptile
📝 WalkthroughWalkthroughThe GitHub Actions workflow cache health check was modified to verify the cached virtual environment by attempting to import the Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.github/workflows/_test.yml (1)
83-83: Guardtimeoutby runner OS in this reusable workflow.At Line 83,
timeoutis Linux/GNU-specific; if callers use macOS/Windows images, this can invalidate.venvon every cache hit even when healthy.Proposed fix
- run: timeout 10s poetry run python -c "import annoy" || rm -rf .venv + run: | + CHECK='from annoy import AnnoyIndex; i=AnnoyIndex(8,"angular"); i.add_item(0,[0.0]*8); i.build(1)' + if [[ "$RUNNER_OS" == "Linux" ]]; then + timeout 10s poetry run python -c "$CHECK" || rm -rf .venv + else + poetry run python -c "$CHECK" || rm -rf .venv + fi#!/bin/bash # Inspect callers of this reusable workflow and what images they pass. # Expected: if non-Linux images are used, timeout guard is required. rg -n -C3 --iglob '*.yml' --iglob '*.yaml' '_test\.yml|uses:.*_test\.yml|image:' .github/workflows🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/_test.yml at line 83, The run step that executes `timeout 10s poetry run python -c "import annoy" || rm -rf .venv` uses the Linux-specific `timeout` binary; change the step so it checks the runner OS (runner.os) and only uses `timeout` on Linux, falling back to running `poetry run python -c "import annoy"` without `timeout` on non-Linux runners, and preserve the existing `|| rm -rf .venv` behavior if the import fails; update the workflow run step (the `run:` that contains `timeout`) to use a conditional shell branch based on `runner.os` so macOS/Windows callers do not invoke the Linux `timeout` command.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.github/workflows/_test.yml:
- Line 83: The run step that executes `timeout 10s poetry run python -c "import
annoy" || rm -rf .venv` uses the Linux-specific `timeout` binary; change the
step so it checks the runner OS (runner.os) and only uses `timeout` on Linux,
falling back to running `poetry run python -c "import annoy"` without `timeout`
on non-Linux runners, and preserve the existing `|| rm -rf .venv` behavior if
the import fails; update the workflow run step (the `run:` that contains
`timeout`) to use a conditional shell branch based on `runner.os` so
macOS/Windows callers do not invoke the Linux `timeout` command.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: c48424a5-bd84-4df6-a6e7-8ce6cb0c28f2
📒 Files selected for processing (1)
.github/workflows/_test.yml
28a27a0 to
fb3339d
Compare
b29ae7a to
1cf7365
Compare
There was a problem hiding this comment.
Pull request overview
Pins Linux compiler target to baseline x86-64 in CI to prevent cached virtualenvs from containing native extensions compiled with newer SIMD instructions that can SIGILL on older GitHub-hosted runners.
Changes:
- Set
CFLAGS/CXXFLAGSto-march=x86-64on Linux runners before installing dependencies. - Modify PR test workflow trigger configuration.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| .github/workflows/_test.yml | Exports baseline x86-64 compiler flags for Linux builds before poetry install. |
| .github/workflows/pr-tests.yml | Changes workflow trigger from pull_request to push and adjusts ignored paths. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
a62b2dd to
1a5cc57
Compare
69c563f to
fe12b3e
Compare
|
caches were deleted and recreated on this PR. So we should be OK. We must watch for similar issue in case it reappear |
Description
CI tests intermittently crash with
Fatal Python error: Illegal instruction(SIGILL, exit code 132) on Ubuntu runners.GitHub Actions
ubuntu-latestrunners have heterogeneous CPU microarchitectures some support AVX-512, others only AVX2. When native C extensions (e.g.annoy,numpy) are compiled from source on a runner with newer SIMD instructions and the resulting.venvis cached, restoring that cache on a runner with an older CPU causes native code to hit unsupported instructions.The cache key includes OS, arch (
x86_64), and Python version — but not the CPU model. Sinceuname -mreturnsx86_64regardless of instruction set support, different CPUs produce identical cache keys.Fix
Set
CFLAGSandCXXFLAGSto-march=x86-64on Linux runners beforepoetry install. This forces all native extensions to compile for the baseline x86-64 instruction set, making cached.sofiles safe across any x86-64 runner.Scoped to Linux only
closes #1659
Summary by CodeRabbit