Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .agents/building-and-testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,32 @@ Let's say the user wants to build a particular backend for a given platform. For
- Unless the user specifies that they want you to run the command, then just print it because not all agent frontends handle long running jobs well and the output may overflow your context
- The user may say they want to build AMD or ROCM instead of hipblas, or Intel instead of SYCL or NVIDIA insted of l4t or cublas. Ask for confirmation if there is ambiguity.
- Sometimes the user may need extra parameters to be added to `docker build` (e.g. `--platform` for cross-platform builds or `--progress` to view the full logs), in which case you can generate the `docker build` command directly.

## Test coverage gate
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would even enforce here for agents to install make install-hooks so automatically enroll themselves when pointed at this repository

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I agree, I'd even consider making the agent submit a hash of the test results, so it can't reason its way out of using the hooks without getting caught.


The core Go suites (`./pkg`, `./core`, plus the in-process integration suite `./tests/e2e`) are covered by a **strict, monotonic coverage ratchet**:

- `make test-coverage` — runs the suites with `covermode=atomic` instrumentation and writes a merged profile to `coverage/coverage.out`. Uses the same prerequisites as `make test`.
- **`--coverpkg` (`COVERAGE_COVERPKG = core/...,pkg/...`):** coverage is attributed to the core+pkg packages, not just the package under test. This is what lets the in-process `tests/e2e` suite (which drives the real HTTP server over loopback via `application.New`) credit the `core/http/endpoints/...` handlers it exercises — folding it in roughly doubled endpoint coverage (e.g. `endpoints/openai` 13.6% → 52%). The denominator is therefore *all* of `core`+`pkg` (minus generated proto, dropped via `COVERAGE_EXCLUDE_RE`), so the number isn't comparable to a plain per-package figure.
- **Integration suites (`COVERAGE_E2E_ROOTS = ./tests/e2e`)** run non-recursively (excludes `tests/e2e/distributed`, which needs containers) with `--label-filter=!real-models` (those need a downloaded model) against the mock backend built by `prepare-test`. `tests/integration` is deliberately excluded — it needs `make backends/local-store`, which the coverage CI job doesn't build.
- **Flake note:** folding integration tests into a *strict* gate means a hard e2e failure (or a spec that silently stops running) can fail the coverage gate, not just the test. `--flake-attempts` absorbs transient retryable failures; covermode=atomic keeps line coverage deterministic otherwise.
- **Why one ginkgo run per root (`scripts/run-coverage.sh`):** passing several recursive roots to a *single* ginkgo invocation (e.g. `ginkgo -r ./pkg ./core`) only merges **one** root's coverprofile into `--output-dir`/`--coverprofile` — the others are silently dropped. Verified with ginkgo 2.29.0: `-r ./pkg ./core` yields only `./pkg` coverage, while `-r ./core` alone yields all 34 core packages. So the script runs each root separately and concatenates the (disjoint) profiles. Don't "simplify" it back to a single multi-root invocation — that's how `core/` (including all of `core/http`, ~7.4k statements) silently vanished from the number before.
- **Build tags (`COVERAGE_TAGS`, passed via `GINKGO_TAGS`):** defaults to `debug auth`. The `auth` tag is required to compile the real (sqlite-backed) auth implementation and its ~150 `//go:build auth` tests — without it those files aren't built, the tests don't run, and the gate scores auth against a stub (~3.7% instead of ~38%). If you add new tag-gated tests, extend `COVERAGE_TAGS` or they won't count (and likely won't run in CI at all).
- `make test-coverage-check` — runs `test-coverage`, then `scripts/coverage-check.sh` fails the build if total coverage is **below** the committed baseline in `coverage-baseline.txt`. The Linux job in `.github/workflows/test.yml` runs this instead of `make test`.
- `make test-coverage-baseline` — regenerates and overwrites `coverage-baseline.txt` from the current run.
- `make install-hooks` — sets `core.hooksPath` to the versioned `.githooks/`, whose `pre-commit` runs checks scoped to what's staged: Go changes → `make lint` + `make test-coverage-check`; `core/http/react-ui/` changes → `make test-ui-coverage-check` (Playwright e2e + UI coverage gate). A commit touching neither is skipped; bypass with `git commit --no-verify`. The hook resolves golangci-lint's new-from base to `upstream/master` → `origin/master` → `master`, so it works from a fork clone where `origin/master` is stale (passed to `make lint` via `LINT_NEW_FROM`).

### React UI coverage

The React UI (`core/http/react-ui/`) has **no component/unit tests** — its only tests are the Playwright e2e specs in `e2e/`, which run against the real app served by `tests/e2e-ui/ui-test-server` (the dist is `//go:embed`ed, so the server is rebuilt per coverage run). Those specs do genuinely exercise the UI (clicks, `fill`, `setInputFiles`, `getByRole`/`getByText`, visibility/value assertions).

- `make test-ui-coverage` — builds an istanbul-instrumented bundle (`COVERAGE=true`, via `vite-plugin-istanbul` with `forceBuildInstrument: true` — the plugin skips production builds otherwise), re-embeds it into `ui-test-server` (the dist is `//go:embed`ed), runs the Playwright specs, and writes an `nyc` report to `core/http/react-ui/coverage/`. The specs import `{ test, expect }` from `e2e/coverage-fixtures.js` (re-exports Playwright's, plus harvests `window.__coverage__` into `.nyc_output/` after each test). Instrumentation is off unless `COVERAGE=true`, so dev/prod builds and plain `make test-ui-e2e` are unaffected (the fixture no-ops when `window.__coverage__` is absent).
- **Browser:** the flake dev shell ships `chromium` and exports `PLAYWRIGHT_CHROMIUM_PATH`; `playwright.config.js` uses it via `launchOptions.executablePath`, and the Makefile skips `playwright install` when it's set. This avoids Playwright's downloaded browser, which can't resolve system libs (`libglib-2.0`, …) on NixOS. In CI (no `PLAYWRIGHT_CHROMIUM_PATH`) the Makefile falls back to `playwright install --with-deps chromium`.
- The app is a React SPA, so coverage accumulates across in-app navigation within a test; a full `page.goto`/reload resets it.
- `.nycrc.json` uses `all: true`, so **every `src/**` file is in the report**, including 0%-coverage ones — that's how you spot features with no test at all (sort the HTML report or `coverage-summary.json` by line% ascending).
- **UI coverage gate:** `make test-ui-coverage-check` runs the suite then `scripts/ui-coverage-check.sh`, failing if total line coverage drops more than `UI_COVERAGE_TOLERANCE` (default **1.0pp**) below `core/http/react-ui/coverage-baseline.txt`. `make test-ui-coverage-baseline` regenerates the baseline. **Why a tolerance (unlike the strict Go gate):** UI e2e line coverage is *non-deterministic* — async/debounced paths (e.g. the VRAM estimate's 500ms debounce) make identical specs vary ~0.5pp run-to-run, so a zero-tolerance gate would flake. Keep the tolerance just above the observed jitter. Run in CI (`tests-ui-e2e.yml`) and pre-commit on `core/http/react-ui/` changes.

Rules:
- The gate is **strict — there is no tolerance**. Any decrease fails, regardless of how many lines a PR adds or deletes. `covermode=atomic` makes line coverage deterministic, so there's no run-to-run jitter to excuse.
- When a change legitimately **raises** coverage, run `make test-coverage-baseline` and **commit** the updated `coverage-baseline.txt` so the ratchet moves up. Never lower the baseline by hand.
- If you can't get coverage back to baseline, the fix is to **add tests**, not to edit the baseline.
60 changes: 60 additions & 0 deletions .githooks/pre-commit
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#!/usr/bin/env sh
#
# LocalAI pre-commit hook. Install it (once per clone) with:
#
# make install-hooks
#
# Runs only the checks relevant to what's staged:
# - Go files -> make lint + make test-coverage-check
# - core/http/react-ui -> make test-ui-coverage-check (Playwright e2e + gate)
# A commit touching neither is skipped entirely (docs/YAML/etc. can't change
# lint findings, Go coverage, or the UI).
#
# To bypass for a single commit (e.g. a WIP checkpoint): git commit --no-verify
set -eu

repo_root="$(git rev-parse --show-toplevel)"
cd "$repo_root"

staged="$(git diff --cached --name-only --diff-filter=ACMRD)"

go_changed=0
ui_changed=0
if echo "$staged" | grep -qE '\.go$'; then go_changed=1; fi
if echo "$staged" | grep -qE '^core/http/react-ui/'; then ui_changed=1; fi

if [ "$go_changed" -eq 0 ] && [ "$ui_changed" -eq 0 ]; then
echo "pre-commit: no Go or React UI changes staged — skipping."
exit 0
fi

if [ "$go_changed" -eq 1 ]; then
# Resolve the ref golangci-lint's new-from-merge-base should compare
# against. .golangci.yml pins origin/master, which is correct in CI
# (origin == the canonical repo) but wrong from a fork clone, where
# origin/master lags behind and lint would report the whole upstream
# backlog. Prefer upstream/master, then origin/master, then master.
lint_base=""
for ref in upstream/master origin/master master; do
if git rev-parse --verify --quiet "${ref}^{commit}" >/dev/null 2>&1; then
lint_base="$ref"
break
fi
done

echo "pre-commit ▶ golangci-lint (make lint${lint_base:+, new-from $lint_base})"
make lint LINT_NEW_FROM="$lint_base"

echo "pre-commit ▶ coverage gate (make test-coverage-check) — builds and runs the"
echo " pkg/core suites plus tests/e2e; can take a few minutes."
make test-coverage-check
fi

if [ "$ui_changed" -eq 1 ]; then
echo "pre-commit ▶ React UI e2e + coverage gate (make test-ui-coverage-check) —"
echo " rebuilds the UI + ui-test-server, runs the Playwright specs, and"
echo " fails if line coverage regressed; can take a couple of minutes."
make test-ui-coverage-check
fi

echo "pre-commit ✓ all relevant checks passed"
17 changes: 15 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,22 @@ jobs:
node-version: '22'
- name: Build React UI
run: make react-ui
- name: Test
# Runs the core suite with coverage and fails if total coverage dropped
# below the committed baseline (coverage-baseline.txt). The gate is
# strict — any decrease fails. Raise the baseline with
# `make test-coverage-baseline` and commit it when coverage rises.
- name: Test (with coverage gate)
run: |
PATH="$PATH:/root/go/bin" make --jobs 5 --output-sync=target test
PATH="$PATH:/root/go/bin" make --jobs 5 --output-sync=target test-coverage-check
- name: Upload coverage report
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: coverage-linux
path: |
coverage/coverage.out
coverage/coverage.html
if-no-files-found: ignore
- name: Setup tmate session if tests fail
if: ${{ failure() }}
uses: mxschmitt/action-tmate@v3.23
Expand Down
28 changes: 18 additions & 10 deletions .github/workflows/tests-ui-e2e.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@ jobs:
uses: actions/setup-node@v6
with:
node-version: '22'
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: '1.3.11'
- name: Proto Dependencies
run: |
curl -L -s https://github.com/protocolbuffers/protobuf/releases/download/v26.1/protoc-26.1-linux-x86_64.zip -o protoc.zip && \
Expand All @@ -48,23 +52,27 @@ jobs:
run: |
sudo apt-get update
sudo apt-get install -y build-essential libopus-dev
- name: Build UI test server
run: PATH="$PATH:$HOME/go/bin" make build-ui-test-server
- name: Install Playwright
working-directory: core/http/react-ui
run: |
npm install
npx playwright install --with-deps chromium
- name: Run Playwright tests
working-directory: core/http/react-ui
run: npx playwright test
# Builds an instrumented UI bundle, runs the Playwright specs, and fails
# if line coverage regressed beyond the jitter tolerance (the gate is
# in `make test-ui-coverage-check`). PLAYWRIGHT_CHROMIUM_PATH is unset
# here, so scripts/ensure-playwright-browser.sh installs Chromium via apt.
- name: Run UI e2e + coverage gate
run: PATH="$PATH:$HOME/go/bin" make test-ui-coverage-check
- name: Upload Playwright report
if: ${{ failure() }}
uses: actions/upload-artifact@v7
with:
name: playwright-report
path: core/http/react-ui/playwright-report/
retention-days: 7
- name: Upload UI coverage report
if: ${{ always() }}
uses: actions/upload-artifact@v7
with:
name: ui-coverage
path: core/http/react-ui/coverage/
if-no-files-found: ignore
retention-days: 7
- name: Setup tmate session if tests fail
if: ${{ failure() }}
uses: mxschmitt/action-tmate@v3.23
Expand Down
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,17 @@ docs/static/gallery.html
# per-developer customization files for the development container
.devcontainer/customization/*

# Coverage profiles (the committed baseline is coverage-baseline.txt)
/coverage/

# React UI build artifacts (keep placeholder dist/index.html)
core/http/react-ui/node_modules/
core/http/react-ui/dist

# React UI coverage (vite-plugin-istanbul + nyc, via `make test-ui-coverage`)
core/http/react-ui/.nyc_output/
core/http/react-ui/coverage/

# Extracted backend binaries for container-based testing
local-backends/

Expand Down
1 change: 1 addition & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,7 @@ For AI-assisted development, see [`AGENTS.md`](AGENTS.md) (or the equivalent [`C

- Prefer modern Go idioms — for example, use `any` instead of `interface{}`.
- Use [`golangci-lint`](https://golangci-lint.run) to catch common issues before submitting a PR.
- Run `make install-hooks` once per clone to enable the pre-commit hook: Go changes run `make lint` + the coverage gate (`make test-coverage-check`); `core/http/react-ui/` changes run the Playwright e2e suite (`make test-ui`). Bypass a single commit with `git commit --no-verify`.
- Use [`github.com/mudler/xlog`](https://github.com/mudler/xlog) for logging (same API as `slog`). Do not use `fmt.Println` or the standard `log` package for operational logging.
- Use tab indentation for Go files (as defined in `.editorconfig`).

Expand Down
Loading
Loading