Skip to content

Releases: anousss007/laravel-vigilance

v0.5.6 — multi-node supervisor/worker state fix

16 Jun 13:19

Choose a tag to compare

Multi-node fix from a distributed-deployment attack pass, plus an adversarial audit of the RUM symbolicator.

Fixed

  • Multi-node fleets under-reported workers; supervisors clobbered each other. Supervisor/worker heartbeat rows were keyed by supervisor name only (vigilance_supervisors.name was even the primary key). Running the same supervisor on multiple servers — normal horizontal scaling — meant each node's heartbeat overwrote the others' row and each node's worker-set write deleted the other nodes' worker rows, so the dashboard showed one flapping node and a worker count well below the real fleet. State is now keyed by (name, host): every node keeps its own rows, the dashboard shows each node (with hostname) and true fleet totals, and prune/forget act per-node. Node identity is configurable via supervision.host / VIGILANCE_SUPERVISOR_HOST (default: machine hostname — set it for containers).

Schema note: vigilance_supervisors gains an id PK + unique(name, host); vigilance_workers is now unique(supervisor, host, pid). Existing installs run php artisan migrate:fresh (supervisor/worker rows are ephemeral heartbeats — nothing of value is lost).

Validated (no code change)

  • RUM symbolicator attacked with 200 KB pathological stacks (no ReDoS — 0.1 ms), malformed source maps (bad JSON / bad VLQ) and high token counts: stays fast, degrades to unsymbolicated without crashing. Stacks capped (8 KB) and ≤5 errors/request before symbolication, atop the endpoint rate limit.

Full notes: CHANGELOG.md.

v0.5.5 — concurrency fix + adversarial validation

16 Jun 12:34

Choose a tag to compare

A second, harder adversarial pass — DDoS/flood amplification, a cardinality bomb, a failing-job storm, a job-dispatching endpoint under flood, multi-driver supervision and the public RUM endpoint — which surfaced and fixed one real concurrency bug.

Fixed

  • Failure-group occurrence counts undercounted under concurrent failures. The per-group occurrences counter was a read-modify-write, so simultaneous workers recording the same failure signature (a failing-job storm) clobbered each other's increments (~10% loss measured across 2000 failures on 3 workers). Now uses a race-safe createOrFirst() (on the unique signature index → no duplicate groups) plus an atomic SQL increment, so the count is exact under any concurrency.

Validated (no code change)

  • No DDoS amplification — throughput/error-rate identical with Vigilance on or off under sustained flood; enabling it never introduced an error.
  • Bounded cardinality — thousands of distinct URLs collapse to the route pattern in APM (1 key); random 404 floods write nothing (only matched routes are recorded). The aggregate tables can't be exploded.
  • Job-dispatch storm — an endpoint enqueuing 10 jobs/request under flood captured every job exactly, no failed requests; VIGILANCE_SAMPLE_RATE throttles enqueue-write load.
  • Multiple queue drivers at once — one vigilance:supervise drained database + Redis + beanstalkd concurrently, correct per-driver attribution.
  • Public RUM endpoint — rate-limited (rum.throttle, default 120/min) and capped to ≤12 metrics + ≤5 errors per request with length-bounded fields; safe to expose.
  • Extreme concurrency — graceful degradation + immediate recovery, no Vigilance-induced errors, ~one storage connection per worker.

Full notes: CHANGELOG.md.

v0.5.4 — production hardening across runtimes, queues & databases

16 Jun 11:53

Choose a tag to compare

A relentless production-readiness pass on real Linux infrastructure — every common web server and app runtime, all four supervisable queue drivers, server-class databases, storage-outage chaos and high concurrency — which surfaced and fixed four real issues.

Fixed

  • Long-running daemons no longer dangle as stuck "running" command runs. octane:start, reverb:start, pulse:work/pulse:check are excluded from command capture via an unconditional Defaults::daemonCommands() baseline (protects installs whose published config predates the list).
  • Redis queue names normalized (queues:defaultdefault) so per-queue grouping is consistent across drivers and matches the supervisor / queue-depth probe / config.
  • Batched jobs link to their batchbatch_id is now recorded on batched runs.
  • Orphaned workers are reaped on supervisor boot — when the master is hard-killed (SIGKILL/OOM, or restarted under a non-cgroup manager like supervisord), its queue:work children no longer pile up. Completes the cross-platform reap the #vigilance name marker always intended.

Validated (no code change)

  • Web servers: Nginx, Apache (mod_proxy_fcgi), Caddy — all + PHP-FPM.
  • Octane on every server: FrankenPHP, Swoole 6.2, OpenSwoole 26.2, RoadRunner 2025.1 — 800 req @ concurrency 16, 0 failed, constant per-request span count (no cross-request leakage).
  • All four supervisable queue drivers: database, Redis, beanstalkd (1.13 + pheanstalk v8), each drained by the auto-scaling supervisor.
  • supervisord + OPcache with config:cache/route:cache/event:cache.
  • Never breaks the app when its storage is down: storage taken down mid-traffic → 100% of requests still served, queue still drained, capture resumed cleanly on recovery.
  • Job lifecycles: retries, timeouts (captured as failures), batches, chains.
  • Concurrency: 1200 req @ 24 against MySQL — no lost writes, aggregate counts exact, no deadlocks.
  • Dashboard at scale: every page 200 / sub-300 ms on 60k runs / 100k entries / 22k traces.
  • Fresh install: vigilance:install, migrate, vigilance:doctor (green), dashboard — all clean.
  • Full suite green on PostgreSQL 18.4 and MySQL 8.4 (CI uses PG 16 + MariaDB 11.4).

Full notes: see CHANGELOG.md.

v0.5.3 — cross-database hardening (MySQL install fix)

15 Jun 23:37

Choose a tag to compare

Cross-database hardening. The test suite previously ran only on SQLite in CI; this release adds PostgreSQL 16 and MariaDB 11.4 to CI (the suite is now connection-configurable via VIGILANCE_TEST_DB), which surfaced and fixed real bugs the other engines hit.

Fixed

  • 🔴 MySQL / MariaDB install was completely broken. The vigilance_aggregates unique index auto-named to 65 chars — over MySQL's 64-char identifier limit (error 1059) — so migrations failed outright. PostgreSQL silently truncates and SQLite has no limit, which is why SQLite-only CI never caught it. Index names are now explicit.
  • PostgreSQL: float → bigint. wait_ms / duration_ms wrote Carbon-3 float milliseconds into bigint columns (PG rejects; MySQL/SQLite truncate). Now cast to int.
  • Cross-driver LIKE filters. Silenced-jobs and name/message searches used LIKE with class names whose backslashes are escape chars on PG/MySQL (not SQLite), so they silently failed there — new Like helper with an explicit ESCAPE clause.
  • Queue-depth probe. A missing jobs table threw, and on PostgreSQL a thrown query inside a transaction aborts it — QueueDepth now checks the table exists first.

Verified

  • Full suite green on SQLite, PostgreSQL 16, MariaDB 11.4 (234 tests each).
  • Worker supervisor drained real queues on the database and redis drivers; failed jobs routed to failed_jobs; all runs captured.
  • Octane state-reset hook flushes the in-flight trace on RequestReceived (no cross-request leakage).

v0.5.2 — Boost guidelines: once-per-incident alerting

15 Jun 17:34

Choose a tag to compare

Docs-only patch. The Laravel Boost AI guidelines (core.blade.php) now document the once-per-incident alerting behaviour from v0.5.1 — so coding agents know a sustained condition alerts once and the rest lives on the dashboard. Matches the Boost skill, README, observability guide and config. No code change.

v0.5.1 — notify once per incident (no alert spam)

15 Jun 16:55

Choose a tag to compare

Fixed

  • Alert spam on sustained conditions. A persistent problem (e.g. a breaching SLO) used to re-email you every throttle window — bad DX. With incident tracking on (the default), Vigilance now notifies once when the incident opens, and again only if its severity escalates or it resolves and later recurs. So a SLO that stays breached pages you once, not every 15 minutes.
  • Set alerts.renotify_minutes (VIGILANCE_ALERT_RENOTIFY_MINUTES) > 0 to get a reminder every N minutes while an incident stays open (default 0 = notify once). With incidents disabled, behaviour is unchanged (one notification per throttle_minutes).

This applies to every rule (SLO burn, queue backlog, error rate, anomalies, deploy regressions, …).

v0.5.0 — release health, smarter alerting, RUM source maps

15 Jun 16:46

Choose a tag to compare

A proactive-monitoring release — Vigilance now gates deploys, finds the problems no fixed threshold would catch, and makes browser errors readable. Full guide: docs/observability.md.

Added

  • Release health & deploy-regression guard (/vigilance/releases). Each deploy marker gets a before/after verdict (error rate · latency · throughput); a regressed deploy fires a critical deploy_regression alert — point a webhook at it to auto-roll-back. Issues are tagged with first_release / regressed_release. Set the release via VIGILANCE_RELEASE (or app.version); record with php artisan vigilance:deploy --release=….
  • New-issue & regression alertingnew_issue fires the first time an error signature appears; issue_regression fires when a resolved issue comes back (with a "regressed" badge). Evaluated at snapshot time, never on the request thread.
  • Dynamic-baseline anomaly detection (anomaly) — z-scores each watched metric (request latency, 5xx rate, exceptions by default) against its rolling baseline; guarded against false positives.
  • RUM source-map symbolication — a pure-PHP Source Map v3 decoder + php artisan vigilance:sourcemaps <build-dir> --release=…. Minified browser stacks are symbolicated at ingest, so the Issues inbox shows original source locations. Toggle rum.symbolicate.
  • Global ignore_paths — one config list (wildcards like /admin/* or #regex#) excludes a path from APM, tracing, RUM and web-request error capture at once.

Notes

  • 230 tests (+21) · axe 0 violations (desktop + mobile) on all new/changed pages · CI on PHP 8.2–8.4 · Laravel 12/13 · Livewire 3/4.
  • Additive migrations (regressed_at, first_release/regressed_release on failure groups; vigilance_sourcemaps table) — php artisan migrate after upgrading.

v0.4.1 — Boost integration for the observability suite

15 Jun 12:10

Choose a tag to compare

A maintenance release that completes the v0.4.0 rollout.

Changed

  • Laravel Boost integration updated for the full observability suite — the AI guidelines (resources/boost/guidelines/core.blade.php) and the vigilance-development skill now cover Issues error tracking, per-route performance, RUM / Web Vitals, SLOs, custom business metrics, the trace-correlated log explorer, and the expanded alerting channels (Discord / Teams / webhooks) with incident tracking — including Vigilance::increment() / gauge() and @vigilanceRum snippets. Coding agents (Claude Code, Cursor, Copilot, …) now generate correct code against the new features.

Added

  • RELEASING.md — a pre-release checklist covering code, version strings, changelog, user-facing docs, the Boost integration, accessibility, and the tag/release steps, so every release moves all surfaces forward together.

v0.4.0 — observability suite

15 Jun 11:59

Choose a tag to compare

A front-to-back observability release — seven new dashboard areas, each built to the same production-first posture as the rest of Vigilance (captured cheaply, flushed after the response, sampled and bounded). Full guide: docs/observability.md.

Added

  • Unified Issues error tracker (/vigilance/issues) — every exception across web, queue, command, Vigilance::report() and browser errors, fingerprinted into a grouped inbox with stacktrace, context, occurrence sparkline and an assign / prioritise / ack / mute / resolve workflow.
  • Per-route performance (/vigilance/routes) — throughput, error rate, Apdex and exact p50/p95/p99 latency per route.
  • Real User Monitoring (/vigilance/vitals) — Core Web Vitals (LCP/INP/CLS/FCP/TTFB) + JS errors from real visitors via the @vigilanceRum beacon. Off by default (VIGILANCE_RUM).
  • SLOs & error budgets (/vigilance/slos) — availability / latency objectives vs. an error budget, with a short-window burn-rate alert.
  • Alerting depth & incidents (/vigilance/incidents) — Discord, Microsoft Teams and generic webhooks on top of mail / Slack; fired alerts persisted as incidents (auto-resolved) with occurrence counts and MTTR.
  • Custom business metrics (/vigilance/custom-metrics) — Vigilance::increment() / gauge() → auto-discovered counter & gauge cards with sparklines.
  • Trace-correlated log explorer (/vigilance/logs) — searchable application logs correlated to the trace that emitted them. Off by default (VIGILANCE_LOGS).

Changed

  • Tracing now also records redis, mail and notification spans.
  • New docs/observability.md; README + docs site updated. Every new page verified with axe-core — zero violations, desktop and mobile. CI on actions/checkout@v6.

Schema note: new columns/tables were folded into the base migration. If you ran a pre-0.4 dev build, run php artisan migrate:fresh to pick them up.

v0.3.0

15 Jun 06:58

Choose a tag to compare

Minor release.

Added

  • Laravel Boost integration. Vigilance now ships first-class Laravel Boost support: AI guidelines (resources/boost/guidelines/core.blade.php) and a vigilance-development agent skill (resources/boost/skills/vigilance-development/SKILL.md). In any project running Boost, boost:install / boost:update automatically loads them, so coding agents (Claude Code, Cursor, Copilot, …) know Vigilance's conventions out of the box — dashboard authorization (viewVigilance), the Dispatchable / ShouldNotBeMonitored / ShouldNotBeDispatchedManually markers, the driver-agnostic worker supervisor, .env alert routing, APM and tracing — and generate correct code against the package. Verified against the installed Boost's package-discovery.

Full changelog: https://github.com/anousss007/laravel-vigilance/blob/main/CHANGELOG.md