v0.9.68 — Session reliability: drain stuck-active cap leak + /sessions 500 resilience + Daytona billing reaper
Fixes the production incident where sessions stuck in an active status with no live box ate the per-account concurrent-session cap — wedging Slack ("queued behind other project work") and 429'ing new sessions. Adds a DB-only reconcileStuckActiveSessions maintenance pass (immune to Daytona throttling) with a real-DB e2e, plus the already-merged /sessions list resilience (#3567) and the provider-agnostic Daytona billing reaper + orphan-box reaper. Also: honest, reason-aware Slack status messages (out-of-credits / at-cap / not-found).
What's Changed
- ci(deploy-prod): apply DB migrations before the API rolls by @markokraemer in #3555
- feat(connectors): default to one shared profile + prominent Connect CTA by @markokraemer in #3558
- ci(dev): make GitOps deploy bump explicit by @markokraemer in #3562
- chore(dev-eks): deploy dev-c29e43d4 [skip ci] by @github-actions[bot] in #3563
- feat(executor): Slack as a native
channelconnector (KORTIX-206 Phase A) by @markokraemer in #3564 - chore(dev-eks): deploy dev-908fc102 [skip ci] by @github-actions[bot] in #3565
- feat(web): Slack
channelconnector first-class in the Executor tab (KORTIX-206 Phase B) by @markokraemer in #3566 - fix(sessions): one bad sandbox no longer 500s the session list (+ stop Daytona rate-limit amplification) by @markokraemer in #3567
- chore(dev-eks): deploy dev-d025307a [skip ci] by @github-actions[bot] in #3569
- platform: Botkube Slack bot (alerts + read-only kubectl) by @lillyboga in #3570
- fix(sandbox): provider-authoritative orphan-box reaper (auto-stop leaked running boxes) by @markokraemer in #3571
- feat(triggers): trigger owner — run automations as a chosen member by @markokraemer in #3568
- chore(dev-eks): deploy dev-093db4ea [skip ci] by @github-actions[bot] in #3572
- fix(botkube): plugin RBAC context by @lillyboga in #3573
- fix(executor): install-driven channel materialization + missing Channels i18n key (KORTIX-206) by @markokraemer in #3574
- chore(dev-eks): deploy dev-dd66b2c1 [skip ci] by @github-actions[bot] in #3575
- test(executor-sdk): in-depth unit suite + live e2e + harden connectors() (KORTIX-206) by @markokraemer in #3576
- feat(slack): shim vendor calls → Executor SDK (KORTIX-206 Phase C1) by @markokraemer in #3577
- review: fix Grafana dashboard field config schema by @agent-kortix in #3554
- review: auto-fix executor release and auth gaps by @agent-kortix in #3547
- feat(slack): token out of the sandbox — file proxies + strip SLACK_BOT_TOKEN (KORTIX-206 Phase C2) by @markokraemer in #3578
- refactor(sandbox): rename agent-cli → slack-cli (it's just the Slack shim now) (KORTIX-206) by @markokraemer in #3579
- chore(dev-eks): deploy dev-642c7149 [skip ci] by @github-actions[bot] in #3580
- docs(infra): where logs live (Better Stack) + ECS-is-warm-standby by @markokraemer in #3582
- fix(slack)+docs: file upload IS supported + purge stale token/provider docs (KORTIX-206) by @markokraemer in #3584
- chore(dev-eks): deploy dev-68fa486c [skip ci] by @github-actions[bot] in #3585
- fix(triggers): make per-project pause kill-switch reachable + add UI toggle by @markokraemer in #3583
- chore(dev-eks): deploy dev-cf587150 [skip ci] by @github-actions[bot] in #3586
- QA: consolidated test framework + LLM gateway tests (validate on branch before main) by @lillyboga in #3495
- chore(dev-eks): deploy dev-61c3f783 [skip ci] by @github-actions[bot] in #3589
- QA portal publishing + CI fixes (follow-up to #3495) by @lillyboga in #3590
- feat(db): replace supabase/migrations + drizzle-push with node-pg-migrate by @lillyboga in #3588
- fix(qa-portal): serve full reports/ tree so per-PR links resolve by @lillyboga in #3591
- fix(api): bundle migrations into the image (image build broken on main) by @lillyboga in #3592
- chore(dev-eks): deploy dev-3a32db2d [skip ci] by @github-actions[bot] in #3593
- test(report): include co-located bun suites (~830) + ke2e flows in Allure by @lillyboga in #3594
- feat(qa-portal): publish release/nightly + landing page by @lillyboga in #3595
- feat(db): block out-of-sequence / duplicate / malformed migrations in CI by @lillyboga in #3596
- chore(dev-eks): deploy dev-c1570b79 [skip ci] by @github-actions[bot] in #3597
- fix(qa-main): install pnpm for the migration job by @lillyboga in #3598
- test(e2e): accept OAuth/SSO on the sign-in reachability check by @lillyboga in #3599
- fix(qa-main): install bun for the migration job by @lillyboga in #3600
- ci(qa-main): make visual regression advisory by @lillyboga in #3601
- fix(ke2e): time-bound Supabase auth so setup can't hang silently by @lillyboga in #3604
- fix(qa-release): repair duplicate env + make visual advisory by @lillyboga in #3603
- fix(ke2e): fund the OWNER at provision time (clears ~64 release failures) by @lillyboga in #3605
- perf(ke2e): parallelize 79 isolated flows (serial 98 -> 19) by @lillyboga in #3606
- fix(web): tuck 'pause all triggers' kill-switch into Settings (out of the Triggers tab) by @markokraemer in #3607
- ci(qa-release): make ke2e API flow suite opt-in by @lillyboga in #3608
- fix(infra): repair the make infra lane (tflint/helm-validate/kubeconform) by @lillyboga in #3609
- fix(sessions): drain cap-eating stuck-active sessions + honest Slack status by @markokraemer in #3610
- ci(qa-release): slim the blocking gate (move exhaustive scans to nightly) by @lillyboga in #3611
- ci(qa-release): fast static security (drop the app-image build) by @lillyboga in #3612
- test(reaper): real-DB e2e for stuck-session reconciliation by @markokraemer in #3613
- chore(dev-eks): deploy dev-336463b3 [skip ci] by @github-actions[bot] in #3615
- ci(qa-release): static security advisory (gate on migration+infra) by @lillyboga in #3614
Full Changelog: v0.9.67...v0.9.68