Skip to content

v0.9.68 — Session reliability: drain stuck-active cap leak + /sessions 500 resilience + Daytona billing reaper

Choose a tag to compare

@github-actions github-actions released this 21 Jun 17:30
· 74 commits to main since this release
58b8b1e

Fixes the production incident where sessions stuck in an active status with no live box ate the per-account concurrent-session cap — wedging Slack ("queued behind other project work") and 429'ing new sessions. Adds a DB-only reconcileStuckActiveSessions maintenance pass (immune to Daytona throttling) with a real-DB e2e, plus the already-merged /sessions list resilience (#3567) and the provider-agnostic Daytona billing reaper + orphan-box reaper. Also: honest, reason-aware Slack status messages (out-of-credits / at-cap / not-found).

What's Changed

  • ci(deploy-prod): apply DB migrations before the API rolls by @markokraemer in #3555
  • feat(connectors): default to one shared profile + prominent Connect CTA by @markokraemer in #3558
  • ci(dev): make GitOps deploy bump explicit by @markokraemer in #3562
  • chore(dev-eks): deploy dev-c29e43d4 [skip ci] by @github-actions[bot] in #3563
  • feat(executor): Slack as a native channel connector (KORTIX-206 Phase A) by @markokraemer in #3564
  • chore(dev-eks): deploy dev-908fc102 [skip ci] by @github-actions[bot] in #3565
  • feat(web): Slack channel connector first-class in the Executor tab (KORTIX-206 Phase B) by @markokraemer in #3566
  • fix(sessions): one bad sandbox no longer 500s the session list (+ stop Daytona rate-limit amplification) by @markokraemer in #3567
  • chore(dev-eks): deploy dev-d025307a [skip ci] by @github-actions[bot] in #3569
  • platform: Botkube Slack bot (alerts + read-only kubectl) by @lillyboga in #3570
  • fix(sandbox): provider-authoritative orphan-box reaper (auto-stop leaked running boxes) by @markokraemer in #3571
  • feat(triggers): trigger owner — run automations as a chosen member by @markokraemer in #3568
  • chore(dev-eks): deploy dev-093db4ea [skip ci] by @github-actions[bot] in #3572
  • fix(botkube): plugin RBAC context by @lillyboga in #3573
  • fix(executor): install-driven channel materialization + missing Channels i18n key (KORTIX-206) by @markokraemer in #3574
  • chore(dev-eks): deploy dev-dd66b2c1 [skip ci] by @github-actions[bot] in #3575
  • test(executor-sdk): in-depth unit suite + live e2e + harden connectors() (KORTIX-206) by @markokraemer in #3576
  • feat(slack): shim vendor calls → Executor SDK (KORTIX-206 Phase C1) by @markokraemer in #3577
  • review: fix Grafana dashboard field config schema by @agent-kortix in #3554
  • review: auto-fix executor release and auth gaps by @agent-kortix in #3547
  • feat(slack): token out of the sandbox — file proxies + strip SLACK_BOT_TOKEN (KORTIX-206 Phase C2) by @markokraemer in #3578
  • refactor(sandbox): rename agent-cli → slack-cli (it's just the Slack shim now) (KORTIX-206) by @markokraemer in #3579
  • chore(dev-eks): deploy dev-642c7149 [skip ci] by @github-actions[bot] in #3580
  • docs(infra): where logs live (Better Stack) + ECS-is-warm-standby by @markokraemer in #3582
  • fix(slack)+docs: file upload IS supported + purge stale token/provider docs (KORTIX-206) by @markokraemer in #3584
  • chore(dev-eks): deploy dev-68fa486c [skip ci] by @github-actions[bot] in #3585
  • fix(triggers): make per-project pause kill-switch reachable + add UI toggle by @markokraemer in #3583
  • chore(dev-eks): deploy dev-cf587150 [skip ci] by @github-actions[bot] in #3586
  • QA: consolidated test framework + LLM gateway tests (validate on branch before main) by @lillyboga in #3495
  • chore(dev-eks): deploy dev-61c3f783 [skip ci] by @github-actions[bot] in #3589
  • QA portal publishing + CI fixes (follow-up to #3495) by @lillyboga in #3590
  • feat(db): replace supabase/migrations + drizzle-push with node-pg-migrate by @lillyboga in #3588
  • fix(qa-portal): serve full reports/ tree so per-PR links resolve by @lillyboga in #3591
  • fix(api): bundle migrations into the image (image build broken on main) by @lillyboga in #3592
  • chore(dev-eks): deploy dev-3a32db2d [skip ci] by @github-actions[bot] in #3593
  • test(report): include co-located bun suites (~830) + ke2e flows in Allure by @lillyboga in #3594
  • feat(qa-portal): publish release/nightly + landing page by @lillyboga in #3595
  • feat(db): block out-of-sequence / duplicate / malformed migrations in CI by @lillyboga in #3596
  • chore(dev-eks): deploy dev-c1570b79 [skip ci] by @github-actions[bot] in #3597
  • fix(qa-main): install pnpm for the migration job by @lillyboga in #3598
  • test(e2e): accept OAuth/SSO on the sign-in reachability check by @lillyboga in #3599
  • fix(qa-main): install bun for the migration job by @lillyboga in #3600
  • ci(qa-main): make visual regression advisory by @lillyboga in #3601
  • fix(ke2e): time-bound Supabase auth so setup can't hang silently by @lillyboga in #3604
  • fix(qa-release): repair duplicate env + make visual advisory by @lillyboga in #3603
  • fix(ke2e): fund the OWNER at provision time (clears ~64 release failures) by @lillyboga in #3605
  • perf(ke2e): parallelize 79 isolated flows (serial 98 -> 19) by @lillyboga in #3606
  • fix(web): tuck 'pause all triggers' kill-switch into Settings (out of the Triggers tab) by @markokraemer in #3607
  • ci(qa-release): make ke2e API flow suite opt-in by @lillyboga in #3608
  • fix(infra): repair the make infra lane (tflint/helm-validate/kubeconform) by @lillyboga in #3609
  • fix(sessions): drain cap-eating stuck-active sessions + honest Slack status by @markokraemer in #3610
  • ci(qa-release): slim the blocking gate (move exhaustive scans to nightly) by @lillyboga in #3611
  • ci(qa-release): fast static security (drop the app-image build) by @lillyboga in #3612
  • test(reaper): real-DB e2e for stuck-session reconciliation by @markokraemer in #3613
  • chore(dev-eks): deploy dev-336463b3 [skip ci] by @github-actions[bot] in #3615
  • ci(qa-release): static security advisory (gate on migration+infra) by @lillyboga in #3614

Full Changelog: v0.9.67...v0.9.68