Skip to content

fix: unset PKG_EXECPATH and stale broker token vars in startup-cleanup#152

Merged
benvinegar merged 1 commit intomainfrom
fix/bridge-env-leak
Feb 24, 2026
Merged

fix: unset PKG_EXECPATH and stale broker token vars in startup-cleanup#152
benvinegar merged 1 commit intomainfrom
fix/bridge-env-leak

Conversation

@baudbot-agent
Copy link
Copy Markdown
Collaborator

@baudbot-agent baudbot-agent commented Feb 23, 2026

Problem

PR #148 introduced bridge-restart-policy.sh sourcing and made startup-cleanup.sh the canonical mid-session bridge restart path. Two classes of inherited env vars cause bridge startup failures when the script is called from a running control-agent session:

Bug 1: PKG_EXECPATH poisons varlock probes

The pi agent process inherits PKG_EXECPATH from its initial varlock launch. This causes varlock's SEA binary to misinterpret argv — treating subcommands like run as Node module paths:

Error: Cannot find module '/home/baudbot_agent/run'

The varlock broker-key probes on lines 115-122 fail silently (2>/dev/null), so BRIDGE_SCRIPT stays empty → "No Slack transport configured" → bridge never starts.

Bug 2: Stale env vars bypass varlock injection

varlock run does not override env vars already set in the parent process. This affects every varlock-managed variable, not just broker tokens. If any value is rotated after session start (broker token refresh, API key rotation, config change), the supervisor passes the stale inherited values instead of reading fresh ones from ~/.config/.env.

The original fix hardcoded unset for just SLACK_BROKER_ACCESS_TOKEN and SLACK_BROKER_ACCESS_TOKEN_EXPIRES_AT, but that's a whack-a-mole approach — any new rotatable credential would need another hardcoded unset line.

Fix

  1. unset PKG_EXECPATH at the top of the script (line 14, before varlock probes)
  2. Dynamic unset of ALL varlock-managed keys in the supervisor subshell — uses varlock load --format env --compact to enumerate every key varlock manages, then unsets them all before varlock run injects fresh values:
if command -v varlock >/dev/null 2>&1; then
  while IFS='=' read -r key _; do
    [ -n "$key" ] && unset "$key"
  done < <(varlock load --path "$HOME/.config/" --format env --compact 2>/dev/null)
fi

This guarantees every bridge restart gets fresh values from ~/.config/.env, regardless of which keys changed.

Testing

  • All 15 existing tests pass (bin/test.sh)
  • Verified manually: bridge starts correctly when called from a session with PKG_EXECPATH set and stale SLACK_BROKER_ACCESS_TOKEN_EXPIRES_AT

Regression from #148.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Feb 23, 2026

Greptile Summary

Fixes two environment variable inheritance bugs that prevented the Slack bridge from starting during mid-session restarts:

  • Unsets PKG_EXECPATH at script start (before varlock probes) to prevent varlock SEA binary from misinterpreting subcommands as Node module paths
  • Unsets stale broker token variables (SLACK_BROKER_ACCESS_TOKEN, SLACK_BROKER_ACCESS_TOKEN_EXPIRES_AT) in the supervisor subshell so varlock injects current values from ~/.config/.env instead of passing through expired tokens from the parent process

Both issues were introduced in #148 when startup-cleanup.sh became the canonical mid-session bridge restart path. The fixes are minimal, well-documented with inline comments, and preserve the existing PKG_EXECPATH unset at line 147 (which only protected the supervisor subshell, not the earlier varlock probes).

Confidence Score: 5/5

  • Safe to merge - surgical fix for documented regression with clear test coverage
  • The changes are minimal (6 lines added), well-documented with inline comments explaining the root cause, and address legitimate bugs introduced in runtime: add adaptive Slack bridge restart policy #148. The fix is defensive (uses || true for unset operations), all 15 existing tests pass, and manual testing confirmed the fix works.
  • No files require special attention

Important Files Changed

Filename Overview
pi/skills/control-agent/startup-cleanup.sh Fixes environment variable inheritance issues that prevent bridge startup after mid-session restarts

Last reviewed commit: e58f2a6

Comment thread pi/skills/control-agent/startup-cleanup.sh Outdated
…-cleanup

When startup-cleanup.sh runs mid-session (called by the control agent),
inherited env vars cause bridge startup failures:

1. PKG_EXECPATH — leaked from the parent varlock-launched process, causes
   varlock's SEA binary to misinterpret subcommands as Node module paths.
   The varlock broker-key probes (lines 115-122) silently fail, resulting
   in 'No Slack transport configured' and the bridge never starting.

2. varlock run does not override env vars already present in the parent
   process. If any managed value (broker tokens, API keys, config) was
   rotated after session start, the supervisor passes the stale values
   instead of reading fresh ones from ~/.config/.env.

Fix:
- unset PKG_EXECPATH at the script top (before varlock probes)
- In the supervisor subshell, dynamically unset ALL varlock-managed keys
  via 'varlock load --format env' before calling 'varlock run', so every
  restart gets fresh values regardless of which keys changed.

Regression from #148.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants