Skip to content

Move TOTP subprocess calls off the event loop#127

Merged
dcellison merged 1 commit intomainfrom
fix/totp-async-subprocess
Mar 23, 2026
Merged

Move TOTP subprocess calls off the event loop#127
dcellison merged 1 commit intomainfrom
fix/totp-async-subprocess

Conversation

@dcellison
Copy link
Copy Markdown
Owner

Summary

  • Wrap all six totp.py call sites in bot.py with asyncio.to_thread() so synchronous subprocess.run() calls run in the thread pool instead of blocking the event loop
  • A single verify_code() invocation can block for up to 15 seconds (three sudo calls with 5-second timeouts each), freezing message processing for all users
  • totp.py itself is unchanged - it stays synchronous, which is correct for its CLI usage (python -m kai totp setup/status/reset)

Call sites wrapped

# Function Location
1 is_totp_configured() _check_totp
2 is_totp_configured() handle_message
3 get_lockout_remaining() handle_message (pre-verify)
4 verify_code() handle_message
5 get_lockout_remaining() handle_message (post-verify failure)
6 get_failure_count() handle_message

Test plan

  • All 1038 existing tests pass unchanged (mocks work with to_thread - it calls mock() in the thread pool, returning mock.return_value)
  • make check clean (ruff lint + format)
  • grep -c "to_thread" src/kai/bot.py returns 6
  • No direct calls remain: grep "verify_code\|get_lockout_remaining\|get_failure_count\|is_totp_configured" src/kai/bot.py | grep -v "to_thread\|import\|def \|#" returns empty

Fixes #123

Wrap all six totp.py call sites in bot.py with asyncio.to_thread() so
the synchronous subprocess.run() calls (sudo cat/tee to root-owned
files) run in the thread pool instead of blocking the event loop.

A single verify_code() invocation can block for up to 15 seconds
(three sudo calls with 5-second timeouts). With to_thread, the event
loop remains responsive to other users during verification.

totp.py itself is unchanged - it stays synchronous for CLI usage.

Fixes #123
@dcellison
Copy link
Copy Markdown
Owner Author

Review by Kai

PR Review: Move TOTP subprocess calls off the event loop

Overall: The core change is correct and cleanly implemented. All six call sites use proper asyncio.to_thread(fn, arg1, ...) syntax (not fn() calls passed in), and the control flow is unchanged. No bugs introduced.


Warning

No exception handling on threaded subprocess calls (bot.py, all 6 wrapped call sites)

If any subprocess fails (TOTP binary missing, permission error, timeout), the exception propagates unhandled through the async handler. The critical path is _check_totp:

if not await asyncio.to_thread(is_totp_configured):
    return True  # allow access

If is_totp_configured raises, the function neither returns True nor False — the exception escapes. Whether this fails open or closed depends entirely on the bot's top-level error handler, which isn't shown. Pre-existing issue, but this PR touches all six sites and is a natural place to address it.

Minimum fix: wrap each to_thread call in a try/except and fail closed (deny access, log the error).


Suggestion

TOCTOU window slightly widened (bot.py lines ~1579 and ~1627)

is_totp_configured() is called independently in _check_totp and again in handle_message. Thread pool scheduling adds a small delay to each, marginally widening the window where TOTP config could change between checks. Pre-existing architectural issue — not introduced here, not worth blocking the PR on.


Clean

  • Correct asyncio.to_thread usage throughout (function reference + args, not a call)
  • verify_code args passed correctly: to_thread(verify_code, code, lockout_attempts, lockout_minutes)
  • No logic changes to authentication flow
  • totp.py left synchronous (correct — CLI usage)
  • PR description's grep verification claims are consistent with the diff

@dcellison dcellison merged commit 8034116 into main Mar 23, 2026
1 check passed
@dcellison dcellison deleted the fix/totp-async-subprocess branch March 23, 2026 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TOTP verification blocks asyncio event loop with synchronous subprocess calls

2 participants