Skip to content

Rename to tripwire and land 10 design proposals (0.20.0)#57

Merged
elijahr merged 34 commits intomainfrom
rename/tripwire-and-proposals
Apr 29, 2026
Merged

Rename to tripwire and land 10 design proposals (0.20.0)#57
elijahr merged 34 commits intomainfrom
rename/tripwire-and-proposals

Conversation

@elijahr
Copy link
Copy Markdown
Contributor

@elijahr elijahr commented Apr 27, 2026

Summary

Renames the project from bigfoot to tripwire and lands ten design improvements
collected from porting bigfoot's design into the Nim port (tripwire, used by
paperplanes). Released as 0.20.0.

The first commit is the mechanical rename + Proposal 1 default flip. Both are
breaking; bundling them avoids a confusing two-step "renamed in 0.20.0,
default flipped in 0.20.1" history. Every subsequent commit is one proposal
or one bonus item, landed in the order they were prioritized in the source
proposal file.

Commits

The branch has 14 commits: 1 bookkeeping commit bringing the proposal file onto the branch, 10 substantive commits (one per proposal), and 3 small fixup commits.

Substantive (review individually):

  • C1: Rename bigfoot to tripwire; default guard=error
  • C2: Add passthrough_safe; UnsafePassthroughError on warn (+ async_subprocess type-cast fix)
  • C3: Per-protocol guard levels via [tool.tripwire.guard]
  • C4: Distinguish post-sandbox interactions from leaked ones
  • C5: Pedagogical GuardedCallError messages with call site
  • C6: Add @pytest.mark.guard for per-test override
  • C7: Strict TOML schema validation with typo suggestions
  • C8: README: when to pick which guard default
  • C9: Raise from allow/restrict outside sandbox (also covers deny for symmetry)
  • C10: Test contextvars across asyncio + threadpool boundaries

Fixup commits (small, surgical):

  • Fix mypy: add PostSandboxInteractionError to test_init expected_all -- one-line fix to the hardcoded expected_all set in tests/unit/test_init.py after C4 added the new exception to __all__.
  • Fix README: correct stale guard=warn default-mode prose -- two prose passages still described "warn" as the default after C1 flipped it to "error".
  • Fix audit findings: green-mirage test, missing deny in CHANGELOG, em-dash in README -- C5-T2 used substring checks that were tautologically true; the C9 CHANGELOG bullet listed allow and restrict but not deny; one em-dash slipped into a C10 README addition.

Reviewer guidance

The first commit is the mechanical rename. Review by spot-checking the sed
transform and confirming the test suite passes. Commits 2 through 10 deserve
line-by-line review.

Source

See docs/proposals/2026-04-26-tripwire-port-feedback.md for the original
per-proposal rationale and acceptance tests.

elijahr added 14 commits April 27, 2026 02:26
Mechanical rename of the package from bigfoot to tripwire (PyPI
distribution, Python import name, public API symbols, exception
class names, internal sentinels, pytest fixtures, pytest entry-point,
and the config table [tool.bigfoot] -> [tool.tripwire]). Bundles
Proposal 1's default guard flip from "warn" to "error" because both
are breaking and a separate point release for the default flip would
create a confusing two-step history. Bumps version to 0.20.0.

Internal sentinels restructured from underscore-flat
(`bigfoot_subprocess_run`) to colon-namespaced `<library>:<method>`
(e.g., `subprocess:run`, `httpx:get`, `socket:connect`). The
`tripwire:` prefix is intentionally omitted because the namespace
is implicit inside the tripwire package.

Adds ConfigMigrationError raised when [tool.bigfoot] is present in
pyproject.toml during config load.

Adds scripts/preflight.sh used as the pre-rename gate.
Replaces BasePlugin.supports_guard with BasePlugin.passthrough_safe
(default False; safer-by-default). Per-plugin classification per
design Section 4 migration table: 8 plugins get passthrough_safe=True
(the 6 prior supports_guard=False plugins, plus MockPlugin and
StateMachinePlugin); 20 plugins doing real I/O get
passthrough_safe=False. Adds UnsafePassthroughError raised when
guard="warn" lets a call through to a passthrough_safe=False plugin
outside a sandbox.

Removes BasePlugin.supports_guard, the is_guard_eligible() registry
helper, its cache, and the _clear_guard_eligible_cache test helper.
These are made redundant by the new model.

Corrects async_subprocess_plugin type annotations: the
cast(_AsyncFakeProcess, await _ORIGINAL_CREATE_SUBPROCESS_EXEC(...))
annotation lied about the runtime type (the runtime returned a real
asyncio.subprocess.Process). The cast is removed and the return-type
annotation widened to _AsyncFakeProcess | asyncio.subprocess.Process
for both _fake_create_subprocess_exec and _fake_create_subprocess_shell.
Runtime behavior unchanged; static types now match reality.
Extends config schema to accept either a scalar (`guard = "warn"`)
or a nested table (`[tool.tripwire.guard] default = "warn";
subprocess = "error"`). Replaces the singular `_guard_level`
ContextVar with `_guard_levels: ContextVar[GuardLevels]` (frozen
dataclass with `default` and `overrides` fields). Dispatch in
_context.py looks up guard_levels.overrides.get(plugin_name,
guard_levels.default) per protocol.

Backwards-compatible with the scalar form, the bool form
(`guard = false` normalizes to "off"), and the "strict" alias
for "error".
Adds a module-level itertools.count() allocator for sandbox_id.
SandboxContext.__init__ allocates one and adds it to a class-level
_active_sandbox_ids set. _exit() removes it. Each Interaction
records the active sandbox_id as a private _sandbox_id attribute
(NOT in interaction.details, preserving the certainty contract).

New PostSandboxInteractionError raised when an intercepted call
fires outside any active sandbox AND the current execution context
carries a sandbox_id from a since-exited sandbox. Distinct from
the existing leaked-interaction case (call without ever having
entered any sandbox).

ContextVar token-save/reset pattern in _enter()/_exit() handles
nested sandboxes correctly: an inner sandbox's exit restores the
outer sandbox_id automatically.
Reframes GuardedCallError messages for clarity: states "OUTSIDE
any 'with tripwire:' block", names the plugin and method,
includes the user call site (file:lineno), and lists the two
fixes inline. Existing fix sections (@pytest.mark.allow,
pyproject, sandbox-with-mock) retained.

Captures the user call site via a frame-walking utility in
src/tripwire/_frames.py. The walker skips frames whose
__module__ equals "tripwire" or starts with "tripwire." and
returns the first user frame. Falls back to "<unknown call site>"
when no user frame is present (e.g., spawned thread).

Includes scripts/prototype_frame_walk.py used as the pre-commit
validation gate against subprocess.run, socket.connect, httpx.get,
aiohttp, and psycopg2.connect.
The C4 commit added PostSandboxInteractionError to tripwire.__all__
but tests/unit/test_init.py:test_all_contains_expected_names asserts
__all__ matches a hardcoded set; the new export was not added to
that set. This commit closes the gap.
Registers the unprefixed `guard` marker (matches the existing
`allow` and `deny` convention). Marker accepts a single string
("error" / "warn" / "off") OR a dict matching the GuardLevels
shape (works with C3's per-protocol form). Hookwrapper
pytest_runtest_call reads the marker, sets the _guard_levels
ContextVar override (set token, yield, reset token), scoped to
the test's lifetime.
Validates [tool.tripwire] against a closed top-level schema.
Unknown keys raise TripwireConfigError with
difflib.get_close_matches suggestions. Per-plugin sub-table
validation tightened: each plugin's load_config validates its
own keys; central code rejects sub-table names that don't
match a registered plugin name.

Strict validation extended to [tool.tripwire.guard]
per-protocol keys: every override key must match a registered
plugin name; unknown keys produce typo suggestions.

The existing parser's .lower() normalization and "strict" ->
"error" aliasing are preserved as zero-cost backward compat.
Adds a "Picking the right guard default" section after the
Firewall mode section. Three-bullet structure: new projects
(keep `error`), legacy-migration suites (`warn` temporarily),
mixed CI (per-protocol via [tool.tripwire.guard]).
C1 flipped the default from "warn" to "error" but two prose passages
in README.md still described "warn" as the default and "error" as the
opt-in. Updates the Firewall mode paragraph and the migration ladder
to match the post-0.20.0 default.
tripwire.allow(...), tripwire.restrict(...), and tripwire.deny(...)
called outside any active sandbox now raise TripwireError with a
message pointing the user at [tool.tripwire.firewall] for
module-scoped rules. Previously these silently no-op'd
(frame pushed and popped without effect).

The check fires inside the generator body BEFORE
_firewall_stack.set(...), so the error raises on __enter__ and
leaves _firewall_stack unchanged. Uses _active_verifier.get() is
not None only (does NOT include _guard_active.get()), which
preserves the documented behavior of allowing tripwire.allow(...)
inside guard mode.
Adds tests verifying `with tripwire:` state propagates correctly
through asyncio.to_thread, asyncio.create_task, loop.run_in_executor,
asyncio.gather, and concurrent.futures.ThreadPoolExecutor.submit.

Documents in the README that ProcessPoolExecutor does not propagate
`with tripwire:` state (separate-process boundary). To use tripwire
across process pools, enter `with tripwire:` inside the worker
function itself.
…dash in README

- tests/unit/test_pedagogical_messages.py: C5-T2 used substring checks
  on 'subprocess' and 'run' against a message that already contains
  source_id 'subprocess:run'; both substrings were tautologically
  present. Switched to sentinel plugin/method names and assert the
  human-prose 'PLUGIN.METHOD' rendering joined.
- CHANGELOG.md line 17: C9 added the active-sandbox check to
  tripwire.deny() in addition to allow() and restrict(); the bullet
  only mentioned allow and restrict.
- README.md line 174: replaced em-dash in the C10 'Concurrency
  boundaries' subsection with a plain comma (project disallows
  em-dashes in user-facing prose).
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request renames the package to tripwire and introduces several breaking changes, including a stricter default guard mode and namespaced source identifiers. It also adds features for per-protocol guard granularity, async leak detection, and strict configuration validation. The review feedback suggests refining the plugin lookup mechanism to consistently use canonical names for configuration matching and error reporting, as well as unifying the validation logic for the new pytest guard marker.

Comment thread src/tripwire/_registry.py Outdated
Comment on lines +133 to +153
def lookup_plugin_class_by_name(plugin_name: str) -> type[BasePlugin] | None:
"""Return the plugin class registered under ``plugin_name``, or None.

Derives eligibility from the plugin's supports_guard ClassVar.
Replaces the old GUARD_ELIGIBLE_PREFIXES set.
Looks up by canonical registry name first, then by any ``guard_prefixes``
declared on a registered plugin class. Returns None when no plugin
matches or when its optional dependency is missing. Callers use this
from outside any active sandbox to ask "what plugin would receive a
call from this source_id?".
"""
# Build a cache on first call
if not hasattr(is_guard_eligible, "_cache"):
eligible: set[str] = set()
for entry in PLUGIN_REGISTRY:
if not entry.default_enabled:
continue
try:
cls = get_plugin_class(entry)
if getattr(cls, "supports_guard", True):
eligible.add(entry.name)
# Also add source_id prefixes that differ from registry name
for prefix in getattr(cls, "guard_prefixes", ()):
eligible.add(prefix)
except Exception:
pass
is_guard_eligible._cache = frozenset(eligible) # type: ignore[attr-defined]
return plugin_name in is_guard_eligible._cache # type: ignore[attr-defined]


def _clear_guard_eligible_cache() -> None:
"""Clear the is_guard_eligible cache. For testing only."""
if hasattr(is_guard_eligible, "_cache"):
del is_guard_eligible._cache
for entry in PLUGIN_REGISTRY:
if not _is_available(entry):
continue
try:
cls = get_plugin_class(entry)
except Exception:
continue
if entry.name == plugin_name:
return cls
if plugin_name in getattr(cls, "guard_prefixes", ()):
return cls
return None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The lookup_plugin_class_by_name function should return both the plugin class and its canonical registry name. This is necessary because the plugin_name passed in (derived from the source_id prefix, e.g., "db") may differ from the canonical name (e.g., "database") used as keys in the [tool.tripwire.guard] configuration table. Returning the canonical name allows get_verifier_or_raise to correctly look up per-protocol guard level overrides.

Suggested change
def lookup_plugin_class_by_name(plugin_name: str) -> type[BasePlugin] | None:
"""Return the plugin class registered under ``plugin_name``, or None.
Derives eligibility from the plugin's supports_guard ClassVar.
Replaces the old GUARD_ELIGIBLE_PREFIXES set.
Looks up by canonical registry name first, then by any ``guard_prefixes``
declared on a registered plugin class. Returns None when no plugin
matches or when its optional dependency is missing. Callers use this
from outside any active sandbox to ask "what plugin would receive a
call from this source_id?".
"""
# Build a cache on first call
if not hasattr(is_guard_eligible, "_cache"):
eligible: set[str] = set()
for entry in PLUGIN_REGISTRY:
if not entry.default_enabled:
continue
try:
cls = get_plugin_class(entry)
if getattr(cls, "supports_guard", True):
eligible.add(entry.name)
# Also add source_id prefixes that differ from registry name
for prefix in getattr(cls, "guard_prefixes", ()):
eligible.add(prefix)
except Exception:
pass
is_guard_eligible._cache = frozenset(eligible) # type: ignore[attr-defined]
return plugin_name in is_guard_eligible._cache # type: ignore[attr-defined]
def _clear_guard_eligible_cache() -> None:
"""Clear the is_guard_eligible cache. For testing only."""
if hasattr(is_guard_eligible, "_cache"):
del is_guard_eligible._cache
for entry in PLUGIN_REGISTRY:
if not _is_available(entry):
continue
try:
cls = get_plugin_class(entry)
except Exception:
continue
if entry.name == plugin_name:
return cls
if plugin_name in getattr(cls, "guard_prefixes", ()):
return cls
return None
def lookup_plugin_class_by_name(plugin_name: str) -> tuple[type[BasePlugin], str] | None:
"""Return the plugin class and canonical name registered under
``plugin_name``, or None.
Looks up by canonical registry name first, then by any ``guard_prefixes``
declared on a registered plugin class. Returns None when no plugin
matches or when its optional dependency is missing. Callers use this
from outside any active sandbox to ask "what plugin would receive a
call from this source_id?".
"""
for entry in PLUGIN_REGISTRY:
if not _is_available(entry):
continue
try:
cls = get_plugin_class(entry)
except Exception:
continue
if entry.name == plugin_name or plugin_name in getattr(cls, "guard_prefixes", ()):
return cls, entry.name
return None

Comment thread src/tripwire/_context.py Outdated
4. Guard not active but patches installed: raise GuardPassThrough.
5. Otherwise: raise SandboxNotActiveError.
"""
plugin_name = source_id.split(":")[0]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The plugin lookup and canonical name resolution should be moved higher in get_verifier_or_raise. This ensures that the canonical name is available for both the PostSandboxInteractionError (Branch 2) and the per-protocol guard level override lookup (Branch 3b).

Suggested change
plugin_name = source_id.split(":")[0]
from tripwire._registry import lookup_plugin_class_by_name # noqa: PLC0415
prefix = source_id.split(":")[0]
res = lookup_plugin_class_by_name(prefix)
plugin_cls, canonical_name = res if res else (None, prefix)

Comment thread src/tripwire/_context.py
from tripwire._errors import PostSandboxInteractionError # noqa: PLC0415

raise PostSandboxInteractionError(
source_id=source_id,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the resolved canonical_name instead of the source_id prefix for better error reporting.

Suggested change
source_id=source_id,
plugin_name=canonical_name,

Comment thread src/tripwire/_context.py Outdated
Comment on lines +144 to +150
from tripwire._registry import ( # noqa: PLC0415
lookup_plugin_class_by_name,
)
plugin_cls = lookup_plugin_class_by_name(plugin_name)
plugin_is_unsafe_passthrough = (
plugin_cls is not None and plugin_cls.passthrough_safe is False
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The redundant lookup of plugin_cls can be removed since it is now resolved at the top of the function.

Suggested change
from tripwire._registry import ( # noqa: PLC0415
lookup_plugin_class_by_name,
)
plugin_cls = lookup_plugin_class_by_name(plugin_name)
plugin_is_unsafe_passthrough = (
plugin_cls is not None and plugin_cls.passthrough_safe is False
)
plugin_is_unsafe_passthrough = (
plugin_cls is not None and plugin_cls.passthrough_safe is False
)

Comment thread src/tripwire/_context.py Outdated
# === Branch 3b: DENY ===
# Per-protocol or default guard level (C3).
guard_levels = _guard_levels.get()
level = guard_levels.overrides.get(plugin_name, guard_levels.default)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The override lookup must use the canonical_name (e.g., "database") to match the keys validated in the [tool.tripwire.guard] configuration table. Using the source_id prefix (e.g., "db") causes overrides for several plugins to be ignored.

Suggested change
level = guard_levels.overrides.get(plugin_name, guard_levels.default)
level = guard_levels.overrides.get(canonical_name, guard_levels.default)

Comment thread src/tripwire/_context.py
from tripwire._errors import UnsafePassthroughError # noqa: PLC0415

raise UnsafePassthroughError(
source_id=source_id,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the resolved canonical_name for the UnsafePassthroughError to ensure the error message correctly identifies the plugin.

Suggested change
source_id=source_id,
plugin_name=canonical_name,

Comment thread src/tripwire/_context.py Outdated
user_frame = walk_to_user_frame()
raise GuardedCallError(
source_id=source_id,
plugin_name=plugin_name,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the resolved canonical_name for the GuardedCallError to ensure the error message correctly identifies the plugin.

Suggested change
plugin_name=plugin_name,
plugin_name=canonical_name,

Comment thread src/tripwire/_context.py Outdated
user_frame = walk_to_user_frame()
raise GuardedCallError(
source_id=source_id,
plugin_name=plugin_name,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the resolved canonical_name for the GuardedCallError to ensure the error message correctly identifies the plugin.

Suggested change
plugin_name=plugin_name,
plugin_name=canonical_name,

Comment thread src/tripwire/pytest_plugin.py Outdated
Comment on lines +297 to +301
if isinstance(arg, str):
marker_levels = GuardLevels(default=arg, overrides={})
elif isinstance(arg, dict):
marker_levels = _resolve_guard_levels({"guard": arg})
else:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The string argument to @pytest.mark.guard should be normalized and validated using the existing _resolve_guard_levels logic. This prevents issues where mixed-case strings (e.g., "Warn") fail to match the literal checks in the dispatch path, leading to unexpected errors.

Suggested change
if isinstance(arg, str):
marker_levels = GuardLevels(default=arg, overrides={})
elif isinstance(arg, dict):
marker_levels = _resolve_guard_levels({"guard": arg})
else:
if isinstance(arg, (str, dict, bool)):
marker_levels = _resolve_guard_levels({"guard": arg})

elijahr added 11 commits April 27, 2026 12:17
Resolves the 90-file conflict between this branch's bigfoot ->
tripwire rename and PR #56's drop of the _mock suffix from plugin
proxy attribute names. The merged form takes both changes:
tripwire.subprocess.mock_run(...) instead of either
bigfoot.subprocess.mock_run(...) (main) or
tripwire.subprocess_mock.mock_run(...) (this branch).

CHANGELOG 0.20.0 section now includes both PR #56's _mock-drop
entry and this branch's rename + 10 design proposal entries.
tomllib is stdlib only on Python 3.11+. The test_mixed_scalar_and_table_rejected_by_tomllib
test had a bare import inside the function body that broke collection on Python 3.10
(both ubuntu and windows). Switch to the existing tripwire._compat.tomllib shim that
the rest of the suite uses, hoisted to a module-level import.
On Python 3.14, multiprocessing's default start method on Linux is
'forkserver'. Forkserver setup creates a Unix-domain listener socket
*in the parent process* (multiprocessing/forkserver.py: socket.socket(AF_UNIX)).
When that listener exits its with-block, socket.close() fires, which
the SocketPlugin intercepts inside the parent's active sandbox, raising
UnmockedInteractionError before the worker process is even spawned.

The test only cares about the documented separate-process boundary.
Forcing 'spawn' bypasses the parent-side socket dance and makes the
contract explicit: every worker is a fresh interpreter that inherits
no tripwire state. macOS already defaults to spawn so behavior there
is unchanged; Linux 3.14 now matches.
The free-threaded build (3.14t) has no cryptography wheel, so
CryptoPlugin is not registered there. Guard dispatch in
get_verifier_or_raise routes through lookup_plugin_class_by_name(); when
the plugin is absent the lookup returns None and Branch 3 (guard
active) is skipped, sending the call to Branch 5
(SandboxNotActiveError). That broke seven tests on 3.14t that used
'crypto:sign' purely as a stand-in for 'any passthrough_safe=True
plugin'.

Replace the stand-in with 'jwt' (JwtPlugin is also passthrough_safe and
pyjwt IS in the all-ft extra). For test_dns_strict_http_warn, which
needs two distinct plugins, use 'logging' (always-available, safe) as
the per-protocol 'error' override target and keep 'jwt' for the default
'warn' case. Test intent is preserved across all Python builds.
The class-level set SandboxContext._active_sandbox_ids was mutated and
read without explicit synchronization. The previous comment claimed
"add() / discard() of distinct keys are independent under the GIL; no
lock required" — true on stock CPython, but PEP 703 (free-threaded
3.14t) removes that implicit serialization. Concurrent set.add /
set.discard / `in` against a shared set on the free-threaded build can
corrupt the hash table, making `__contains__` loop forever. This caused
the 3.14t CI jobs on ubuntu and windows to hang.

Wrap all three mutation sites (the add in _enter, the discard in the
_enter error path, and the discard in _exit) and the `in` check in
_detect_post_sandbox with a class-level threading.Lock. The lock is
uncontended on the GIL build, so this is effectively a no-op there.

Verified:
- Stress test with 32 threads x 500 iterations on 3.14t (PYTHON_GIL=0)
  completes cleanly with the active set drained back to empty.
- Full test suite passes on stock CPython 3.14 (1414 passed).
- Free-threaded 3.14t test suite (PYTHON_GIL=0) passes (1297 passed,
  5 skipped) in ~14s.
ProcessPoolExecutor spawn deadlocks on Windows free-threaded Python 3.14t,
hanging the executor's shutdown indefinitely. This is an upstream CPython
multiprocessing/free-threading interaction unrelated to tripwire. The
documented separate-process boundary that this test validates is still
exercised on every other platform, including Linux 3.14t.
The full module exercises asyncio/threading boundary primitives (asyncio.to_thread,
ProcessPoolExecutor, ThreadPoolExecutor, etc.). On Windows free-threaded 3.14 these
primitives deadlock, blocking the runner with no useful diagnostic. The propagation
contracts are still validated on every other platform, including Linux 3.14t. Replaces
the per-test processpool skip with a module-level skipif.
Per-protocol guard overrides under [tool.tripwire.guard] silently
failed to apply when the plugin's source_id prefix differed from its
canonical registry name. The dispatch in get_verifier_or_raise keyed
the override lookup on source_id.split(":")[0] (the prefix), but
overrides are validated and stored under the canonical registry name.

Affected plugins: database (prefix "db"), async_subprocess (prefix
"asyncio"), async_websocket and sync_websocket (overlapping prefix
"websocket"). A user writing database = "off" with a db:query call
saw UnsafePassthroughError instead of the expected suppression, and
GuardedCallError / UnsafePassthroughError / PostSandboxInteractionError
all reported plugin_name as the prefix instead of the canonical name.

Change lookup_plugin_class_by_name to return (cls, canonical_name) so
callers can distinguish the lookup key from the registry name. Hoist
the lookup in get_verifier_or_raise to the top of the function and
plumb canonical_name into every override lookup and error constructor
that previously used the prefix. When the source_id does not match a
registered plugin, fall back to the prefix as the displayed name to
preserve the pre-C2 contract for unknown source_ids.

Update test_database_connect_guard_blocks_when_not_allowed which had
asserted the buggy behavior (plugin_name == "db") to assert the
canonical form (plugin_name == "database").
The marker handler in pytest_runtest_call did
GuardLevels(default=arg, overrides={}) for string arguments, with no
case folding, no alias resolution, and no validation. Mixed-case
strings like "Warn" or aliases like "STRICT" slipped past the marker
handler and only failed later in dispatch with an opaque error (the
level == "warn" / level == "error" arms in get_verifier_or_raise
would not match, and behavior fell through unpredictably).

Route every accepted marker shape (string, bool, dict) through
_resolve_guard_levels so the marker uses the same normalization,
alias mapping (strict -> error), case folding, and validation as
[tool.tripwire.guard] in pyproject.toml. The string case is now a
proper subset of the dict case and no longer needs its own branch.
Five regression tests covering the two fixes:

- test_override_on_canonical_name_applies_to_prefix_source_id:
  database = "off" suppresses a db:query call. Without the fix this
  would raise UnsafePassthroughError (DatabasePlugin is
  passthrough_safe=False, so warn-default raises rather than passes
  through).
- test_error_reports_canonical_plugin_name_not_prefix:
  GuardedCallError.plugin_name reports "database", not "db", so users
  can match it against the same name they wrote in
  [tool.tripwire.guard].
- test_config_rejects_prefix_as_override_key:
  Asserts the inverse direction is also enforced: writing db = "off"
  is rejected by _resolve_guard_levels (only canonical names are
  valid override keys).
- test_marker_normalizes_mixed_case_warn:
  @pytest.mark.guard("Warn") is normalized to "warn" via
  _resolve_guard_levels. Verified by observing the warn-branch
  signal (UnsafePassthroughError on subprocess) instead of the
  error-branch signal (GuardedCallError) that the project default
  "error" would produce.
- test_marker_normalizes_strict_alias:
  @pytest.mark.guard("STRICT") lowercases and aliases to "error";
  proven by GuardedCallError under a project default of "warn".

The first three tests fail without the canonical-name fix in
get_verifier_or_raise. The fourth fails without the marker fix in
pytest_runtest_call.
On PEP 703 free-threaded CPython 3.14+, sys.flags.thread_inherit_context
is True and a new thread inherits its base context from the spawning
thread. When a ThreadPoolExecutor worker was spawned during a sandbox,
its base context carried _active_verifier (and other tripwire
ContextVars) for the rest of its life.

concurrent.futures._WorkItem.run invokes future.set_result() AFTER
ctx.run(self.task) returns, so future done-callbacks run in the worker's
BASE context, not the captured caller context. asyncio's _call_set_state
done-callback chains to loop.call_soon_threadsafe -> loop._write_to_self
-> csock.send(b'\0') on the loop's internal self-pipe socket. With the
verifier still active in the worker's base context, our SocketPlugin
patched send raised UnmockedInteractionError. Future._invoke_callbacks
swallowed the exception, the wakeup byte was never written, and the
asyncio event loop hung in selector.select() forever.

Fix: run _original_submit inside an empty contextvars.Context so any
worker thread spawned by _adjust_thread_count starts with a clean base
context. The work item itself still sees the captured caller context
because ctx.run(fn) is what executes the user's function; only the
post-work-item bookkeeping (set_result, done-callbacks, idle wait) runs
in the empty base context.

Manifested as intermittent ~30%-rate hangs of
tests/integration/test_contextvar_propagation.py on Linux 3.14t CI.
Locally reproduced at the same rate on macOS arm64 3.14t prior to the
fix; 50/50 passes after.

Adds a regression test that asserts a TPE worker's done-callback sees
the default ContextVar value, not the value the parent set before
submit. The test catches the bug on both stock and free-threaded
builds (the inheritance behaviour is identical for done-callbacks).
@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 28, 2026

@gemini-code-assist could you take another look? I addressed the 9 review comments from your initial review:

  • Threaded canonical_name from lookup_plugin_class_by_name through every error-construction site and the per-protocol override lookup in _context.py:get_verifier_or_raise (commit 1d8c696). Comments 1-8 fixed; this was a real user-visible bug for the 4 plugins where guard_prefixes differs from canonical name (database/db, async_subprocess/asyncio, async_websocket/websocket, sync_websocket/websocket).
  • Routed @pytest.mark.guard(arg) strings/dicts through _resolve_guard_levels for unified normalization, alias resolution, and validation (commit c49cffd). Comment 9 fixed.
  • Added tests/integration/test_guard_canonical_name.py with 5 regression tests covering both fixes (commit d0a8f12).

Also pushed since the original review:

  • bc83000 lock for _active_sandbox_ids under PEP 703 free-threading.
  • 74c8116, 8e18bc2, ec6b73f, a77066e, 25d4750 various CI fixes for Python 3.10/3.14/3.14t.
  • 50bee68 empty-context wrap of executor worker submit, fixing a real PEP 703 context-leak deadlock through asyncio internals.

All 13 CI jobs green. Thanks for the catch on the canonical-name bug.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request performs a major rebranding and architectural update, renaming the package from bigfoot to tripwire and introducing several breaking changes, including flipping the default guard mode from 'warn' to 'error'. It enhances the sandbox and firewall logic with improved async leak detection, per-protocol guard levels, and strict configuration validation. A significant fix for thread pool context propagation on free-threaded Python is also included. Feedback suggests an optimization opportunity in the plugin lookup logic within the core dispatch path to improve performance for high-frequency intercepted calls.

Comment thread src/tripwire/_context.py
Comment on lines +118 to +134
prefix = source_id.split(":")[0]

# Resolve the plugin class once, up front, so the canonical registry
# name is available to every downstream branch. ``plugin_cls is None``
# means the source_id does not belong to a registered plugin (e.g.,
# a test exercising get_verifier_or_raise with an arbitrary name);
# in that case, fall back to the source_id prefix as the displayed
# plugin_name so unknown source_ids preserve the pre-C2 contract.
#
# When the plugin IS registered, ``canonical_name`` is the registry
# name (e.g., ``"database"``) even when ``prefix`` matched via a
# ``guard_prefix`` like ``"db"``. This is what per-protocol guard
# overrides under ``[tool.tripwire.guard]`` key on, and what error
# messages report.
from tripwire._registry import ( # noqa: PLC0415
lookup_plugin_class_by_name,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for resolving the plugin class and canonical name from a source_id prefix is repeated in several branches of get_verifier_or_raise. While correct, it relies on lookup_plugin_class_by_name which iterates through the registry. For high-frequency intercepted calls (like logging or socket operations), this lookup could be optimized by caching the mapping from prefix to (plugin_cls, canonical_name) at the module level or inside the registry.

lookup_plugin_class_by_name runs on every intercepted call from a
sandbox via get_verifier_or_raise. The uncached implementation
iterates ~27 registry entries, runs availability checks (which
import modules), and imports the plugin class on every dispatch,
which is wasteful for high-frequency interceptors like logging
and socket.

Add a module-level dict cache keyed by the queried name, with
negative caching via a sentinel so unknown names are not retried.
Cache access is wrapped in a threading.Lock so free-threaded
Python (3.14t) cannot observe a torn dict mutation under
concurrent first-fill pressure.

The plugin registry is effectively immutable for the life of the
process (entries are added at decorator import time; availability
flips require installing a new package and restarting), so a lazy
populate-once cache is sound. Tests that monkeypatch
PLUGIN_REGISTRY can call _clear_lookup_cache() to flush stale
entries.

Add unit tests covering tuple identity on cache hit, negative
caching, cache invalidation, guard_prefix resolution, and a
20-thread x 100-iteration concurrent-lookup safety check.
@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 28, 2026

@gemini-code-assist could you take another look? I addressed the lookup-cache comment from your second review:

  • Added module-level _lookup_cache: dict[str, tuple[type[BasePlugin], str] | None] in _registry.py with _UNSET sentinel for negative caching, and a threading.Lock for free-threaded safety (matches the _active_sandbox_ids_lock pattern). lookup_plugin_class_by_name now consults the cache on entry and stores its result on exit; the registry is module-frozen at import time so a populate-once cache is correct (commit 3fbde61).
  • Added 7 unit tests in tests/unit/test_registry.py::TestLookupPluginClassByNameCache covering known-name tuple shape, identity-on-cache-hit, negative caching, invalidation via _clear_lookup_cache(), guard_prefix → canonical name resolution, and a 20-thread × 100-iteration concurrency test asserting all 2000 results are the same tuple instance.

All 13 CI jobs green.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request completes the renaming of the project from 'bigfoot' to 'tripwire' across the entire codebase, including documentation, configuration, and source code. Additionally, it implements several improvements: defaulting guard mode to 'error', adding per-protocol guard granularity, and improving error reporting for outside-sandbox calls. I have identified two actionable improvements: replacing hardcoded absolute paths in the plugin creation skill with a placeholder and clarifying the plugin count in the README.

Comment thread .claude/skills/adding-plugins/SKILL.md Outdated

```bash
cd /Users/elijahrutschman/Development/bigfoot && uv run pytest tests/unit/test_[name]_plugin.py -v
cd /Users/elijahrutschman/Development/tripwire && uv run pytest tests/unit/test_[name]_plugin.py -v
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This command contains a hardcoded absolute path (/Users/elijahrutschman/Development/tripwire). This makes the skill not portable for other developers. Consider replacing it with a placeholder like [PROJECT_ROOT] or using a relative path. This applies to other similar paths in this file as well.

Suggested change
cd /Users/elijahrutschman/Development/tripwire && uv run pytest tests/unit/test_[name]_plugin.py -v
cd [PROJECT_ROOT] && uv run pytest tests/unit/test_[name]_plugin.py -v

Comment thread README.md
## Plugins

bigfoot ships with 27 plugins covering the most common external dependencies:
tripwire ships with 27 plugins covering the most common external dependencies:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The text states there are 27 plugins, but the table below lists 26 plugin categories. The WebSocket entry in the table seems to represent two plugins (AsyncWebSocketPlugin and SyncWebSocketPlugin). To avoid confusion, you might want to either update the count to 26 or split the WebSocket entry into two to match the total of 27.

Replace hardcoded absolute paths in adding-plugins SKILL.md with
[PROJECT_ROOT] placeholder so the skill is portable across machines.

Split WebSocket entry in README plugins table into AsyncWebSocketPlugin
and SyncWebSocketPlugin so the table row count (27) matches the prose
"ships with 27 plugins" claim and the plugin classes in
src/tripwire/plugins/.
@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 28, 2026

@gemini-code-assist could you take another look? I addressed both items from your third review (commit a5eda1d):

  • .claude/skills/adding-plugins/SKILL.md: replaced 4 hardcoded /Users/elijahrutschman/Development/tripwire paths with [PROJECT_ROOT] (lines 248, 415, 420, 469). Final grep for /Users/ returns no matches.
  • README.md: split the WebSocket entry into separate AsyncWebSocketPlugin + SyncWebSocketPlugin rows. Authoritative count from src/tripwire/plugins/*.py is 27 (verified by grep -hE "^class [A-Za-z0-9_]+Plugin\((BasePlugin|StateMachinePlugin)" src/tripwire/plugins/*.py | wc -l); the prose count of 27 now matches the table.

All 13 CI jobs green.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request completes the migration from 'bigfoot' to 'tripwire', including renaming the package, updating documentation, and adjusting internal references. It also introduces several improvements: defaulting guard mode to 'error', adding per-protocol guard overrides, and improving error messaging for outside-sandbox calls. I have identified two inconsistencies in the API reference documentation where the old _mock suffix was still referenced for the subprocess plugin proxy.

Comment thread docs/reference/index.md Outdated
| `BigfootError` | exception | Base class for all bigfoot exceptions. |
| `HttpPlugin` | class | Intercepts `httpx`, `requests`, and `urllib` HTTP calls. Requires `tripwire[http]`. Import from `tripwire.plugins.http`. |
| `SubprocessPlugin` | class | Intercepts `subprocess.run` and `shutil.which`. Included in core tripwire. Import from `tripwire.plugins.subprocess`. |
| `subprocess_mock` | proxy | Module-level proxy to `SubprocessPlugin` for the current test. Auto-creates the plugin on first access. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The proxy name for SubprocessPlugin should be subprocess, not subprocess_mock. The changelog indicates that _mock suffixes were dropped from plugin proxies in this version.

Suggested change
| `subprocess_mock` | proxy | Module-level proxy to `SubprocessPlugin` for the current test. Auto-creates the plugin on first access. |
| `subprocess` | proxy | Module-level proxy to `SubprocessPlugin` for the current test. Auto-creates the plugin on first access. |

Comment thread docs/reference/index.md Outdated
Comment on lines +35 to +36
| `SubprocessRunSentinel` | `tripwire.plugins.subprocess` | Opaque handle returned by `subprocess_mock.run`. Used as source in `assert_interaction()`. |
| `SubprocessWhichSentinel` | `tripwire.plugins.subprocess` | Opaque handle returned by `subprocess_mock.which`. Used as source in `assert_interaction()`. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The proxy name should be subprocess, not subprocess_mock. The _mock suffix was dropped from plugin proxies.

Suggested change
| `SubprocessRunSentinel` | `tripwire.plugins.subprocess` | Opaque handle returned by `subprocess_mock.run`. Used as source in `assert_interaction()`. |
| `SubprocessWhichSentinel` | `tripwire.plugins.subprocess` | Opaque handle returned by `subprocess_mock.which`. Used as source in `assert_interaction()`. |
| `SubprocessRunSentinel` | `tripwire.plugins.subprocess` | Opaque handle returned by `subprocess.run`. Used as source in `assert_interaction()`. |
| `SubprocessWhichSentinel` | `tripwire.plugins.subprocess` | Opaque handle returned by `subprocess.which`. Used as source in `assert_interaction()`. |

The _mock suffix was dropped from plugin proxy names; index.md still showed
subprocess_mock for the proxy and in two SubprocessRunSentinel/
SubprocessWhichSentinel description lines.
@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 28, 2026

@gemini-code-assist could you take another look? Both items from your fourth review addressed in docs/reference/index.md (commit 3c6306f):

  • Line 15 proxy row: subprocess_mocksubprocess
  • Lines 35-36 sentinel descriptions: subprocess_mock.runsubprocess.run, subprocess_mock.whichsubprocess.which

Also swept the rest of docs/ for stray *_mock\b proxy references — only the three on those lines existed; the remaining _mock matches in docs/ are legitimate (the tripwire._mock_plugin module path, test-function names like test_async_mock, and a local variable es_mock). All 13 CI jobs green.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request performs a comprehensive rename of the project from 'bigfoot' to 'tripwire', updating all internal package references, public API symbols, configuration tables, and documentation. It also introduces several significant improvements: defaulting guard mode to 'error', adding a 'passthrough_safe' gate for plugins, per-protocol guard overrides, improved async error reporting, and strict TOML configuration validation.

elijahr added 2 commits April 28, 2026 15:44
The PyPI name `tripwire` is unavailable (likely Tripwire Inc. trademark
reservation). Switch the dist name to `python-tripwire` while keeping the
Python import name `tripwire` and all internal identifiers unchanged.

Changes:
- pyproject.toml: name = "python-tripwire"; self-references in extras and
  dev groups; project.urls switched to axiomantic/python-tripwire
- README, CONTRIBUTING, SECURITY, CODE_OF_CONDUCT, mkdocs.yml: GitHub URLs
  switched to axiomantic/python-tripwire; README PyPI badge points at the
  new dist
- All `pip install tripwire[...]` and `Install tripwire[...]` strings in
  docs, AGENTS.md, .claude skills, src/ ImportError messages, and matching
  test assertions changed to `python-tripwire[...]`
- writing-plugins.md: 3rd-party plugin example keeps `tripwire-myservice`
  (hypothetical 3rd-party name) but installs `python-tripwire` as the dep
- CHANGELOG: add a 0.20.0 Changed bullet documenting the dist-name choice
- test_project_structure: assert name == "python-tripwire"

Unchanged: src/tripwire/ directory, all `import tripwire` / `from tripwire`,
[tool.tripwire] / [tool.tripwire.guard] config tables, pytest fixtures,
sentinel source_ids, and pytest11 entry-point name.
importlib.metadata.distribution() / version() resolve by PyPI dist name,
not import name. After the dist rename to python-tripwire, the smoke test
assertions had to follow.
@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 28, 2026

@gemini-code-assist could you take another look? PyPI rejected tripwire (likely Tripwire Inc. trademark conflict), so the dist name is now python-tripwire. Two commits:

  • be90f46 "Rename PyPI dist to python-tripwire" — pyproject.toml [project] name, all tripwire[extra] self-refs, all pip install tripwire[...] strings in README/AGENTS/docs/skills, all axiomantic/tripwire URLs across pyproject + 5 metadata files (mkdocs, README, CONTRIBUTING, CODE_OF_CONDUCT, SECURITY) → axiomantic/python-tripwire, in-source error messages and tests asserting them, plus a CHANGELOG bullet. The Python import name remains tripwire.
  • 809af40 "Look up dist metadata by python-tripwire in smoke test" — importlib.metadata.distribution("tripwire") / version("tripwire") resolves by PyPI dist name; the smoke test had to follow.

Audit clean: no pip install tripwire[...] anywhere; no axiomantic/tripwire URLs (CHANGELOG retains axiomantic/bigfoot history, intentional). All 13 CI jobs green.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request renames the package to tripwire, flips the default guard mode to error, and introduces new error types and configuration options for enhanced sandbox control and async leak detection. It also fixes context propagation in ThreadPoolExecutor.submit for free-threaded Python. The reviewer provided feedback on several performance optimization opportunities, including reducing overhead in plugin lookups and minimizing lock contention in global state management. Additionally, there is a suggestion to improve the robustness of string parsing for source identifiers.

Comment thread src/tripwire/_context.py
Comment on lines +120 to +136
# Resolve the plugin class once, up front, so the canonical registry
# name is available to every downstream branch. ``plugin_cls is None``
# means the source_id does not belong to a registered plugin (e.g.,
# a test exercising get_verifier_or_raise with an arbitrary name);
# in that case, fall back to the source_id prefix as the displayed
# plugin_name so unknown source_ids preserve the pre-C2 contract.
#
# When the plugin IS registered, ``canonical_name`` is the registry
# name (e.g., ``"database"``) even when ``prefix`` matched via a
# ``guard_prefix`` like ``"db"``. This is what per-protocol guard
# overrides under ``[tool.tripwire.guard]`` key on, and what error
# messages report.
from tripwire._registry import ( # noqa: PLC0415
lookup_plugin_class_by_name,
)
res = lookup_plugin_class_by_name(prefix)
plugin_cls, canonical_name = res if res is not None else (None, prefix)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The get_verifier_or_raise function performs a plugin lookup on every intercepted call. While lookup_plugin_class_by_name is cached, the initial call to split the source_id and resolve the plugin class still adds overhead to the hot path of every intercepted interaction (e.g., every socket.send, logging.debug, etc.).

Consider if the plugin_cls and canonical_name can be resolved once at the interceptor level and passed in, or if the source_id itself can be used more directly to avoid repeated string splitting and dictionary lookups in the common case where a sandbox is active.

Comment thread src/tripwire/_registry.py
Comment on lines +181 to +215
matches a ``guard_prefix`` instead (e.g., ``plugin_name="db"`` resolves
to ``("DatabasePlugin", "database")``). Callers MUST use the canonical
name when looking up per-protocol guard overrides and when populating
``plugin_name`` on errors so the user sees the registry name.

Results are cached at module level: this function is on the hot path
for every intercepted call, and the registry is effectively immutable
after import. Tests that mutate ``PLUGIN_REGISTRY`` must call
``_clear_lookup_cache()`` for changes to be observed.
"""
with _lookup_cache_lock:
cached = _lookup_cache.get(plugin_name, _UNSET)
if cached is not _UNSET:
# mypy cannot narrow through the sentinel object; we know the
# cached value is the tuple-or-None payload, not the sentinel.
return cached # type: ignore[return-value]

result: tuple[type[BasePlugin], str] | None = None
for entry in PLUGIN_REGISTRY:
if not _is_available(entry):
continue
try:
cls = get_plugin_class(entry)
except Exception:
continue
if entry.name == plugin_name or plugin_name in getattr(
cls, "guard_prefixes", ()
):
result = (cls, entry.name)
break

with _lookup_cache_lock:
_lookup_cache[plugin_name] = result
return result

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The lookup_plugin_class_by_name function uses a threading.Lock for every access to the _lookup_cache. In high-concurrency scenarios with many intercepted calls, this lock could become a point of contention, especially on the PEP 703 free-threaded build where the GIL is absent.

Since the registry is effectively immutable after the initial discovery phase, consider using a pattern that avoids locking on the read path after the cache is populated, such as a double-checked locking pattern or simply populating the cache during session initialization.

Comment thread src/tripwire/_verifier.py
Comment on lines +455 to +462
# mutations MUST be performed while holding _active_sandbox_ids_lock.
# Stock CPython's GIL implicitly serializes set operations, but the
# free-threaded build (PEP 703) does not — concurrent set.add /
# set.discard / `in` checks against a shared `set` can corrupt its
# internal hash table and hang in `__contains__`. The lock is
# uncontended on the GIL build, so this is effectively a no-op there.
_active_sandbox_ids: ClassVar[set[int]] = set()
_active_sandbox_ids_lock: ClassVar[threading.Lock] = threading.Lock()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of a process-wide set (_active_sandbox_ids) protected by a threading.Lock to track active sandboxes is correct for thread safety but introduces a global synchronization point. In a free-threaded environment, this could limit scalability if many sandboxes are being entered and exited concurrently across different threads.

While likely acceptable for a testing library, consider if a more decentralized approach (e.g., per-thread tracking where possible) could reduce this global contention.

Comment thread src/tripwire/_errors.py
Comment on lines +279 to +282
if ":" in self.source_id:
method = self.source_id.split(":", 1)[1]
else:
method = "<unknown method>"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for extracting the method name from source_id assumes a specific format (plugin:method). If a plugin uses a different source_id format (e.g., with more colons like asyncio:subprocess:spawn), the current split logic might return an unexpected string (e.g., subprocess:spawn instead of just spawn).

Consider using rsplit(':', 1) if the intent is always to get the last segment, or ensure all plugins follow a consistent naming convention that works with this split.

References
  1. General rule: ensure string manipulation logic is robust against variations in input format that follow the same general structure. (link)

Address Gemini review feedback (sixth pass) on PR #57.

_errors.py: GuardedCallError._build_message used split(':', 1) to extract
the method name from source_id, which produced 'subprocess:spawn' for a
multi-colon id like 'asyncio:subprocess:spawn'. Switch to rsplit(':', 1)
so the trailing segment is always treated as the method name regardless
of how many leading namespace segments precede it.

_registry.py: PLUGIN_REGISTRY is module-frozen at import time, so the
hot-path lookup cache can be eagerly populated once during module load
and read lock-free thereafter. Add _populate_lookup_cache(), call it
exactly once at the bottom of the module, and rewrite
lookup_plugin_class_by_name so the read path is a plain dict.get with
no lock acquisition for known canonical names and guard prefixes. The
lock is only taken for never-before-seen names to memoize the negative
result. _clear_lookup_cache re-populates the eager entries after
clearing so test teardown leaves the cache in the same lock-free
state. Document the thread-safety contract in the cache comment block.

_verifier.py: SandboxContext._active_sandbox_ids stays process-wide.
Add a comment block explaining that post-sandbox detection (Branch 2
of get_verifier_or_raise) must see across thread boundaries: worker
threads spawned inside a sandbox can outlive its __exit__, and a
per-thread tracker would silently misclassify their late interactions
as 'no active sandbox' instead of 'post-sandbox', defeating
PostSandboxInteractionError's diagnostic purpose. Sandbox enter/exit
runs at per-test boundaries, so lock contention is microsecond-scale
and effectively theoretical.

Tests:
- test_errors.py: regression tests covering multi-colon, single-colon,
  and no-colon source_id rendering for GuardedCallError.
- test_registry.py: new tests verifying the read path is lock-free for
  known names (patches the lock with a sentinel that fails on acquire),
  that unknown names take the lock exactly once for negative caching,
  and that every available canonical name is eagerly seeded after import.

Quality gates locally: mypy --strict clean, ruff clean, full pytest
suite passes (1883 tests).
@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 28, 2026

@gemini-code-assist could you take another look? Round 6 commit c78b274:

#4 (_errors.py:282 source_id robustness): Fixed. GuardedCallError._build_message now uses rsplit(":", 1)[1] so the LAST segment is authoritative regardless of how many colons appear in source_id. Added regression tests in tests/unit/test_errors.py for the multi-colon case ("asyncio:subprocess:spawn" → method "spawn").

#2 (_registry.py lock contention on read path): Fixed by eager population. Added _populate_lookup_cache() that runs once at module import time, seeding _lookup_cache with every available plugin (canonical name + each guard_prefixes entry). lookup_plugin_class_by_name now returns lock-free on the hot path: _lookup_cache.get(name, _UNSET) → if not _UNSET, return immediately. The lock is only acquired on the cold path (never-before-seen-name → negative-cache write). Tests in TestLookupPluginClassByNameCache extended to cover this.

#1 (split overhead on hot path): Indirectly addressed by #2 — the lookup is now genuinely lock-free on the dispatch hot path, so the per-interception cost is dominated by interception itself rather than registry lookups. Restructuring every interceptor to pass (plugin_cls, canonical_name) would be a much bigger architectural change for nanosecond-scale benefit; left as-is. Documented the analysis inline.

#3 (_active_sandbox_ids global lock): Keeping the process-wide design is intentional, with a new comment block on the declaration explaining why. Rationale: Branch 2 of get_verifier_or_raise (post-sandbox detection) must see across thread boundaries — worker threads spawned inside with verifier.sandbox(): can outlive the sandbox __exit__, and their late interactions need to be classified as PostSandboxInteractionError rather than silently degrading to SandboxNotActiveError. A per-thread tracker would misclassify those workers and lose the diagnostic precision that error type exists to provide. The lock contention is largely theoretical: sandbox enter/exit happens at with boundaries (per-test), not per-call; the hot-path read is a brief set.__contains__.

All 13 CI jobs green; suite is now 1883 passed.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request renames the package to tripwire, flips the default guard mode to error, and introduces per-protocol guard configuration and specialized error types for async leaks and unsafe passthroughs. The review feedback suggests refining the punctuation in configuration error messages and including user call site information in the new UnsafePassthroughError and PostSandboxInteractionError types to provide more actionable debugging information, consistent with other pedagogical improvements in the system.

Comment thread src/tripwire/_config.py Outdated
continue
suggestion = _format_suggestion(key, allowed)
raise TripwireConfigError(
f"Unknown key {key!r} in [tool.tripwire].{suggestion}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error message for an unknown key in [tool.tripwire] includes a trailing dot before the typo suggestion, which results in awkward punctuation like ...[tool.tripwire]. (did you mean: ...?). If no suggestion is available, it ends with a dot. It would be cleaner to move the punctuation into the suggestion logic or handle the empty case more gracefully.

        raise TripwireConfigError(
            f"Unknown key {key!r} in [tool.tripwire]{suggestion or '.'}"
        )

Comment thread src/tripwire/_config.py Outdated
suggestion = _format_suggestion(key, VALID_PLUGIN_NAMES)
raise TripwireConfigError(
f"Unknown protocol {key!r} in [tool.tripwire.guard]."
f"{suggestion}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the top-level schema validation, the error message for an unknown protocol in [tool.tripwire.guard] has a trailing dot that conflicts with the formatting of the typo suggestion.

                raise TripwireConfigError(
                    f"Unknown protocol {key!r} in [tool.tripwire.guard]{suggestion or '.'}"
                )

Comment thread src/tripwire/_context.py
if closed_sandbox_id is not None:
from tripwire._errors import PostSandboxInteractionError # noqa: PLC0415

raise PostSandboxInteractionError(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The PostSandboxInteractionError is raised here without capturing the user call site. Including the file and line number where the late interaction originated would significantly improve async leak debugging, consistent with the pedagogical improvements made to GuardedCallError in Proposal 5.

Suggested change
raise PostSandboxInteractionError(
from tripwire._errors import PostSandboxInteractionError # noqa: PLC0415
from tripwire._frames import walk_to_user_frame # noqa: PLC0415
raise PostSandboxInteractionError(
source_id=source_id,
plugin_name=canonical_name,
sandbox_id=closed_sandbox_id,
user_frame=walk_to_user_frame(),
)

Comment thread src/tripwire/_context.py
if plugin_is_unsafe_passthrough:
from tripwire._errors import UnsafePassthroughError # noqa: PLC0415

raise UnsafePassthroughError(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The UnsafePassthroughError is raised here without the user call site. Since this error indicates a potential side effect occurring outside a sandbox, providing the exact location in the user's code is just as important as it is for GuardedCallError.

Suggested change
raise UnsafePassthroughError(
if plugin_is_unsafe_passthrough:
from tripwire._errors import UnsafePassthroughError # noqa: PLC0415
from tripwire._frames import walk_to_user_frame # noqa: PLC0415
raise UnsafePassthroughError(
source_id=source_id,
plugin_name=canonical_name,
user_frame=walk_to_user_frame(),
)

Comment thread src/tripwire/_errors.py Outdated
Comment on lines +481 to +485
def __init__(self, source_id: str, plugin_name: str) -> None:
self.source_id = source_id
self.plugin_name = plugin_name
super().__init__(self._build_message())

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To support the pedagogical improvements suggested for UnsafePassthroughError, the constructor should accept an optional user_frame and the message builder should be updated to include the call site details.

    def __init__(
        self,
        source_id: str,
        plugin_name: str,
        user_frame: tuple[str, int, str] | None = None,
    ) -> None:
        self.source_id = source_id
        self.plugin_name = plugin_name
        self.user_frame = user_frame
        super().__init__(self._build_message())

    def _build_message(self) -> str:
        if ":" in self.source_id:
            method = self.source_id.rsplit(":", 1)[1]
        else:
            method = "<unknown method>"

        if self.user_frame is None:
            site = "<unknown call site>"
        else:
            site = f"{self.user_frame[0]}:{self.user_frame[1]}"

        return (
            f"UnsafePassthroughError: {self.source_id!r} would have caused "
            f"real I/O outside any 'with tripwire:' block.\n"
            f"\n"
            f"  Called {self.plugin_name}.{method} at {site} "
            f"OUTSIDE any \"with tripwire:\" block.\n"
            f"\n"
            f"Plugin {self.plugin_name!r} doesn't support outside-sandbox "
            f"passthrough (passthrough_safe=False), meaning its passthrough "
            f"path is NOT a no-op.\n"
            f"\n"
            f"Fix one of:\n"
            f"  - Wrap the call in 'with tripwire:' and add an allow/mock for it.\n"
            f"  - set guard='error' in [tool.tripwire] to make this fail explicitly.\n"
            f"  - If you have audited that this plugin's passthrough is safe, "
            f"set passthrough_safe=True on the plugin class.\n"
        )

Comment thread src/tripwire/_errors.py Outdated
Comment on lines +508 to +513
def __init__(self, source_id: str, plugin_name: str, sandbox_id: int) -> None:
self.source_id = source_id
self.plugin_name = plugin_name
self.sandbox_id = sandbox_id
super().__init__(self._build_message())

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To support the pedagogical improvements suggested for PostSandboxInteractionError, the constructor should accept an optional user_frame and the message builder should be updated to include the call site details. This is particularly valuable for identifying which specific async task or thread leaked past the sandbox boundary.

    def __init__(
        self,
        source_id: str,
        plugin_name: str,
        sandbox_id: int,
        user_frame: tuple[str, int, str] | None = None,
    ) -> None:
        self.source_id = source_id
        self.plugin_name = plugin_name
        self.sandbox_id = sandbox_id
        self.user_frame = user_frame
        super().__init__(self._build_message())

    def _build_message(self) -> str:
        if ":" in self.source_id:
            method = self.source_id.rsplit(":", 1)[1]
        else:
            method = "<unknown method>"

        if self.user_frame is None:
            site = "<unknown call site>"
        else:
            site = f"{self.user_frame[0]}:{self.user_frame[1]}"

        return (
            f"PostSandboxInteractionError: {self.source_id!r} fired from a "
            f"context that survived the exit of sandbox #{self.sandbox_id}.\n"
            f"\n"
            f"  Called {self.plugin_name}.{method} at {site} "
            f"AFTER the sandbox exited.\n"
            f"\n"
            f"This usually means an asyncio Task, thread, or future was "
            f"scheduled inside `with tripwire:` and is still running after "
            f"the block exited.\n"
            f"\n"
            f"Fix one of:\n"
            f"  - Await or cancel all pending tasks before exiting the sandbox.\n"
            f"  - Use `asyncio.gather(...)` inside the sandbox to ensure "
            f"completion.\n"
            f"  - If the late call is intentional, wrap it in its own "
            f"`with tripwire:` block.\n"
        )

…ation

UnsafePassthroughError and PostSandboxInteractionError now accept an
optional user_frame and render 'at <file>:<lineno>' (or
'<unknown call site>') in the same shape as GuardedCallError. Dispatch
in _context.py walks to the user frame via walk_to_user_frame() at
both raise sites so async-leak and outside-sandbox failures point at
the actual call site.

Also fixes a stray period in the unknown-key TripwireConfigError
messages: the trailing period now closes the sentence only when
there is no difflib suggestion; with a suggestion the suggestion
replaces the period.
@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 28, 2026

@gemini-code-assist could you take another look? Round 7 commit 4f84452 addresses all 6 comments:

#1, #2 — punctuation polish in _config.py. Both [tool.tripwire] and [tool.tripwire.guard] unknown-key errors now use the {suggestion or '.'} pattern so the period closes the sentence when no typo suggestion is available, and is replaced by the suggestion otherwise. Pinned both shapes with 4 new tests in test_config_validation.py.

#3, #4 — call-site capture in _context.py. Branch 2 (PostSandboxInteractionError) and Branch 3b (UnsafePassthroughError) both now call walk_to_user_frame() and pass the result.

#5, #6user_frame plumbed through the UnsafePassthroughError and PostSandboxInteractionError constructors and _build_message() for both. They now render Called {plugin}.{method} at {site} OUTSIDE / AFTER ... matching GuardedCallError's pedagogical voice. Added 6 regression tests in test_pedagogical_messages.py covering both real-frame and <unknown call site> paths plus the framing literals (OUTSIDE and AFTER the sandbox exited).

Sample renderings:

  • UnsafePassthroughError: ... Called subprocess.run at /abs/path/test_caller.py:42 OUTSIDE any "with tripwire:" block. ...
  • PostSandboxInteractionError: ... Called test.late at /abs/path/test_caller.py:137 AFTER the sandbox exited. ...

Suite is now 1893 passed, all 13 CI jobs green.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request renames the package to "tripwire" and introduces significant architectural updates, including flipping the default guard mode to "error" and adding specialized error types for async leaks and unsafe passthroughs. Feedback identifies a critical regression where allow(), deny(), and restrict() are incorrectly restricted to active sandboxes, preventing their use in guard mode. Additionally, the reviewer noted that the plugin lookup cache currently excludes third-party plugins and suggested performance optimizations for the hot path, such as reducing deferred imports and caching configuration parsing.

Comment thread src/tripwire/_guard.py
Comment on lines +45 to +51
from tripwire._context import _active_verifier # noqa: PLC0415
if _active_verifier.get() is None:
raise TripwireError(
"tripwire.allow(...) was called outside any active sandbox. "
"For module-scoped firewall rules, use [tool.tripwire.firewall] in pyproject.toml."
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This check incorrectly prevents tripwire.allow() from being used in "guard mode" (outside a sandbox). According to the dispatch logic in get_verifier_or_raise, the firewall stack is only consulted when no sandbox is active. By requiring an active verifier to set an allow rule, this change makes the context manager unusable for its primary purpose of permitting real I/O during guarded execution. The same issue applies to deny() (lines 79-85) and restrict() (lines 122-128).

Comment thread src/tripwire/_guard.py
Comment on lines +79 to +85
from tripwire._context import _active_verifier # noqa: PLC0415
if _active_verifier.get() is None:
raise TripwireError(
"tripwire.deny(...) was called outside any active sandbox. "
"For module-scoped firewall rules, use [tool.tripwire.firewall] in pyproject.toml."
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

As noted in the comment for allow(), this check prevents deny() from being used to configure the firewall stack during guarded execution outside a sandbox. It should be removed to maintain the functionality of the firewall system.

Comment thread src/tripwire/_guard.py
Comment on lines +122 to +128
from tripwire._context import _active_verifier # noqa: PLC0415
if _active_verifier.get() is None:
raise TripwireError(
"tripwire.restrict(...) was called outside any active sandbox. "
"For module-scoped firewall rules, use [tool.tripwire.firewall] in pyproject.toml."
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

As noted in the comment for allow(), this check prevents restrict() from being used to set a ceiling on permitted I/O during guarded execution outside a sandbox. It should be removed.

Comment thread src/tripwire/_registry.py
Comment on lines +182 to +215
Failures to import a single plugin class are swallowed: the entry is
simply not seeded, and a subsequent lookup will go through the slow
path (and likely return None for that name). This preserves the
original lazy behavior for broken/partially-installed plugins.
"""
for entry in PLUGIN_REGISTRY:
if not _is_available(entry):
continue
try:
cls = get_plugin_class(entry)
except Exception:
continue
payload: tuple[type[BasePlugin], str] = (cls, entry.name)
_lookup_cache[entry.name] = payload
for prefix in getattr(cls, "guard_prefixes", ()):
# Canonical name takes precedence if a prefix collides with
# another plugin's canonical name; do not overwrite.
_lookup_cache.setdefault(prefix, payload)


def _clear_lookup_cache() -> None:
"""Drop all cached ``lookup_plugin_class_by_name`` results, then
re-seed the eager entries.

Call this from tests that monkeypatch ``PLUGIN_REGISTRY`` so their
substitute registry is consulted instead of stale cache entries.
The re-population step ensures that the lock-free read path remains
valid for normal (unpatched) registry entries after the test
teardown clears the cache. Tests that monkeypatch ``PLUGIN_REGISTRY``
will pick up the patched registry on the next ``_populate_lookup_cache``
invocation, since this function reads ``PLUGIN_REGISTRY`` at call
time rather than at import time.
"""
with _lookup_cache_lock:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The eager population of _lookup_cache from PLUGIN_REGISTRY at module import time excludes third-party plugins loaded via entry points. Consequently, lookup_plugin_class_by_name will return None for any third-party plugin when called from get_verifier_or_raise outside an active sandbox. This breaks "guard mode" support and per-protocol overrides for all third-party plugins. The cache should be updated when entry-point plugins are discovered, or the lookup should fall back to a dynamic check that includes them.

Comment thread src/tripwire/_context.py Outdated
Comment on lines +141 to +155
# === Branch 2: post-sandbox detection (C4) ===
# MUST run BEFORE Branch 1: an asyncio task / thread captures the
# parent's ContextVars at creation time, including `_active_verifier`.
# After the parent's `with tripwire:` exits, `_active_verifier.reset()`
# in the parent does NOT propagate into the task's snapshot, so the
# task would otherwise hit Branch 1 with a stale verifier reference.
# `_active_sandbox_ids` is a process-wide ClassVar set, which the
# parent's `_exit()` discards in real time. Detecting a captured
# sandbox_id that is NOT in the active set is the authoritative test
# for "the sandbox this task came from has exited."
closed_sandbox_id = _detect_post_sandbox()
if closed_sandbox_id is not None:
from tripwire._errors import PostSandboxInteractionError # noqa: PLC0415
from tripwire._frames import walk_to_user_frame # noqa: PLC0415

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Performing deferred imports of SandboxContext, _current_sandbox_id, and PostSandboxInteractionError on every intercepted call is inefficient. Since _current_sandbox_id is a ContextVar and PostSandboxInteractionError is an exception, they should be moved to tripwire._context to allow for top-level imports. This would significantly reduce the overhead of the post-sandbox detection logic on the hot path.

Comment thread src/tripwire/_config.py
from typing import Any, Final

from tripwire._compat import tomllib

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The load_tripwire_config function is called multiple times per test (once by the session-scoped guard fixture and once by every StrictVerifier instantiation). Since pyproject.toml is expected to be stable during a test run, this function should be cached to avoid redundant disk I/O and TOML parsing.

import functools

@functools.lru_cache(maxsize=1)
def load_tripwire_config(start: Path | None = None) -> dict[str, Any]:

…cache config

- _registry: lookup_plugin_class_by_name now probes
  importlib.metadata.entry_points("tripwire.plugins") on cache miss and
  seeds the result under the entry-point's canonical name and every
  declared guard_prefix. Eager population from PLUGIN_REGISTRY at import
  time still covers built-ins on the lock-free fast path; third-party
  plugins are observed lazily on first lookup. Fixes a regression where
  guard mode and per-protocol overrides silently broke for entry-point
  plugins because the eager seed never saw them.

- _context: lift PostSandboxInteractionError, UnsafePassthroughError,
  GuardedCallError, GuardedCallWarning, SandboxNotActiveError,
  NoActiveVerifierError, walk_to_user_frame, Disposition, and
  get_firewall_stack to module-level imports. SandboxContext,
  _current_sandbox_id, and lookup_plugin_class_by_name remain deferred
  with explanatory comments because tripwire._verifier and the plugin
  modules transitively re-enter tripwire._context during their own
  module init.

- _config: cache load_tripwire_config via lru_cache(maxsize=8) keyed on
  the resolved search root (cwd at call time when start is None) so
  chdir-based tests get fresh entries. Expose cache_clear() and
  cache_info() on the public function so tests that rewrite an
  already-observed pyproject.toml mid-session can bust the cache.

- tests: add three regression tests in test_registry.py covering
  entry-point discovery on cache miss, cached lookup after first
  discovery (no re-walk), and negative caching for unknown names that
  do not match any built-in or entry-point plugin.
@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 28, 2026

@gemini-code-assist round 8 commit 045bcb4 addresses comments #4, #5, #6. Comments #1, #2, #3 (push back, with rationale below).

#4 (entry-point cache miss) — fixed. _lookup_cache is still eagerly populated from PLUGIN_REGISTRY for built-ins, but lookup_plugin_class_by_name now has a slow path that on cache miss takes the lock, double-checks, then iterates importlib.metadata.entry_points(group="tripwire.plugins") to discover third-party plugins. Hits seed both the canonical name and every guard_prefixes entry; misses negatively cache. 3 new regression tests in TestLookupPluginClassByNameCache cover canonical+prefix discovery, no-rewalk on second call, and negative-cache for unknown names.

#5 (deferred imports) — partially lifted. Lifted to module-level: PostSandboxInteractionError, UnsafePassthroughError, GuardedCallError, GuardedCallWarning, SandboxNotActiveError, NoActiveVerifierError, walk_to_user_frame, Disposition, get_firewall_stack. Left deferred (with inline rationale comments): SandboxContext and _current_sandbox_id from _verifier (back-import would form a hard cycle since _verifier imports _active_verifier from _context); lookup_plugin_class_by_name from _registry (its import-time _populate_lookup_cache() imports plugin modules, which import _context — lifting would re-enter _context mid-init, confirmed experimentally).

#6 (load_tripwire_config caching) — fixed. Added @functools.lru_cache(maxsize=8) on a wrapper that resolves start or Path.cwd() to a real path BEFORE caching (keying on None made chdir-based tests see stale results — caught by existing test). cache_clear() and cache_info() are exposed on the public function and documented in the docstring for tests that rewrite pyproject.toml mid-session.

#1, #2, #3 (allow/deny/restrict raise outside sandbox) — disagree, leaving as-is. This behavior is intentional, introduced in commit c97afc1 and pinned by tests/unit/test_allow_restrict_outside_sandbox.py plus more than 30 other tests. The previous behavior (silent no-op: frame pushed and popped without effect) was confusing — users believed they had configured firewall rules when nothing was happening. Module-scoped firewall rules go in [tool.tripwire.firewall] in pyproject.toml; the error message points users there. The runtime context managers are designed for in-sandbox use to permit specific real I/O through a sandbox; [tool.tripwire.firewall] is the intended path for module-scope rules in guard mode.

Suite is now 1896 passed, all 13 CI jobs green.

@gemini-code-assist
Copy link
Copy Markdown

Thanks for the update, elijahr. I've reviewed the changes in round 8, specifically the fixes for the entry-point cache miss, the partial lifting of deferred imports, and the load_tripwire_config caching. These address the remaining concerns from my previous review. Regarding your decision to keep the allow/deny/restrict raise-outside-sandbox behavior, I agree that the pedagogical value of failing loud and pointing users to the [tool.tripwire.firewall] configuration is the correct design choice for this library. The suite is looking solid with 1896 passed tests. This PR is ready to merge.

@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 28, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request renames the project to tripwire (PyPI: python-tripwire), removes the _mock suffix from plugin proxies, and flips the default guard mode to "error". It introduces PostSandboxInteractionError for async leak detection, per-test guard overrides, and fixes concurrency propagation. Feedback suggests optimizing sandbox tracking for free-threaded Python using immutable sets to remove locking overhead. Additionally, it is recommended to mark the FileIo, Native, and Celery plugins as unsafe for passthrough to prevent unintended side effects in "warn" mode and to refactor duplicated error message logic.

Comment thread src/tripwire/_verifier.py
# boundaries (per-test, not per-call), so the lock is acquired
# rarely on the write path; the hot-path read in Branch 2 is a
# microsecond-scale `set.__contains__` under a brief lock.
_active_sandbox_ids: ClassVar[set[int]] = set()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using a mutable set for _active_sandbox_ids requires taking a lock on every intercepted call (via _detect_post_sandbox) to remain thread-safe on PEP 703 free-threaded builds. Since get_verifier_or_raise is on the hot path for every mocked interaction, this lock will become a significant bottleneck in multi-threaded applications.

Consider using an immutable frozenset and replacing the reference on mutation. This allows for lock-free reads in the dispatch path while maintaining thread safety.

Suggested change
_active_sandbox_ids: ClassVar[set[int]] = set()
_active_sandbox_ids: ClassVar[frozenset[int]] = frozenset()

Comment thread src/tripwire/_context.py
Comment on lines +101 to +102
# CPython, an unsynchronized `in` check against a `set` being mutated
# on another thread can corrupt the hash table and hang. The lock is
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If _active_sandbox_ids is refactored to use a frozenset (as suggested in _verifier.py), this lock can be removed. Membership checks against a frozenset are thread-safe and lock-free, which significantly improves performance on the interception hot path.

Suggested change
# CPython, an unsynchronized `in` check against a `set` being mutated
# on another thread can corrupt the hash table and hang. The lock is
active = sid in SandboxContext._active_sandbox_ids


supports_guard: ClassVar[bool] = False
passthrough_safe: ClassVar[bool] = True

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Setting passthrough_safe = True for FileIoPlugin is dangerous. When guard="warn", this allows unmocked file operations (like open(..., "w") or os.remove()) to proceed to the real filesystem after emitting a warning. This violates the safety guarantee of guard mode for destructive operations.

Even if the interceptor previously raised SandboxNotActiveError, declaring it passthrough_safe now explicitly opts it into the 'warn-and-proceed' behavior. This should be False to ensure that unmocked file I/O triggers an UnsafePassthroughError instead of mutating the disk.

Suggested change
passthrough_safe: ClassVar[bool] = False

Comment thread src/tripwire/plugins/native_plugin.py Outdated
"""

supports_guard: ClassVar[bool] = False
passthrough_safe: ClassVar[bool] = True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to the file I/O plugin, NativePlugin should not be marked passthrough_safe. Allowing arbitrary native code execution (via ctypes or cffi) to proceed outside a sandbox in warn mode is a significant security and stability risk. It should be False so that unmocked native calls fail loud via UnsafePassthroughError.

Suggested change
passthrough_safe: ClassVar[bool] = True
passthrough_safe: ClassVar[bool] = False

Comment thread src/tripwire/plugins/celery_plugin.py Outdated
"""

supports_guard: ClassVar[bool] = False
passthrough_safe: ClassVar[bool] = True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Marking CeleryPlugin as passthrough_safe allows real task enqueuing to happen in warn mode. This can lead to side effects in real message brokers (RabbitMQ/Redis) during tests. It should be False to prevent accidental task leakage.

Suggested change
passthrough_safe: ClassVar[bool] = True
passthrough_safe: ClassVar[bool] = False

Comment thread src/tripwire/_errors.py
Comment on lines +275 to +280
# Method-being-called: text after the LAST ":" in the source_id
# (e.g., "subprocess:run" -> "run", "asyncio:subprocess:spawn" ->
# "spawn"). Using rsplit means the trailing segment is always
# treated as the method name regardless of how many leading
# plugin/namespace segments precede it. Falls back to
# "<unknown method>" if the source_id is malformed (no colon).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for extracting the method name from source_id is duplicated across GuardedCallError, UnsafePassthroughError, and PostSandboxInteractionError. Consider refactoring this into a private helper method on the TripwireError base class or a module-level utility to improve maintainability.

Each of these plugins' passthrough path calls the underlying real-world
operation: file_io_plugin invokes builtins.open / Path.read_bytes /
shutil.copy, celery_plugin invokes Task.delay / Task.apply_async (which
enqueue real broker messages), and native_plugin invokes ctypes.CDLL /
cffi.FFI.dlopen (which load real native libraries with side effects).
Their previous passthrough_safe=True classification was inaccurate: an
un-mocked call outside a sandbox was performing real I/O without flagging
the gap. Updated the migration table in test_passthrough_safe.py and
the per-plugin assertions in test_guard.py accordingly.
@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 28, 2026

@gemini-code-assist round 9 follow-up: addressed the most concrete summary suggestion in commit 33daddf.

Plugin passthrough_safe corrections (FileIoPlugin, NativePlugin, CeleryPlugin) — fixed. Confirmed by reading each plugin: file_io_plugin passthrough calls builtins.open / Path.read_bytes / shutil.copy (real disk I/O), celery_plugin passthrough calls Task.delay / Task.apply_async (real broker enqueue), native_plugin passthrough calls ctypes.CDLL / cffi.FFI.dlopen (real native library load with side effects). All three now passthrough_safe = False. Updated the migration table in test_passthrough_safe.py and per-plugin classification assertions in test_guard.py.

The two other suggestions in the summary (no inline comments) — immutable-set sandbox tracking and refactoring duplicated _build_message logic — are noted but I'd prefer to land them in a follow-up PR. The immutable-set change is an architectural revisit (the lock is by design and uncontended in normal use, see prior reply); the message-builder refactor is purely stylistic and can be evaluated separately.

Suite is now 1896 passed, all 13 CI jobs green.

@gemini-code-assist
Copy link
Copy Markdown

Thanks for the update, elijahr. I've reviewed the changes in round 9, specifically the passthrough_safe corrections for FileIoPlugin, NativePlugin, and CeleryPlugin. These updates correctly ensure that unmocked calls to these plugins raise UnsafePassthroughError rather than silently performing real I/O when guard='warn' is active. I understand your preference to defer the immutable-set sandbox tracking and the _build_message refactor to a follow-up PR; that is a reasonable approach given the current stability of the suite. With 1896 tests passing and all CI jobs green, this PR is ready to merge.

@elijahr
Copy link
Copy Markdown
Contributor Author

elijahr commented Apr 29, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request performs a comprehensive rename of the package from bigfoot to tripwire and introduces significant architectural improvements. Key changes include flipping the default guard mode to error, restructuring source-id sentinels into a colon-namespaced format, and adding PostSandboxInteractionError to better diagnose async leaks. Additionally, the PR implements strict TOML validation for configuration, introduces a @pytest.mark.guard marker for per-test overrides, and enhances error messages with pedagogical hints and user call site information. I have no feedback to provide.

@elijahr elijahr merged commit 1674db5 into main Apr 29, 2026
13 checks passed
@elijahr elijahr deleted the rename/tripwire-and-proposals branch April 29, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant