Release v0.5.1 by shoom1 · Pull Request #73 · shoom1/agentic-cli

shoom1 · 2026-04-19T03:30:39Z

Summary

Release v0.5.1 — the capability-based permission engine release.

Added

Capability-based permission engine (PR feat(permissions): capability-based permission engine with ADK + LangGraph adapters #72) replacing the ConfirmationPlugin HITL. Tools declare capabilities (filesystem.write, http.read, shell.exec, …) via @register_tool(capabilities=...). Rules load from builtin defaults, user/project settings.json, and an in-memory session layer; unmatched calls prompt Allow once / session / always (save to project) / Deny.
Path/URL/Shell/StringGlob matchers; EXEMPT sentinel.
ADK PermissionPlugin and LangGraph wrap_tool_for_permission.
permissions / permissions_enabled settings.
Real BM25 backends (bm25s, rank_bm25).

Changed

@register_tool now requires capabilities=.
filesystem.* grants auto-extend to the parent directory; memory.* / kb.* allowed by default.
_ensure_managers_initialized runs in a worker thread.
Memory tools deduped; HITL confirmation extracted into a backend-neutral module.

Fixed

Matcher canonicalize/matches preserves the * wildcard.
KnowledgeBaseManager concurrency contract tightened.

Removed

PermissionLevel (SAFE / CAUTION / DANGEROUS), ConfirmationPlugin, _wrap_for_confirmation, hitl_tools, hitl_enabled.

Test plan

conda run -n agenticcli python -m pytest tests/ -q — 1459 passed, 2 skipped, 26 xfailed
After merge: tag v0.5.1 on main and push the tag

`create_bm25_index()` previously advertised a 3-tier fallback (bm25s → rank_bm25 → mock) but the backend module didn't exist, so every deployment silently got the mock TF-IDF stub and hybrid retrieval's keyword arm was degraded. Adds `_bm25_backends.py` with `BM25sIndex` (NumPy-accelerated, preferred) and `RankBM25Index` (pure Python, uses BM25Plus to avoid Okapi's zero/negative IDF on common terms). Both implement the same add/remove/search/save/load interface as `MockBM25Index` so the factory can swap them transparently. Adds parametrized contract tests over both backends plus a test that the factory now returns the real backend when bm25s is installed.

LangGraph's graph_builder previously imported `is_dangerous` and `request_tool_confirmation` from `workflow.adk.plugins`, making the LangGraph backend carry a dependency on the ADK backend. Both functions are framework-neutral (they only touch the tool registry, service registry, and event model), so moving them to `workflow.confirmation` restores the intended "pluggable orchestrators behind one protocol" shape — a third orchestrator would no longer inherit an ADK dep via shared HITL logic. `ConfirmationPlugin` (the ADK Plugin wrapper) stays in `workflow.adk.plugins` and now imports the helpers from the neutral module. Added `tests/workflow/test_backend_isolation.py` to lock in the constraint: no file under `workflow/langgraph/` may import from `workflow/adk/*`, and vice versa.

… points The @register_tool wrappers in memory_tools.py and the closures returned by factories.make_memory_tools() previously contained duplicated function bodies. This had already caused silent drift: search_memory returned different keys from each entry point (the factory version included `importance`, the registry version did not), and update_memory handled tags=None differently (factory treated it as "don't touch", registry as "clear"), so the same tool name exhibited different behavior depending on how the workflow wired it. Following the pattern already used for KB and ArXiv tools, extract four _with_store helpers in memory_tools.py that own the business logic. Both the @register_tool wrappers (service-registry bound) and the make_memory_tools() closures (explicit store bound) now delegate to the helpers, so they cannot drift. Unified contract: - search_memory always includes `importance` in result items and accepts include_archived. - update_memory uses the _SENTINEL default in both entry points: omit tags to leave them unchanged, pass None to clear, pass a list to replace. Added TestMemoryToolParity (5 tests) that feed identical inputs to both variants and assert identical output — locks the no-drift invariant in place.

Three narrow fixes plus documentation of what was previously implicit: 1. `_backfill_running` check-and-set is now inside `with self._lock:`, closing a cross-thread TOCTOU window where two threads could both pass the `if self._backfill_running:` guard before either set the flag. Within a single asyncio event loop this was already safe (no `await` between check and set), but two threads — each with its own event loop, or a thread pool call plus a direct await — could race. 2. `backfill_sidecars` now snapshots `self._documents` and performs its per-iteration existence check under `_lock`, so a concurrent sync `delete_document` call from another thread cannot interleave with backfill iteration. 3. `tools/knowledge_tools.py` used to reach into `source_kb._sidecar_locks.setdefault(...)` to create per-doc async locks. Replaced with a new public accessor `KnowledgeBaseManager.get_or_create_sidecar_lock(doc_id)` that performs the insert under `_lock` (so two threads seeding the dict for the same new doc cannot produce distinct `asyncio.Lock` instances). The lazy `kb_read` fallback and `backfill_sidecars` now both go through this accessor. Also documented the contract in the class docstring: sync mutation API is thread-safe; async sidecar API is single-event-loop. Sharing a manager across multiple event loops is not supported — per-doc `asyncio.Lock` instances are bound to the loop they were created on. Added TestGetOrCreateSidecarLock with 4 tests covering accessor identity, usability, and that the lazy kb_read path actually routes through the accessor.

Guard sibling imports in __init__.py with try/except so each task's tests can run in isolation before all modules are implemented.

Add PathMatcher class and _glob_to_regex helper to matchers.py.

Implements load_rules() in store.py that parses the permissions.allow/deny arrays from a settings.json file, canonicalizes targets via get_matcher(), and returns typed Rule objects. Uses a local import of get_matcher to avoid a circular dependency with matchers.py. Four new tests cover: missing file, missing section, allow+deny parsing with substitution, and malformed JSON.

…ort-circuit Implements Task 14: PermissionEngine.__init__, rule loading from builtin/user/ project sources, and the permissions_enabled=False early-return path. Uses Rule constructor copies (not object.__setattr__) to avoid mutating the module-level BUILTIN_RULES frozen dataclasses.

Implement check(), _resolve(), _evaluate(), _fmt_rule_reason(), and _ask_and_apply() stub in PermissionEngine. DENY wins across both capabilities and sources; targetless capabilities match '*'.

…ow that all modules exist

…arg (staged) Optional kwarg with default=EXEMPT; existing permission_level kwarg untouched for staged migration. Tools will be updated to declare capabilities in Task 19; permission_level + PermissionLevel enum get removed in the final cleanup.

Maps each tool's side effects to (capability, target) tuples. Keeps permission_level= intact for staged migration; old field is removed in the final cleanup task.

…+ PERMISSION_ENGINE key - Add PERMISSION_ENGINE = "permission_engine" constant to service_registry.py (alphabetical order) - Construct PermissionEngine unconditionally in _ensure_managers_initialized (before WORKFLOW) - Update permission_plugin.py and permission_wrap.py to import and use the real constant - Update existing permission integration tests to use the imported PERMISSION_ENGINE constant - Add tests/workflow/test_base_manager_permissions.py verifying engine is published to services dict

…aged) Adds PermissionRuleConfig, PermissionsConfig models and two new fields (permissions, permissions_enabled) to WorkflowSettingsMixin. hitl_enabled is preserved for migration compatibility.

…tionPlugin, _wrap_for_confirmation, hitl_tools, hitl_enabled) - Drop permission_level= from every tool; capabilities= is now required - Remove PermissionLevel enum + permission_level field from ToolDefinition - Delete workflow/confirmation.py and tools/hitl_tools.py - Strip ConfirmationPlugin from workflow/adk/plugins.py (LLMLoggingPlugin preserved) - Remove LangGraphBuilder._wrap_for_confirmation + its confirmation imports - Remove hitl_enabled setting - Update tests that referenced the retired symbols - Update README + CLAUDE.md snippets to the new API

… docstrings

…ermission engine

…ches Targetless capabilities (Capability with target_arg=None, used by web_search, save_memory, etc.) resolve to target='*'. When a user picked 'Allow always', URLMatcher.canonicalize was mangling '*' into 'https://*' on JSON reload, and URLMatcher.matches rejected the bare '*' on schema-comparison, so the just-granted rule didn't match subsequent calls — the prompt reappeared. Fix: every matcher short-circuits the '*' sentinel in both canonicalize (pass-through) and matches (wildcard pattern matches anything). Narrow patterns still correctly deny targetless ('*') calls — only pattern=='*' is treated as the universal wildcard. Adds TestTargetlessAllowAlwaysRegression covering: - Allow-always on web_search (http.read, targetless) → next two calls are allowed without re-prompting - Cross-tool sharing: search_arxiv with same capability is also covered - After project JSON reload, the persisted wildcard rule still matches

save_plan / get_plan / save_tasks / get_tasks in tools/{adk,langgraph}/state_tools.py are injected into agent tool lists by BaseWorkflowManager._get_state_tools() and were never decorated with @register_tool. The adapter therefore saw defn is None and denied every call with 'tool has no capability declaration'. These tools only touch internal workflow state (plan string, task list) — no file, network, or shell side effects — so EXEMPT is the right classification.

…ransfer_to_agent as EXEMPT - BUILTIN_RULES now includes 'memory.* → *' and 'kb.* → *' ALLOW entries. Memory + KB are agent-internal stores the user already opted into by giving the tool. Prompting for every save_memory / kb_ingest is noisy. - ADK auto-injects transfer_to_agent into coordinator agents with sub_agents; it's an internal routing primitive, not an external side effect. Register it with EXEMPT at permission_plugin.py import time so the plugin lets it through. - Tests for both new behaviours.

EmbeddingService construction loads a sentence-transformers model, which takes 1-3 seconds of pure sync CPU/IO. It ran on the main event loop inside initialize_services(), blocking prompt_toolkit's key handler during background init — the REPL appeared frozen for seconds at startup and keystrokes only rendered after init completed. Run _ensure_managers_initialized() via asyncio.to_thread so the event loop stays free. Prompt responsiveness is restored; the PermissionEngine construction (which also does a small amount of sync I/O) rides along.

When the user picks 'Allow for session' or 'Allow always' on a filesystem.* capability, store a rule covering the parent directory (with /** glob) instead of the exact file. One grant then covers every file the agent writes or reads in that folder, which matches typical workflows — if an agent is writing one file, it's almost always going to write more. Non-filesystem namespaces (http.*, shell.*, etc.) keep exact-target grants. The wildcard sentinel '*' (used by targetless capabilities like web_search) also passes through unchanged. The ask prompt now prints '(Session/Always grants apply to the parent directory.)' when a filesystem capability is involved so the user knows what scope they're agreeing to. Tests cover the broadening, scope preservation for other namespaces, and nested-subdirectory matching.

The UI was displaying the exact filename (e.g. 'filesystem.write → /foo/bar.txt'), but Session/Always grants store /foo/**. Show the broadened target in the prompt so the displayed scope matches the scope that will actually be granted. - Moved the widening helper to a module-level function (engine.broaden_target_for_grant) so both engine._ask_and_apply and prompt.build_request use the same logic. - prompt.build_request now displays broaden_target_for_grant(cap) for each capability line. A '(Grant scope widened to the parent directory.)' hint replaces the previous 'applies to parent directory' wording and only appears when any capability was actually broadened. - Handle the root-parent edge case: a file directly under / collapses to '/**' instead of '//**'. Tests cover: - Filesystem capabilities display the /parent/** scope - Exact filenames do NOT appear in the prompt - Non-filesystem (http.*) capabilities keep exact targets - Root-parent paths render as '/**'

feat(permissions): capability-based permission engine with ADK + LangGraph adapters

Bump version to 0.5.1 and document the capability-based permission engine that replaces the ConfirmationPlugin-based HITL. Refresh README to drop the stale PermissionLevel table in favor of a Capabilities subsection.

shoom1 added 30 commits April 17, 2026 15:53

docs(claude): note that docs/ is gitignored and must not be committed

de3e9a9

feat(permissions): scaffold permissions package

132ccaa

feat(permissions): Capability, ResolvedCapability, EXEMPT sentinel

6ef9dac

Guard sibling imports in __init__.py with try/except so each task's tests can run in isolation before all modules are implemented.

feat(permissions): Rule, Effect, RuleSource, AskScope, CheckResult

ae2b5e9

feat(permissions): PermissionContext + variable substitution

163b3eb

feat(permissions): Matcher protocol + StringGlobMatcher

569b751

feat(permissions): PathMatcher with ** glob support

78114e5

Add PathMatcher class and _glob_to_regex helper to matchers.py.

feat(permissions): URLMatcher for http.* capabilities

ffccb77

feat(permissions): ShellMatcher for shell.* capabilities

8d2ad3d

feat(permissions): matcher registry + capability-name glob

67ae761

feat(permissions): BUILTIN_RULES default policy

a4b2a3a

feat(permissions): append_project_rule with atomic JSON merge

cb83c61

feat(permissions): build_request + parse_response for ask dialog

6c2886d

feat(permissions): engine rule-based allow/deny path

d0ae502

Implement check(), _resolve(), _evaluate(), _fmt_rule_reason(), and _ask_and_apply() stub in PermissionEngine. DENY wins across both capabilities and sources; targetless capabilities match '*'.

feat(permissions): engine ask flow (once/session/always/deny)

d243545

test(permissions): serialise concurrent ask prompts

96a9af4

refactor(permissions): drop try/except scaffolding from __init__.py n…

9a6f1bf

…ow that all modules exist

feat(tools): declare capabilities on every registered tool

2628cf7

Maps each tool's side effects to (capability, target) tuples. Keeps permission_level= intact for staged migration; old field is removed in the final cleanup task.

feat(permissions): ADK PermissionPlugin

a720c04

feat(permissions): LangGraph wrap_tool_for_permission

de905c0

feat(permissions): add permissions + permissions_enabled settings (st…

2de1afc

…aged) Adds PermissionRuleConfig, PermissionsConfig models and two new fields (permissions, permissions_enabled) to WorkflowSettingsMixin. hitl_enabled is preserved for migration compatibility.

feat(permissions): ADK manager uses PermissionPlugin

2bf7950

shoom1 added 12 commits April 18, 2026 11:01

feat(permissions): LangGraph builder uses wrap_tool_for_permission

353edc4

docs(permissions): drop transient 'once Task N' phrasing from adapter…

7c84fc6

… docstrings

docs(permissions): update CLAUDE.md + README.md to describe the new p…

af1c551

…ermission engine

Merge pull request #72 from shoom1/feature/permissions-system

a07aca7

feat(permissions): capability-based permission engine with ADK + LangGraph adapters

Release v0.5.1

fffdee1

Bump version to 0.5.1 and document the capability-based permission engine that replaces the ConfirmationPlugin-based HITL. Refresh README to drop the stale PermissionLevel table in favor of a Capabilities subsection.

shoom1 merged commit 2189ccc into main Apr 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.5.1#73

Release v0.5.1#73
shoom1 merged 42 commits intomainfrom
develop

shoom1 commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shoom1 commented Apr 19, 2026

Summary

Added

Changed

Fixed

Removed

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant