Skip to content

feat: plugin bundling, catalog, installation and versioning#31

Open
tedhabeck wants to merge 88 commits into
contextforge-org:mainfrom
tedhabeck:plugin-repo-1
Open

feat: plugin bundling, catalog, installation and versioning#31
tedhabeck wants to merge 88 commits into
contextforge-org:mainfrom
tedhabeck:plugin-repo-1

Conversation

@tedhabeck
Copy link
Copy Markdown
Contributor

@tedhabeck tedhabeck commented Apr 21, 2026

Summary

Closes: #9

Summary of Changes in Branch plugin-repo-1

This branch introduces a comprehensive plugin installation and management system for the CPEX framework. Here are the key changes:

📦 Major Features Added

1. Plugin Installation System

  • New CLI commands for plugin management: list, info, install, search, uninstall, versions
  • Support for multiple installation sources:
    • monorepo: Git monorepo installations (default)
    • pypi: PyPI package installations
    • test-pypi: Test PyPI installations
    • git: Direct git repository installations
    • local: Local directory installations

2. Plugin Catalog (cpex/tools/catalog.py)

  • 1,705 lines of new code for plugin discovery and management
  • Automatic catalog updates from GitHub monorepos
  • Plugin manifest persistence and caching
  • Integration with PyGithub API for repository scanning
  • Support for downloading and installing plugins from various sources

3. Plugin Registry (cpex/tools/plugin_registry.py)

  • Tracks installed plugins with metadata (version, installation type, path, timestamp, user)
  • Persists plugin information in data/installed-plugins.json
  • Manages plugin lifecycle (install/uninstall)

4. Isolated Virtual Environment Support

  • Enhanced isolated_venv plugin type with proper dependency isolation
  • Optimized worker process reuse (eliminates ~1.2ms fork overhead per invocation)
  • Automatic venv creation and caching
  • Support for requirements.txt-based dependency management
  • Security improvements: directory traversal prevention, TOCTOU race condition fixes

🔧 Core Improvements

5. Framework Enhancements

  • New models: PluginPackageInfo, PluginVersionInfo, PluginVersionRegistry, InstalledPluginInfo
  • Enhanced PluginManifest with installation metadata
  • Config loader now supports saving configurations (ConfigSaver class)
  • Improved error handling and validation

6. Security Hardening

  • Directory traversal prevention in module paths
  • Plugin directory allowlist validation
  • TOCTOU race condition fixes
  • Secure requirements file path validation

📊 Statistics

  • 7,501 lines added across 18 files
  • 68 lines removed
  • 3,092 new test lines for catalog functionality
  • 1,384 additional test lines for CLI
  • Test coverage increased to >90%

🛠️ Developer Experience

  • Comprehensive README documentation with sequence diagrams
  • Example commands for all installation types
  • Rich CLI output with emojis and formatting
  • Automatic pip upgrades before installations
  • Support for version constraints (e.g., >=0.1.0)

🔄 Key Workflows Enabled

  1. Catalog Update: Scan GitHub repos for plugins and cache manifests
  2. Plugin Installation: Download, install, and register plugins from multiple sources
  3. Plugin Uninstall: Clean removal with registry updates
  4. Plugin Discovery: Search and list available/installed plugins
  5. Version Management: Query available versions and constraints

This branch transforms CPEX into a full-featured plugin ecosystem with robust installation, isolation, and management capabilities.

Checks

  • make lint passes
  • make test passes
  • CHANGELOG updated (if user-facing)

Notes (optional)

@araujof araujof changed the title [FEATURE]: Plugin bundling, catalog, installation and versioning feat: plugin bundling, catalog, installation and versioning Apr 22, 2026
@araujof araujof added enhancement New feature or request cli framework labels Apr 22, 2026
@tedhabeck tedhabeck marked this pull request as ready for review May 1, 2026 15:24
@araujof araujof self-assigned this May 1, 2026
Copy link
Copy Markdown
Contributor

@araujof araujof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work and thanks for this PR, @tedhabeck! This is a big PR, and there are a few issues that should be addressed before merge. Others, we may decide to ship as a separate PR. Please see below.

P0 — Critical

# File Issue
1 cpex/tools/catalog.py:840 tarfile.extractall() without filter='data' allows path traversal from malicious archives — crafted .tar.gz packages from PyPI/git/monorepo can write files outside the extraction directory

P1 — High

# File Issue
2 cpex/tools/catalog.py:1105 Code injection via f-string interpolationplugin_package_name is interpolated directly into a Python script string executed via subprocess. Crafted name breaks out of string literal.
3 cpex/tools/catalog.py:266 logger.error uses %d format for str(e) — causes TypeError crash whenever the except branch executes, masking the real error
4 cpex/tools/catalog.py:446 _process_version_item writes None to file — when download_file fails it returns None; relpath.write_text(None) raises TypeError
5 cpex/tools/catalog.py:128 Version comparison uses string ordering instead of semantic versioning'9.0.0' < '10.0.0' is False lexicographically; latest pointer set incorrectly
6 cpex/tools/cli.py:482 update_plugins_config_yaml called twice per local/git install — _install_from_local and _install_from_git call it directly AND through _finalize_installation, creating duplicate config entries
7 cpex/tools/catalog.py:545 update_catalog_with_pyproject returns inverted boolean — returns True on failure, False on success. CLI caller interprets failure as "completed."
8 cpex/tools/catalog.py:851 Temp directory leaked on failure in _download_monorepo_folder_to_temp — mkdtemp with no cleanup in except/finally paths
9 cpex/tools/catalog.py:1066 asyncio.run() crashes inside an already-running event loop — called synchronously in _initialize_isolated_venv and install_from_local; fails in Jupyter, async tests, or any async CLI context
10 cpex/tools/cli.py:394 inquirer.prompt blocks non-interactive callers — install (monorepo) and uninstall use interactive prompts with no --yes bypass; agents/CI hang indefinitely
11 cpex/tools/cli.py:601 No structured output — search/list/versions emit only human-readable prose with no --format json option; agents cannot parse results
12 cpex/framework/models.py:1421 PyPiRepo.version_constraint missing = None default — Pydantic treats it as required; constructing PyPiRepo without version_constraint raises ValidationError
13 cpex/tools/catalog.py:152 find_package_path duplicated verbatim from cpex/framework/utils.py — divergence risk; plugin_registry.py already imports from utils
14 cpex/tools/catalog.py:675 All subprocess.run pip calls have no timeout — stalled pip download blocks the thread indefinitely (14+ call sites)
15 cpex/tools/catalog.py:236 httpx.get() has no timeout — download_contents hangs forever if GitHub is slow

P2 — Moderate

# File Issue
16 cpex/tools/cli.py:327 remove_from_plugins_config_yaml over-removes — filter uses OR logic when it should use AND; uninstalling one isolated_venv plugin removes ALL plugins of that kind
17 cpex/tools/catalog.py:624 search() case-insensitive match brokenmanifest.name.lower().count(plugin_name) doesn't lowercase plugin_name
18 cpex/framework/models.py:2419 Non-atomic registry write — write_text truncates then writes; crash mid-operation permanently corrupts installed-plugins.json
19 cpex/tools/plugin_registry.py:35 No error handling for corrupted JSON in registry — parse failure renders all CLI commands unusable
20 cpex/tools/cli.py:659 All failure paths exit 0 — agents cannot detect failed installs, missing plugins, or errors without parsing output text
21 cpex/tools/cli.py:247 list function shadows built-in and routes output through logger.info (invisible on stdout)
22 cpex/framework/models.py:2390 register_plugin appends without dedup — reinstalling a plugin creates phantom duplicate entries in the registry
23 cpex/tools/catalog.py:59 Auth.Token(None) called when token is unset — may raise TypeError depending on PyGithub version; silently creates unauthenticated client hitting 60 req/hr rate limit
24 cpex/framework/models.py:2415 Registry file path triplicated across models.py, plugin_registry.py, and cli.py using inconsistent env-var name (PLUGIN_REGISTRY_FILE used as folder)
25 cpex/tools/catalog.py:52 PluginCatalog.init re-reads env vars bypassing pydantic-settings — CatalogSettings is built then ignored; github_token always reads None as fallback

Pre-existing Issues

# File Issue
A cpex/framework/isolated/worker.py:166 sys.stdin.readline(limit=...) uses unsupported keyword arg — should be positional size parameter
B cpex/framework/isolated/venv_comm.py:98 Worker process env stripped to only PLUGINS_CONFIG_FILE — intentional isolation but breaks native extensions

Additional Findings

  • Subprocess pattern: Team converged on subprocess.run(..., check=True, capture_output=True, text=True) with list-form args. venv_comm.py still uses older check_call.
  • YAML safety: yaml.safe_load is used consistently everywhere — confirmed correct.
  • Path traversal guard: _find_requirements_in_extracted_package has layered protection (normpath + resolve + relative_to). A TODO at line 1059 flags a gap where requirements_file from manifest.default_config bypasses this validation.
  • Module import blocklist: cpex.framework.utils.import_module blocks dangerous stdlib modules. New loading paths must route through it.
  • PyPI name validation gate: PluginPackageInfo validators sanitize before any subprocess call. The git install path skips this for the package_name portion.

Agent-Native Gaps

0/6 new plugin management capabilities are agent-accessible via the MCP tool layer. The existing MCP server registers only get_plugin_configs, get_plugin_config, and invoke_hook. None of the new operations (search, info, list, install, uninstall, versions) are exposed. Additionally:

  • 2/6 CLI operations are completely blocked for non-interactive use due to mandatory inquirer prompts with no bypass
  • The underlying PluginCatalog and PluginRegistry classes have clean method signatures suitable for direct MCP tool wrapping

Residual Risks

  • Concurrent installations corrupt shared registry/versions.json (no file locking)
  • Supply chain risk: pip install from untrusted sources executes arbitrary code during installation
  • GitHub API rate limiting (30 req/min for search) can leave catalog partially updated with no rollback
  • No integrity verification (signatures, checksums) of downloaded packages beyond pip's built-in checks
  • The _find_and_load_versions_json f-string injection vector is reachable from any install path where plugin_package_name comes from untrusted manifest YAML

Testing Gaps

  • No tests for tarfile path traversal in _extract_package_archive
  • No tests for download_file exception path (would reveal %d format bug)
  • No tests for remove_from_plugins_config_yaml with multiple plugins of same kind
  • No test for the asyncio.run() conflict from an existing event loop
  • No test for corrupted installed-plugins.json recovery
  • No test for concurrent registry access
  • No test for the f-string injection vector in _find_and_load_versions_json
  • No test exercises plugin install or plugin uninstall with stdin closed (non-TTY)
  • No tests assert exit codes for failure paths
  • upgrade_pip() in VenvProcessCommunicator has no dedicated test
  • PluginRegistry.update() with installation_type='git' is never tested
  • No direct unit tests for update_plugin_version_registry() — the found/not-found branching and latest-pointer tracking logic is entirely untested

Verdict

Fix order:

  1. Security (P0): Add filter='data' to tarfile.extractall and validate zip member paths before extraction
  2. Security (P1): Eliminate f-string code injection by passing package_name as sys.argv to subprocess, not interpolated into script string
  3. Correctness (P1): Fix logger format bug, None-write crash, string version comparison, duplicate config writes, inverted boolean return
  4. Reliability (P1): Add timeouts to all subprocess.run/httpx calls, fix temp directory cleanup, address asyncio.run in event loop
  5. CLI (P1): Add --yes flag to bypass interactive prompts; add --format json for structured output
  6. Correctness (P2): Fix remove_from_plugins_config_yaml AND/OR logic, search case sensitivity, registry dedup
  7. Reliability (P2): Atomic registry writes, corrupted JSON recovery, guard Auth.Token(None)

The security and correctness issues (P0-P1) must be fixed before merge. The CLI agent-readiness issues (P1 #10-11) should be addressed in this PR or a fast-follow to make the CLI is unusable for automation.

@araujof araujof added this to CPEX May 4, 2026
@github-project-automation github-project-automation Bot moved this to Backlog in CPEX May 4, 2026
@araujof araujof moved this from Backlog to In review in CPEX May 4, 2026
@araujof araujof added this to the 0.1.1 milestone May 4, 2026
@tedhabeck tedhabeck requested a review from araujof May 6, 2026 19:45
Copy link
Copy Markdown
Contributor

@araujof araujof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; Thanks for addressing the review, @tedhabeck !

Nice addition on the package integrity verification (SHA256), and the isolated worker now routes through the module import blocklist.

All P0 and P1 findings from my review have been addressed: tarfile/zip path traversal is fixed (filter="data" + member validation), f-string injection eliminated (sys.argv), timeouts added to all subprocess/httpx calls, asyncio.run handled for running loops, --yes and --format json flags added for agent/CI use, and semantic versioning is now correct via packaging.version.Version.

All P2 items are resolved as well: atomic registry writes, dedup on register, corrupted JSON recovery, proper exit codes, Auth.Token(None) guard, and consolidated registry path.

tedhabeck added 14 commits May 8, 2026 13:51
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
…ent.py to run async

Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Forking a new Python process (~1.2ms per fork_exec)
Initializing the Python interpreter
Loading modules and dependencies
Setting up the subprocess communication pipes

Signed-off-by: habeck <habeck@us.ibm.com>
…ically, update cli to support creating an isolated plugin.

Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
…ate (Consistent with how the PluginManager works).

Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
tedhabeck added 27 commits May 8, 2026 14:46
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
…"if rc".

Signed-off-by: habeck <habeck@us.ibm.com>
… the unconditional append with a filter-then-append. Any existing entry with the same

   name is removed before the new one is added, so a reinstall upgrades the entry rather than creating a duplicate. One save() call, same atomicity as
  before.

Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
…than importlib.import_module directly.

Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
… with the appropriate default from catalog settings.

Signed-off-by: habeck <habeck@us.ibm.com>
tedhabeck added 2 commits May 8, 2026 14:53
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli enhancement New feature or request framework

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

[FEATURE]: Plugin bundling, catalog, installation and versioning

2 participants