feat: plugin bundling, catalog, installation and versioning#31
feat: plugin bundling, catalog, installation and versioning#31tedhabeck wants to merge 88 commits into
Conversation
There was a problem hiding this comment.
Nice work and thanks for this PR, @tedhabeck! This is a big PR, and there are a few issues that should be addressed before merge. Others, we may decide to ship as a separate PR. Please see below.
P0 — Critical
| # | File | Issue |
|---|---|---|
| 1 | cpex/tools/catalog.py:840 |
tarfile.extractall() without filter='data' allows path traversal from malicious archives — crafted .tar.gz packages from PyPI/git/monorepo can write files outside the extraction directory |
P1 — High
| # | File | Issue |
|---|---|---|
| 2 | cpex/tools/catalog.py:1105 |
Code injection via f-string interpolation — plugin_package_name is interpolated directly into a Python script string executed via subprocess. Crafted name breaks out of string literal. |
| 3 | cpex/tools/catalog.py:266 |
logger.error uses %d format for str(e) — causes TypeError crash whenever the except branch executes, masking the real error |
| 4 | cpex/tools/catalog.py:446 |
_process_version_item writes None to file — when download_file fails it returns None; relpath.write_text(None) raises TypeError |
| 5 | cpex/tools/catalog.py:128 |
Version comparison uses string ordering instead of semantic versioning — '9.0.0' < '10.0.0' is False lexicographically; latest pointer set incorrectly |
| 6 | cpex/tools/cli.py:482 |
update_plugins_config_yaml called twice per local/git install — _install_from_local and _install_from_git call it directly AND through _finalize_installation, creating duplicate config entries |
| 7 | cpex/tools/catalog.py:545 |
update_catalog_with_pyproject returns inverted boolean — returns True on failure, False on success. CLI caller interprets failure as "completed." |
| 8 | cpex/tools/catalog.py:851 |
Temp directory leaked on failure in _download_monorepo_folder_to_temp — mkdtemp with no cleanup in except/finally paths |
| 9 | cpex/tools/catalog.py:1066 |
asyncio.run() crashes inside an already-running event loop — called synchronously in _initialize_isolated_venv and install_from_local; fails in Jupyter, async tests, or any async CLI context |
| 10 | cpex/tools/cli.py:394 |
inquirer.prompt blocks non-interactive callers — install (monorepo) and uninstall use interactive prompts with no --yes bypass; agents/CI hang indefinitely |
| 11 | cpex/tools/cli.py:601 |
No structured output — search/list/versions emit only human-readable prose with no --format json option; agents cannot parse results |
| 12 | cpex/framework/models.py:1421 |
PyPiRepo.version_constraint missing = None default — Pydantic treats it as required; constructing PyPiRepo without version_constraint raises ValidationError |
| 13 | cpex/tools/catalog.py:152 |
find_package_path duplicated verbatim from cpex/framework/utils.py — divergence risk; plugin_registry.py already imports from utils |
| 14 | cpex/tools/catalog.py:675 |
All subprocess.run pip calls have no timeout — stalled pip download blocks the thread indefinitely (14+ call sites) |
| 15 | cpex/tools/catalog.py:236 |
httpx.get() has no timeout — download_contents hangs forever if GitHub is slow |
P2 — Moderate
| # | File | Issue |
|---|---|---|
| 16 | cpex/tools/cli.py:327 |
remove_from_plugins_config_yaml over-removes — filter uses OR logic when it should use AND; uninstalling one isolated_venv plugin removes ALL plugins of that kind |
| 17 | cpex/tools/catalog.py:624 |
search() case-insensitive match broken — manifest.name.lower().count(plugin_name) doesn't lowercase plugin_name |
| 18 | cpex/framework/models.py:2419 |
Non-atomic registry write — write_text truncates then writes; crash mid-operation permanently corrupts installed-plugins.json |
| 19 | cpex/tools/plugin_registry.py:35 |
No error handling for corrupted JSON in registry — parse failure renders all CLI commands unusable |
| 20 | cpex/tools/cli.py:659 |
All failure paths exit 0 — agents cannot detect failed installs, missing plugins, or errors without parsing output text |
| 21 | cpex/tools/cli.py:247 |
list function shadows built-in and routes output through logger.info (invisible on stdout) |
| 22 | cpex/framework/models.py:2390 |
register_plugin appends without dedup — reinstalling a plugin creates phantom duplicate entries in the registry |
| 23 | cpex/tools/catalog.py:59 |
Auth.Token(None) called when token is unset — may raise TypeError depending on PyGithub version; silently creates unauthenticated client hitting 60 req/hr rate limit |
| 24 | cpex/framework/models.py:2415 |
Registry file path triplicated across models.py, plugin_registry.py, and cli.py using inconsistent env-var name (PLUGIN_REGISTRY_FILE used as folder) |
| 25 | cpex/tools/catalog.py:52 |
PluginCatalog.init re-reads env vars bypassing pydantic-settings — CatalogSettings is built then ignored; github_token always reads None as fallback |
Pre-existing Issues
| # | File | Issue |
|---|---|---|
| A | cpex/framework/isolated/worker.py:166 |
sys.stdin.readline(limit=...) uses unsupported keyword arg — should be positional size parameter |
| B | cpex/framework/isolated/venv_comm.py:98 |
Worker process env stripped to only PLUGINS_CONFIG_FILE — intentional isolation but breaks native extensions |
Additional Findings
- Subprocess pattern: Team converged on
subprocess.run(..., check=True, capture_output=True, text=True)with list-form args.venv_comm.pystill uses oldercheck_call. - YAML safety:
yaml.safe_loadis used consistently everywhere — confirmed correct. - Path traversal guard:
_find_requirements_in_extracted_packagehas layered protection (normpath + resolve + relative_to). A TODO at line 1059 flags a gap whererequirements_filefrommanifest.default_configbypasses this validation. - Module import blocklist:
cpex.framework.utils.import_moduleblocks dangerous stdlib modules. New loading paths must route through it. - PyPI name validation gate:
PluginPackageInfovalidators sanitize before any subprocess call. The git install path skips this for the package_name portion.
Agent-Native Gaps
0/6 new plugin management capabilities are agent-accessible via the MCP tool layer. The existing MCP server registers only get_plugin_configs, get_plugin_config, and invoke_hook. None of the new operations (search, info, list, install, uninstall, versions) are exposed. Additionally:
- 2/6 CLI operations are completely blocked for non-interactive use due to mandatory
inquirerprompts with no bypass - The underlying
PluginCatalogandPluginRegistryclasses have clean method signatures suitable for direct MCP tool wrapping
Residual Risks
- Concurrent installations corrupt shared registry/versions.json (no file locking)
- Supply chain risk: pip install from untrusted sources executes arbitrary code during installation
- GitHub API rate limiting (30 req/min for search) can leave catalog partially updated with no rollback
- No integrity verification (signatures, checksums) of downloaded packages beyond pip's built-in checks
- The
_find_and_load_versions_jsonf-string injection vector is reachable from any install path where plugin_package_name comes from untrusted manifest YAML
Testing Gaps
- No tests for tarfile path traversal in
_extract_package_archive - No tests for download_file exception path (would reveal %d format bug)
- No tests for
remove_from_plugins_config_yamlwith multiple plugins of same kind - No test for the
asyncio.run()conflict from an existing event loop - No test for corrupted
installed-plugins.jsonrecovery - No test for concurrent registry access
- No test for the f-string injection vector in
_find_and_load_versions_json - No test exercises
plugin installorplugin uninstallwith stdin closed (non-TTY) - No tests assert exit codes for failure paths
upgrade_pip()in VenvProcessCommunicator has no dedicated testPluginRegistry.update()withinstallation_type='git'is never tested- No direct unit tests for
update_plugin_version_registry()— the found/not-found branching and latest-pointer tracking logic is entirely untested
Verdict
Fix order:
- Security (P0): Add
filter='data'to tarfile.extractall and validate zip member paths before extraction - Security (P1): Eliminate f-string code injection by passing package_name as sys.argv to subprocess, not interpolated into script string
- Correctness (P1): Fix logger format bug, None-write crash, string version comparison, duplicate config writes, inverted boolean return
- Reliability (P1): Add timeouts to all subprocess.run/httpx calls, fix temp directory cleanup, address asyncio.run in event loop
- CLI (P1): Add
--yesflag to bypass interactive prompts; add--format jsonfor structured output - Correctness (P2): Fix remove_from_plugins_config_yaml AND/OR logic, search case sensitivity, registry dedup
- Reliability (P2): Atomic registry writes, corrupted JSON recovery, guard Auth.Token(None)
The security and correctness issues (P0-P1) must be fixed before merge. The CLI agent-readiness issues (P1 #10-11) should be addressed in this PR or a fast-follow to make the CLI is unusable for automation.
araujof
left a comment
There was a problem hiding this comment.
LGTM; Thanks for addressing the review, @tedhabeck !
Nice addition on the package integrity verification (SHA256), and the isolated worker now routes through the module import blocklist.
All P0 and P1 findings from my review have been addressed: tarfile/zip path traversal is fixed (filter="data" + member validation), f-string injection eliminated (sys.argv), timeouts added to all subprocess/httpx calls, asyncio.run handled for running loops, --yes and --format json flags added for agent/CI use, and semantic versioning is now correct via packaging.version.Version.
All P2 items are resolved as well: atomic registry writes, dedup on register, corrupted JSON recovery, proper exit codes, Auth.Token(None) guard, and consolidated registry path.
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
…ent.py to run async Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Forking a new Python process (~1.2ms per fork_exec) Initializing the Python interpreter Loading modules and dependencies Setting up the subprocess communication pipes Signed-off-by: habeck <habeck@us.ibm.com>
…ically, update cli to support creating an isolated plugin. Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
…ate (Consistent with how the PluginManager works). Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
…"if rc". Signed-off-by: habeck <habeck@us.ibm.com>
… the unconditional append with a filter-then-append. Any existing entry with the same name is removed before the new one is added, so a reinstall upgrades the entry rather than creating a duplicate. One save() call, same atomicity as before. Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
…than importlib.import_module directly. Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
… with the appropriate default from catalog settings. Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Signed-off-by: habeck <habeck@us.ibm.com>
Summary
Closes: #9
Summary of Changes in Branch
plugin-repo-1This branch introduces a comprehensive plugin installation and management system for the CPEX framework. Here are the key changes:
📦 Major Features Added
1. Plugin Installation System
list,info,install,search,uninstall,versions2. Plugin Catalog (
cpex/tools/catalog.py)3. Plugin Registry (
cpex/tools/plugin_registry.py)data/installed-plugins.json4. Isolated Virtual Environment Support
isolated_venvplugin type with proper dependency isolation🔧 Core Improvements
5. Framework Enhancements
PluginPackageInfo,PluginVersionInfo,PluginVersionRegistry,InstalledPluginInfoPluginManifestwith installation metadataConfigSaverclass)6. Security Hardening
📊 Statistics
🛠️ Developer Experience
>=0.1.0)🔄 Key Workflows Enabled
This branch transforms CPEX into a full-featured plugin ecosystem with robust installation, isolation, and management capabilities.
Checks
make lintpassesmake testpassesNotes (optional)