[FIX] Use importlib.util.find_spec for pluggable worker discovery#1923
[FIX] Use importlib.util.find_spec for pluggable worker discovery#1923chandrasekharan-zipstack merged 2 commits intomainfrom
Conversation
…1918) * [FIX] Use importlib.util.find_spec for pluggable worker discovery _verify_pluggable_worker_exists() previously checked for the literal file `pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin has been compiled to a .so (Nuitka, Cython, or any C extension) — the module is perfectly importable but the pre-check rejects it because only the .py extension is considered. Replace the filesystem check with importlib.util.find_spec(), which is Python's standard way to ask "is this module resolvable by the import system?". It honors every registered finder — source .py, compiled .so, bytecode .pyc, namespace packages, zipimports — so the function now matches what its docstring claims: verifying the module can be loaded, not that a specific file extension is present. Behavior is preserved for existing deployments: - Images with no `pluggable_worker/<name>/` subpackage → find_spec raises ModuleNotFoundError (ImportError subclass) → returns False. - Images with source .py → find_spec resolves the .py → returns True. - Images with compiled .so → find_spec resolves the .so → returns True. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Handle ValueError from find_spec in pluggable worker verification Greptile-flagged edge case: importlib.util.find_spec() can raise ValueError (not just ImportError) when sys.modules has a partially initialised module entry with __spec__ = None from a prior failed import. Broaden the except to catch both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Resolve api-deployment worker directory from enum import path worker.py:452 did worker_type.value.replace("-", "_") to derive the on-disk dir name. All WorkerType enum values already use underscores, so the replace was a no-op; for API_DEPLOYMENT whose dir is "api-deployment" (hyphen), it resolved to "api_deployment" and the os.path.exists() check failed. Boot then logged a spurious "❌ Worker directory not found: /app/api_deployment" at ERROR level. The task registration path (builder + celery autodiscover via to_import_path) is unaffected, so this was purely log noise — but noise at ERROR level that masks real failures in log scans. Fix: derive the directory from the authoritative to_import_path() which already handles the hyphen case (api_deployment -> api-deployment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WalkthroughTwo files were modified to refactor worker module discovery and import-path handling. The first changes verification from filesystem existence checks to Python import-resolution mechanisms with expanded exception handling. The second aligns task import path derivation with authoritative import-path mappings. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
| Filename | Overview |
|---|---|
| workers/shared/infrastructure/config/builder.py | Replaces filesystem .py check with importlib.util.find_spec() for compiled-plugin discovery, broadens exception handler to include ValueError, and tightens up the module-path derivation. |
| workers/worker.py | Fixes the spurious api_deployment directory lookup by deriving worker_directory from to_import_path().rsplit(".", 1)[0], which correctly resolves to api-deployment via the authoritative directory-mapping in the enum. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Worker boot] --> B{is_pluggable?}
B -- Yes --> C["worker_directory = pluggable_worker/{value}"]
B -- No --> D["worker_directory = to_import_path().rsplit('.', 1)[0]\ne.g. api-deployment.tasks → api-deployment"]
C --> E[worker_path = base_dir / worker_directory]
D --> E
E --> F{os.path.exists worker_path?}
F -- No --> G["❌ Worker directory not found (logged)"]
F -- Yes --> H{tasks.py exists?}
H -- No --> I["⚠️ No tasks.py found"]
H -- Yes --> J[spec_from_file_location / exec_module]
subgraph builder.py _verify_pluggable_worker_exists
K[find_spec pluggable_worker.value.worker] --> L{spec is None?}
L -- Yes --> M["logger.error → return False"]
L -- No --> N[import_module]
N -- ImportError / ValueError --> O["logger.exception → return False"]
N -- OSError / AttributeError --> P["logger.exception → return False"]
N -- Success --> Q["return True"]
end
Reviews (2): Last reviewed commit: "Merge branch 'main' into fix/pluggable-w..." | Re-trigger Greptile
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
workers/shared/infrastructure/config/builder.py (1)
321-353:⚠️ Potential issue | 🟠 Major
load_pluggable_worker()must useimportlib.util.find_spec()instead of filesystem check to support compiled plugins consistently.The
worker_path.exists()check at line 236 returns early before attempting to load the config module, preventing compiled plugins (e.g., Nuitka/Cython.sofiles) from accessing theirQUEUE_CONFIG,TASK_ROUTING, andLOGGING_CONFIG— even though_verify_pluggable_worker_exists()now correctly validates them viaimportlib.util.find_spec(). Replace the filesystem check withimportlib.util.find_spec(f"pluggable_worker.{worker_type.value}.config")to align discovery logic across both gates.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@workers/shared/infrastructure/config/builder.py` around lines 321 - 353, The code in load_pluggable_worker still uses a filesystem check (worker_path.exists()) which skips loading compiled plugins; update load_pluggable_worker to use importlib.util.find_spec(f"pluggable_worker.{worker_type.value}.config") instead of checking worker_path.exists(), mirroring _verify_pluggable_worker_exists; if find_spec returns None, log an error and return early, otherwise proceed to importlib.import_module for the "pluggable_worker.{worker_type.value}.config" module to access QUEUE_CONFIG, TASK_ROUTING, and LOGGING_CONFIG.workers/worker.py (1)
457-476:⚠️ Potential issue | 🟡 MinorTask loading won't work for compiled pluggable workers without on-disk .py files.
The filesystem checks at lines 458 and 464 (
os.path.exists(worker_path)andos.path.exists(tasks_file)) will fail if a pluggable worker is compiled to.sowithout a disk-basedtasks.py. Even if those checks somehow passed,spec_from_file_location()at line 469 lacks an explicit loader (e.g.,ExtensionFileLoader) and cannot load.sofiles.For compiled pluggable worker images, either ensure
tasks.pyis shipped alongside the compiled extension, or switch the pluggable branch to useimportlib.import_module(worker_type.to_import_path()), which handles both.pyand compiled module loading.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@workers/worker.py` around lines 457 - 476, The current task-loading logic uses os.path.exists(worker_path) and os.path.exists(tasks_file) plus importlib.util.spec_from_file_location("tasks", tasks_file) which will fail for compiled extension modules (.so) because there may be no tasks.py and spec_from_file_location without an ExtensionFileLoader won't load .so files; update the branch that registers pluggable workers to fall back to dynamic import using importlib.import_module with the worker import path (use worker_type.to_import_path() or equivalent) when a disk-based tasks.py is absent or when the worker is a compiled extension, and remove the strict file-existence gating so that workers can be loaded via importlib.import_module (or explicitly use ExtensionFileLoader when loading a .so file) instead of relying solely on spec_from_file_location for tasks.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@workers/shared/infrastructure/config/builder.py`:
- Around line 321-353: The code in load_pluggable_worker still uses a filesystem
check (worker_path.exists()) which skips loading compiled plugins; update
load_pluggable_worker to use
importlib.util.find_spec(f"pluggable_worker.{worker_type.value}.config") instead
of checking worker_path.exists(), mirroring _verify_pluggable_worker_exists; if
find_spec returns None, log an error and return early, otherwise proceed to
importlib.import_module for the "pluggable_worker.{worker_type.value}.config"
module to access QUEUE_CONFIG, TASK_ROUTING, and LOGGING_CONFIG.
In `@workers/worker.py`:
- Around line 457-476: The current task-loading logic uses
os.path.exists(worker_path) and os.path.exists(tasks_file) plus
importlib.util.spec_from_file_location("tasks", tasks_file) which will fail for
compiled extension modules (.so) because there may be no tasks.py and
spec_from_file_location without an ExtensionFileLoader won't load .so files;
update the branch that registers pluggable workers to fall back to dynamic
import using importlib.import_module with the worker import path (use
worker_type.to_import_path() or equivalent) when a disk-based tasks.py is absent
or when the worker is a compiled extension, and remove the strict file-existence
gating so that workers can be loaded via importlib.import_module (or explicitly
use ExtensionFileLoader when loading a .so file) instead of relying solely on
spec_from_file_location for tasks.py.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 37df8bc0-7758-460f-aa75-11c62a6172ec
📒 Files selected for processing (2)
workers/shared/infrastructure/config/builder.pyworkers/worker.py
|
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|



What
v0.163.2-hotfix, merged as e6bb412) ontomain..pycheck in_verify_pluggable_worker_exists()withimportlib.util.find_spec().ValueError.api-deploymentworker directory fromto_import_path()instead of a naive.replace("-", "_").Why
worker.pyhad been compiled away to a.so(Nuitka / Cython / any C extension), even though the module was perfectly importable — so on-prem images with compiled plugins silently failed discovery.find_spec()can raiseValueErrorwhensys.modules[path]has__spec__ = Nonefrom a prior failed import; the previousexcept ImportErrorlet it propagate and crash boot.worker_type.value.replace("-", "_")was a no-op for every enum exceptAPI_DEPLOYMENT(api-deployment), where it produced/app/api_deployment, which doesn't exist. Boot logged a spurious❌ Worker directory not found: /app/api_deploymentat ERROR level, masking real failures in log scans.How
_verify_pluggable_worker_exists()now callsimportlib.util.find_spec(f"pluggable_worker.{name}.worker")and returnsspec is not None. This honors every registered finder — source.py, compiled.so, bytecode, namespace packages, zipimports.exceptis broadened to(ImportError, ValueError)with a comment explaining theValueErroredge case.WorkerType.to_import_path()which already handles the hyphen case.Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
pluggable_worker/<name>/subpackage →find_specraisesModuleNotFoundError(anImportErrorsubclass) → returnsFalse(same as before)..py→find_specresolves the.py→ returnsTrue(same as before)..so→find_specresolves the.so→ returnsTrue(previously returnedFalse, which is the bug being fixed).builder+ celeryautodiscoverviato_import_path) was already correct and is unchanged.Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
v0.163.2-hotfix)..soplugins this fix makes discoverable.Dependencies Versions
Notes on Testing
pluggable_worker/*plugins (was the reason for the hotfix).❌ Worker directory not found: /app/api_deployment.Screenshots
Checklist
I have read and understood the Contribution Guidelines.