-
Notifications
You must be signed in to change notification settings - Fork 1
feat: deploy from archive #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (1)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the 📝 WalkthroughWalkthroughReplaces archive-based deployment with a cache-and-pin model: removes ArchiveableEnvBase; adds PinnableEnvBase and CacheableEnvBase; EnvBase constructor and prefixes renamed to cache/pin/deployment; managed_deploy becomes async; adds cache/pin APIs and runtime assertions with WorkflowError handling. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Env as DeployableEnvBase (concrete)
Caller->>Env: await managed_deploy()
alt env supports caching (CacheableEnvBase)
Env->>Env: await get_cache_assets()
Env->>Env: await cache_assets() %% populate cache at cache_path
end
Env->>Env: await deploy() %% uses deployment_prefix / cache_path
Env-->>Caller: return success
rect rgba(240,240,255,0.6)
note right of Env: Removal path
Caller->>Env: managed_remove()
Env->>Env: await remove_cache() --x--> OS filesystem (shutil.rmtree)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Pre-merge checks (1 passed, 2 warnings)❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
snakemake_interface_software_deployment_plugins/__init__.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit Configuration File
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
Files:
snakemake_interface_software_deployment_plugins/__init__.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
294-302
: Docstring: explicitly note this is a coroutine that must be awaitedClarify the async contract in the docstring to reduce misuse.
Apply this diff:
async def deploy_from_archive(self) -> None: """Deploy the environment from self.archive_path to self.deployment_path. When issuing shell commands, the environment should use self.run_cmd(cmd: str) in order to ensure that it runs within eventual parent environments (e.g. a container or an env module). + + This is a coroutine; callers must await it. """ ...
🧹 Nitpick comments (2)
snakemake_interface_software_deployment_plugins/__init__.py (2)
257-261
: Guard against missing deployment_prefix when hashing non-portable deploymentsIf
_deployment_prefix
is None,str(None)
gets hashed, which can conceal a configuration issue and produce unstable semantics. Assert it’s set when the path is not portable.Also, nit in the docstring above: "senstivive" → "sensitive".
Apply this diff:
def record_deployment_hash(self, hash_object) -> None: @@ - assert isinstance(self, EnvBase) + assert isinstance(self, EnvBase) self.record_hash(hash_object) if not self.is_deployment_path_portable(): + assert ( + self._deployment_prefix is not None + ), "deployment_prefix must be set when deployment path is not portable" hash_object.update(str(self._deployment_prefix).encode())Optional typo fix for the docstring just above:
- deployment is senstivive to the path (e.g. in case of conda, which patches + deployment is sensitive to the path (e.g. in case of conda, which patches
306-307
: Prefer explicit exceptions over asserts for public API invariantsAsserts can be stripped with Python -O, turning this into a potential AttributeError later. Raise a descriptive exception instead.
Apply this diff:
@property def archive_path(self) -> Path: - assert isinstance(self, EnvBase) and self._archive_prefix is not None - return self._archive_prefix / self.hash() + if not isinstance(self, EnvBase) or self._archive_prefix is None: + raise RuntimeError( + "archive_path accessed but no archive_prefix is configured." + ) + return self._archive_prefix / self.hash()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
snakemake_interface_software_deployment_plugins/__init__.py
(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit Configuration File
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
Files:
snakemake_interface_software_deployment_plugins/__init__.py
🪛 GitHub Actions: CI
snakemake_interface_software_deployment_plugins/__init__.py
[error] 1-1: Command 'poetry run ruff format --check .' failed. 1 file would be reformatted: snakemake_interface_software_deployment_plugins/init.py. Run 'ruff format' to fix.
🔇 Additional comments (6)
snakemake_interface_software_deployment_plugins/__init__.py (6)
11-11
: Import for archive removal — LGTM
shutil
import is appropriate for the new archive removal functionality.
274-276
: Assertion before computing deployment_hash — LGTMAsserting
EnvBase
membership here is reasonable given_managed_generic_hash("deployment_hash")
depends onrecord_deployment_hash
.
279-281
: deployment_path invariant — LGTMAsserting a non-None
deployment_prefix
before building the path is appropriate.
309-313
: Archive removal — LGTM
shutil.rmtree(..., ignore_errors=True)
is a pragmatic choice here to avoid noisy failures when nothing is archived.
1-1
: CI failing ruff format — run the formatter to unblock the pipelineCI reports: “ruff format --check . failed. 1 file would be reformatted: snakemake_interface_software_deployment_plugins/init.py”.
Please run
ruff format
(orpoetry run ruff format
) locally and commit the result.
267-272
: Add explicit_archive_prefix
guard and updatemanaged_deploy
to asyncPlease apply the diff below to avoid
AssertionError
when no archive is configured and to gate archive‐based deploys on the existence of the archive path:async def managed_deploy(self) -> None: - if isinstance(self, ArchiveableEnvBase) and self.archive_path.exists(): - await self.deploy_from_archive() - else: - await self.deploy() + # Only attempt archive-based deployment if an archive prefix is configured. + if ( + isinstance(self, ArchiveableEnvBase) + and isinstance(self, EnvBase) + and self._archive_prefix is not None + and (self._archive_prefix / self.hash()).exists() + ): + await self.deploy_from_archive() + else: + await self.deploy()Key action items:
- Breaking change:
managed_deploy
is nowasync
. Audit all call sites to ensure they useawait managed_deploy()
(or schedule viaasyncio.create_task
). A quick scan showed no direct invocations—please verify manually.- Confirm every subclass of
ArchiveableEnvBase
implements the new abstractdeploy_from_archive()
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limits.
🔭 Outside diff range comments (2)
snakemake_interface_software_deployment_plugins/__init__.py (2)
217-226
: Hash caching bug: computed digest isn’t stored back; causes repeated recomputation
_managed_generic_hash()
readsself._managed_{kind}_store
but never writes it after computing the digest. This defeats the intended memoization of bothhash()
anddeployment_hash()
, and can significantly degrade performance (e.g.,__hash__
repeatedly recomputes).Apply:
def _managed_generic_hash(self, kind: str) -> str: store = getattr(self, f"_managed_{kind}_store") if store is None: record_hash = f"record_{kind}" hash_object = hashlib.md5() if self.within is not None: getattr(self.within, record_hash)(hash_object) getattr(self, record_hash)(hash_object) - store = hash_object.hexdigest() + store = hash_object.hexdigest() + setattr(self, f"_managed_{kind}_store", store) return store
137-149
: Sync tests withcache_prefix
renameWe ran a scan and found lingering
archive_prefix
references in the tests. Update these to use the new cache API:• File:
snakemake_interface_software_deployment_plugins/tests.py
– Lines ~128–129:
```diff
if issubclass(env_cls, ArchiveableEnvBase):
args["archive_prefix"] = tmp_path / "archives"
if issubclass(env_cls, ArchiveableEnvBase):
args["cache_prefix"] = tmp_path / "cache"
– Update any path literal (`"archives"`) to your chosen cache directory (e.g. `"cache"`). – Rename the test (and any assertions) from “archive” to “cache” if its name or expectations reference archive semantics.
No other
archive_*
ordeploy_from_archive
symbols were found in production code—only in this test file—so addressing these two lines completes the API rename.
🧹 Nitpick comments (4)
snakemake_interface_software_deployment_plugins/__init__.py (4)
262-268
: Typo in docstring ("senstivive")Small polish to the public API docs.
Apply:
- deployment is senstivive to the path (e.g. in case of conda, which patches + deployment is sensitive to the path (e.g. in case of conda, which patches
269-270
: Asserts used for runtime invariants can be optimized away; prefer explicit exceptions for user-facing API
assert
statements are fine for internal invariants, but withpython -O
they’re removed. Here, losing these checks could result in confusing errors (e.g.,TypeError
when_deployment_prefix
is None). Consider explicit checks raisingWorkflowError
orValueError
for clearer diagnostics.Example for
deployment_path
:- assert isinstance(self, EnvBase) and self._deployment_prefix is not None - return self._deployment_prefix / self.deployment_hash() + if not isinstance(self, EnvBase): + raise WorkflowError("deployment_path accessed on non-EnvBase instance") + if self._deployment_prefix is None: + raise WorkflowError("deployment_prefix is not set; cannot compute deployment_path") + return self._deployment_prefix / self.deployment_hash()Apply similarly in
deployment_hash
,managed_deploy
, andcache_path
.Also applies to: 300-301, 310-311, 315-316, 320-321
1-321
: CI reports low coverage for this module—add focused tests for new error-handling and cache APITarget high-signal behaviors:
- managed_deploy wraps exceptions from deploy into WorkflowError (both success and failure cases).
- managed_remove wraps exceptions from remove into WorkflowError.
- remove_cache removes directory; error path wraps into WorkflowError.
- deployment_path raises a clear error (or fails assertion) when deployment_prefix is None.
- record_deployment_hash changes when deployment path changes for non-portable envs.
- EnvSpecBase now requires str; add a skeletal subclass in tests to verify error messages include spec string.
I can draft pytest tests for these cases. Want me to open a follow-up PR with a test suite to raise coverage over the CI threshold?
246-251
: No cache() calls found; caching remains optional
A search across the Python codebase didn’t turn up anyawait cache()
invocations—plugins’cache()
hook isn’t called by any orchestrator, so it’s effectively a no-op and treated as optional.Optional refactor: if you later want symmetry with
managed_deploy
/managed_remove
and unified error handling, you can add amanaged_cache()
wrapper, for example:+ async def managed_cache(self) -> None: + assert isinstance(self, EnvBase) + try: + await self.cache() + except Exception as e: + raise WorkflowError(f"Caching assets for {self.spec} failed: {e}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
snakemake_interface_software_deployment_plugins/__init__.py
(7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit Configuration File
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
Files:
snakemake_interface_software_deployment_plugins/__init__.py
🧬 Code Graph Analysis (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
snakemake_interface_software_deployment_plugins/settings.py (1)
SoftwareDeploymentSettingsBase
(8-17)
🪛 GitHub Actions: CI
snakemake_interface_software_deployment_plugins/__init__.py
[error] 1-321: Low test coverage: 42% (108 missing) in init.py. Overall coverage threshold not met (fail-under=63%).
🔇 Additional comments (8)
snakemake_interface_software_deployment_plugins/__init__.py (8)
11-11
: LGTM: Imports for cache removal and error wrapping are appropriate
shutil
andWorkflowError
are correctly introduced and used below.Also applies to: 18-18
279-288
: LGTM: managed_remove wraps plugin errors in WorkflowError with contextGood defensive wrapper for removal.
289-299
: LGTM: remove_cache safely deletes cache with error wrappingChecks existence and wraps errors with
WorkflowError
. Looks good.
309-312
: LGTM: deployment_hash now guarded; consistent with new error handling approachWith the earlier note about asserts vs explicit exceptions.
315-316
: LGTM: deployment_path depends on non-None deployment_prefixSee earlier note on replacing asserts with explicit exceptions for clearer UX.
319-321
: LGTM: cache_path computes from _cache_prefix and environment hashThis aligns with the cache-based API.
300-307
: No direct calls to managed_deploy() found—please verify external usage
I ran a repository-wide grep for bothmanaged_deploy(
andawait …managed_deploy(
and didn’t locate any invocation sites. It looks like this async API isn’t yet consumed internally, but you should:
- Audit all plugins, tests, and upstream integrations that may call
managed_deploy
- Ensure any invocation is prefixed with
await
to prevent unawaited coroutine issues
120-124
: Abstract str on EnvSpecBase is a breaking API change—no local subclasses found but downstream plugins will break– I ran a recursive search for any
class …(EnvSpecBase)
in this repo and found none.
– However, external plugins that subclassEnvSpecBase
will now be forced to implement__str__
or else fail at instantiation.
– You should coordinate a major/minor version bump, update any plugin-skeletons and docs to require or supply__str__
, and communicate this change to plugin authors.
– If you want to soften this break, you can replace the abstract stub with a default implementation:- @abstractmethod - def __str__(self) -> str: - """Return a string representation of the environment spec.""" - ... + def __str__(self) -> str: + """Return a string representation of the environment spec.""" + ident = ", ".join(f"{attr}={getattr(self, attr)!r}" for attr in self.identity_attributes()) + return f"{self.__class__.__name__}({ident})"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
219-229
: Hash caching bug: computed value isn’t stored back.
_managed_generic_hash()
recomputes every time because the store attribute isn’t updated.def _managed_generic_hash(self, kind: str) -> str: store = getattr(self, f"_managed_{kind}_store") if store is None: record_hash = f"record_{kind}" hash_object = hashlib.md5() if self.within is not None: getattr(self.within, record_hash)(hash_object) getattr(self, record_hash)(hash_object) store = hash_object.hexdigest() + setattr(self, f"_managed_{kind}_store", store) return store
♻️ Duplicate comments (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
346-353
: Async managed_deploy with WorkflowError wrapper — looks good.This addresses prior feedback to make
managed_deploy
await the deployment routine.
🧹 Nitpick comments (4)
snakemake_interface_software_deployment_plugins/__init__.py (4)
281-296
: Optional: fast-path cache removal by deleting the per-env directory.Once
cache_path
is per-env, you can remove the whole dir instead of iterating assets.async def remove_cache(self) -> None: """Remove the cached environment assets.""" assert isinstance(self, EnvBase) + # Fast path: drop the entire per-env cache directory if present. + if self.cache_path.exists(): + shutil.rmtree(self.cache_path, ignore_errors=True) + return for asset in await self.get_cache_assets(): asset_path = self.cache_path / asset if asset_path.exists(): try: if asset_path.is_dir(): shutil.rmtree(asset_path) else: asset_path.unlink() except Exception as e: raise WorkflowError( f"Removal of cache asset {asset_path} for {self.spec} failed: {e}" )
300-307
: Clarify portability docstring (path sensitivity).Current text conflates “path matters” with “portable”. Suggest clearer wording.
- def is_deployment_path_portable(self) -> bool: - """Return whether the deployment path matters for the environment, i.e. - whether the environment is portable. If this returns False, the deployment + def is_deployment_path_portable(self) -> bool: + """Return whether the environment is portable (path-insensitive). + If this returns False, the deployment path is considered for the deployment hash. For example, conda environments are not portable because they hardcode the path in binaries, while containers are portable. """
319-325
: Typo in docstring.“senstivive” → “sensitive”.
- deployment is senstivive to the path (e.g. in case of conda, which patches + deployment is sensitive to the path (e.g. in case of conda, which patches
336-345
: Preserve original exception context.Chain the original exception for better diagnostics.
def managed_remove(self) -> None: """Remove the deployed environment, handling exceptions.""" assert isinstance(self, EnvBase) try: self.remove() except Exception as e: - raise WorkflowError( - f"Removal of {self.spec} failed: {e}" - ) + raise WorkflowError(f"Removal of {self.spec} failed: {e}") from e
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_interface_software_deployment_plugins/__init__.py
(7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
Files:
snakemake_interface_software_deployment_plugins/__init__.py
🧬 Code graph analysis (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
snakemake_interface_software_deployment_plugins/settings.py (1)
SoftwareDeploymentSettingsBase
(8-17)
🪛 GitHub Actions: CI
snakemake_interface_software_deployment_plugins/__init__.py
[error] 244-245: Command failed: poetry run coverage run -m pytest tests/tests.py. Pytest collection error: AttributeError: 'isabstractmethod' of 'classmethod' objects is not writable during PinnableEnvBase class definition (snakemake_interface_software_deployment_plugins/init.py:244-245).
🔇 Additional comments (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
120-124
: Requiring str on EnvSpecBase — good API improvement.This enforces consistent human-readable specs for error messages and logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
136-151
: Constructor signature change is breaking; ensure all call sites updated.The new cache_prefix/deployment_prefix/pinfile_prefix args will break existing instantiations and downstream plugins unless updated.
🧹 Nitpick comments (5)
snakemake_interface_software_deployment_plugins/__init__.py (5)
258-265
: Nit: avoid double call to pinfile_extension().Reuse the local ext variable to prevent duplicate lookups.
def pinfile(self) -> Path: assert isinstance(self, EnvBase) ext = self.pinfile_extension() if not ext.startswith("."): raise ValueError("pinfile_extension must start with a dot.") - return (self._pinfile_prefix / self.hash()).with_suffix( - self.pinfile_extension() - ) + return (self._pinfile_prefix / self.hash()).with_suffix(ext)
321-333
: Typo in docstring.“senstivive” → “sensitive”.
- deployment is senstivive to the path (e.g. in case of conda, which patches + deployment is sensitive to the path (e.g. in case of conda, which patches
338-345
: Preserve traceback chaining.Raise from the original exception for better diagnostics.
- except Exception as e: - raise WorkflowError(f"Removal of {self.spec} failed: {e}") + except Exception as e: + raise WorkflowError(f"Removal of {self.spec} failed: {e}") from e
346-352
: Same: preserve traceback chaining on deployment errors.- except Exception as e: - raise WorkflowError(f"Deployment of {self.spec} failed: {e}") + except Exception as e: + raise WorkflowError(f"Deployment of {self.spec} failed: {e}") from e
219-228
: Cache the computed hash to avoid recomputation._store isn’t written back; hash is recomputed every call. Save it into the corresponding managed*_store.
def _managed_generic_hash(self, kind: str) -> str: store = getattr(self, f"_managed_{kind}_store") if store is None: record_hash = f"record_{kind}" hash_object = hashlib.md5() if self.within is not None: getattr(self.within, record_hash)(hash_object) getattr(self, record_hash)(hash_object) - store = hash_object.hexdigest() + store = hash_object.hexdigest() + setattr(self, f"_managed_{kind}_store", store) return store
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_interface_software_deployment_plugins/__init__.py
(7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Do not suggest type annotation of thecls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturn
statement.
Files:
snakemake_interface_software_deployment_plugins/__init__.py
🧠 Learnings (1)
📚 Learning: 2025-09-11T11:26:53.355Z
Learnt from: johanneskoester
PR: snakemake/snakemake-interface-software-deployment-plugins#18
File: snakemake_interface_software_deployment_plugins/__init__.py:278-282
Timestamp: 2025-09-11T11:26:53.355Z
Learning: In snakemake_interface_software_deployment_plugins, the cache_path property in CacheableEnvBase intentionally returns only self._cache_prefix without appending the environment hash, as the cache is meant to be shared across environments for efficiency.
Applied to files:
snakemake_interface_software_deployment_plugins/__init__.py
🧬 Code graph analysis (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
snakemake_interface_software_deployment_plugins/settings.py (1)
SoftwareDeploymentSettingsBase
(8-17)
🔇 Additional comments (3)
snakemake_interface_software_deployment_plugins/__init__.py (3)
120-124
: Good: explicit string contract for EnvSpec.Adding an abstract str is helpful for clear error/output messages.
245-247
: Decorator order on abstract classmethod is correct.@classmethod above @AbstractMethod avoids the ABC metaclass issue.
278-282
: Intentional shared cache path acknowledged.Returning only the prefix is by design to share caches across envs; keeping it as-is matches prior decision.
class CacheableEnvBase(ABC): | ||
async def get_cache_assets(self) -> Iterable[str]: ... | ||
|
||
@abstractmethod | ||
def cache_assets(self) -> None: | ||
"""Determine environment assets and store any associated information or data to | ||
self.cache_path. | ||
""" | ||
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Make get_cache_assets abstract (or provide a safe default).
As written, remove_cache awaits get_cache_assets(); the current stub ...
will return None and cause a TypeError. Either make it abstract or return an empty iterable.
class CacheableEnvBase(ABC):
- async def get_cache_assets(self) -> Iterable[str]: ...
+ @abstractmethod
+ async def get_cache_assets(self) -> Iterable[str]:
+ """Return relative asset paths (files/dirs) under cache_path."""
+ ...
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class CacheableEnvBase(ABC): | |
async def get_cache_assets(self) -> Iterable[str]: ... | |
@abstractmethod | |
def cache_assets(self) -> None: | |
"""Determine environment assets and store any associated information or data to | |
self.cache_path. | |
""" | |
... | |
class CacheableEnvBase(ABC): | |
@abstractmethod | |
async def get_cache_assets(self) -> Iterable[str]: | |
"""Return relative asset paths (files/dirs) under cache_path.""" | |
... | |
@abstractmethod | |
def cache_assets(self) -> None: | |
"""Determine environment assets and store any associated information or data to | |
self.cache_path. | |
""" | |
... |
🤖 Prompt for AI Agents
In snakemake_interface_software_deployment_plugins/__init__.py around lines 268
to 276, get_cache_assets is currently a non-abstract stub that returns None
(ellipsis) which causes remove_cache to fail when awaiting it; change it so
callers always get an iterable by either (a) marking get_cache_assets with
@abstractmethod so subclasses must implement it, or (b) provide a safe default
implementation that returns an empty iterable (e.g., "async def
get_cache_assets(self) -> Iterable[str]: return ()") — choose one approach and
update the signature and docstring accordingly so remove_cache can safely
iterate the result.
async def remove_cache(self) -> None: | ||
"""Remove the cached environment assets.""" | ||
assert isinstance(self, EnvBase) | ||
for asset in await self.get_cache_assets(): | ||
asset_path = self.cache_path / asset | ||
if asset_path.exists(): | ||
try: | ||
if asset_path.is_dir(): | ||
shutil.rmtree(asset_path) | ||
else: | ||
asset_path.unlink() | ||
except Exception as e: | ||
raise WorkflowError( | ||
f"Removal of cache asset {asset_path} for {self.spec} failed: {e}" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Guard against path traversal when deleting cache assets.
Validate resolved asset paths stay within cache_path to avoid deleting arbitrary files if a plugin returns a malicious/buggy asset like "../../x".
async def remove_cache(self) -> None:
"""Remove the cached environment assets."""
assert isinstance(self, EnvBase)
- for asset in await self.get_cache_assets():
- asset_path = self.cache_path / asset
+ base = self.cache_path.resolve()
+ for asset in await self.get_cache_assets():
+ asset_path = (base / asset).resolve()
+ if not asset_path.is_relative_to(base):
+ raise WorkflowError(
+ f"Refusing to remove cache asset outside cache_prefix: {asset_path}"
+ )
if asset_path.exists():
try:
if asset_path.is_dir():
shutil.rmtree(asset_path)
else:
asset_path.unlink()
except Exception as e:
- raise WorkflowError(
- f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"
- )
+ raise WorkflowError(
+ f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"
+ ) from e
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def remove_cache(self) -> None: | |
"""Remove the cached environment assets.""" | |
assert isinstance(self, EnvBase) | |
for asset in await self.get_cache_assets(): | |
asset_path = self.cache_path / asset | |
if asset_path.exists(): | |
try: | |
if asset_path.is_dir(): | |
shutil.rmtree(asset_path) | |
else: | |
asset_path.unlink() | |
except Exception as e: | |
raise WorkflowError( | |
f"Removal of cache asset {asset_path} for {self.spec} failed: {e}" | |
) | |
async def remove_cache(self) -> None: | |
"""Remove the cached environment assets.""" | |
assert isinstance(self, EnvBase) | |
base = self.cache_path.resolve() | |
for asset in await self.get_cache_assets(): | |
asset_path = (base / asset).resolve() | |
if not asset_path.is_relative_to(base): | |
raise WorkflowError( | |
f"Refusing to remove cache asset outside cache_prefix: {asset_path}" | |
) | |
if asset_path.exists(): | |
try: | |
if asset_path.is_dir(): | |
shutil.rmtree(asset_path) | |
else: | |
asset_path.unlink() | |
except Exception as e: | |
raise WorkflowError( | |
f"Removal of cache asset {asset_path} for {self.spec} failed: {e}" | |
) from e |
🤖 Prompt for AI Agents
In snakemake_interface_software_deployment_plugins/__init__.py around lines
283-298, the removal loop currently deletes assets by joining raw asset names
with cache_path which allows path-traversal like "../../foo"; before deleting,
resolve the asset path and the cache root (use Path.resolve(strict=False)),
verify the resolved asset path is inside the resolved cache root (use
Path.relative_to or os.path.commonpath) and if not, either skip the asset or
raise a WorkflowError; only perform is_dir()/rmtree or unlink on the validated
resolved path to prevent deleting files outside the cache directory.
🤖 I have created a release *beep* *boop* --- ## [0.7.0](v0.6.1...v0.7.0) (2025-09-11) ### Features * deploy from archive ([#18](#18)) ([f6d62bd](f6d62bd)) * pass tempdir to EnvBase ([#16](#16)) ([28d8351](28d8351)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary by CodeRabbit
New Features
Compatibility / API