feat: deploy from archive #18

johanneskoester · 2025-08-18T14:33:17Z

Summary by CodeRabbit

New Features
- Deployment switched from archive-based to cache-based storage with explicit pinning and a public cache path for managing cached environment assets.
- Environments expose a stable string representation for specs.
- Added safe cache removal with improved error handling.
Compatibility / API
- Deployment and cache lifecycle is now async-aware; plugins should adopt the new async workflow and cache/pin APIs.
- Stronger runtime checks for deployment/caching operations.

coderabbitai · 2025-08-18T14:33:25Z

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)

pyproject.toml is excluded by !pyproject.toml

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Replaces archive-based deployment with a cache-and-pin model: removes ArchiveableEnvBase; adds PinnableEnvBase and CacheableEnvBase; EnvBase constructor and prefixes renamed to cache/pin/deployment; managed_deploy becomes async; adds cache/pin APIs and runtime assertions with WorkflowError handling.

Changes

Cohort / File(s)	Summary
Core deployment & caching `snakemake_interface_software_deployment_plugins/__init__.py`	- Removed `ArchiveableEnvBase`. - EnvBase: constructor signature changed (replaced `archive_prefix` with `cache_prefix`, added `deployment_prefix` and `pinfile_prefix`), internal `_archive_prefix` → `_cache_prefix`, added `_pinfile_prefix`. - Added `PinnableEnvBase` (pinfile_extension, abstract pin, `pinfile` property) and `CacheableEnvBase` (async `get_cache_assets`, abstract `cache_assets`, `cache_path` property, async `remove_cache`). - EnvSpecBase: added abstract `__str__`. - `DeployableEnvBase.managed_deploy` → async and wraps `deploy()` with `WorkflowError`; added `managed_remove`. - `record_deployment_hash`, `deployment_hash`, `deployment_path` now assert `isinstance(self, EnvBase)` and validate `_deployment_prefix`. - Replaced `archive_path` with `cache_path`. - Imports updated (added `shutil`, `WorkflowError`) and portable deployment check `is_deployment_path_portable` added.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant Env as DeployableEnvBase (concrete)

  Caller->>Env: await managed_deploy()
  alt env supports caching (CacheableEnvBase)
    Env->>Env: await get_cache_assets()
    Env->>Env: await cache_assets()    %% populate cache at cache_path
  end
  Env->>Env: await deploy()            %% uses deployment_prefix / cache_path
  Env-->>Caller: return success

  rect rgba(240,240,255,0.6)
    note right of Env: Removal path
    Caller->>Env: managed_remove()
    Env->>Env: await remove_cache()    --x--> OS filesystem (shutil.rmtree)
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

feat: make deploy async #11: Overlaps on converting deployment/archive operations to async and touches archive/cache handling; likely directly related.

Pre-merge checks (1 passed, 2 warnings)

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title Check	⚠️ Warning	The title "feat: deploy from archive" is misleading because the changeset removes archive-based paths and the ArchiveableEnvBase class and instead introduces a cache- and pin-based deployment model (CacheableEnvBase, PinnableEnvBase, cache_path, pinfile, etc.), so the title does not describe the primary change and could confuse reviewers.	Rename the PR to reflect the actual change, for example: "feat: migrate deployment to cache-based model with pinning (remove archive support)"; update the title before merge so it succinctly summarizes the main intent.
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/deploy-from-archive

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bb2067a and 5916c08.

📒 Files selected for processing (1)

snakemake_interface_software_deployment_plugins/__init__.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit Configuration File

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

snakemake_interface_software_deployment_plugins/__init__.py

snakemake_interface_software_deployment_plugins/__init__.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

snakemake_interface_software_deployment_plugins/__init__.py (1)
294-302: Docstring: explicitly note this is a coroutine that must be awaited

Clarify the async contract in the docstring to reduce misuse.

Apply this diff:
     async def deploy_from_archive(self) -> None:
         """Deploy the environment from self.archive_path to self.deployment_path.
 
         When issuing shell commands, the environment should use
         self.run_cmd(cmd: str) in order to ensure that it runs within eventual
         parent environments (e.g. a container or an env module).
+
+        This is a coroutine; callers must await it.
         """
         ...

🧹 Nitpick comments (2)

snakemake_interface_software_deployment_plugins/__init__.py (2)
257-261: Guard against missing deployment_prefix when hashing non-portable deployments

If _deployment_prefix is None, str(None) gets hashed, which can conceal a configuration issue and produce unstable semantics. Assert it’s set when the path is not portable.

Also, nit in the docstring above: "senstivive" → "sensitive".

Apply this diff:
 def record_deployment_hash(self, hash_object) -> None:
@@
-        assert isinstance(self, EnvBase)
+        assert isinstance(self, EnvBase)
         self.record_hash(hash_object)
         if not self.is_deployment_path_portable():
+            assert (
+                self._deployment_prefix is not None
+            ), "deployment_prefix must be set when deployment path is not portable"
             hash_object.update(str(self._deployment_prefix).encode())
Optional typo fix for the docstring just above:
-        deployment is senstivive to the path (e.g. in case of conda, which patches
+        deployment is sensitive to the path (e.g. in case of conda, which patches
306-307: Prefer explicit exceptions over asserts for public API invariants

Asserts can be stripped with Python -O, turning this into a potential AttributeError later. Raise a descriptive exception instead.

Apply this diff:
     @property
     def archive_path(self) -> Path:
-        assert isinstance(self, EnvBase) and self._archive_prefix is not None
-        return self._archive_prefix / self.hash()
+        if not isinstance(self, EnvBase) or self._archive_prefix is None:
+            raise RuntimeError(
+                "archive_path accessed but no archive_prefix is configured."
+            )
+        return self._archive_prefix / self.hash()

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5916c08 and 617189c.

📒 Files selected for processing (1)

snakemake_interface_software_deployment_plugins/__init__.py (4 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit Configuration File

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

snakemake_interface_software_deployment_plugins/__init__.py

🪛 GitHub Actions: CI

snakemake_interface_software_deployment_plugins/__init__.py

[error] 1-1: Command 'poetry run ruff format --check .' failed. 1 file would be reformatted: snakemake_interface_software_deployment_plugins/init.py. Run 'ruff format' to fix.

🔇 Additional comments (6)

snakemake_interface_software_deployment_plugins/__init__.py (6)

11-11: Import for archive removal — LGTM

shutil import is appropriate for the new archive removal functionality.

274-276: Assertion before computing deployment_hash — LGTM

Asserting EnvBase membership here is reasonable given _managed_generic_hash("deployment_hash") depends on record_deployment_hash.

279-281: deployment_path invariant — LGTM

Asserting a non-None deployment_prefix before building the path is appropriate.

309-313: Archive removal — LGTM

shutil.rmtree(..., ignore_errors=True) is a pragmatic choice here to avoid noisy failures when nothing is archived.

1-1: CI failing ruff format — run the formatter to unblock the pipeline

CI reports: “ruff format --check . failed. 1 file would be reformatted: snakemake_interface_software_deployment_plugins/init.py”.

Please run ruff format (or poetry run ruff format) locally and commit the result.

267-272: Add explicit _archive_prefix guard and update managed_deploy to async

Please apply the diff below to avoid AssertionError when no archive is configured and to gate archive‐based deploys on the existence of the archive path:
 async def managed_deploy(self) -> None:
-        if isinstance(self, ArchiveableEnvBase) and self.archive_path.exists():
-            await self.deploy_from_archive()
-        else:
-            await self.deploy()
+        # Only attempt archive-based deployment if an archive prefix is configured.
+        if (
+            isinstance(self, ArchiveableEnvBase)
+            and isinstance(self, EnvBase)
+            and self._archive_prefix is not None
+            and (self._archive_prefix / self.hash()).exists()
+        ):
+            await self.deploy_from_archive()
+        else:
+            await self.deploy()
Key action items:

Breaking change: managed_deploy is now async. Audit all call sites to ensure they use await managed_deploy() (or schedule via asyncio.create_task). A quick scan showed no direct invocations—please verify manually.

Confirm every subclass of ArchiveableEnvBase implements the new abstract deploy_from_archive() method.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limits.

🔭 Outside diff range comments (2)

snakemake_interface_software_deployment_plugins/__init__.py (2)
217-226: Hash caching bug: computed digest isn’t stored back; causes repeated recomputation

_managed_generic_hash() reads self._managed_{kind}_store but never writes it after computing the digest. This defeats the intended memoization of both hash() and deployment_hash(), and can significantly degrade performance (e.g., __hash__ repeatedly recomputes).

Apply:
     def _managed_generic_hash(self, kind: str) -> str:
         store = getattr(self, f"_managed_{kind}_store")
         if store is None:
             record_hash = f"record_{kind}"
             hash_object = hashlib.md5()
             if self.within is not None:
                 getattr(self.within, record_hash)(hash_object)
             getattr(self, record_hash)(hash_object)
-            store = hash_object.hexdigest()
+            store = hash_object.hexdigest()
+            setattr(self, f"_managed_{kind}_store", store)
         return store
137-149: Sync tests with cache_prefix rename

We ran a scan and found lingering archive_prefix references in the tests. Update these to use the new cache API:

• File: snakemake_interface_software_deployment_plugins/tests.py
– Lines ~128–129:
```diff
      if issubclass(env_cls, ArchiveableEnvBase):
          args["archive_prefix"] = tmp_path / "archives"
      if issubclass(env_cls, ArchiveableEnvBase):
          args["cache_prefix"] = tmp_path / "cache"
– Update any path literal (`"archives"`) to your chosen cache directory (e.g. `"cache"`).  
– Rename the test (and any assertions) from “archive” to “cache” if its name or expectations reference archive semantics.
No other archive_* or deploy_from_archive symbols were found in production code—only in this test file—so addressing these two lines completes the API rename.

🧹 Nitpick comments (4)

snakemake_interface_software_deployment_plugins/__init__.py (4)
262-268: Typo in docstring ("senstivive")

Small polish to the public API docs.

Apply:
-        deployment is senstivive to the path (e.g. in case of conda, which patches
+        deployment is sensitive to the path (e.g. in case of conda, which patches
269-270: Asserts used for runtime invariants can be optimized away; prefer explicit exceptions for user-facing API

assert statements are fine for internal invariants, but with python -O they’re removed. Here, losing these checks could result in confusing errors (e.g., TypeError when _deployment_prefix is None). Consider explicit checks raising WorkflowError or ValueError for clearer diagnostics.

Example for deployment_path:
-        assert isinstance(self, EnvBase) and self._deployment_prefix is not None
-        return self._deployment_prefix / self.deployment_hash()
+        if not isinstance(self, EnvBase):
+            raise WorkflowError("deployment_path accessed on non-EnvBase instance")
+        if self._deployment_prefix is None:
+            raise WorkflowError("deployment_prefix is not set; cannot compute deployment_path")
+        return self._deployment_prefix / self.deployment_hash()
Apply similarly in deployment_hash, managed_deploy, and cache_path.

Also applies to: 300-301, 310-311, 315-316, 320-321

1-321: CI reports low coverage for this module—add focused tests for new error-handling and cache API

Target high-signal behaviors:

managed_deploy wraps exceptions from deploy into WorkflowError (both success and failure cases).

managed_remove wraps exceptions from remove into WorkflowError.

remove_cache removes directory; error path wraps into WorkflowError.

deployment_path raises a clear error (or fails assertion) when deployment_prefix is None.

record_deployment_hash changes when deployment path changes for non-portable envs.

EnvSpecBase now requires str; add a skeletal subclass in tests to verify error messages include spec string.

I can draft pytest tests for these cases. Want me to open a follow-up PR with a test suite to raise coverage over the CI threshold?

246-251: No cache() calls found; caching remains optional
A search across the Python codebase didn’t turn up any await cache() invocations—plugins’ cache() hook isn’t called by any orchestrator, so it’s effectively a no-op and treated as optional.

Optional refactor: if you later want symmetry with managed_deploy/managed_remove and unified error handling, you can add a managed_cache() wrapper, for example:
+    async def managed_cache(self) -> None:
+        assert isinstance(self, EnvBase)
+        try:
+            await self.cache()
+        except Exception as e:
+            raise WorkflowError(f"Caching assets for {self.spec} failed: {e}")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 617189c and 62be1da.

📒 Files selected for processing (1)

snakemake_interface_software_deployment_plugins/__init__.py (7 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit Configuration File

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

snakemake_interface_software_deployment_plugins/__init__.py

🧬 Code Graph Analysis (1)

snakemake_interface_software_deployment_plugins/__init__.py (1)

snakemake_interface_software_deployment_plugins/settings.py (1)

SoftwareDeploymentSettingsBase (8-17)

🪛 GitHub Actions: CI

snakemake_interface_software_deployment_plugins/__init__.py

[error] 1-321: Low test coverage: 42% (108 missing) in init.py. Overall coverage threshold not met (fail-under=63%).

🔇 Additional comments (8)

snakemake_interface_software_deployment_plugins/__init__.py (8)

11-11: LGTM: Imports for cache removal and error wrapping are appropriate

shutil and WorkflowError are correctly introduced and used below.

Also applies to: 18-18

279-288: LGTM: managed_remove wraps plugin errors in WorkflowError with context

Good defensive wrapper for removal.

289-299: LGTM: remove_cache safely deletes cache with error wrapping

Checks existence and wraps errors with WorkflowError. Looks good.

309-312: LGTM: deployment_hash now guarded; consistent with new error handling approach

With the earlier note about asserts vs explicit exceptions.

315-316: LGTM: deployment_path depends on non-None deployment_prefix

See earlier note on replacing asserts with explicit exceptions for clearer UX.

319-321: LGTM: cache_path computes from _cache_prefix and environment hash

This aligns with the cache-based API.

300-307: No direct calls to managed_deploy() found—please verify external usage
I ran a repository-wide grep for both managed_deploy( and await …managed_deploy( and didn’t locate any invocation sites. It looks like this async API isn’t yet consumed internally, but you should:

Audit all plugins, tests, and upstream integrations that may call managed_deploy

Ensure any invocation is prefixed with await to prevent unawaited coroutine issues

120-124: Abstract str on EnvSpecBase is a breaking API change—no local subclasses found but downstream plugins will break

– I ran a recursive search for any class …(EnvSpecBase) in this repo and found none.
– However, external plugins that subclass EnvSpecBase will now be forced to implement __str__ or else fail at instantiation.
– You should coordinate a major/minor version bump, update any plugin-skeletons and docs to require or supply __str__, and communicate this change to plugin authors.
– If you want to soften this break, you can replace the abstract stub with a default implementation:
-    @abstractmethod
-    def __str__(self) -> str:
-        """Return a string representation of the environment spec."""
-        ...
+    def __str__(self) -> str:
+        """Return a string representation of the environment spec."""
+        ident = ", ".join(f"{attr}={getattr(self, attr)!r}" for attr in self.identity_attributes())
+        return f"{self.__class__.__name__}({ident})"

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

snakemake_interface_software_deployment_plugins/__init__.py (1)

219-229: Hash caching bug: computed value isn’t stored back.

_managed_generic_hash() recomputes every time because the store attribute isn’t updated.

     def _managed_generic_hash(self, kind: str) -> str:
         store = getattr(self, f"_managed_{kind}_store")
         if store is None:
             record_hash = f"record_{kind}"
             hash_object = hashlib.md5()
             if self.within is not None:
                 getattr(self.within, record_hash)(hash_object)
             getattr(self, record_hash)(hash_object)
             store = hash_object.hexdigest()
+            setattr(self, f"_managed_{kind}_store", store)
         return store

♻️ Duplicate comments (1)

snakemake_interface_software_deployment_plugins/__init__.py (1)

346-353: Async managed_deploy with WorkflowError wrapper — looks good.

This addresses prior feedback to make managed_deploy await the deployment routine.

🧹 Nitpick comments (4)

snakemake_interface_software_deployment_plugins/__init__.py (4)

281-296: Optional: fast-path cache removal by deleting the per-env directory.

Once cache_path is per-env, you can remove the whole dir instead of iterating assets.

     async def remove_cache(self) -> None:
         """Remove the cached environment assets."""
         assert isinstance(self, EnvBase)
+        # Fast path: drop the entire per-env cache directory if present.
+        if self.cache_path.exists():
+            shutil.rmtree(self.cache_path, ignore_errors=True)
+            return
         for asset in await self.get_cache_assets():
             asset_path = self.cache_path / asset
             if asset_path.exists():
                 try:
                     if asset_path.is_dir():
                         shutil.rmtree(asset_path)
                     else:
                         asset_path.unlink()
                 except Exception as e:
                     raise WorkflowError(
                         f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"
                     )

300-307: Clarify portability docstring (path sensitivity).

Current text conflates “path matters” with “portable”. Suggest clearer wording.

-    def is_deployment_path_portable(self) -> bool: 
-        """Return whether the deployment path matters for the environment, i.e.
-        whether the environment is portable. If this returns False, the deployment
+    def is_deployment_path_portable(self) -> bool:
+        """Return whether the environment is portable (path-insensitive).
+        If this returns False, the deployment
         path is considered for the deployment hash. For example, conda environments are not
         portable because they hardcode the path in binaries, while containers are
         portable.
         """

319-325: Typo in docstring.

“senstivive” → “sensitive”.

-        deployment is senstivive to the path (e.g. in case of conda, which patches
+        deployment is sensitive to the path (e.g. in case of conda, which patches

336-345: Preserve original exception context.

Chain the original exception for better diagnostics.

     def managed_remove(self) -> None:
         """Remove the deployed environment, handling exceptions."""
         assert isinstance(self, EnvBase)
         try:
             self.remove()
         except Exception as e:
-            raise WorkflowError(
-                f"Removal of {self.spec} failed: {e}"
-            )
+            raise WorkflowError(f"Removal of {self.spec} failed: {e}") from e

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62be1da and 82d7144.

📒 Files selected for processing (1)

snakemake_interface_software_deployment_plugins/__init__.py (7 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

snakemake_interface_software_deployment_plugins/__init__.py

🧬 Code graph analysis (1)

snakemake_interface_software_deployment_plugins/__init__.py (1)

snakemake_interface_software_deployment_plugins/settings.py (1)

SoftwareDeploymentSettingsBase (8-17)

🪛 GitHub Actions: CI

snakemake_interface_software_deployment_plugins/__init__.py

[error] 244-245: Command failed: poetry run coverage run -m pytest tests/tests.py. Pytest collection error: AttributeError: 'isabstractmethod' of 'classmethod' objects is not writable during PinnableEnvBase class definition (snakemake_interface_software_deployment_plugins/init.py:244-245).

🔇 Additional comments (1)

snakemake_interface_software_deployment_plugins/__init__.py (1)

120-124: Requiring str on EnvSpecBase — good API improvement.

This enforces consistent human-readable specs for error messages and logging.

snakemake_interface_software_deployment_plugins/__init__.py

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

snakemake_interface_software_deployment_plugins/__init__.py (1)

136-151: Constructor signature change is breaking; ensure all call sites updated.

The new cache_prefix/deployment_prefix/pinfile_prefix args will break existing instantiations and downstream plugins unless updated.

🧹 Nitpick comments (5)

snakemake_interface_software_deployment_plugins/__init__.py (5)

258-265: Nit: avoid double call to pinfile_extension().

Reuse the local ext variable to prevent duplicate lookups.

     def pinfile(self) -> Path:
         assert isinstance(self, EnvBase)
         ext = self.pinfile_extension()
         if not ext.startswith("."):
             raise ValueError("pinfile_extension must start with a dot.")
-        return (self._pinfile_prefix / self.hash()).with_suffix(
-            self.pinfile_extension()
-        )
+        return (self._pinfile_prefix / self.hash()).with_suffix(ext)

321-333: Typo in docstring.

“senstivive” → “sensitive”.

-        deployment is senstivive to the path (e.g. in case of conda, which patches
+        deployment is sensitive to the path (e.g. in case of conda, which patches

338-345: Preserve traceback chaining.

Raise from the original exception for better diagnostics.

-        except Exception as e:
-            raise WorkflowError(f"Removal of {self.spec} failed: {e}")
+        except Exception as e:
+            raise WorkflowError(f"Removal of {self.spec} failed: {e}") from e

346-352: Same: preserve traceback chaining on deployment errors.

-        except Exception as e:
-            raise WorkflowError(f"Deployment of {self.spec} failed: {e}")
+        except Exception as e:
+            raise WorkflowError(f"Deployment of {self.spec} failed: {e}") from e

219-228: Cache the computed hash to avoid recomputation.

_store isn’t written back; hash is recomputed every call. Save it into the corresponding managed*_store.

     def _managed_generic_hash(self, kind: str) -> str:
         store = getattr(self, f"_managed_{kind}_store")
         if store is None:
             record_hash = f"record_{kind}"
             hash_object = hashlib.md5()
             if self.within is not None:
                 getattr(self.within, record_hash)(hash_object)
             getattr(self, record_hash)(hash_object)
-            store = hash_object.hexdigest()
+            store = hash_object.hexdigest()
+            setattr(self, f"_managed_{kind}_store", store)
         return store

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 82d7144 and b58c822.

📒 Files selected for processing (1)

snakemake_interface_software_deployment_plugins/__init__.py (7 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

snakemake_interface_software_deployment_plugins/__init__.py

🧠 Learnings (1)

📚 Learning: 2025-09-11T11:26:53.355Z

Learnt from: johanneskoester
PR: snakemake/snakemake-interface-software-deployment-plugins#18
File: snakemake_interface_software_deployment_plugins/__init__.py:278-282
Timestamp: 2025-09-11T11:26:53.355Z
Learning: In snakemake_interface_software_deployment_plugins, the cache_path property in CacheableEnvBase intentionally returns only self._cache_prefix without appending the environment hash, as the cache is meant to be shared across environments for efficiency.

Applied to files:

snakemake_interface_software_deployment_plugins/__init__.py

🧬 Code graph analysis (1)

snakemake_interface_software_deployment_plugins/__init__.py (1)

snakemake_interface_software_deployment_plugins/settings.py (1)

SoftwareDeploymentSettingsBase (8-17)

🔇 Additional comments (3)

snakemake_interface_software_deployment_plugins/__init__.py (3)

120-124: Good: explicit string contract for EnvSpec.

Adding an abstract str is helpful for clear error/output messages.

245-247: Decorator order on abstract classmethod is correct.

@classmethod above @AbstractMethod avoids the ABC metaclass issue.

278-282: Intentional shared cache path acknowledged.

Returning only the prefix is by design to share caches across envs; keeping it as-is matches prior decision.

coderabbitai · 2025-09-11T11:34:17Z

snakemake_interface_software_deployment_plugins/__init__.py

+class CacheableEnvBase(ABC):
+    async def get_cache_assets(self) -> Iterable[str]: ...
+
+    @abstractmethod
+    def cache_assets(self) -> None:
+        """Determine environment assets and store any associated information or data to
+        self.cache_path.
+        """
+        ...


🛠️ Refactor suggestion

Make get_cache_assets abstract (or provide a safe default).

As written, remove_cache awaits get_cache_assets(); the current stub ... will return None and cause a TypeError. Either make it abstract or return an empty iterable.

class CacheableEnvBase(ABC): - async def get_cache_assets(self) -> Iterable[str]: ... + @abstractmethod + async def get_cache_assets(self) -> Iterable[str]: + """Return relative asset paths (files/dirs) under cache_path.""" + ...

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class CacheableEnvBase(ABC):

async def get_cache_assets(self) -> Iterable[str]: ...

@abstractmethod

def cache_assets(self) -> None:

"""Determine environment assets and store any associated information or data to

self.cache_path.

"""

...

class CacheableEnvBase(ABC):

@abstractmethod

async def get_cache_assets(self) -> Iterable[str]:

"""Return relative asset paths (files/dirs) under cache_path."""

...

@abstractmethod

def cache_assets(self) -> None:

"""Determine environment assets and store any associated information or data to

self.cache_path.

"""

...

🤖 Prompt for AI Agents

In snakemake_interface_software_deployment_plugins/__init__.py around lines 268 to 276, get_cache_assets is currently a non-abstract stub that returns None (ellipsis) which causes remove_cache to fail when awaiting it; change it so callers always get an iterable by either (a) marking get_cache_assets with @abstractmethod so subclasses must implement it, or (b) provide a safe default implementation that returns an empty iterable (e.g., "async def get_cache_assets(self) -> Iterable[str]: return ()") — choose one approach and update the signature and docstring accordingly so remove_cache can safely iterate the result.

coderabbitai · 2025-09-11T11:34:17Z

snakemake_interface_software_deployment_plugins/__init__.py

+    async def remove_cache(self) -> None:
+        """Remove the cached environment assets."""
+        assert isinstance(self, EnvBase)
+        for asset in await self.get_cache_assets():
+            asset_path = self.cache_path / asset
+            if asset_path.exists():
+                try:
+                    if asset_path.is_dir():
+                        shutil.rmtree(asset_path)
+                    else:
+                        asset_path.unlink()
+                except Exception as e:
+                    raise WorkflowError(
+                        f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"
+                    )
+


🛠️ Refactor suggestion

Guard against path traversal when deleting cache assets.

Validate resolved asset paths stay within cache_path to avoid deleting arbitrary files if a plugin returns a malicious/buggy asset like "../../x".

async def remove_cache(self) -> None: """Remove the cached environment assets.""" assert isinstance(self, EnvBase) - for asset in await self.get_cache_assets(): - asset_path = self.cache_path / asset + base = self.cache_path.resolve() + for asset in await self.get_cache_assets(): + asset_path = (base / asset).resolve() + if not asset_path.is_relative_to(base): + raise WorkflowError( + f"Refusing to remove cache asset outside cache_prefix: {asset_path}" + ) if asset_path.exists(): try: if asset_path.is_dir(): shutil.rmtree(asset_path) else: asset_path.unlink() except Exception as e: - raise WorkflowError( - f"Removal of cache asset {asset_path} for {self.spec} failed: {e}" - ) + raise WorkflowError( + f"Removal of cache asset {asset_path} for {self.spec} failed: {e}" + ) from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def remove_cache(self) -> None:

"""Remove the cached environment assets."""

assert isinstance(self, EnvBase)

for asset in await self.get_cache_assets():

asset_path = self.cache_path / asset

if asset_path.exists():

try:

if asset_path.is_dir():

shutil.rmtree(asset_path)

else:

asset_path.unlink()

except Exception as e:

raise WorkflowError(

f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"

)

async def remove_cache(self) -> None:

"""Remove the cached environment assets."""

assert isinstance(self, EnvBase)

base = self.cache_path.resolve()

for asset in await self.get_cache_assets():

asset_path = (base / asset).resolve()

if not asset_path.is_relative_to(base):

raise WorkflowError(

f"Refusing to remove cache asset outside cache_prefix: {asset_path}"

)

if asset_path.exists():

try:

if asset_path.is_dir():

shutil.rmtree(asset_path)

else:

asset_path.unlink()

except Exception as e:

raise WorkflowError(

f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"

) from e

🤖 Prompt for AI Agents

In snakemake_interface_software_deployment_plugins/__init__.py around lines 283-298, the removal loop currently deletes assets by joining raw asset names with cache_path which allows path-traversal like "../../foo"; before deleting, resolve the asset path and the cache root (use Path.resolve(strict=False)), verify the resolved asset path is inside the resolved cache root (use Path.relative_to or os.path.commonpath) and if not, either skip the asset or raise a WorkflowError; only perform is_dir()/rmtree or unlink on the validated resolved path to prevent deleting files outside the cache directory.

🤖 I have created a release *beep* *boop* --- ## [0.7.0](v0.6.1...v0.7.0) (2025-09-11) ### Features * deploy from archive ([#18](#18)) ([f6d62bd](f6d62bd)) * pass tempdir to EnvBase ([#16](#16)) ([28d8351](28d8351)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

feat: deploy from archive

7931ed8

fmt

5916c08

coderabbitai bot reviewed Aug 18, 2025

View reviewed changes

snakemake_interface_software_deployment_plugins/__init__.py Outdated Show resolved Hide resolved

fix

617189c

coderabbitai bot reviewed Aug 18, 2025

View reviewed changes

migrate archive into cache semantics

62be1da

coderabbitai bot reviewed Aug 19, 2025

View reviewed changes

refactoring

82d7144

coderabbitai bot reviewed Sep 11, 2025

View reviewed changes

johanneskoester added 4 commits September 11, 2025 13:23

fix

bed9820

fmt

ba2a72d

fix

b58c822

lower min cov for now

e0a5847

coderabbitai bot reviewed Sep 11, 2025

View reviewed changes

johanneskoester merged commit f6d62bd into main Sep 11, 2025
4 checks passed

johanneskoester deleted the feat/deploy-from-archive branch September 11, 2025 12:04

github-actions bot mentioned this pull request Sep 11, 2025

chore(main): release 0.7.0 #17

Merged

This was referenced Sep 12, 2025

fix: adapt tests to API changes #19

Merged

fix: streamline inheritance annotation #23

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: deploy from archive #18

feat: deploy from archive #18

Uh oh!

johanneskoester commented Aug 18, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 18, 2025 •

edited

Loading

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 11, 2025

Uh oh!

coderabbitai bot Sep 11, 2025

Uh oh!

Uh oh!

Uh oh!

feat: deploy from archive #18

feat: deploy from archive #18

Uh oh!

Conversation

johanneskoester commented Aug 18, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Pre-merge checks (1 passed, 2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

johanneskoester commented Aug 18, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 18, 2025 •

edited

Loading