Skip to content

Conversation

johanneskoester
Copy link
Contributor

@johanneskoester johanneskoester commented Aug 18, 2025

Summary by CodeRabbit

  • New Features

    • Deployment switched from archive-based to cache-based storage with explicit pinning and a public cache path for managing cached environment assets.
    • Environments expose a stable string representation for specs.
    • Added safe cache removal with improved error handling.
  • Compatibility / API

    • Deployment and cache lifecycle is now async-aware; plugins should adopt the new async workflow and cache/pin APIs.
    • Stronger runtime checks for deployment/caching operations.

Copy link
Contributor

coderabbitai bot commented Aug 18, 2025

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)
  • pyproject.toml is excluded by !pyproject.toml

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

Replaces archive-based deployment with a cache-and-pin model: removes ArchiveableEnvBase; adds PinnableEnvBase and CacheableEnvBase; EnvBase constructor and prefixes renamed to cache/pin/deployment; managed_deploy becomes async; adds cache/pin APIs and runtime assertions with WorkflowError handling.

Changes

Cohort / File(s) Summary
Core deployment & caching
snakemake_interface_software_deployment_plugins/__init__.py
- Removed ArchiveableEnvBase. - EnvBase: constructor signature changed (replaced archive_prefix with cache_prefix, added deployment_prefix and pinfile_prefix), internal _archive_prefix_cache_prefix, added _pinfile_prefix. - Added PinnableEnvBase (pinfile_extension, abstract pin, pinfile property) and CacheableEnvBase (async get_cache_assets, abstract cache_assets, cache_path property, async remove_cache). - EnvSpecBase: added abstract __str__. - DeployableEnvBase.managed_deploy → async and wraps deploy() with WorkflowError; added managed_remove. - record_deployment_hash, deployment_hash, deployment_path now assert isinstance(self, EnvBase) and validate _deployment_prefix. - Replaced archive_path with cache_path. - Imports updated (added shutil, WorkflowError) and portable deployment check is_deployment_path_portable added.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant Env as DeployableEnvBase (concrete)

  Caller->>Env: await managed_deploy()
  alt env supports caching (CacheableEnvBase)
    Env->>Env: await get_cache_assets()
    Env->>Env: await cache_assets()    %% populate cache at cache_path
  end
  Env->>Env: await deploy()            %% uses deployment_prefix / cache_path
  Env-->>Caller: return success

  rect rgba(240,240,255,0.6)
    note right of Env: Removal path
    Caller->>Env: managed_remove()
    Env->>Env: await remove_cache()    --x--> OS filesystem (shutil.rmtree)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • feat: make deploy async #11: Overlaps on converting deployment/archive operations to async and touches archive/cache handling; likely directly related.

Pre-merge checks (1 passed, 2 warnings)

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title "feat: deploy from archive" is misleading because the changeset removes archive-based paths and the ArchiveableEnvBase class and instead introduces a cache- and pin-based deployment model (CacheableEnvBase, PinnableEnvBase, cache_path, pinfile, etc.), so the title does not describe the primary change and could confuse reviewers. Rename the PR to reflect the actual change, for example: "feat: migrate deployment to cache-based model with pinning (remove archive support)"; update the title before merge so it succinctly summarizes the main intent.
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/deploy-from-archive

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bb2067a and 5916c08.

📒 Files selected for processing (1)
  • snakemake_interface_software_deployment_plugins/__init__.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit Configuration File

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • snakemake_interface_software_deployment_plugins/__init__.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)

294-302: Docstring: explicitly note this is a coroutine that must be awaited

Clarify the async contract in the docstring to reduce misuse.

Apply this diff:

     async def deploy_from_archive(self) -> None:
         """Deploy the environment from self.archive_path to self.deployment_path.
 
         When issuing shell commands, the environment should use
         self.run_cmd(cmd: str) in order to ensure that it runs within eventual
         parent environments (e.g. a container or an env module).
+
+        This is a coroutine; callers must await it.
         """
         ...
🧹 Nitpick comments (2)
snakemake_interface_software_deployment_plugins/__init__.py (2)

257-261: Guard against missing deployment_prefix when hashing non-portable deployments

If _deployment_prefix is None, str(None) gets hashed, which can conceal a configuration issue and produce unstable semantics. Assert it’s set when the path is not portable.

Also, nit in the docstring above: "senstivive" → "sensitive".

Apply this diff:

 def record_deployment_hash(self, hash_object) -> None:
@@
-        assert isinstance(self, EnvBase)
+        assert isinstance(self, EnvBase)
         self.record_hash(hash_object)
         if not self.is_deployment_path_portable():
+            assert (
+                self._deployment_prefix is not None
+            ), "deployment_prefix must be set when deployment path is not portable"
             hash_object.update(str(self._deployment_prefix).encode())

Optional typo fix for the docstring just above:

-        deployment is senstivive to the path (e.g. in case of conda, which patches
+        deployment is sensitive to the path (e.g. in case of conda, which patches

306-307: Prefer explicit exceptions over asserts for public API invariants

Asserts can be stripped with Python -O, turning this into a potential AttributeError later. Raise a descriptive exception instead.

Apply this diff:

     @property
     def archive_path(self) -> Path:
-        assert isinstance(self, EnvBase) and self._archive_prefix is not None
-        return self._archive_prefix / self.hash()
+        if not isinstance(self, EnvBase) or self._archive_prefix is None:
+            raise RuntimeError(
+                "archive_path accessed but no archive_prefix is configured."
+            )
+        return self._archive_prefix / self.hash()
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5916c08 and 617189c.

📒 Files selected for processing (1)
  • snakemake_interface_software_deployment_plugins/__init__.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit Configuration File

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • snakemake_interface_software_deployment_plugins/__init__.py
🪛 GitHub Actions: CI
snakemake_interface_software_deployment_plugins/__init__.py

[error] 1-1: Command 'poetry run ruff format --check .' failed. 1 file would be reformatted: snakemake_interface_software_deployment_plugins/init.py. Run 'ruff format' to fix.

🔇 Additional comments (6)
snakemake_interface_software_deployment_plugins/__init__.py (6)

11-11: Import for archive removal — LGTM

shutil import is appropriate for the new archive removal functionality.


274-276: Assertion before computing deployment_hash — LGTM

Asserting EnvBase membership here is reasonable given _managed_generic_hash("deployment_hash") depends on record_deployment_hash.


279-281: deployment_path invariant — LGTM

Asserting a non-None deployment_prefix before building the path is appropriate.


309-313: Archive removal — LGTM

shutil.rmtree(..., ignore_errors=True) is a pragmatic choice here to avoid noisy failures when nothing is archived.


1-1: CI failing ruff format — run the formatter to unblock the pipeline

CI reports: “ruff format --check . failed. 1 file would be reformatted: snakemake_interface_software_deployment_plugins/init.py”.

Please run ruff format (or poetry run ruff format) locally and commit the result.


267-272: Add explicit _archive_prefix guard and update managed_deploy to async

Please apply the diff below to avoid AssertionError when no archive is configured and to gate archive‐based deploys on the existence of the archive path:

 async def managed_deploy(self) -> None:
-        if isinstance(self, ArchiveableEnvBase) and self.archive_path.exists():
-            await self.deploy_from_archive()
-        else:
-            await self.deploy()
+        # Only attempt archive-based deployment if an archive prefix is configured.
+        if (
+            isinstance(self, ArchiveableEnvBase)
+            and isinstance(self, EnvBase)
+            and self._archive_prefix is not None
+            and (self._archive_prefix / self.hash()).exists()
+        ):
+            await self.deploy_from_archive()
+        else:
+            await self.deploy()

Key action items:

  • Breaking change: managed_deploy is now async. Audit all call sites to ensure they use await managed_deploy() (or schedule via asyncio.create_task). A quick scan showed no direct invocations—please verify manually.
  • Confirm every subclass of ArchiveableEnvBase implements the new abstract deploy_from_archive() method.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limits.

🔭 Outside diff range comments (2)
snakemake_interface_software_deployment_plugins/__init__.py (2)

217-226: Hash caching bug: computed digest isn’t stored back; causes repeated recomputation

_managed_generic_hash() reads self._managed_{kind}_store but never writes it after computing the digest. This defeats the intended memoization of both hash() and deployment_hash(), and can significantly degrade performance (e.g., __hash__ repeatedly recomputes).

Apply:

     def _managed_generic_hash(self, kind: str) -> str:
         store = getattr(self, f"_managed_{kind}_store")
         if store is None:
             record_hash = f"record_{kind}"
             hash_object = hashlib.md5()
             if self.within is not None:
                 getattr(self.within, record_hash)(hash_object)
             getattr(self, record_hash)(hash_object)
-            store = hash_object.hexdigest()
+            store = hash_object.hexdigest()
+            setattr(self, f"_managed_{kind}_store", store)
         return store

137-149: Sync tests with cache_prefix rename

We ran a scan and found lingering archive_prefix references in the tests. Update these to use the new cache API:

• File: snakemake_interface_software_deployment_plugins/tests.py
– Lines ~128–129:
```diff

  •       if issubclass(env_cls, ArchiveableEnvBase):
    
  •           args["archive_prefix"] = tmp_path / "archives"
    
  •       if issubclass(env_cls, ArchiveableEnvBase):
    
  •           args["cache_prefix"] = tmp_path / "cache"
    
    – Update any path literal (`"archives"`) to your chosen cache directory (e.g. `"cache"`).  
    – Rename the test (and any assertions) from “archive” to “cache” if its name or expectations reference archive semantics.
    
    

No other archive_* or deploy_from_archive symbols were found in production code—only in this test file—so addressing these two lines completes the API rename.

🧹 Nitpick comments (4)
snakemake_interface_software_deployment_plugins/__init__.py (4)

262-268: Typo in docstring ("senstivive")

Small polish to the public API docs.

Apply:

-        deployment is senstivive to the path (e.g. in case of conda, which patches
+        deployment is sensitive to the path (e.g. in case of conda, which patches

269-270: Asserts used for runtime invariants can be optimized away; prefer explicit exceptions for user-facing API

assert statements are fine for internal invariants, but with python -O they’re removed. Here, losing these checks could result in confusing errors (e.g., TypeError when _deployment_prefix is None). Consider explicit checks raising WorkflowError or ValueError for clearer diagnostics.

Example for deployment_path:

-        assert isinstance(self, EnvBase) and self._deployment_prefix is not None
-        return self._deployment_prefix / self.deployment_hash()
+        if not isinstance(self, EnvBase):
+            raise WorkflowError("deployment_path accessed on non-EnvBase instance")
+        if self._deployment_prefix is None:
+            raise WorkflowError("deployment_prefix is not set; cannot compute deployment_path")
+        return self._deployment_prefix / self.deployment_hash()

Apply similarly in deployment_hash, managed_deploy, and cache_path.

Also applies to: 300-301, 310-311, 315-316, 320-321


1-321: CI reports low coverage for this module—add focused tests for new error-handling and cache API

Target high-signal behaviors:

  • managed_deploy wraps exceptions from deploy into WorkflowError (both success and failure cases).
  • managed_remove wraps exceptions from remove into WorkflowError.
  • remove_cache removes directory; error path wraps into WorkflowError.
  • deployment_path raises a clear error (or fails assertion) when deployment_prefix is None.
  • record_deployment_hash changes when deployment path changes for non-portable envs.
  • EnvSpecBase now requires str; add a skeletal subclass in tests to verify error messages include spec string.

I can draft pytest tests for these cases. Want me to open a follow-up PR with a test suite to raise coverage over the CI threshold?


246-251: No cache() calls found; caching remains optional
A search across the Python codebase didn’t turn up any await cache() invocations—plugins’ cache() hook isn’t called by any orchestrator, so it’s effectively a no-op and treated as optional.

Optional refactor: if you later want symmetry with managed_deploy/managed_remove and unified error handling, you can add a managed_cache() wrapper, for example:

+    async def managed_cache(self) -> None:
+        assert isinstance(self, EnvBase)
+        try:
+            await self.cache()
+        except Exception as e:
+            raise WorkflowError(f"Caching assets for {self.spec} failed: {e}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 617189c and 62be1da.

📒 Files selected for processing (1)
  • snakemake_interface_software_deployment_plugins/__init__.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit Configuration File

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • snakemake_interface_software_deployment_plugins/__init__.py
🧬 Code Graph Analysis (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
snakemake_interface_software_deployment_plugins/settings.py (1)
  • SoftwareDeploymentSettingsBase (8-17)
🪛 GitHub Actions: CI
snakemake_interface_software_deployment_plugins/__init__.py

[error] 1-321: Low test coverage: 42% (108 missing) in init.py. Overall coverage threshold not met (fail-under=63%).

🔇 Additional comments (8)
snakemake_interface_software_deployment_plugins/__init__.py (8)

11-11: LGTM: Imports for cache removal and error wrapping are appropriate

shutil and WorkflowError are correctly introduced and used below.

Also applies to: 18-18


279-288: LGTM: managed_remove wraps plugin errors in WorkflowError with context

Good defensive wrapper for removal.


289-299: LGTM: remove_cache safely deletes cache with error wrapping

Checks existence and wraps errors with WorkflowError. Looks good.


309-312: LGTM: deployment_hash now guarded; consistent with new error handling approach

With the earlier note about asserts vs explicit exceptions.


315-316: LGTM: deployment_path depends on non-None deployment_prefix

See earlier note on replacing asserts with explicit exceptions for clearer UX.


319-321: LGTM: cache_path computes from _cache_prefix and environment hash

This aligns with the cache-based API.


300-307: No direct calls to managed_deploy() found—please verify external usage
I ran a repository-wide grep for both managed_deploy( and await …managed_deploy( and didn’t locate any invocation sites. It looks like this async API isn’t yet consumed internally, but you should:

  • Audit all plugins, tests, and upstream integrations that may call managed_deploy
  • Ensure any invocation is prefixed with await to prevent unawaited coroutine issues

120-124: Abstract str on EnvSpecBase is a breaking API change—no local subclasses found but downstream plugins will break

– I ran a recursive search for any class …(EnvSpecBase) in this repo and found none.
– However, external plugins that subclass EnvSpecBase will now be forced to implement __str__ or else fail at instantiation.
– You should coordinate a major/minor version bump, update any plugin-skeletons and docs to require or supply __str__, and communicate this change to plugin authors.
– If you want to soften this break, you can replace the abstract stub with a default implementation:

-    @abstractmethod
-    def __str__(self) -> str:
-        """Return a string representation of the environment spec."""
-        ...
+    def __str__(self) -> str:
+        """Return a string representation of the environment spec."""
+        ident = ", ".join(f"{attr}={getattr(self, attr)!r}" for attr in self.identity_attributes())
+        return f"{self.__class__.__name__}({ident})"

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)

219-229: Hash caching bug: computed value isn’t stored back.

_managed_generic_hash() recomputes every time because the store attribute isn’t updated.

     def _managed_generic_hash(self, kind: str) -> str:
         store = getattr(self, f"_managed_{kind}_store")
         if store is None:
             record_hash = f"record_{kind}"
             hash_object = hashlib.md5()
             if self.within is not None:
                 getattr(self.within, record_hash)(hash_object)
             getattr(self, record_hash)(hash_object)
             store = hash_object.hexdigest()
+            setattr(self, f"_managed_{kind}_store", store)
         return store
♻️ Duplicate comments (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)

346-353: Async managed_deploy with WorkflowError wrapper — looks good.

This addresses prior feedback to make managed_deploy await the deployment routine.

🧹 Nitpick comments (4)
snakemake_interface_software_deployment_plugins/__init__.py (4)

281-296: Optional: fast-path cache removal by deleting the per-env directory.

Once cache_path is per-env, you can remove the whole dir instead of iterating assets.

     async def remove_cache(self) -> None:
         """Remove the cached environment assets."""
         assert isinstance(self, EnvBase)
+        # Fast path: drop the entire per-env cache directory if present.
+        if self.cache_path.exists():
+            shutil.rmtree(self.cache_path, ignore_errors=True)
+            return
         for asset in await self.get_cache_assets():
             asset_path = self.cache_path / asset
             if asset_path.exists():
                 try:
                     if asset_path.is_dir():
                         shutil.rmtree(asset_path)
                     else:
                         asset_path.unlink()
                 except Exception as e:
                     raise WorkflowError(
                         f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"
                     )

300-307: Clarify portability docstring (path sensitivity).

Current text conflates “path matters” with “portable”. Suggest clearer wording.

-    def is_deployment_path_portable(self) -> bool: 
-        """Return whether the deployment path matters for the environment, i.e.
-        whether the environment is portable. If this returns False, the deployment
+    def is_deployment_path_portable(self) -> bool:
+        """Return whether the environment is portable (path-insensitive).
+        If this returns False, the deployment
         path is considered for the deployment hash. For example, conda environments are not
         portable because they hardcode the path in binaries, while containers are
         portable.
         """

319-325: Typo in docstring.

“senstivive” → “sensitive”.

-        deployment is senstivive to the path (e.g. in case of conda, which patches
+        deployment is sensitive to the path (e.g. in case of conda, which patches

336-345: Preserve original exception context.

Chain the original exception for better diagnostics.

     def managed_remove(self) -> None:
         """Remove the deployed environment, handling exceptions."""
         assert isinstance(self, EnvBase)
         try:
             self.remove()
         except Exception as e:
-            raise WorkflowError(
-                f"Removal of {self.spec} failed: {e}"
-            )
+            raise WorkflowError(f"Removal of {self.spec} failed: {e}") from e
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62be1da and 82d7144.

📒 Files selected for processing (1)
  • snakemake_interface_software_deployment_plugins/__init__.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • snakemake_interface_software_deployment_plugins/__init__.py
🧬 Code graph analysis (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
snakemake_interface_software_deployment_plugins/settings.py (1)
  • SoftwareDeploymentSettingsBase (8-17)
🪛 GitHub Actions: CI
snakemake_interface_software_deployment_plugins/__init__.py

[error] 244-245: Command failed: poetry run coverage run -m pytest tests/tests.py. Pytest collection error: AttributeError: 'isabstractmethod' of 'classmethod' objects is not writable during PinnableEnvBase class definition (snakemake_interface_software_deployment_plugins/init.py:244-245).

🔇 Additional comments (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)

120-124: Requiring str on EnvSpecBase — good API improvement.

This enforces consistent human-readable specs for error messages and logging.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)

136-151: Constructor signature change is breaking; ensure all call sites updated.

The new cache_prefix/deployment_prefix/pinfile_prefix args will break existing instantiations and downstream plugins unless updated.

🧹 Nitpick comments (5)
snakemake_interface_software_deployment_plugins/__init__.py (5)

258-265: Nit: avoid double call to pinfile_extension().

Reuse the local ext variable to prevent duplicate lookups.

     def pinfile(self) -> Path:
         assert isinstance(self, EnvBase)
         ext = self.pinfile_extension()
         if not ext.startswith("."):
             raise ValueError("pinfile_extension must start with a dot.")
-        return (self._pinfile_prefix / self.hash()).with_suffix(
-            self.pinfile_extension()
-        )
+        return (self._pinfile_prefix / self.hash()).with_suffix(ext)

321-333: Typo in docstring.

“senstivive” → “sensitive”.

-        deployment is senstivive to the path (e.g. in case of conda, which patches
+        deployment is sensitive to the path (e.g. in case of conda, which patches

338-345: Preserve traceback chaining.

Raise from the original exception for better diagnostics.

-        except Exception as e:
-            raise WorkflowError(f"Removal of {self.spec} failed: {e}")
+        except Exception as e:
+            raise WorkflowError(f"Removal of {self.spec} failed: {e}") from e

346-352: Same: preserve traceback chaining on deployment errors.

-        except Exception as e:
-            raise WorkflowError(f"Deployment of {self.spec} failed: {e}")
+        except Exception as e:
+            raise WorkflowError(f"Deployment of {self.spec} failed: {e}") from e

219-228: Cache the computed hash to avoid recomputation.

_store isn’t written back; hash is recomputed every call. Save it into the corresponding managed*_store.

     def _managed_generic_hash(self, kind: str) -> str:
         store = getattr(self, f"_managed_{kind}_store")
         if store is None:
             record_hash = f"record_{kind}"
             hash_object = hashlib.md5()
             if self.within is not None:
                 getattr(self.within, record_hash)(hash_object)
             getattr(self, record_hash)(hash_object)
-            store = hash_object.hexdigest()
+            store = hash_object.hexdigest()
+            setattr(self, f"_managed_{kind}_store", store)
         return store
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 82d7144 and b58c822.

📒 Files selected for processing (1)
  • snakemake_interface_software_deployment_plugins/__init__.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

Files:

  • snakemake_interface_software_deployment_plugins/__init__.py
🧠 Learnings (1)
📚 Learning: 2025-09-11T11:26:53.355Z
Learnt from: johanneskoester
PR: snakemake/snakemake-interface-software-deployment-plugins#18
File: snakemake_interface_software_deployment_plugins/__init__.py:278-282
Timestamp: 2025-09-11T11:26:53.355Z
Learning: In snakemake_interface_software_deployment_plugins, the cache_path property in CacheableEnvBase intentionally returns only self._cache_prefix without appending the environment hash, as the cache is meant to be shared across environments for efficiency.

Applied to files:

  • snakemake_interface_software_deployment_plugins/__init__.py
🧬 Code graph analysis (1)
snakemake_interface_software_deployment_plugins/__init__.py (1)
snakemake_interface_software_deployment_plugins/settings.py (1)
  • SoftwareDeploymentSettingsBase (8-17)
🔇 Additional comments (3)
snakemake_interface_software_deployment_plugins/__init__.py (3)

120-124: Good: explicit string contract for EnvSpec.

Adding an abstract str is helpful for clear error/output messages.


245-247: Decorator order on abstract classmethod is correct.

@classmethod above @AbstractMethod avoids the ABC metaclass issue.


278-282: Intentional shared cache path acknowledged.

Returning only the prefix is by design to share caches across envs; keeping it as-is matches prior decision.

Comment on lines +268 to +276
class CacheableEnvBase(ABC):
async def get_cache_assets(self) -> Iterable[str]: ...

@abstractmethod
def cache_assets(self) -> None:
"""Determine environment assets and store any associated information or data to
self.cache_path.
"""
...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make get_cache_assets abstract (or provide a safe default).

As written, remove_cache awaits get_cache_assets(); the current stub ... will return None and cause a TypeError. Either make it abstract or return an empty iterable.

 class CacheableEnvBase(ABC):
-    async def get_cache_assets(self) -> Iterable[str]: ...
+    @abstractmethod
+    async def get_cache_assets(self) -> Iterable[str]:
+        """Return relative asset paths (files/dirs) under cache_path."""
+        ...
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class CacheableEnvBase(ABC):
async def get_cache_assets(self) -> Iterable[str]: ...
@abstractmethod
def cache_assets(self) -> None:
"""Determine environment assets and store any associated information or data to
self.cache_path.
"""
...
class CacheableEnvBase(ABC):
@abstractmethod
async def get_cache_assets(self) -> Iterable[str]:
"""Return relative asset paths (files/dirs) under cache_path."""
...
@abstractmethod
def cache_assets(self) -> None:
"""Determine environment assets and store any associated information or data to
self.cache_path.
"""
...
🤖 Prompt for AI Agents
In snakemake_interface_software_deployment_plugins/__init__.py around lines 268
to 276, get_cache_assets is currently a non-abstract stub that returns None
(ellipsis) which causes remove_cache to fail when awaiting it; change it so
callers always get an iterable by either (a) marking get_cache_assets with
@abstractmethod so subclasses must implement it, or (b) provide a safe default
implementation that returns an empty iterable (e.g., "async def
get_cache_assets(self) -> Iterable[str]: return ()") — choose one approach and
update the signature and docstring accordingly so remove_cache can safely
iterate the result.

Comment on lines +283 to +298
async def remove_cache(self) -> None:
"""Remove the cached environment assets."""
assert isinstance(self, EnvBase)
for asset in await self.get_cache_assets():
asset_path = self.cache_path / asset
if asset_path.exists():
try:
if asset_path.is_dir():
shutil.rmtree(asset_path)
else:
asset_path.unlink()
except Exception as e:
raise WorkflowError(
f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Guard against path traversal when deleting cache assets.

Validate resolved asset paths stay within cache_path to avoid deleting arbitrary files if a plugin returns a malicious/buggy asset like "../../x".

     async def remove_cache(self) -> None:
         """Remove the cached environment assets."""
         assert isinstance(self, EnvBase)
-        for asset in await self.get_cache_assets():
-            asset_path = self.cache_path / asset
+        base = self.cache_path.resolve()
+        for asset in await self.get_cache_assets():
+            asset_path = (base / asset).resolve()
+            if not asset_path.is_relative_to(base):
+                raise WorkflowError(
+                    f"Refusing to remove cache asset outside cache_prefix: {asset_path}"
+                )
             if asset_path.exists():
                 try:
                     if asset_path.is_dir():
                         shutil.rmtree(asset_path)
                     else:
                         asset_path.unlink()
                 except Exception as e:
-                    raise WorkflowError(
-                        f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"
-                    )
+                    raise WorkflowError(
+                        f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"
+                    ) from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def remove_cache(self) -> None:
"""Remove the cached environment assets."""
assert isinstance(self, EnvBase)
for asset in await self.get_cache_assets():
asset_path = self.cache_path / asset
if asset_path.exists():
try:
if asset_path.is_dir():
shutil.rmtree(asset_path)
else:
asset_path.unlink()
except Exception as e:
raise WorkflowError(
f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"
)
async def remove_cache(self) -> None:
"""Remove the cached environment assets."""
assert isinstance(self, EnvBase)
base = self.cache_path.resolve()
for asset in await self.get_cache_assets():
asset_path = (base / asset).resolve()
if not asset_path.is_relative_to(base):
raise WorkflowError(
f"Refusing to remove cache asset outside cache_prefix: {asset_path}"
)
if asset_path.exists():
try:
if asset_path.is_dir():
shutil.rmtree(asset_path)
else:
asset_path.unlink()
except Exception as e:
raise WorkflowError(
f"Removal of cache asset {asset_path} for {self.spec} failed: {e}"
) from e
🤖 Prompt for AI Agents
In snakemake_interface_software_deployment_plugins/__init__.py around lines
283-298, the removal loop currently deletes assets by joining raw asset names
with cache_path which allows path-traversal like "../../foo"; before deleting,
resolve the asset path and the cache root (use Path.resolve(strict=False)),
verify the resolved asset path is inside the resolved cache root (use
Path.relative_to or os.path.commonpath) and if not, either skip the asset or
raise a WorkflowError; only perform is_dir()/rmtree or unlink on the validated
resolved path to prevent deleting files outside the cache directory.

@johanneskoester johanneskoester merged commit f6d62bd into main Sep 11, 2025
4 checks passed
@johanneskoester johanneskoester deleted the feat/deploy-from-archive branch September 11, 2025 12:04
johanneskoester pushed a commit that referenced this pull request Sep 11, 2025
🤖 I have created a release *beep* *boop*
---


##
[0.7.0](v0.6.1...v0.7.0)
(2025-09-11)


### Features

* deploy from archive
([#18](#18))
([f6d62bd](f6d62bd))
* pass tempdir to EnvBase
([#16](#16))
([28d8351](28d8351))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant