Conversation
Helpful bit of meta-information.
Invocations of these two workhorses of the codebase are often useful to see during debugging and development. Debug logging is still very primitive in this codebase. Related-to: <#201>
I did this research as part of resolving a bug¹, so might as well document it where we'll need it. ¹ <#274>
|
If CI passes for this PR I will probably merge it then, so that master's CI passes again. |
|
Ok, two kinds of CI failures. Type failures, which I should have checked locally. Oops. CI-only failures, due to the CI system:
|
…k-setup Helpful to explicitly document this and test for it. (Though note it's about to bump up in subsequent commits.) We'll also be able to optionally use newer Singularity features by guarding them with version checks.
Our recent upgrade of Snakemake in our container image¹ resulted in an
fatal error when used with the Singularity runtime, e.g.:
OSError: [Errno 30] Read-only file system: '/home/runner'
We first observed this in periodic CI for Nextstrain CLI, but were able
to reproduce it outside CI. The error comes from Snakemake inside the
container (newly) trying to `mkdir -p` paths under HOME and represents
two separate-but-intertwined bugs: the first being the read-only
container filesystem preventing creation under HOME, and the second that
the value of HOME from outside the container (e.g. /home/runner in CI)
is leaking into the container.
The read-only filesystem previously wasn't an issue (and wasn't noticed)
because writes were limited to paths bind-mounted read-write from
outside the container (e.g. /nextstrain/build). Writing to the
container filesystem has always been allowed in the Docker runtime, so
we come to parity with it using --writable-tmpfs, which requires
Singularity 3.0.0.
The HOME leak arose thru a intersection of behaviours. Our container
image sets HOME=/nextstrain², but this is a *default* value that can be
overridden at container launch time. With our Docker runtime, it's
never overridden. Singularity, however, automatically forwards the
user's HOME value into the container, even though we use --no-home to
disable mounting the user's home at the same path. The forwarded value
thus override our image's default. Providing an explicitly empty HOME
value via the --home option allows the image default to apply, as
expected.
Resolves: <#274>
¹ <nextstrain/docker-base#136>
² <https://github.com/nextstrain/docker-base/blob/1b0d1998/Dockerfile#L390>
…y ≥3.5.3 Hopefully this "arms race" is quiet for a while now…
1dc3575 to
5054c34
Compare
|
Pyright failures are not specific to this branch, but fixed locally by 5054c34. The mypy failures: bisect to 019c8a8 and are currently inscrutable. |
|
Ugh, mypy. This diff resolves its problems: diff --git a/nextstrain/cli/runner/singularity.py b/nextstrain/cli/runner/singularity.py
index 28c3664..81634f3 100644
--- a/nextstrain/cli/runner/singularity.py
+++ b/nextstrain/cli/runner/singularity.py
@@ -66,7 +66,7 @@ SINGULARITY_CONFIG_ENV = {
"SINGULARITYENV_PROMPT_COMMAND": "unset PROMPT_COMMAND; #",
}
-SINGULARITY_EXEC_ARGS = lambda: [
+def SINGULARITY_EXEC_ARGS(): return [
# Increase isolation.
#
# In the future, we may find we want to use additional related flags toThat's clearly an issue with mypy itself, but one we'll work around I guess. |
|
Mypy issues fixed (locally at least) by 91c5f1c. |
e638cdd to
5bdef31
Compare
|
5bdef31 got CI past Singularity's need for DBus when using cgroups v2 (for new
I'm going to drop cfbdda7 for now from this PR to unblock it and then circle back at some point in a separate PR. |
…ine… …when possible, which is only on Singularity 3.10.0 or newer. Many uses will pass thru this too-early evaluation unscathed, so we don't require it as 3.10.0 is too new a minimum version. Without it, however, some uses, like our reporting of component versions with a `bash -c` invocation, will get maimed resulting in bugs that seem very strange at first.
Without these type stubs, new failures related to these libraries are
raised by Pyright starting with 1.1.309:
…/nextstrain/cli/remote/s3.py
…/nextstrain/cli/remote/s3.py:213:39 - error: Cannot access member "Bucket" for type "_"
Member "Bucket" is unknown (reportGeneralTypeIssues)
…/nextstrain/cli/runner/aws_batch/s3.py
…/nextstrain/cli/runner/aws_batch/s3.py:208:26 - error: "client" is not a known member of "None" (reportOptionalMemberAccess)
…/nextstrain/cli/runner/aws_batch/s3.py:212:24 - error: Cannot access member "Bucket" for type "_"
Member "Bucket" is unknown (reportGeneralTypeIssues)
3 errors, 0 warnings, 0 informations
With these type stubs, those failures are resolved, but different
failures are raised:
…/nextstrain/cli/remote/s3.py
…/nextstrain/cli/remote/s3.py:251:23 - error: Could not access item in TypedDict
"Error" is not a required key in "_ClientErrorResponseTypeDef", so access may result in runtime exception (reportTypedDictNotRequiredAccess)
…/nextstrain/cli/remote/s3.py:251:23 - error: Could not access item in TypedDict
"Code" is not a required key in "_ClientErrorResponseError", so access may result in runtime exception (reportTypedDictNotRequiredAccess)
2 errors, 0 warnings, 0 informations
All failures are fixed in this commit.
Resolves: <#284>
With the lambda syntax, mypy raises these errors for currently-to-me
inscrutable reasons:
nextstrain/cli/runner/singularity.py:24: error: Name "docker" already defined (by an import) [no-redef]
nextstrain/cli/runner/singularity.py:136: error: Cannot determine type of "singularity_version_at_least" [has-type]
Found 2 errors in 1 file (checked 54 source files)
The errors make no sense: the "docker" "redefinition" is the sole
definition, the import, and it is not redefined anywhere in our code
(but maybe it is in mypy's mangling of the code?), and the type of
"singularity_version_at_least" seems clear to me (but maybe mypy
can't/isn't forward looking here?).
Why this workaround works is unknown to me, but I guessed it would
perturb the code path in mypy enough and, lo, it does and it's
unoffensive enough to use.
5bdef31 to
31d708f
Compare
|
Moved those commits to 31d708f...trs/singularity/resource-limits. |
|
Checks pass now, so going to merge to fix CI on master. I'll leave the support for resource limits to another time. |
a3b727b to
3215a5f
Compare
See commit messages for details.
Resolves #274.
Testing