Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containers-add produces platform-specific configuration #224

Closed
mih opened this issue Oct 9, 2023 · 4 comments · Fixed by #243
Closed

containers-add produces platform-specific configuration #224

mih opened this issue Oct 9, 2023 · 4 comments · Fixed by #243

Comments

@mih
Copy link
Member

mih commented Oct 9, 2023

[datalad "containers.debian"]
	image = .datalad\\environments\\debian\\image
	cmdexec = python.exe -m datalad_container.adapters.docker run {img} {cmd}

The python executable is not portable and the path separators also not.

mih added a commit to mih/datalad-container that referenced this issue Oct 9, 2023
This placeholder is expanded on *execution* of a container, rather than
on configuration/addition. This helps use the same Python installation
that is also executing the datalad-container code.

Previously, the docker-support code would expand and then hardcode
`sys.executable` on configuring a container. This led to non-portable
configuration (e.g., hardcoded `python.exe` on windows), and would fail
to pick up the correct python installation in any case where the
`python` entrypoint would not point to the correct one.

Closes datalad#226
Closes datalad#224
mih added a commit to mih/datalad-container that referenced this issue Oct 9, 2023
This placeholder is expanded on *execution* of a container, rather than
on configuration/addition. This helps use the same Python installation
that is also executing the datalad-container code.

Previously, the docker-support code would expand and then hardcode
`sys.executable` on configuring a container. This led to non-portable
configuration (e.g., hardcoded `python.exe` on windows), and would fail
to pick up the correct python installation in any case where the
`python` entrypoint would not point to the correct one.

Closes datalad#226
Closes datalad#224
@mih
Copy link
Member Author

mih commented Oct 9, 2023

Fixing the path separator in a yes-we-remain-compatible-with-1893 will be complex. Theoretically, directory names with backslashes are legal. So we cannot simply force convert them.

So to make things smooth, we would need to employ heuristics/guesswork. This means adding code to do that and maintain it forever. The target audience is windows users that use containers-run on windows and never give the resulting datasets to other platforms (for those things have never worked).

For those, and in some cases, the path could be checked and auto-converted against the annex placeholder. However, that is only possible, when the image is hosted in the same dataset -- further complicating the logic and still not being comprehensive.

That being said, I am comfortable with an entry in the changelog that says "Windows: manual fixing needed".

@mih
Copy link
Member Author

mih commented Oct 9, 2023

This issue is further complicated by the fact that all this code is pre-pathlib.

@yarikoptic
Copy link
Member

since not said explicitly I think - I think we should store only POSIX paths there, so fix code and add mapping to OS specific path (pathlib or not). Adding heuristic to fix paths either starting with r".datalad\\" or containing r"\\.datalad\\" in the path would also cover most of the use cases.

@mih mih closed this as completed in 70093cd Oct 9, 2023
@mih mih reopened this Oct 9, 2023
@mih
Copy link
Member Author

mih commented Oct 9, 2023

I arrived at the same conclusion and have a partial patch for that. Rollout is difficult, will file separate issue

mih added a commit to mih/datalad-container that referenced this issue Oct 10, 2023
This establishes a fixed format for the configuration. Previously,
the path was stored in platform conventions without any indication
of the nature of that platform. This turned platform detection into
guesswork.

Some of that guesswork is now implemented in a path normalization
helper (`_normalize_image_path()`). it is executed on-read, in order
to keep older dataset configuration functional for the cases

- we can tell the platform conventions, because the image is placed
  under `.datalad\\environments\\` (the default)
- we can locate a matching element on the file system

Notably, this guesswork does not cover the case of an image that
was configured on windows, and is hosted in a currently uninstalled
subdataset.

Closes datalad#224
mih added a commit to mih/datalad-container that referenced this issue Oct 10, 2023
This establishes a fixed format for the configuration. Previously,
the path was stored in platform conventions without any indication
of the nature of that platform. This turned platform detection into
guesswork.

Some of that guesswork is now implemented in a path normalization
helper (`_normalize_image_path()`). it is executed on-read, in order
to keep older dataset configuration functional for the cases

- we can tell the platform conventions, because the image is placed
  under `.datalad\\environments\\` (the default)
- we can locate a matching element on the file system

Notably, this guesswork does not cover the case of an image that
was configured on windows, and is hosted in a currently uninstalled
subdataset.

Closes datalad#224
mih added a commit to mih/datalad-container that referenced this issue Oct 10, 2023
This establishes a fixed format for the configuration. Previously,
the path was stored in platform conventions without any indication
of the nature of that platform. This turned platform detection into
guesswork.

Some of that guesswork is now implemented in a path normalization
helper (`_normalize_image_path()`). it is executed on-read, in order
to keep older dataset configuration functional for the cases

- we can tell the platform conventions, because the image is placed
  under `.datalad\\environments\\` (the default)
- we can locate a matching element on the file system

Notably, this guesswork does not cover the case of an image that
was configured on windows, and is hosted in a currently uninstalled
subdataset.

Closes datalad#224
@mih mih closed this as completed in #243 Oct 12, 2023
mih added a commit to mih/datalad that referenced this issue Oct 13, 2023
…ults

Previously, `run()` would not recognize configuration defaults for
placeholder substitution. This means that any placeholders globally
declared in `datalad.interface.common_cfg`, or via `register_config()`
in DataLad extensions would not be effective.

This changeset makes run's `format_command()` helper include such
defaults explicitly, and thereby enable the global declaration of
substitution defaults.

Moreoever a `{python}` placeholder is now defined via this mechanism,
and points to the value of `sys.executable` by default. This particular
placeholder was found to be valueable for improving the portability of
run-recording across (specific) Python versions, or across different
(virtual) environments. See
datalad/datalad-container#224 for an example
use case.

A corresponding test is included.

The ability to keep run-records paramterizable, in particular, for
interpreters can also aid longevity (think platform-specific choices
to call a system Python executable `python3` for some years, and then
switch back.

Related datalad/datalad-container#250
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants