Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop substitution {python} at containers-run runtime #250

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mih
Copy link
Member

@mih mih commented Oct 12, 2023

This causes worse portability issues than the previous approach of storing python(.exe).

Instead pass the placeholder on to datalad run.

This would now error with (as evident by the CI failure)

command has an unrecognized placeholder: 'python'

and requires a further intervention. Options:

  • have datalad-core define datalad.run.substitutions.python=sys.executable
  • wait for Replace custom configuration definition with standard approach #244 and use register_config() to have datalad-container define it
  • have datalad-container define it at containers-run runtime by patching the configuration for the execution time (would have rerun fail still)
  • add the configuration item to the committed dataset config

Closes #249

This causes worse portability issues than the previous approach
of storing `python(.exe)`.

Instead pass the placeholder on to `datalad run`.

This would now error with

```
command has an unrecognized placeholder: 'python'
```

and requires a further intervention. Three options:

- have datalad-core define
  `datalad.run.substitutions.python=sys.executable`
- wait for datalad#244
  and use `register_config()` to have datalad-container define it
- have datalad-container define it at `containers-run` runtime by
  patching the configuration for the execution time (would have `rerun`
  fail still)
- add the configuration item to the committed dataset config

Closes datalad#249
Mostly to avoid the misconceptions that led to datalad#249
@codeclimate
Copy link

codeclimate bot commented Oct 12, 2023

Code Climate has analyzed commit 4ab0a6b and detected 0 issues on this pull request.

View more on Code Climate.

@yarikoptic
Copy link
Member

#244 was merged, should we proceed that way?

@mih
Copy link
Member Author

mih commented Oct 13, 2023

Sadly, this does not seem to be working. It looks like run is not considering configuration defaults, but only explicit configuration:

>>> import datalad
>>> import datalad_container
>>> datalad.cfg.get('datalad.run.substitutions.python') is None
True
>>> datalad.cfg.obtain('datalad.run.substitutions.python')
'/home/mih/env/datalad-dev/bin/python'

It does not matter whether the default is set via register_config() in the extension, or directly in datalad-core.

If run() would use configuration() this would work out-of-the-box:

>>> [c for c in configuration('dump', result_renderer='disabled') if c['name'].startswith('datalad.run.substitutions.')]
[{'action': 'dump_configuration',
  'refds': '/tmp/conti',
  'name': 'datalad.run.substitutions.python',
  'value': '/home/mih/env/datalad-dev/bin/python',
  'purpose': 'Substitution for {python} placeholder',
  'description': 'path to a Python interpreter executable',
  'value_type': constraint:str,
  'path': '/tmp/conti',
  'status': 'ok'}]

@mih
Copy link
Member Author

mih commented Oct 13, 2023

Here is a patch for run() that would make this work.

diff --git a/datalad/core/local/run.py b/datalad/core/local/run.py
index a2ca86772..b588c4a34 100644
--- a/datalad/core/local/run.py
+++ b/datalad/core/local/run.py
@@ -38,6 +38,7 @@ from datalad.interface.base import (
     build_doc,
     eval_results,
 )
+from datalad.interface.common_cfg import definitions as cfg_defs
 from datalad.interface.common_opts import (
     jobs_opt,
     save_message_opt,
@@ -623,7 +624,12 @@ def format_command(dset, command, **kwds):
     command = normalize_command(command)
     sfmt = SequenceFormatter()
 
-    for k, v in dset.config.items("datalad.run.substitutions"):
+    for k in set(cfg_defs.keys()).union(dset.config.keys()):
+        v = dset.config.get(
+            k,
+            # pull a default from the config definitions
+            # if we have no value, but a key
+            cfg_defs.get(k, {}).get('default', None))
         sub_key = k.replace("datalad.run.substitutions.", "")
         if sub_key not in kwds:
             kwds[sub_key] = v

I think this would be a worthwhile addition.

mih added a commit to mih/datalad that referenced this pull request Oct 13, 2023
…ults

Previously, `run()` would not recognize configuration defaults for
placeholder substitution. This means that any placeholders globally
declared in `datalad.interface.common_cfg`, or via `register_config()`
in DataLad extensions would not be effective.

This changeset makes run's `format_command()` helper include such
defaults explicitly, and thereby enable the global declaration of
substitution defaults.

Moreoever a `{python}` placeholder is now defined via this mechanism,
and points to the value of `sys.executable` by default. This particular
placeholder was found to be valueable for improving the portability of
run-recording across (specific) Python versions, or across different
(virtual) environments. See
datalad/datalad-container#224 for an example
use case.

A corresponding test is included.

The ability to keep run-records paramterizable, in particular, for
interpreters can also aid longevity (think platform-specific choices
to call a system Python executable `python3` for some years, and then
switch back.

Related datalad/datalad-container#250
@mih
Copy link
Member Author

mih commented Oct 13, 2023

I posted a corresponding PR at datalad/datalad#7509

It would make the tests pass in this (unmodified) PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

{python} placeholder implementation non-portable
2 participants