Skip to content

Edited contributing guidelines, multiconstructor for yaml loader#102

Merged
jucordero merged 7 commits into
FixOurFood:mainfrom
jucordero:conffile
May 20, 2026
Merged

Edited contributing guidelines, multiconstructor for yaml loader#102
jucordero merged 7 commits into
FixOurFood:mainfrom
jucordero:conffile

Conversation

@jucordero
Copy link
Copy Markdown
Collaborator

Description

This pull request adds new functionality to the configuration file loader by allowing arbitrary functions to be called to define objects passed as parameters to nodes.

It also fixes a typo in the contributing guidelines where the afp remote is defined but not used correctly to fetch changes from the main branch.

Checklist

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates contributor documentation and extends the pipeline YAML config loader to support YAML-tag-based construction of parameter objects by importing and invoking Python callables during config parsing.

Changes:

  • Updated config-file docs wording to explicitly reference YAML configuration files.
  • Fixed the git fetch / remote name usage in CONTRIBUTING.md to match the documented afp remote.
  • Added a PyYAML multi-constructor in Pipeline.read() to resolve !<suffix> tags into imported callables and optionally call them with sequence/mapping arguments.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
docs/config_file.rst Clarifies that CLI config files are YAML.
CONTRIBUTING.md Corrects feature-branch creation commands to use the afp remote consistently.
agrifoodpy/pipeline/pipeline.py Adds a YAML multi-constructor to dynamically load and invoke functions when parsing pipeline configs.
Comments suppressed due to low confidence (2)

agrifoodpy/pipeline/pipeline.py:66

  • Using yaml.add_multi_constructor("!", ...) will intercept all YAML tags starting with ! (including standard/third-party tags like !!str), and _load_function will fail for tags that aren't dotted paths. It would be safer to use a dedicated tag prefix (e.g., !call:) or validate suffix before attempting to import/call.
        # Register the multi-constructor for all tags starting with '!'
        yaml.add_multi_constructor("!", dynamic_call_constructor,
                                   Loader=yaml.FullLoader)

agrifoodpy/pipeline/pipeline.py:66

  • This change enables importing and invoking arbitrary callables from YAML config at load time. If config files can come from untrusted sources, this is effectively code execution. Please either (a) restrict allowed modules/functions (whitelist), (b) gate this behind an explicit opt-in flag, and/or (c) clearly document that configs must be trusted.
        def dynamic_call_constructor(loader, suffix, node):
            """Multi-constructor for arbitrary functions"""
            func = cls._load_function(suffix)
            
            if isinstance(node, yaml.ScalarNode):
                return func
            elif isinstance(node, yaml.SequenceNode):
                args = loader.construct_sequence(node)
                return func(*args)
            elif isinstance(node, yaml.MappingNode):
                kwargs = loader.construct_mapping(node)
                return func(**kwargs)
            
        # Register the multi-constructor for all tags starting with '!'
        yaml.add_multi_constructor("!", dynamic_call_constructor,
                                   Loader=yaml.FullLoader)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agrifoodpy/pipeline/pipeline.py Outdated
Comment thread agrifoodpy/pipeline/pipeline.py Outdated
Comment thread agrifoodpy/pipeline/pipeline.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.

func = cls._load_function(func_path)

if isinstance(node, yaml.ScalarNode):
return func
Comment on lines +79 to +89
func = cls._load_function(func_path)

if isinstance(node, yaml.ScalarNode):
return func
if isinstance(node, yaml.SequenceNode):
args = loader.construct_sequence(node, deep=True)
return func(*args)
if isinstance(node, yaml.MappingNode):
kwargs = loader.construct_mapping(node, deep=True)
return func(**kwargs)

Comment on lines +99 to +108
yaml.add_multi_constructor(
"!numpy.",
dynamic_call_constructor("numpy"),
Loader=yaml.FullLoader,
)
yaml.add_multi_constructor(
"!xarray.",
dynamic_call_constructor("xarray"),
Loader=yaml.FullLoader,
)
Comment on lines +35 to +47
@staticmethod
def _is_supported_yaml_function(path):
"""Return True for dotted numpy/xarray function paths."""
if not isinstance(path, str) or "." not in path:
return False

module_path, _ = path.rsplit(".", 1)
return (
module_path == "numpy"
or module_path.startswith("numpy.")
or module_path == "xarray"
or module_path.startswith("xarray.")
)
Comment on lines +368 to +417
def test_read_yaml_numpy_array_():

script_dir = os.path.dirname(__file__)
config_path = os.path.join(script_dir, "data/test_config_numpy_array.yaml")

pipeline = Pipeline.read(str(config_path))
pipeline.run()

assert np.array_equal(pipeline.params[0]['value'], np.array([1, 2, 3]))
assert np.array_equal(pipeline.datablock["test_numpy_array"], np.array([1, 2, 3]))

def test_read_yaml_numpy_array_kwargs():
script_dir = os.path.dirname(__file__)
config_path = os.path.join(script_dir, "data/test_config_numpy_array_kwargs.yaml")

pipeline = Pipeline.read(str(config_path))
pipeline.run()

assert np.array_equal(pipeline.params[0]['value'], np.array([1, 2, 3]))
assert np.array_equal(pipeline.datablock["test_numpy_array"], np.array([1, 2, 3]))

def test_read_yaml_xarray_dataarray():
script_dir = os.path.dirname(__file__)
config_path = os.path.join(script_dir, "data/test_config_xarray_dataarray.yaml")

pipeline = Pipeline.read(str(config_path))
pipeline.run()

expected_array = xr.DataArray([1, 2, 3], coords={"Year": [2020, 2021, 2022]}, dims=["Year"])
xr.testing.assert_equal(pipeline.params[0]['value'], expected_array)
xr.testing.assert_equal(pipeline.datablock["test_value"], expected_array)

def test_read_yaml_xarray_dataarray_kwargs():
script_dir = os.path.dirname(__file__)
config_path = os.path.join(script_dir, "data/test_config_xarray_dataarray_kwargs.yaml")

pipeline = Pipeline.read(str(config_path))
pipeline.run()

expected_array = xr.DataArray([1, 2, 3], coords={"Year": [2020, 2021, 2022]}, dims=["Year"])
xr.testing.assert_equal(pipeline.params[0]['value'], expected_array)
xr.testing.assert_equal(pipeline.datablock["test_value"], expected_array)

def test_read_yaml_unsupported_function():
from yaml.constructor import ConstructorError
script_dir = os.path.dirname(__file__)
config_path = os.path.join(script_dir, "data/test_config_unsupported_function.yaml")

with pytest.raises(ConstructorError):
pipeline = Pipeline.read(str(config_path)) No newline at end of file
Comment on lines +411 to +417
def test_read_yaml_unsupported_function():
from yaml.constructor import ConstructorError
script_dir = os.path.dirname(__file__)
config_path = os.path.join(script_dir, "data/test_config_unsupported_function.yaml")

with pytest.raises(ConstructorError):
pipeline = Pipeline.read(str(config_path)) No newline at end of file
@jucordero jucordero merged commit aaa89bc into FixOurFood:main May 20, 2026
4 checks passed
@jucordero jucordero deleted the conffile branch May 20, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants