Skip to content

"Optional" Artifact should not throw an error when a loader is specified #1253

@sam2x

Description

@sam2x

Pre-bug-report checklist

1. This bug can be reproduced using pure Argo YAML

  • No

2. I have searched for existing issues

  • Yes

3. This bug occurs in Hera when...

  • running on Argo with the Hera runner

Bug report

Hello,

Summary

In the current Runner code, when the parameter "Optional" is set to "True" for an Artifact, it will always fail if an Artifact loader is specified.
Current implemention is as follow:

def get_annotated_artifact_value(artifact_annotation: Artifact) -> Union[Path, Any]:
    """Get the value of the given Artifact annotation.

    If the artifact is an output, return the path it will write to.
    If the artifact is an input, return the loaded value (json, path or string) using its ArtifactLoader.

    As Artifacts are always Annotated in function parameters, we don't need to consider
    the `kwargs` or the function parameter name.
    """
    if artifact_annotation.output:
        if artifact_annotation.path:
            path = Path(artifact_annotation.path)
        else:
            path = _get_outputs_path(artifact_annotation)
        # Automatically create the parent directory (if required)
        path.parent.mkdir(parents=True, exist_ok=True)
        return path

    if not artifact_annotation.path:
        # Path is added to the spec automatically when built. As it isn't present in the annotation itself,
        # we need to add it back in for the runner.
        artifact_annotation.path = artifact_annotation._get_default_inputs_path()

    if artifact_annotation.loader == ArtifactLoader.json.value:
        path = Path(artifact_annotation.path)
        with path.open() as f:
            return json.load(f)

    if artifact_annotation.loader == ArtifactLoader.file.value:
        path = Path(artifact_annotation.path)
        return path.read_text()

    if artifact_annotation.loader is None:
        return artifact_annotation.path

    raise RuntimeError(f"Artifact {artifact_annotation.name} was not given a value")

It will raise "FileNotFoundError: [Errno 2] No such file or directory: '/tmp/hera-inputs/artifacts/data'".

To Reproduce

Any Input inherited class with the following declaration:

    data: Annotated[Any, Artifact(name="data", optional=True, loader=ArtifactLoader.json)] = ""

Expected behavior

We can't catch the exception, which in my opinion doesn't make sense when "Optional" is to True ?
I believe this was not the intended logic when you implemented "Optional". Since the loader is "called" before the function and out of our control, there should be a condition whenever Optional is True to just return 'None' ?

You would maybe ask me "why don't you just remove 'loader'" for this specific Input class ?": Actually I can, but the reason is:

  1. It's really important for me to keep my code standardized/consistent as much as possible and 'avoid' such conditional case. This Input class is currently used over all my tasks as "base" Input class.
  2. I keep an 'interoperable' approach of my tasks (they could be used as entry-point task or 'passing' task, which mean that either direct 'Parameter' value are provided to the task, or Artifact-passing are provided).

Proposed solution

When parameter optional is specified, the get_annotated_artifact_value should return None is the file opening fail (non-existing file, error IO, etc.).
Code:

    if artifact_annotation.loader == ArtifactLoader.json.value:
        path = Path(artifact_annotation.path)
        try:
            with path.open() as f:
                return json.load(f)
        except FileNotFoundError:
            return None
    if artifact_annotation.loader == ArtifactLoader.file.value:
        path = Path(artifact_annotation.path)
        try:
            with path.open() as f:
                return path.read_text()
        except FileNotFoundError:
            return None
    if artifact_annotation.loader is None:
        return artifact_annotation.path

As a temporary solution, i will implement a custom 'loader' that do it, but I believe this temporary solution should be the mainstream approach for Hera.

I reported this as a "bug", but could not find other category suitable.

Let me know your vision on this.
Br,
Sam

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions