Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Hera I/O models #858

Closed
samj1912 opened this issue Nov 10, 2023 · 1 comment
Closed

Improve Hera I/O models #858

samj1912 opened this issue Nov 10, 2023 · 1 comment
Assignees
Labels
type:enhancement A general enhancement

Comments

@samj1912
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Currently hera i/o with annotated params can become extremely verbose. The output syntax is especially error-prone.

Describe the solution you'd like

Introduce a HeraInput and HeraOutput class which derives from pydantic basemodel. When this is present, hera script runner will expand out the top level class attributes to params.

from hera.workflows.io import Input, Output


class AnotherModel(BaseModel):
	...

class MyInput(InputModel):
	my_param: int = 1
	another_param: Annotated[int, Parameter(name="another-param") = 2
	another_artifact: Annotated[AnotherModel, Artifact(...)]


class MyOutput(OutputModel):
	my_output: str = ""
	another_output: Annotated[Path, Parameter("test"]]

def my_function(my_input: MyInput, another_param_inline: int) -> MyOutput:
	...
	outputs = MyOutput(exit_code=10)
	outputs.my_output = "foo"
	outputs.another_output.write("foo")
	outputs.result = "foo"
	return outputs

Describe alternatives you've considered

Additional context

@samj1912 samj1912 added the type:enhancement A general enhancement label Nov 10, 2023
@elliotgunton elliotgunton self-assigned this Jan 10, 2024
samj1912 pushed a commit that referenced this issue Jan 30, 2024
**Pull Request Checklist**
- [x] Part of #858 
- [x] Tests added
- [ ] ~Documentation/examples added~ See #939 
- [x] [Good commit messages](https://cbea.ms/git-commit/) and/or PR
title

**Description of PR**
Currently hera i/o with annotated params can become extremely verbose.
The output syntax is especially error-prone.

This PR introduces custom Input/Output BaseModels for users to subclass,
which allow a cleaner arrangement of inputs and outputs for functions.
These are available under the `script_pydantic_io` experimental feature
flag.

With these Pydantic input/output models, the following should be noted:
* duplicated param names (for normal Parameters as well as the new
models) are now detected in Hera rather than when linted by Argo (as
well as duplicated artifact names). Parameters and Artifacts having the
same name is legal in the Argo spec as they exist in different scopes
e.g.
```yaml
...
      inputs:
        parameters:
          - name: my-name
            default: test
        artifacts:
          - name: my-name
            path: /tmp
            optional: true
...
```
* `exit_code` and `result` are reserved attributes for the
`RunnerOutput`. A user trying to use their own parameters with these
names would have to be specified with an annotated parameter e.g.
`my_exit_code: Annotated[int, Parameter(name="exit_code")]` (TBC with a
test)
* Scripts cannot have a return tuple containing any `RunnerOutput` to
avoid multiple `exit_code`s being specified. @samj1912 / @flaviuvadan
this is up for debate but I think would encourage better practices to
discourage tuples and have a single script template outputting a single
`RunnerOutput` subclass, and it keeps the logic clearer from the Hera
side. Users can still use inline output parameters alongside the
`RunnerOutput` return annotation
* Multiple input parameters when using a `RunnerInput` in the function
params **is not legal**
* A `RunnerInput`'s `__fields__` as defined by pydantic are used to
"explode" the input class into constituent parameters for the Argo spec.
i.e. using the following class as an input param to a script function:
```py
class MyInput(RunnerInput):
     my_input_str: str
     my_input_int: int

@script(constructor="runner")
def my_func(my_input: MyInput):
    ...
```
will create the script template `my_func` in yaml with Parameters
`my_input_str` and `my_input_int`, NOT `my_input`, [see the
example](https://github.com/argoproj-labs/hera/blob/92f11d341eb29d2501b9ee5be57a703160b35e24/docs/examples/workflows/experimental/script_pydantic_io.md)

---------

Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
elliotgunton added a commit that referenced this issue Feb 12, 2024
Part of #858, follow up to #920  
* Enables users to use Pydantic V2 objects in their scripts while
maintaining V1 usage internally for Hera
* RunnerInput/Output classes are created in hera.workflows.io depending
on the value of _PYDANTIC_VERSION - users will automatically get classes
using their (possibly pinned) version of Pydantic
* I have not yet managed to get an explicit test that uses the automatic
import of V1 classes - we may just have to rely on the Pydantic V1 CI
check.

---------

Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
@elliotgunton
Copy link
Collaborator

Released in 5.14.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

2 participants