Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Pydantic V1 IO models for use with Hera Runner #920

Merged
merged 15 commits into from
Jan 30, 2024
Merged

Conversation

elliotgunton
Copy link
Collaborator

@elliotgunton elliotgunton commented Jan 10, 2024

Pull Request Checklist

Description of PR
Currently hera i/o with annotated params can become extremely verbose. The output syntax is especially error-prone.

This PR introduces custom Input/Output BaseModels for users to subclass, which allow a cleaner arrangement of inputs and outputs for functions. These are available under the script_pydantic_io experimental feature flag.

With these Pydantic input/output models, the following should be noted:

  • duplicated param names (for normal Parameters as well as the new models) are now detected in Hera rather than when linted by Argo (as well as duplicated artifact names). Parameters and Artifacts having the same name is legal in the Argo spec as they exist in different scopes e.g.
...
      inputs:
        parameters:
          - name: my-name
            default: test
        artifacts:
          - name: my-name
            path: /tmp
            optional: true
...
  • exit_code and result are reserved attributes for the RunnerOutput. A user trying to use their own parameters with these names would have to be specified with an annotated parameter e.g. my_exit_code: Annotated[int, Parameter(name="exit_code")] (TBC with a test)
  • Scripts cannot have a return tuple containing any RunnerOutput to avoid multiple exit_codes being specified. @samj1912 / @flaviuvadan this is up for debate but I think would encourage better practices to discourage tuples and have a single script template outputting a single RunnerOutput subclass, and it keeps the logic clearer from the Hera side. Users can still use inline output parameters alongside the RunnerOutput return annotation
  • Multiple input parameters when using a RunnerInput in the function params is not legal
  • A RunnerInput's __fields__ as defined by pydantic are used to "explode" the input class into constituent parameters for the Argo spec. i.e. using the following class as an input param to a script function:
class MyInput(RunnerInput):
     my_input_str: str
     my_input_int: int

@script(constructor="runner")
def my_func(my_input: MyInput):
    ...

will create the script template my_func in yaml with Parameters my_input_str and my_input_int, NOT my_input, see the example

@elliotgunton elliotgunton force-pushed the pydantic-io branch 3 times, most recently from b6af175 to 43a25f8 Compare January 12, 2024 14:14
@elliotgunton elliotgunton added type:enhancement A general enhancement semver:minor A change requiring a minor version bump labels Jan 12, 2024
Copy link

codecov bot commented Jan 12, 2024

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (b3fc378) 80.7% compared to head (a0dd261) 81.2%.
Report is 1 commits behind head on main.

❗ Current head a0dd261 differs from pull request most recent head 221ecae. Consider uploading reports for the commit 221ecae to get more accurate results

Files Patch % Lines
src/hera/workflows/runner.py 93.8% 1 Missing and 3 partials ⚠️
src/hera/workflows/io.py 95.2% 0 Missing and 3 partials ⚠️
src/hera/workflows/script.py 94.1% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main    #920     +/-   ##
=======================================
+ Coverage   80.7%   81.2%   +0.4%     
=======================================
  Files         50      51      +1     
  Lines       3908    4045    +137     
  Branches     793     843     +50     
=======================================
+ Hits        3157    3287    +130     
- Misses       564     565      +1     
- Partials     187     193      +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@elliotgunton elliotgunton force-pushed the pydantic-io branch 16 times, most recently from fe9ff79 to 3a0a39f Compare January 17, 2024 17:13
@samj1912
Copy link
Collaborator

samj1912 commented Jan 18, 2024

@elliotgunton can you provide a brief examples/details in the PR description of the functionality that is currently enabled?

Specifically I am curious how we handle things when a user -

  • has duplicate param names between a non annotated param/annotated param/normal param
  • has duplicate names between params/artifacts (this should be valid?)
  • has a parameter/artifact called output/exit code
  • has multiple output return types
  • has multiple input types
  • how is the input object constructed from argo params
  • I see we added support for defaults as well (is it also added back to the normal params apart from hera io models?)

Having that in the PR description would help future readers understand what is happening.

src/hera/shared/serialization.py Outdated Show resolved Hide resolved
src/hera/workflows/io.py Outdated Show resolved Hide resolved
@elliotgunton elliotgunton force-pushed the pydantic-io branch 3 times, most recently from 5bc252c to f37fbea Compare January 18, 2024 16:00
@elliotgunton elliotgunton changed the title [WIP] Add IO models for use with Hera Runner Add IO models for use with Hera Runner Jan 22, 2024
@elliotgunton elliotgunton force-pushed the pydantic-io branch 2 times, most recently from b553cbb to 66a4443 Compare January 24, 2024 14:53
@elliotgunton elliotgunton changed the title Add IO models for use with Hera Runner Add Pydantic V1 IO models for use with Hera Runner Jan 24, 2024
Comment on lines +451 to +452
if len(inspect.signature(source).parameters) != 1:
raise SyntaxError("Only one function parameter can be specified when using a RunnerInput.")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one function param that is a RunnerInput subclass is allowed

Comment on lines +367 to +368
if isinstance(annotation, type) and issubclass(annotation, RunnerOutput):
raise ValueError("RunnerOutput cannot be part of a tuple output")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one RunnerOutput is allowed as the function return type. Note that we also haven't dealt with nested Annotated types above.

@elliotgunton
Copy link
Collaborator Author

@samj1912 ready for re-review! Self-reviewed to double check this PR stands on its own to implement for Pydantic V1 and signposted the single input/output we discussed 🚀

Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
* Add annotations test only in this commit

Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
* Tidy up kwarg mapping code - may need another refactor
* TODO artifact outputs

Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
* Add environment variable to script templates using the feature
* Use actual error message for ValueError when flag not enabled

Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
* Also allow setting defaults of a RunnerInput param via its func param default
* Remove runner tests that use multiple inputs
* Add script annotations test for using a RunnerInput in a List

Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
@samj1912 samj1912 merged commit 7998f2e into main Jan 30, 2024
20 checks passed
@samj1912 samj1912 deleted the pydantic-io branch January 30, 2024 14:36
elliotgunton added a commit that referenced this pull request Feb 12, 2024
Part of #858, follow up to #920  
* Enables users to use Pydantic V2 objects in their scripts while
maintaining V1 usage internally for Hera
* RunnerInput/Output classes are created in hera.workflows.io depending
on the value of _PYDANTIC_VERSION - users will automatically get classes
using their (possibly pinned) version of Pydantic
* I have not yet managed to get an explicit test that uses the automatic
import of V1 classes - we may just have to rely on the Pydantic V1 CI
check.

---------

Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver:minor A change requiring a minor version bump type:enhancement A general enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants