Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Use pydantic #42

Closed
wants to merge 20 commits into from
Closed

[RFC] Use pydantic #42

wants to merge 20 commits into from

Conversation

gadomski
Copy link
Member

@gadomski gadomski commented Sep 18, 2023

Overview

This is a total rewrite of the library to use pydantic under the hood for validation and input/output schema generation. Key points:

  • All tasks are now pydantic models, and their input and output data structures are also pydantic models. This allows jsonschema generation for input, output, and task config.
  • Custom subclasses of Task are provided for common use-cases
class definition key method
Task process(self, input: List[Any]) -> List[Any]
StacOutTask process_to_items(self, input: List[Any]) -> List[Item]
StacInStacOutTask process_items(self, input: List[Item]) -> List[Item]
OneToManyTask process_one_to_many(self, input: Any) -> List[Any]
OneToOneTask process_one_to_one(self, input: Any) -> Any
ToItemTask process_to_item(self, input: Any) -> Item
ItemTask process_item(self, item: Item) -> Item
HrefTask process_href(self, href: str) -> Item
  • Models are also included for Payload and its substructures. This will hopefully solidify the data model for payloads (there had been confusion in the past around dict-or-list for process, etc)
  • Handling has been moved up to Payload, so tasks just worry about themselves
  • I'm using stac-asset instead of fsspec for read (and eventually, write)
  • An example is provided of how to make a simple item using rio-stac, both as a standalone file and as a package. See the examples/ directory for more information.
  • CLI provides stac-task list for showing available tasks, and stac-task jsonschema TASK MODEL for dumping the jsonschema
  • I've changed the package name to stac_task (from stactask) -- I figure this is a good chance to bring the naming in line with other repos AND clearly signal these very breaking changes to downstreams

Opening as a draft PR for visibility and feedback.

Task list

  • Flesh out documentation
  • Unit test every generic class
  • Get the plugin story sorted out
  • Provide more examples of building custom tasks and how they would be called / invoked
  • Asset download/upload helpers
    • As an aside, I'm not sure how much of this we actually want to put in stac-task, and how much should be farmed out to stac-asset
  • Provide some common "helper" tasks, such as a pystac-client search. Maybe even include rio-stac via an extra dependencies set.
  • Update Github branch protection for the new CI job names
  • Build and deploy documentation with read the docs

Includes:
- pydantic under the hood
- move handler out of task into payload
- STAC models
@gadomski
Copy link
Member Author

Closing as OBE. Maybe we'll re-visit someday.

@gadomski gadomski closed this Dec 28, 2023
@gadomski gadomski deleted the rewrite branch December 28, 2023 14:30
@gadomski gadomski restored the rewrite branch December 28, 2023 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants