Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metadata schema for IO managers #9617

Open
sryza opened this issue Sep 7, 2022 · 2 comments
Open

metadata schema for IO managers #9617

sryza opened this issue Sep 7, 2022 · 2 comments

Comments

@sryza
Copy link
Contributor

sryza commented Sep 7, 2022

E.g.

@io_manager(output_definition_metadata_schema={"database": str, "schema": str})
def my_io_manager():
    ...

@asset(metadata={"database": "abc", "schema: "xyz"})
def my_asset():
    ...
@chasleslr
Copy link
Contributor

chasleslr commented Mar 17, 2023

+1

I keep running into this scenario where an IO Manager needs to be configured for a given asset but using metadata for this feels like a "hack" because it bypasses all of the great validation available within Dagster for config schemas. For example, I often want to use a common SparkDataFrameIOManager and configure it differently for certain assets (i.e. partitionBy, mode, etc.), but currently it seems like that either needs to be configured at the instance level, or through asset metadata.

@danielgafni
Copy link
Contributor

Another way of doing this is to return a tuple of your actual object and I/O parameters (saved with pickle). Way more ugly but also more powerful as you can work with complex parameters (arbitrary Python objects).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants