-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add storage_kind
to assets, similar to compute_kind
#14475
Comments
asset_kind
to assets, similar to compute_kind
Hi @danielgafni ! Thanks for the suggestion -- this is something we've had in the back of our minds for awhile... We've generally conceptualized this as a "storage kind", and (as you mention) potentially having it sourced from the IOManager. This would be displayed independently from the "compute_kind" (e.g. you do your computation in Pandas, then store it in Snowflake vs. you do your computation in Pandas, then store it on s3). Tons of interesting things that could be done with this -- for example you could create views for "all assets stored in Snowflake", or if we enforce that this tag is only sourced from a resource, you could imagine a "delete asset" command literally invoking a resource to delete the asset (as opposed to the current "wipe asset" command, which just removes materializations) |
@OwenKephart any plan to pick this up..? |
This would be very useful, indeed. I would personally like two distinct tags:
Why I would like two separate tags:
Anyway, being able to search assets by their destination resource/storage and by data format would be useful. And when discovering/reading an asset lineage, having tags showing where and how it is stored would also be very useful. In my opinion, having default storage/asset_kind tags assigned by the IOManager could be great. But for use-cases like external assets or manual dumping, being able to define formats on asset decorator would be desirable. |
I'd prefer one tag than more, just on the grounds of simplicity. The additional abstraction of 'medium' vs 'format' isn't always meaningful, and where it is it can already be encapsulated into the proposed single tag. If we want to tag assets with a variety of metadata - possibly even in dict format - I believe there's already a proposal for this? |
Hi all! There is a recent proposal for functionality along these lines here: #19737, which I'd definitely encourage you to add your thoughts to. |
@OwenKephart how is that the same thing? We are talking about adding additional UI labels here for the storage on an asset |
@ion-elgreco : I think there is overlap. The proposal #19737 aims to provide a generic system of tags for assets. A tag that describe asset output could be part of it. Dagster could unify tag definition with this system. For example, if it is designed as a set of key/value, The UI could search and pick such tags to allow advanced search or improve asset display (by adding a storage label, for example). |
Yes, it can work similarly to how currently There can be a special interface like: @asset(storage_kind="parquet", compute_kind="polars", labels={"foo": "bar"}) and internally Dagster could insert them into the labels dictionary: {
"storage_kind": "parquet",
"compute_kind": "polars",
"foo": "bar",
}
|
I also believe that actually storage_kind should be a property of Out. So a multi_asset could have different storage kinds for different outputs |
Yes, I'd like this. |
asset_kind
to assets, similar to compute_kind
storage_kind
to assets, similar to compute_kind
Note that as of 1.7.9: Dagster will now display a “storage kind” tag on assets in the UI, similar to the existing compute kind. To set storage kind for an asset, set its dagster/storage_kind tag. |
@garethbrickman then the issue can be closed, right? |
What's the use case?
It seems like
compute_kind
was initially created forop
s (is this why it's namedop_tags
in the frontend code?).90% of my assets have
Python
compute kind :)Would be nice to see Parquet logo in Dagit, for example.
Perhaps a similar
storage_kind
(which would represent the data type) could be added to assets?Alternatively, it could be assigned to the IOManager.
The text was updated successfully, but these errors were encountered: