You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm a native OPs JOBs user on Dagster, since 0.1.0 or below, can't remeber anymore, lmao, but anyway.
I would like to suggest graph_assets being treated like real assets, when we put the asset decorator we have lots of parameters to use and etc and as the point of graph_asset decorator is to convert a DAG of OPs into a single asset, I think it could have the same features as a normal asset.
Why is that? Because sometimes doing something like this
@asset(
compute_kind="api",
partitions_def=daily_partitions,
metadata={"partition_expr": "created_at"},
backfill_policy=BackfillPolicy.single_run()
)
def users(context, api: RawDataAPI):
"""A table containing all users data"""
# during a backfill the partition range will span multiple hours
# during a single run the partition range will be for a single hour
first_partition, last_partition = context.asset_partitions_time_window_for_output()
partition_seq = _daily_partition_seq(first_partition, last_partition)
all_users = []
for partition in partition_seq:
resp = api.get_users(partition)
users = pd.read_json(resp.json())
all_users.append(users)
return pd.concat(all_users)
just for the sake of using the @asset decorator isn't nice when you could just use the most powerfull resource in Dagster so far, which is DynamicOut. Instead of processing in a linear for loop, you would process all the partitions in its own process all in parallel. Faster. Better to visualize. Better to debug. Imagine if one partition from this asset gets a problem? The for loop would crash.
I'm giving this example and I know you could just use a multiprocessing lib here, but for me it's better to use something Dagster already gives us.
"I have a dream. The dream in which I build entire pipelines with OPs and they're just assets in the end, the best of two worlds, having your steps and also having them as your assets with all the capabilities that Dagster can offer." - MELO, Ismael. (2024)
And this just 1 use-case of many that may exist. I don't know if what I said makes sense for you guys, but for me makes a lot, hehe, if I'm crazy, please let me know :)
Ideas of implementation
No response
Additional information
No response
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
The text was updated successfully, but these errors were encountered:
For me the fundamental issue with Ops and Graph Assets is you can't configure that Ops in a graph are executed in a single execution using an inmemoryiomanager on any executor.
What's the use case?
Hi,
I'm a native OPs JOBs user on Dagster, since 0.1.0 or below, can't remeber anymore, lmao, but anyway.
I would like to suggest graph_assets being treated like real assets, when we put the asset decorator we have lots of parameters to use and etc and as the point of graph_asset decorator is to convert a DAG of OPs into a single asset, I think it could have the same features as a normal asset.
Why is that? Because sometimes doing something like this
just for the sake of using the @asset decorator isn't nice when you could just use the most powerfull resource in Dagster so far, which is DynamicOut. Instead of processing in a linear for loop, you would process all the partitions in its own process all in parallel. Faster. Better to visualize. Better to debug. Imagine if one partition from this asset gets a problem? The for loop would crash.
I'm giving this example and I know you could just use a multiprocessing lib here, but for me it's better to use something Dagster already gives us.
"I have a dream. The dream in which I build entire pipelines with OPs and they're just assets in the end, the best of two worlds, having your steps and also having them as your assets with all the capabilities that Dagster can offer." - MELO, Ismael. (2024)
And this just 1 use-case of many that may exist. I don't know if what I said makes sense for you guys, but for me makes a lot, hehe, if I'm crazy, please let me know :)
Ideas of implementation
No response
Additional information
No response
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
The text was updated successfully, but these errors were encountered: