You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So I started working on output caching for tasks; however, I quickly ran into a question: where should the cached data be stored?
I began implementing it analogously to #78 by putting the outputs into a task's State which is then fed into the FlowRunner via task_states. However, after a successful first run the returned state is Success which effectively is an output cache except with no input-validation to ensure the cache is still valid.
We could continue down this route, and let the TaskRunner perform the necessary cache-validation checks when the task has a Success state, but this might alter the fundamental behavior of the TaskRunner when it is provided with a Success state (for example, if the TaskRunner receives a Success state whose cache is not valid anymore).
Two other options I immediately see:
the Task itself maintains the cache
prefect.Context somehow maintains a cache for all tasks via a dictionary _cached_tasks
The answer should probably depend on how we envision this output cache behaving in relation to our server and whether the cache should work across Flows or not.
Another option is to create a new State: CachedState which inherits from Success but is handled differently than a Success in that the TaskRunner will rerun this task if the cache is no longer valid. (I think I actually prefer this option the most)
Currently tasks have a dummy "checkpoint" attribute which needs to be handled
The text was updated successfully, but these errors were encountered: