PIN 16: Use cache_for
and cache_validator
kwargs against a task's checkpointed Result
#2619
Labels
enhancement
An improvement of an existing feature
Current behavior
Please describe how the feature works today
Currently users can cache data output from a task run between flow runs via two paths:
target
andresult
kwargs on tasks to enforce cache behavior using checkpointed results against existence of a specific location name in persistent storagecache_for
/cache_key
/cache_validators
kwargs on tasks to enforce cache behavior against existence of a matching entry inprefect.context.caches
in-memory of the current Python processThe former works only against a path name existing (and no other terms, such as duration or arbitrary validation against the deserialized Python object), but is persistent -- starting up a flow run later can utilize the same data as long as it is configured to look at the same location. The latter provides more cache validation options since it supports arbitrary validation functions against the cached value, but only works for as long as
prefect.context.caches
exists in memory.Proposed behavior
Please describe your proposed change to the current behavior
Implement the part of PIN 16 that describes honoring kwargs like
cache_for
andcache_validators
against not just the in-memory cache but also the serialized object from a prior task run that should have been persisted by the Result interface. This gives us a persistent cache 'for free' since Result subclasses already write to their storage backend during pipeline execution, and then it becomes useful to extend just the Result subclasses to include more cache-like backends.Example
Please give an example of how the enhancement would be useful
For example a task decorator like:
would check during the pre-pipeline checks for this task run for a file at
~/.prefect/hello
and use the data from that location as that task run'sCached
state's result, as long as the rerun was within 10 seconds of the first run.Other thoughts/ideas:
target
kwarg, since thetarget
kwarg is purposefully only about location existence, not any other cache validation.The text was updated successfully, but these errors were encountered: