Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache key policies #13791

Merged
merged 18 commits into from
Jun 5, 2024
Merged

Cache key policies #13791

merged 18 commits into from
Jun 5, 2024

Conversation

cicdw
Copy link
Member

@cicdw cicdw commented Jun 5, 2024

This PR introduces the final user-facing feature in the transactions story: cache policies (formerly known as both idempotency key policies and transaction key policies).

At its core, the idea with cache policies is relatively simple: they expose configuration for how your task should cache its results across executions. In this spirit, this PR provides a menu of options for configuring this behavior. Some examples include:

  • NONE: never cache
  • INPUTS: cache based on a hash of the inputs to the task
  • TASKDEF: cache based on the source code of the task function
  • providing a cache_key_fn: in a future PR this will create a Custom policy definition

This gets interesting in two respects: first off, these policies can be manipulated through basic operations to create more advanced policies:

from prefect.records.cache_policies import INPUTS, TASKDEF

# a policy that hashes all inputs _except_ any input named "foo"
my_policy = INPUTS - "foo" 

# a policy that uses both hashed inputs _as well as_ task definition 
another_policy = INPUTS + TASKDEF 

At a deeper level, these policies configure the transactional behavior of tasks by specifying the location of the transaction record. These records are only written whenever a task commits itself.

cc: @EmilRex

@cicdw cicdw requested a review from a team as a code owner June 5, 2024 03:06
Copy link
Collaborator

@chrisguidry chrisguidry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dig it! It's a nice interface, just have a think about whether we want those add/subtract operations to be idempotent

src/prefect/records/cache_policies.py Outdated Show resolved Hide resolved
Comment on lines +27 to +55
def __sub__(self, other: str) -> "CompoundCachePolicy":
if not isinstance(other, str):
raise TypeError("Can only subtract strings from key policies.")
if isinstance(self, Inputs):
exclude = self.exclude or []
return Inputs(exclude=exclude + [other])
elif isinstance(self, CompoundCachePolicy):
new = Inputs(exclude=[other])
policies = self.policies or []
return CompoundCachePolicy(policies=policies + [new])
else:
new = Inputs(exclude=[other])
return CompoundCachePolicy(policies=[self, new])

def __add__(self, other: "CachePolicy") -> "CompoundCachePolicy":
# adding _None is a no-op
if isinstance(other, _None):
return self
elif isinstance(self, _None):
return other

if isinstance(self, CompoundCachePolicy):
policies = self.policies or []
return CompoundCachePolicy(policies=policies + [other])
elif isinstance(other, CompoundCachePolicy):
policies = other.policies or []
return CompoundCachePolicy(policies=policies + [self])
else:
return CompoundCachePolicy(policies=[self, other])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 Awesome API

@cicdw cicdw merged commit 8861ab3 into main Jun 5, 2024
27 checks passed
@cicdw cicdw deleted the key-policies branch June 5, 2024 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants