Skip to content

Conversation

@anuthebananu
Copy link
Contributor

@anuthebananu anuthebananu commented Apr 22, 2025

Summary & Motivation

[EDIT 04/23] - based on PR feedback, limiting this PR to docstring only, will remove direct usages of SerializableTimeDelta in cloud code.


I ran into a pretty subtle "bug" with SerializableTimeDelta - or an accessibility issue, really - when trying to use it for some datetime arithmetic. This PR adds some docstrings and a total_seconds() method as guardrails to hopefully make it less likely to bite someone else in future.

Let's say you create a SerializableTimeDelta from a datetime.timedelta that is 24 hours long:

td = datetime.timedelta(hours=24)
serializable_td = SerializableTimeDelta.from_timedelta(td)

If you now want to get the duration of serializable_td in seconds, you may naturally reach for the seconds attribute. You would expect this to equal 86,400 seconds. However, this is not the case - it would actually be 0!

# This resolves to 0
seconds = serializable_td.seconds 

# This resolves to 86400
seconds = serializable_td.to_timestamp().total_seconds()

So I added a total_seconds() method that is pass through to datetime.timedelta.total_seconds and some docstrings warning not to use the days, seconds or microseconds properties for any datetime arithmetic.

How I Tested These Changes

bk

Changelog

Insert changelog entry or delete this section.

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@anuthebananu anuthebananu changed the title guardrails for fetching SerializableTimeDelta duration Guardrails for getting duration of a SerializableTimeDelta Apr 22, 2025
@anuthebananu anuthebananu marked this pull request as ready for review April 22, 2025 19:42
@OwenKephart
Copy link
Contributor

agreed that this stuff is very confusing but generally would prefer for SerializableTimeDeltas to not really make their way into our application code -- basically I think they should be thought of as purely an implementation detail and code that interacts with them should basically always do so by accessing a property that does the conversion for you on the class that has SerializableTimeDelta values.

do you have the example where in code this specific bug came up? just trying to visualize the scenario a bit better

Copy link
Contributor Author

anuthebananu commented Apr 22, 2025

The exact example I laid out in the PR description (trying to get timedelta duration for a 24 hour SerializableTimeDelta) caused a bug with asset freshness evaluation, where basically every asset in dagster-open-platform was failing freshness when I attached a freshness policy with fail_window of 24 hours. Fixed here: https://github.com/dagster-io/internal/pull/15236

Hear you on not using SerializableTimeDelta directly in code - if you still feel strongly, I can:

  • remove total_seconds() helper method added in this PR
  • add a helper or properties on TimeWindowFreshnessPolicy to convert fail_window and warn_window from SerializableTimeDelta to plain old timedelta for us

Copy link
Contributor Author

If we don't want SerializableTimeDelta to be used in app code, it might be worth noting this in the docstring as well.

Copy link
Collaborator

@smackesey smackesey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strong agree with @OwenKephart that we don't want SerializableTimeDelta in application code. Any place it pops up it should be turned into a regular timedelta.

  • remove total_seconds() helper method added in this PR
  • add a helper or properties on TimeWindowFreshnessPolicy to convert fail_window and warn_window from SerializableTimeDelta to plain old timedelta for us

Yes ^^

Copy link
Contributor

@OwenKephart OwenKephart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

req'ing changes for queue management but yeah I think that we're on the same page here -- in fact I think we really want to rename the fields fail_window and warn_window to serializable_fail_window and serializable_warn_window to make room for the fail_window and warn_window properties to be the "correct" properties to set. my bad on missing that on the initial round of reviews

that might have some backcompat implications for our internal users if these things are already getting serialized, we can talk through the pros/cons of some different approaches but if it's just Nick using these right now we could probably just wrap the current deserialization callsite in a try/except so that we ignore deserialization errors. this would mean that:

  1. when we merge this in and push to cloud, all existing freshness policies would be disabled (but no errors would occur)
  2. once Nick updates the dagster version, freshness policies would get turned back on.

it's possible to avoid turning freshness policies off in (1) with some extra effort, it's just mildly complicated to do this in a way that doesn't have us serializing the old storage names forever so it might be nice to avoid that extra work if the above series of events is acceptable

Copy link
Contributor Author

I think it's acceptable to turn off freshness for a day or so, as long as we give Roach a heads up.

I will stack a separate PR for the field renaming.

@anuthebananu anuthebananu changed the title Guardrails for getting duration of a SerializableTimeDelta Clarify SerializableTimeDelta Apr 23, 2025
@anuthebananu anuthebananu changed the title Clarify SerializableTimeDelta Clarify SerializableTimeDelta is for internal usage only in docstring Apr 23, 2025
@anuthebananu anuthebananu changed the title Clarify SerializableTimeDelta is for internal usage only in docstring Clarify SerializableTimeDelta is for internal use only in docstring Apr 23, 2025
@anuthebananu anuthebananu requested a review from smackesey April 23, 2025 15:05
Copy link
Contributor Author

anuthebananu commented Apr 24, 2025

Merge activity

  • Apr 24, 10:04 AM EDT: A user started a stack merge that includes this pull request via Graphite.
  • Apr 24, 10:04 AM EDT: @anuthebananu merged this pull request with Graphite.

@anuthebananu anuthebananu merged commit 43769df into master Apr 24, 2025
6 checks passed
@anuthebananu anuthebananu deleted the anurag/serializable-time-delta-improvements branch April 24, 2025 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants