-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please reconsider 'guessing' whether a timestamp is in milliseconds or seconds #7940
Comments
This has been around since the very earliest days of pydantic, it has saved me personally hours in combined time not having to thing about whether something is in seconds or milliseconds. I won't be changing the default. But I would consider a config switch or similar to stop using the watershed. |
Thanks for the quick reply. I will certainly agree the behavior is more convenient 😄, and probably saves many developers a lot of time. Although I am curious: is there another validation library or language that also does it this way? Part of the reason for my bemusement is that I've never encountered this type of behavior before 😅. Regarding a potential config switch: how about an option called |
Solution sounds good to me, "infer" or "guess" might be better than "auto". |
I'm just realized you're the he same person who asked the question on HN, sorry for my dumb initial response. 🙏 By the way, to fix this I think you'll need to update speedate, then pydantic-core. |
Thanks for mentioning. Combined with your twitter post today regarding issue submission quality, I was concerned I hadn't brought it up correctly.
I do feel compelled to say that I disagree with keeping the default behavior like this. If you've already made up your mind, I won't argue — it's your project after all, and there might be some context I'm missing. Part of me really wants to write a blog post to convince the world 🔥 😉 , but it's a lot more constructive to work with you all on this 🤝
Happy to submit the PRs — if you're OK with a slow pace. I'll need to brush up on Rust, and the codebase in general. |
No you definitely weren't the trigger for my twitter idea. I'm not pretending this is a democracy, and I'm not resorting to the tyranny of the masses / hegemony of the upvote. More or the point, we definitely can't change the behaviour anyway until V3. Obviously happy to accept the config switch asap. The only thing I'd consider in V3 is to move away from a single watershed, to a ranges which are accepted, with errors outside those ranges. That would mean it doesn't "suddenly" change. E.g. timestamps are only accepted 1980-2100 or whatever in "infer" mode. |
Indeed, this is the way. It would remove any ambiguous cases, in favor of an explicit boundary 🙌 . Actually, I think you can even move the upper bound way up 🤔. The only requirement is that there is no overlap between the numerical ranges... Sadly we can't move it all the way to
or alternatively, put the lower bound at 1985 and you don't need an upper bound — maybe cleaner? |
Agree that decisions shouldn't be taken this way. Although, discussion could yield interesting perspectives from people that use Pydantic daily.
Makes total sense to save this for a breaking release. Also logical to make the transition itself as small as possible. Assuming that defaults are trivial to change, the big decision can be deferred if you like. What's the ballpark date for V3? This will also help me plan my work on the PRs. |
I guess V3 should be expected in the middle of next year. But as I said, we can add the config flag asap. |
I'd be happy to review your PRs for the config flag in both |
I'm also interested in making sure we can ensure the unit used for timestamps. Our usecase is to validate data pipelines created by clients and running with polars. @ariebovenberg I would be more than happy to work on |
Yes more units should be fine on e we have agreed ranges. |
@PrettyWood Thanks for the offer, but I'd like to take a stab at it myself first. I'll let you know if it doesn't work out. My 2 cents on more time units: Makes sense to support it, but note that this would make the "infer" behavior very broad. I'd myself prefer a timestamp of 1.000.000 times the usual size would raise an exception, instead of being silently interpreted as nanoseconds 🤔. An alternative would be |
A naive implementation of this in |
@samuelcolvin could you have a look at the speedate PR? Some open questions are blocking progress at the moment. |
@PrettyWood you're welcome to take over this issue, if you like. I won't have time for it in the near future. Feel free to build on my earlier "work in progress" PRs, or to start over 👍 |
Ok I'll take some time to work on it soon |
Initial Checks
Description
The following surprising behavior recently tripped me up. When parsing datetimes from unix timestamps, Pydantic
determines automagically whether you mean seconds or milliseconds:
(from the docs)
This seems sensible at first...but if you're working with milliseconds, Pydantic will decide that most timestamps in 1970 and 1969 are just too small, and they will be interpreted as seconds instead — leading to wildly different times.
While most Pydantic users won't encounter this, it is not a stretch to imagine some Pydantic users are handling data from the 1970s and their validation library, ironically, is silently changing their timestamps.
If you'd rather not change this, please consider adding a big red warning to the docs on the implications of this behavior on timestamps in 1970 and 1969.
Example Code
Python, Pydantic & OS Version
The text was updated successfully, but these errors were encountered: