Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FromSql/ToSql<Timestamp> to Pg backend for OffsetDateTime #3973

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions diesel/src/pg/types/date_and_time/time.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,11 @@ use crate::sql_types::{Date, Time, Timestamp, Timestamptz};
// Postgres timestamps start from January 1st 2000.
const PG_EPOCH: PrimitiveDateTime = datetime!(2000-1-1 0:00:00);

fn to_primitive_datetime(dt: OffsetDateTime) -> PrimitiveDateTime {
let dt = dt.to_offset(UtcOffset::UTC);
PrimitiveDateTime::new(dt.date(), dt.time())
Copy link
Member

@Ten0 Ten0 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems arbitrary.
I think I would have expected the one that just drops timezone information (turns into naive) without converting to UTC, but arguably that's arbitrary as well.

In fact, it seems that there's really no non-arbitrary way to implicitly drop timezone information.

=> I would prefer that we do not do this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is arbitrary, but this is also just mirroring the current behavior of the existing MySql implementation.

If all your timestamps are sanitized internally to UTC it's reasonably common for applications to store without timezone to save space. Since local tz obviously doesn't work, the only reasonable assumption is UTC.

Copy link
Member

@weiznich weiznich Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main difference to mysql here is that postgresql has a distinct timestamp type (Timestamp with time zone), which does exactly what you describe. So it seems to be a bit odd to reimplement that behavior here.

Given that there is no non-arbitrary way to implicitly drop timezone information I personally would prefer to keep that step explicit for the user. I would go as far and argue that accepting the mysql impl was a mistake back then. Maybe we should put a #[deprecated] on that impl instead?

Copy link
Member

@Ten0 Ten0 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does trait implementation deprecation work now? 🤩
(Or do we just want to use the without-deprecated feature?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if rustc will show the deprecation warning, but we have the without-deprectaed feature flag for such cases.

}

#[cfg(all(feature = "time", feature = "postgres_backend"))]
impl FromSql<Timestamp, Pg> for PrimitiveDateTime {
fn from_sql(bytes: PgValue<'_>) -> deserialize::Result<Self> {
Expand Down Expand Up @@ -44,6 +49,23 @@ impl ToSql<Timestamp, Pg> for PrimitiveDateTime {
}
}

// Delegate offset datetimes in terms of UTC primitive datetimes; this stores everything in the DB as UTC
#[cfg(all(feature = "time", feature = "postgres_backend"))]
Ten0 marked this conversation as resolved.
Show resolved Hide resolved
impl ToSql<Timestamp, Pg> for OffsetDateTime {
Copy link
Member

@Ten0 Ten0 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OffsetDateTime is timezone-aware, Timestamp isn't.

OffsetDateTime should be allowed to map to Timestamptz.
PrimitiveDateTime should be allowed to map to Timestamp.

However, implementing ToSql<Timestamp, _> for OffsetDateTime would let users implicitly drop timezone information by writing table::non_timezone_aware_field.eq(timezone_aware_data), which is bug-prone.
This goes strongly against the general principle of non-implicit-cast otherwise used in Rust.

=> I would prefer that we do not do this, and let users write table::non_timezone_aware_field.eq(timezone_aware_data.forget_offset()) if they really want to - which they probably don't: they probably want to update their database schema to be timezone-aware instead (that is, by using the Timestamptz type instead of Timestamp, if they are feeding actually feeding timezone-aware data to their database).

NB: I couldn't find an actual OffsetDateTime::forget_offset function in time (or one named differently that would do that), which I find very surprising (I don't use time), but that would be an issue to solve via a PR in time, not in diesel. (But then maybe such a function doesn't exist because the behavior of a function with a such signature may be unclear because choosing whether to convert by going to UTC or by forgetting offset is arbitrary.) (In any case you can write PrimitiveDateTime::new(dt.date(), dt.time()) until that gets merged)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While strictly true, this winds up being incredibly verbose in practice. In particular where users of diesel don't control their own schema, it's convenient to specify the UTC convention at the row mapping. This is probably why the MySql impl exists.

When deriving Queryable for large structs on Pg you currently have to write the field mapping as PrimitiveDateTime. When doing almost anything with the data later (including serializing for REST output) you wind up with a lot of assume_utc() calls. This becomes confusing to read, because "hey this Postgres time is in UTC" annotation is separated from the database call that populated the PrimitiveDateTime.

In the other direction when storing rows, you wind up having to write wrapper fns to manually convert OffsetDateTime into PrimitiveDateTime for each and every insertion. This is easy to get wrong: if you simply call date() and time() without first explicitly converting to UTC you can easily wind up inadvertently storing an incorrect timestamp. There is no impl in time to perform this conversion, making it almost deliberately awkward. I assume the time crate author has a philosophical reasoning here, in that most people should store and operate on OffsetDateTime instead of PrimitiveDateTime.

Copy link
Member

@Ten0 Ten0 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case where you don't control your database schema, I think you have two reasonable options:

  • Use serialize_as/serialize_fn/deserialize_as/deserialize_fn on your Insertable/Queryable struct - this removes the need for a conversion at every struct construction (resp. usage)
  • Since it turns out that the actual wire serialization for the Timestamptz type is precisely that of serializing a Timestamp as Utc (cf the module this PR updates), you could update your schema.rs to replace Timestamp by Timestamptz there after the CLI generates it, in effect asserting that when a Timestamp is used in this table, it is indeed expected to be an UTC timestamp. You would then appropriately use timezone-aware throughout your rust application, guaranteeing that these are read and inserted as such, and avoiding non-timezone-aware types in your application (which is best in any case).

This is probably why the MySql impl exists

As outlined by weiznich the reason is more likely that MySQL has a different meaning than Postgres for the Timestamp type, where it is actually timezone-aware (in that it converts provided time from local to UTC for storage):
https://dev.mysql.com/doc/refman/8.0/en/datetime.html

So it looks like on the MySQL side, Timestamp should be allowed to map from timezone-aware types, and DateTime should be allowed to map from non-timezone-aware types. (It looks like it may be possible to extract the appropriate information to behave properly in every case from the MysqlTime that gets deserialized. I'm surprised that we assume utc instead of reading time_zone_displacement (doc: https://dev.mysql.com/worklog/task/?id=10828) but that may be correct.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. I didn't realize the Timestamp behavior was that different, but I guess it does make sense here.

you could update your schema.rs to replace Timestamp by Timestamptz there after the CLI generates it, in effect asserting that when a Timestamp is used in this table, it is indeed expected to be an UTC timestamp.

This is a really clever approach. I was able to make this switch in my own code, and things generally worked as you've described. The only place I ran into issues was with a gt comparison against now... to me that indicates the NOW() typing in Postgres actually isn't correct either. The Postgres documentation seems to indicate that the type returned should be Timestamptz and not naked Timestamp. I was able to work around it by doing a local now_utc and doing a local subtraction, but probably something to add to a corrections list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC now has AsExpression implementations for both Timestamp and Timestamptz, so one shouldn't hit any error if .into_sql()ing it to Timestamptz. I'm surprised that doesn't work implicitly anymore... It's maybe a matter of Nullable/not Nullable...

fn to_sql<'b>(&'b self, out: &mut Output<'b, '_, Pg>) -> serialize::Result {
let prim = to_primitive_datetime(*self);
<PrimitiveDateTime as ToSql<Timestamp, Pg>>::to_sql(&prim, &mut out.reborrow())
}
}

#[cfg(all(feature = "time", feature = "postgres_backend"))]
impl FromSql<Timestamp, Pg> for OffsetDateTime {
fn from_sql(bytes: PgValue<'_>) -> deserialize::Result<Self> {
let prim = <PrimitiveDateTime as FromSql<Timestamp, Pg>>::from_sql(bytes)?;
Ok(prim.assume_utc())
}
}

#[cfg(all(feature = "time", feature = "postgres_backend"))]
impl FromSql<Timestamptz, Pg> for PrimitiveDateTime {
Copy link
Member

@Ten0 Ten0 Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: I see that we already do implement this, both here and in the chrono module, and I don't like it for the same reasons.
But here we probably don't want to remove it until the next major release, because that would otherwise be breaking.

fn from_sql(bytes: PgValue<'_>) -> deserialize::Result<Self> {
Expand Down
4 changes: 2 additions & 2 deletions diesel/src/sql_types/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -331,14 +331,14 @@ pub struct Time;
/// - [`std::time::SystemTime`][SystemTime] (PG only)
/// - [`chrono::NaiveDateTime`][NaiveDateTime] with `feature = "chrono"`
/// - [`time::PrimitiveDateTime`] with `feature = "time"`
/// - [`time::OffsetDateTime`] with `feature = "time"` (MySQL only)
/// - [`time::OffsetDateTime`] with `feature = "time"`
///
/// ### [`FromSql`](crate::deserialize::FromSql) impls
///
/// - [`std::time::SystemTime`][SystemTime] (PG only)
/// - [`chrono::NaiveDateTime`][NaiveDateTime] with `feature = "chrono"`
/// - [`time::PrimitiveDateTime`] with `feature = "time"`
/// - [`time::OffsetDateTime`] with `feature = "time"` (MySQL only)
/// - [`time::OffsetDateTime`] with `feature = "time"`
///
/// [SystemTime]: std::time::SystemTime
#[cfg_attr(
Expand Down
Loading