-
-
Notifications
You must be signed in to change notification settings - Fork 415
Open
Description
- I am on the latest Pendulum version.
- I have searched the issues of this repo and believe that this is not a duplicate.
- OS version and name: AmazonLinux2023
- Pendulum version: 3.0.0
Issue
My team uses Airflow's SQLiteHook to store some data on disk between tasks. We timestamp the column by creating some pandas DataFrame and inserting into the SQLite file with the hook.
data_interval_end = pendulum.now()
df = df.rename(columns=str.lower).astype({"user_id": int}).assign(date_created=data_interval_end)
sqlite_hook.insert_rows(
failure_event_sqlite_table,
list(df[["user_id", "shift_id", "successes", "failures", "date_created"]].itertuples(index=False)),
)
This has (and still does) make rows that look like this:
user_id shift_id successes failures date_created
0 1 11 1.0 0.1 2023-09-25T00:00:00+00:00
In later tasks, we may pull out some data with a query that looks like this:
sqlite_hook.get_pandas_df(
f"select * from {failure_event_sqlite_table} where date_created = ?", parameters=(data_interval_end,)
)
This used to generate a query with a timestamp in the same format, but now it generates a query with a timestamp that looks like this (note the missing T)
where date_created = '2023-07-04 18:00:00+00:00'
This is inconsistent behavior breaks all of these queries - the pendulum timestamps should serialize and deserialize in the same format.
What you think should happen instead?
The previous behavior was to make statements like this:
where date_created = '2023-07-04T18:00:00+00:00'
But it's fine either way, as long as it's consistent.
Metadata
Metadata
Assignees
Labels
No labels