-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Add support for +/- infinity datetime
objects
#8294
Conversation
@@ -405,12 +448,31 @@ py::object PythonObject::FromValue(const Value &val, const LogicalType &type) { | |||
case LogicalTypeId::TIMESTAMP_TZ: { | |||
D_ASSERT(type.InternalType() == PhysicalType::INT64); | |||
auto timestamp = val.GetValueUnsafe<timestamp_t>(); | |||
|
|||
InfinityType infinity = InfinityType::NONE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly concerned with the existing code that we can overflow Python's more limited date range. Is that handled elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we do anything to prevent that no
def test_getattr_spaces(self): | ||
rel = duckdb.sql('select 42 as "hello world"') | ||
assert rel['hello world'].fetchall()[0][0] == 42 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an issue, but If you use .editorconfig, it will add trailing line feeds to Python files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably run a formatter/linter on Python code in the repo at this point, @Mause any thoughts on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Elliana suggested we should use black to format the python code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually have a pr with that I was looking to submit once the big python PR's had been merged (because otherwise we'll end up a bunch of conflicts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! LGTM. Perhaps @Mause wants to have a look as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a couple of minor comments
def test_getattr_spaces(self): | ||
rel = duckdb.sql('select 42 as "hello world"') | ||
assert rel['hello world'].fetchall()[0][0] == 42 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually have a pr with that I was looking to submit once the big python PR's had been merged (because otherwise we'll end up a bunch of conflicts)
@@ -434,6 +496,12 @@ py::object PythonObject::FromValue(const Value &val, const LogicalType &type) { | |||
|
|||
auto date = val.GetValueUnsafe<date_t>(); | |||
int32_t year, month, day; | |||
if (!duckdb::Date::IsFinite(date)) { | |||
if (date == date_t::infinity()) { | |||
return py::module::import("datetime").attr("date").attr("max"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not caching the min/max values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think our import cache does anything other than remove a few extra string comparisons.
We haven't measured this, but from working with the code for a while I am confident pybind caches the imports itself
'TIMESTAMP_MS', | ||
'TIMESTAMP_S' | ||
]: | ||
# This query is not supported in core duckdb but doesn't deterministically produce an error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea why it doesn't deterministically error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No clue, I think it has to do with an earlier under/overflow from an unchecked expression.
I noticed INFINITY and -INFINITY are not really implemented for TIMESTAMP_S/MS/NS, as I am having to check for the truncated versions of the timestamp_t::infinity
value.
struct timestamp_ms_t : public timestamp_t { // NOLINT
static constexpr timestamp_t infinity() {
return timestamp_t(int64_t(NumericLimits<int64_t>::Maximum() / 1000) * 1000);
}
static constexpr timestamp_t ninfinity() {
return timestamp_t(int64_t(-NumericLimits<int64_t>::Maximum() / 1000) * 1000);
}
};
So it might be a result of that
}; | ||
struct timestamp_ms_t : public timestamp_t { // NOLINT | ||
static constexpr timestamp_t infinity() { | ||
return timestamp_t(int64_t(NumericLimits<int64_t>::Maximum() / 1000) * 1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's always use timestamp_t::infinity
.
@hawkfish could you perhaps have a look at this, it looks like there is an issue when casting infinity to non-microsecond timestamp types, e.g.:
D select 'infinity'::timestamp_ms as ts;
┌───────────────────────────┐
│ ts │
│ timestamp_ms │
├───────────────────────────┤
│ 294247-01-10 04:00:54.775 │
└───────────────────────────┘
… they're bugged currently
Thanks! |
This PR fixes #8143 and #8214
To fix 8214, we now add double quotes around the identifier when performing the internal lookup
To fix 8143 we now produce
datetime.datetime.max
anddatetime.date.max
values forINFINITY
timestamps and dates respectively.For
-INFINITY
we produce the.min
variant of the datetime object.This also adds support for reading those values as
INFINITY
and-INFINITY