Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump Coral to 1.0.120 #9530

Merged
merged 1 commit into from Oct 12, 2021
Merged

Conversation

losipiuk
Copy link
Member

@losipiuk losipiuk commented Oct 6, 2021

It also adds test for CAST(timestamp AS DECIMAL()); which apparently is
not working great.

@cla-bot cla-bot bot added the cla-signed label Oct 6, 2021
@losipiuk losipiuk requested a review from findepi October 6, 2021 15:29
@losipiuk losipiuk force-pushed the lo/hive-views-cast-timestamp branch 2 times, most recently from 9fb64f8 to 4e22f76 Compare October 6, 2021 16:30
onHive().executeQuery("DROP VIEW IF EXISTS cast_timestamp_as_decimal_view");
onHive().executeQuery("CREATE VIEW cast_timestamp_as_decimal_view AS SELECT CAST(a_timestamp as DECIMAL(10,0)) a_cast_timestamp FROM cast_timestamp_as_decimal");

// this currently misbehave; Trino returns 631307594 while Hive returns 631261694
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a problem

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

@ljfgem ljfgem Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, but I tried on our hive and trino, both are 631282394.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

631282394 is yet a third option it seems
@ljfgem for the record, what is "out hive and trino" versions?
and, do you run the test using bin/ptl test run or?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment in code was coming from execution on HIve 1.2.
If I run on HDP3 result is different: 631282394

Hive 3.1.0 (HDP3):
ON Hive: 631282394
ON Trino: 631307594

Hive 1.2:
ON Hive: 631261694
ON Trino: 631307594

It looks it boils down to the fact that value returned to by to_unixtime in Trino depends on client session timezone.
If I do SET TIME ZONE 'UTC' then I get 631282394 in Trino (same as Hive 3.1).

To get 631261694 (as Hive 1.2) I need to do SET TIME ZONE 'Asia/Kathmandu' in Trino (Asia/Kathmandu is timezone used by Hive JVM in product tests).

We cannot do anything about incompatibility in behaviour of Hive 1.2 and 3.1 I guess. But I think we should not use to_unixtime in rewrites, as the returned values must not depend on session timezone.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think we should not use to_unixtime in rewrites, as the returned values must not depend on session timezone

Or rather we need to wrap argument in with_timezone"(..., 'UTC') before passing it to to_unixtime, so it is session timezone agnostic. I do that in linkedin/coral#109.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining! Looks like it is because our hive/trino are using UTC timezone.

Or rather we need to wrap argument in with_timezone"(..., 'UTC') before passing it to to_unixtime, so it is session timezone agnostic.

But since it doesn't specify the timezone that Hive uses, I think such a wrapper couldn't help, or we need to assume Hive is using UTC timezone?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining! Looks like it is because our hive/trino are using UTC timezone.

Or rather we need to wrap argument in with_timezone"(..., 'UTC') before passing it to to_unixtime, so it is session timezone agnostic.

But since it doesn't specify the timezone that Hive uses, I think such a wrapper couldn't help, or we need to assume Hive is using UTC timezone?

It will help for Hive 3.1 as the semantics of casting timestamp to decimal there is not timezone dependent.
I think we need to live with the fact that 1.2 will behave incorrectly. We have same situation with from_utc_timestamp

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, opened a PR: linkedin/coral#169
Feel free to review it, thanks!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @losipiuk thanks for the explanation - looks like I fell prey to having both my Hive/Trino setup in UTC and did not catch that.

It also adds test for CAST(timestamp AS DECIMAL()); which apparently is
not working great.
@losipiuk
Copy link
Member Author

CI: #8691

@losipiuk losipiuk merged commit c994b15 into trinodb:master Oct 12, 2021
@github-actions github-actions bot added this to the 364 milestone Oct 12, 2021
@losipiuk losipiuk mentioned this pull request Oct 26, 2021
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants