-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-27847: Casting NUMERIC types to TIMESTAMP is prohibited #5278
Conversation
<property> | ||
<name>hive.strict.timestamp.conversion</name> | ||
<value>false</value> | ||
</property> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test HS2 clusters inherit this hive-site.xml, then this param is stored in where new HiveConf()
picks up.
In order to make sure that test cases use the param not in hive-site but in session params, I'd like to remove it.
@@ -1,4 +1,6 @@ | |||
--! qt:dataset:src | |||
set hive.strict.timestamp.conversion=false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the removal of the param from data/conf/hive-site.xml
.
@@ -84,6 +84,7 @@ public TestVectorFilterCompare() { | |||
// Arithmetic operations rely on getting conf from SessionState, need to initialize here. | |||
SessionState ss = new SessionState(new HiveConf()); | |||
ss.getConf().setVar(HiveConf.ConfVars.HIVE_COMPAT, "latest"); | |||
ss.getConf().setBoolVar(HiveConf.ConfVars.HIVE_STRICT_TIMESTAMP_CONVERSION, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the removal of the param from data/conf/hive-site.xml
.
if (ss != null) { | ||
final boolean strict = ss.getConf().getBoolVar(ConfVars.HIVE_STRICT_TIMESTAMP_CONVERSION); | ||
final PrimitiveGrouping grouping = PrimitiveObjectInspectorUtils.getPrimitiveGrouping(tsInputTypes[0]); | ||
if (strict && grouping == PrimitiveGrouping.NUMERIC_GROUP) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect this validation is required only at compile-time.
@@ -4,6 +4,8 @@ create table test_num_ts_input(begin string, ts string); | |||
|
|||
insert into test_num_ts_input values('1653209895687','2022-05-22T15:58:15.931+07:00'),('1653209938316','2022-05-22T15:58:58.490+07:00'),('1653209962021','2022-05-22T15:59:22.191+07:00'),('1653210021993','2022-05-22T16:00:22.174+07:00'); | |||
|
|||
set hive.vectorized.execution.enabled=false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem was not reproduced without this session param.
-- SHA512 is used | ||
MASK_HASH(text_timestamp), | ||
-- Java's Pattern doesn't support it, then it fails with hive.use.googleregex.engine=false | ||
text_timestamp RLIKE '\\p{Katakana}+' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following files would touch SessionState or HiveConf.
git grep -l -e 'org.apache.hadoop.hive.ql.session.SessionState' -e 'org.apache.hadoop.hive.conf.HiveConf' | grep 'ql/src/java/org/apache/hadoop/hive/ql/udf/generic'
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAssertTrueOOM.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseArithmetic.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseCompare.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentAuthorizer.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentCatalog.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDatabase.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentGroups.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentSchema.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentUser.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUnixTime.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLoggedInUser.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMaskHash.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRegExp.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRestrictInformationSchema.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTimestamp.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUnixTimeStamp.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSQLSchema.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/InstantFormatter.java
I didn't add tests for the following UDFs because they are marked as "runtime constant" and the issue will unlikely happen. As another point, it is hard to configure some of them in qtest.
CURRENT_AUTHORIZER(),
CURRENT_CATALOG(),
CURRENT_DATABASE(),
CURRENT_DATE(),
CURRENT_GROUPS(),
CURRENT_SCHEMA(),
CURRENT_TIMESTAMP(),
CURRENT_USER(),
LOGGED_IN_USER(),
RESTRICT_INFORMATION_SCHEMA()
This seems to be unrelated and temporal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a minor comment to clarify, the change looks good to me! Thank you @okumin for check and PR!
-- On HiveServer2 | ||
SELECT | ||
-- DECIMAL because of hive.compat=latest | ||
TYPEOF(cint1 / cint2), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this TYPEOF(cint1 / cint2)
happens only at compile stage, should we change it to cint1 / cint2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right! I wonder what is the most reasonable way to identify if it is DOUBLE or DECIMAL...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you say, I found we can simply remove TYPEOF since the text expression or precision of DOUBLE and DECIMAL is different. This is the diff between hive.compat=latest
vs default.
< 3333.3333333333335 false 1970-01-01 02:46:40 1970-01-01 01:02:03 4 1970-01-01 09:46:40 4 -21477 -21477 ca764b57c635a893c91b0edaea84bca142e065990e1a66b9c60cf855777af51070444210d30e359fa81e5a77c68a073224e0c7343d957f556816618a10baa37c false
< 4000.0 false 1970-01-01 05:33:20 1970-01-02 04:05:06 5 1970-01-01 12:33:20 4 75906 75906 7882085a4ed36e7c675fcd12083eefa9a208f1fdcfba10b3a986bb5b1d43e23da3ba87b3933feebfed80ba9e05e4e73f72231cb8022453e562e034e8b5c603b9 false
< 4285.714285714285 false 1970-01-01 08:20:00 1970-01-03 07:08:09 6 1970-01-01 15:20:00 4 173289 173289 d9899e12fce35a4f65cd63aee1e953dacd9e91229d52dd17ddad8e44c17d2d081c0a5519a96637663bc6baa2d7c96d6f1a9d6944ac7668c7926ebc10c9f3b81b false
---
> 3333.33333333333 false 1970-01-01 02:46:40 1970-01-01 01:02:03 4 1970-01-01 09:46:40 4 -21477 -21477 ca764b57c635a893c91b0edaea84bca142e065990e1a66b9c60cf855777af51070444210d30e359fa81e5a77c68a073224e0c7343d957f556816618a10baa37c false
> 4000.00000000000 false 1970-01-01 05:33:20 1970-01-02 04:05:06 5 1970-01-01 12:33:20 4 75906 75906 7882085a4ed36e7c675fcd12083eefa9a208f1fdcfba10b3a986bb5b1d43e23da3ba87b3933feebfed80ba9e05e4e73f72231cb8022453e562e034e8b5c603b9 false
> 4285.71428571429 false 1970-01-01 08:20:00 1970-01-03 07:08:09 6 1970-01-01 15:20:00 4 173289 173289 d9899e12fce35a4f65cd63aee1e953dacd9e91229d52dd17ddad8e44c17d2d081c0a5519a96637663bc6baa2d7c96d6f1a9d6944ac7668c7926ebc10c9f3b81b false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed the change and am waiting for CI.
db8edc8
|
…5278) (Shohei Okumiya, reviewed by Zhihua Deng)
What changes were proposed in this pull request?
Configure GenericUDFTimestamp.java correctly.
https://issues.apache.org/jira/browse/HIVE-27847
Why are the changes needed?
hive.strict.timestamp.conversion
andhive.int.timestamp.conversion.in.seconds
are not propagated to runtime.Does this PR introduce any user-facing change?
Yes. This is a bug fix.
Is the change a dependency upgrade?
No.
How was this patch tested?
Integration tests are added.