Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-25268: Restore the functionality of date_format UDF #2409

Merged
merged 7 commits into from Jun 28, 2021

Conversation

guptanikhil007
Copy link
Contributor

@guptanikhil007 guptanikhil007 commented Jun 19, 2021

What changes were proposed in this pull request?

Use new java time APIs to get the result instead of older APIs to restore functionality

Why are the changes needed?

date_format UDF gives

  1. incorrect time values for dates prior to 1901-01-01 00:00:00 if the timezone is not UTC
  2. incorrect date and time values for dates prior to October 15, 1582 (Default Gregorian Calendar change)

These changes fix both of the above issues

Does this PR introduce any user-facing change?

The functionality is restored for the date_format UDF

How was this patch tested?

  1. Unit tests
  2. Local tests with MiniHS2 functionality

@guptanikhil007 guptanikhil007 changed the title [WIP] HIVE-25268: Restore the functionality of date_format UDF HIVE-25268: Restore the functionality of date_format UDF Jun 20, 2021
@guptanikhil007
Copy link
Contributor Author

guptanikhil007 commented Jun 20, 2021

@zabetak @jcamachor @sankarh Please review the change in the APIs

select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

set hive.local.time.zone=Africa/Johannesburg;
select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have tests for other formats (to ensure DateTimeFormat doesn't break anything)? Also need update of wiki doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the existing tests with SimpleDateFormat Formatter is passing except the milliseconds change which I have mentioned in my comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this patch is merged I will update the Hive wiki as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @sankarh comments. Existing test cases are fine. Could please add some more time zone based test cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@guptanikhil007
Copy link
Contributor Author

Rebased on latest master

@sankarh sankarh merged commit f7a21ab into apache:master Jun 28, 2021
@guptanikhil007 guptanikhil007 deleted the update_date_format branch June 28, 2021 04:30
dengzhhu653 pushed a commit to dengzhhu653/hive that referenced this pull request Dec 15, 2022
…1900 if the local timezone is other than UTC (Nikhil Gupta, reviewed by Ashish Sharma, Stamatis Zampetakis, Sankar Hariappan)

Signed-off-by: Sankar Hariappan <sankarh@apache.org>
Closes (apache#2409)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants