Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add quarter support in temporal #1836

Merged
merged 1 commit into from Jun 11, 2022

Conversation

MazterQyou
Copy link
Contributor

Which issue does this PR close?

Closes #1835.

Rationale for this change

Getting date's quarter is widely supported across different RDBMS and is trivial to implement.
This allows, for instance, extending apache/arrow-datafusion's date_part SQL builtin to support quarter field extraction.

What changes are included in this PR?

This PR adds a quarter extractor to temporal, and a few related tests.

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jun 10, 2022
@codecov-commenter
Copy link

Codecov Report

Merging #1836 (ccc7fb4) into master (bd33489) will increase coverage by 0.00%.
The diff coverage is 87.50%.

❗ Current head ccc7fb4 differs from pull request most recent head d905c47. Consider uploading reports for the commit d905c47 to get more accurate results

@@           Coverage Diff           @@
##           master    #1836   +/-   ##
=======================================
  Coverage   83.48%   83.48%           
=======================================
  Files         201      201           
  Lines       56838    56878   +40     
=======================================
+ Hits        47451    47485   +34     
- Misses       9387     9393    +6     
Impacted Files Coverage Δ
arrow/src/compute/kernels/temporal.rs 95.77% <87.50%> (-1.36%) ⬇️
parquet/src/encodings/encoding.rs 93.46% <0.00%> (-0.20%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bd33489...d905c47. Read the comment docs.

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, only some minor nits

#[test]
fn test_temporal_array_date64_quarter() {
//1514764800000 -> 2018-01-01
//1566275025000 -> 2019-08-20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth checking a month on the leading edge, e.g. first day of April

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's worth it considering the first test is already at the beginning of a quarter.
If you think it's a good idea to test that specifically, I could either change the first test to April 1st, or add another date to the test.

trait ChronoDateQuarter {
fn quarter(&self) -> u32;

fn quarter0(&self) -> u32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason to have quarter0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no particular reason to have quarter0 at the moment, but chrono implements those 0 getters so I followd. I thought it might be more future-proof that way, besides the fact one could be based on another.
If you think quarter0 should not be there because it serve no purpose at the moment, I can refactor the trait to have only quarter fn.

arrow/src/compute/kernels/temporal.rs Show resolved Hide resolved
@@ -183,6 +209,34 @@ where
Ok(b.finish())
}

/// Extracts the quarter of a given temporal array as an array of integers
pub fn quarter<T>(array: &PrimitiveArray<T>) -> Result<Int32Array>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW there was a similar PR from @ovr in DataFusion https://github.com/apache/arrow-datafusion/pull/1667/files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. This is regarding date_trunc function though, while date_part in DataFusion, for instance, uses arrow-rs's temporal.

FYI I had an attempt to bring quarter to date_part in DF without adding it to arrow-rs first, but this required a refactor of the macro and made things look weird, while having quarter here is a nice addition and makes the same addition in DF simple and elegant rather than hacky.

@MazterQyou MazterQyou closed this Jun 11, 2022
@MazterQyou MazterQyou deleted the upstream/temporal-quarter branch June 11, 2022 02:27
@MazterQyou MazterQyou restored the upstream/temporal-quarter branch June 11, 2022 02:30
@MazterQyou MazterQyou reopened this Jun 11, 2022
@tustvold tustvold merged commit 36ceb22 into apache:master Jun 11, 2022
@MazterQyou MazterQyou deleted the upstream/temporal-quarter branch June 11, 2022 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add quarter support in temporal kernels
4 participants