Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: wrong result of range function #8313

Merged
merged 6 commits into from
Nov 27, 2023

Conversation

smallzhongfeng
Copy link
Contributor

Which issue does this PR close?

Closes #8311.

Rationale for this change

When step is a negative number, the decrement calculation is performed from front to back instead of using the step_by method under iterator, because step_by stipulates that the parameter type is usize

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added physical-expr Physical Expressions sqllogictest labels Nov 23, 2023
@Veeupup
Copy link
Contributor

Veeupup commented Nov 23, 2023

thank you for fixing it!

Copy link
Contributor

@Weijun-H Weijun-H left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smallzhongfeng 👍

Comment on lines 749 to 759
let value;
if step < 0 {
while stop < start {
values.push(start);
start += step;
}
} else {
value = (start..stop).step_by(step as usize);
values.extend(value.clone());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let value;
if step < 0 {
while stop < start {
values.push(start);
start += step;
}
} else {
value = (start..stop).step_by(step as usize);
values.extend(value.clone());
}
if step < 0 {
// Decreasing range
values.extend((stop..start).rev().step_by((-step) as usize));
} else {
// Increasing range
values.extend((start..stop).step_by(step as usize));
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emm, this is good advice, but it should be
values.extend((stop+1..start+1).rev().step_by((-step) as usize));
Because the stop of the range is not taken. 🤔️

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and I think we should add a test case for it 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I added ut for this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind putting the test in sqllogoictest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/apache/arrow-datafusion/blob/b648d4e22e82989c65523e62312e1995a1543888/datafusion/sqllogictest/test_files/array.slt#L2748C1-L2752C23
I think there is currently this test, but the results were wrong before, and I have corrected them.

@@ -741,13 +741,22 @@ pub fn gen_range(args: &[ArrayRef]) -> Result<ArrayRef> {
let mut offsets = vec![0];
for (idx, stop) in stop_array.iter().enumerate() {
let stop = stop.unwrap_or(0);
let start = start_array.as_ref().map(|arr| arr.value(idx)).unwrap_or(0);
let mut start = start_array.as_ref().map(|arr| arr.value(idx)).unwrap_or(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut start = start_array.as_ref().map(|arr| arr.value(idx)).unwrap_or(0);
let start = start_array.as_ref().map(|arr| arr.value(idx)).unwrap_or(0);

@@ -2752,7 +2752,7 @@ select range(5),
range(1, -5, 1)
;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
;
range(1, -5, -1)
;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the test to the sqllogistest, overall LGTM 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated~

@@ -2514,6 +2520,67 @@ mod tests {
.is_null(0));
}

#[test]
fn test_array_range() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove test here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jayzhan211 and @Weijun-H I think what is happening is that people follow the model of the existing code -- so as long as there are tests in array_expressions (rather than .slt) that is what people will model.

What do you think about migrating (as part of another PR) all the existing tests to sqllogictests -- this would likely result in all subsequent PRs following the sqllogictests model

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I endorse this proposal as it promises to enhance convenience in directing individuals to incorporate new tests, thereby minimizing potential misunderstandings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb @Weijun-H
There is one problem actually, function like align_array_dimensions we have a unit test for it but it is not possible to port to slt file. We should either not test it directly but rely on array function test that call it or port it to datafusion::common::utils, and test it there. I think the latter is better so we can check the coverage more easily.

btw, should we introduce array_utils in common. I think it will grow larger and larger quickly.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @smallzhongfeng and @Weijun-H and @jayzhan211 for the reviews

@@ -2514,6 +2520,67 @@ mod tests {
.is_null(0));
}

#[test]
fn test_array_range() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jayzhan211 and @Weijun-H I think what is happening is that people follow the model of the existing code -- so as long as there are tests in array_expressions (rather than .slt) that is what people will model.

What do you think about migrating (as part of another PR) all the existing tests to sqllogictests -- this would likely result in all subsequent PRs following the sqllogictests model

@alamb alamb merged commit 9dfb224 into apache:main Nov 27, 2023
22 checks passed
@alamb
Copy link
Contributor

alamb commented Nov 27, 2023

Thank you everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Physical Expressions sqllogictest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The range function results in an error when step is a negative number
5 participants