Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message for timestamp queries outside supported range #5730

Merged
merged 4 commits into from
May 16, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions arrow-cast/src/cast/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8057,6 +8057,35 @@ mod tests {
test_cast_string_to_decimal256_overflow(overflow_array);
}

#[test]
fn test_cast_outside_supported_range_for_nanoseconds() {
const EXPECTED_ERROR_MESSAGE: &str = "The dates that can be represented as nanoseconds have to be between 1677-09-21T00:12:44.0 and 2262-04-11T23:47:16.854775804";

let array = StringArray::from(vec![Some("1650-01-01 01:01:01.000001")]);

let cast_options = CastOptions {
safe: false,
format_options: FormatOptions::default(),
};

let result = cast_string_to_timestamp::<i32, TimestampNanosecondType>(
&array,
&None::<Arc<str>>,
&cast_options,
);

assert!(result.is_err());
Abdi-29 marked this conversation as resolved.
Show resolved Hide resolved
let err = result.unwrap_err();
assert_eq!(
err.to_string(),
format!(
"Cast error: Overflow converting {} to Nanosecond. {}",
array.value(0),
EXPECTED_ERROR_MESSAGE
)
);
}

#[test]
fn test_cast_date32_to_timestamp() {
let a = Date32Array::from(vec![Some(18628), Some(18993), None]); // 2021-1-1, 2022-1-1
Expand Down
7 changes: 5 additions & 2 deletions arrow-cast/src/cast/string.rs
Original file line number Diff line number Diff line change
Expand Up @@ -112,8 +112,11 @@ fn cast_string_to_timestamp_impl<O: OffsetSizeTrait, T: ArrowTimestampType, Tz:
.map(|v| {
v.map(|v| {
let naive = string_to_datetime(tz, v)?.naive_utc();
T::make_value(naive).ok_or_else(|| {
ArrowError::CastError(format!(
T::make_value(naive).ok_or_else(|| match T::UNIT {
TimeUnit::Nanosecond => ArrowError::CastError(format!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you would be willing to improve the messages for the other types (like Microsecond, etc)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sure, I can work on it. Thanks for the feedback

Copy link
Contributor Author

@Abdi-29 Abdi-29 May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb I did some research about the range of the other types but I couldn't find anything. Do you have any tips for me on where I can search for them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could create an array with T::Min of the underlying type

something like (untested):

let min_date = DateTimeArray::from(vec![i32::MIN, i32:MAX]));
pretty_print(min_date)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I'll try it now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried

    let  a  = Arc::new(TimestampMicrosecondArray::from(vec![Some(i64::MIN), Some(i64::MAX)]));
    println!("{}", pretty_format_columns("foo", &[a]).unwrap());

And I got

+--------------------------------------------------------------------------------------------------------+
| foo                                                                                                    |
+--------------------------------------------------------------------------------------------------------+
| ERROR: Cast error: Failed to convert -9223372036854775808 to datetime for Timestamp(Microsecond, None) |
| ERROR: Cast error: Failed to convert 9223372036854775807 to datetime for Timestamp(Microsecond, None)  |
+--------------------------------------------------------------------------------------------------------+

So if you can figure out the min/max values that were allowed for that type and then that code would show you the string value I think

However, to be honest what we have in this PR is better than main, I think we could merge it as is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, thanks. I'll open another issue to fix this

"Overflow converting {naive} to Nanosecond. The dates that can be represented as nanoseconds have to be between 1677-09-21T00:12:44.0 and 2262-04-11T23:47:16.854775804"
)),
_ => ArrowError::CastError(format!(
"Overflow converting {naive} to {:?}",
T::UNIT
))
Expand Down
Loading