Describe the bug
Date32Type::parse does not parse date strings whose year is outside chrono::NaiveDate's range (Jan 1, 262145 BCE to Dec 31, 262143 CE )
Ideally Date32 should be able to represent much larger range (i32 days from epoch; years ≈ ±5,881,580).
the extended-year branch in parse_date calls chrono::NaiveDate::from_ymd_opt, limiting the supported range.
|
return NaiveDate::from_ymd_opt(year, month, day); |
In the end, all we need is i32, so we can avoid NaiveDate detour
To Reproduce
Try to parse date outside NaiveDate supported range
use arrow_array::types::Date32Type;
use arrow_cast::parse::Parser;
fn main() {
// Works: year is within chrono::NaiveDate's supported range.
assert_eq!(Date32Type::parse("+29349-01-26"), Some(10_000_000));
// Fails today: returns None. This should be accepted because the resulting
// day offset is still representable by Date32.
assert_eq!(Date32Type::parse("+2739877-01-03"), Some(1_000_000_000));
}
Expected behavior
All valid Date32 should be parsed successfully
Additional context
arrow-array = "58.2.0"
arrow-cast = "58.2.0"
Easy approach would be:
The Gregorian calendar repeats exactly every 400 years (146,097 days), so we calculate current era ( era = y.div_euclid(400) ) and year in current era ( yoe = y.rem_euclid(400) ). Year in era would be 0..=399, so always within chrono's supported range, which we can use to calculate days / validation
let nd = NaiveDate::from_ymd_opt(yoe, month, day)?;
let in_era = (nd.num_days_from_ce() - EPOCH_DAYS_FROM_CE) as i64;
days = i32::try_from(era * 146_097 + in_era).ok();
Other can be to update parse_date signature itself? or something else?
Describe the bug
Date32Type::parse does not parse date strings whose year is outside
chrono::NaiveDate's range (Jan 1, 262145 BCE to Dec 31, 262143 CE )Ideally Date32 should be able to represent much larger range (i32 days from epoch; years ≈ ±5,881,580).
the extended-year branch in parse_date calls
chrono::NaiveDate::from_ymd_opt, limiting the supported range.arrow-rs/arrow-cast/src/parse.rs
Line 612 in 7abb225
In the end, all we need is i32, so we can avoid
NaiveDatedetourTo Reproduce
Try to parse date outside NaiveDate supported range
Expected behavior
All valid Date32 should be parsed successfully
Additional context
arrow-array = "58.2.0"
arrow-cast = "58.2.0"
Easy approach would be:
The Gregorian calendar repeats exactly every 400 years (146,097 days), so we calculate current era ( era = y.div_euclid(400) ) and year in current era ( yoe = y.rem_euclid(400) ). Year in era would be 0..=399, so always within chrono's supported range, which we can use to calculate days / validation
Other can be to update
parse_datesignature itself? or something else?