Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the offsets checking #1675

Closed
HaoYang670 opened this issue May 8, 2022 · 1 comment · Fixed by #1684
Closed

Speed up the offsets checking #1675

HaoYang670 opened this issue May 8, 2022 · 1 comment · Fixed by #1684
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog

Comments

@HaoYang670
Copy link
Contributor

HaoYang670 commented May 8, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

let start_offset: usize = start_offset
.try_into()
.map_err(|_| {
ArrowError::InvalidArgumentError(format!(
"Offset invariant failure: could not convert start_offset {} to usize in slot {}",
start_offset, i))
})?;
let end_offset: usize = end_offset
.try_into()
.map_err(|_| {
ArrowError::InvalidArgumentError(format!(
"Offset invariant failure: Could not convert end_offset {} to usize in slot {}",
end_offset, i+1))
})?;
if start_offset > offset_limit {
return Err(ArrowError::InvalidArgumentError(format!(
"Offset invariant failure: offset for slot {} out of bounds: {} > {}",
i, start_offset, offset_limit))
);
}
if end_offset > offset_limit {

In each iteration, we fully check start_offset and end_offset, which means that all elements (except the first and the last) in the offsets_buffer are checked twice. This is somewhat wasteful.

Describe alternatives you've considered
Hoist the to_usize() checking and offset_limit checking outside the zip iteration.

Additional context
Add any other context or screenshots about the feature request here.

@HaoYang670 HaoYang670 added the enhancement Any new improvement worthy of a entry in the changelog label May 8, 2022
@HaoYang670
Copy link
Contributor Author

HaoYang670 commented May 8, 2022

Hopefully, there would be some performance improvement.

@HaoYang670 HaoYang670 changed the title Simplify the offsets checking Speed up the offsets checking May 14, 2022
@alamb alamb added the arrow Changes to the arrow crate label May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants