Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Improved dictionary invariants #1137

Merged
merged 1 commit into from
Jul 5, 2022
Merged

Improved dictionary invariants #1137

merged 1 commit into from
Jul 5, 2022

Conversation

jorgecarleitao
Copy link
Owner

This PR changes the internal invariant from DictionaryArray to only contain keys that:

  • can be casted to usize
  • the maximum value is smaller than the values' length

This allows removing bound checks when iterating over the values via its keys.

Backward incompatible changes

  • DictionaryArray::from_data was replaced by try_new, try_new_unchecked and try_from_keys
  • DictionaryKey now only implements NativeType + TryFrom<usize> + TryInto<usize>

@codecov
Copy link

codecov bot commented Jul 3, 2022

Codecov Report

Merging #1137 (a461a93) into main (b3583b6) will increase coverage by 0.03%.
The diff coverage is 78.15%.

@@            Coverage Diff             @@
##             main    #1137      +/-   ##
==========================================
+ Coverage   83.49%   83.52%   +0.03%     
==========================================
  Files         366      366              
  Lines       35635    35799     +164     
==========================================
+ Hits        29752    29902     +150     
- Misses       5883     5897      +14     
Impacted Files Coverage Δ
src/array/dictionary/iterator.rs 100.00% <ø> (+41.66%) ⬆️
src/array/equal/dictionary.rs 100.00% <ø> (ø)
src/array/equal/mod.rs 80.00% <0.00%> (-3.59%) ⬇️
src/compute/arithmetics/mod.rs 74.10% <ø> (ø)
src/io/parquet/read/statistics/dictionary.rs 48.57% <ø> (ø)
src/compute/cast/dictionary_to.rs 22.41% <7.14%> (-4.40%) ⬇️
src/io/avro/read/nested.rs 64.53% <50.00%> (-0.60%) ⬇️
src/io/parquet/read/deserialize/dictionary.rs 79.06% <65.62%> (-2.69%) ⬇️
src/array/growable/dictionary.rs 77.31% <72.22%> (-0.21%) ⬇️
src/io/json/read/deserialize.rs 72.55% <83.33%> (-0.07%) ⬇️
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b3583b6...a461a93. Read the comment docs.

@jorgecarleitao jorgecarleitao marked this pull request as ready for review July 3, 2022 20:24
@jorgecarleitao jorgecarleitao merged commit 78a2a63 into main Jul 5, 2022
@jorgecarleitao jorgecarleitao deleted the dict branch July 5, 2022 15:20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant