Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Added support for Extension (logical) type #359

Merged
merged 1 commit into from Aug 31, 2021
Merged

Conversation

jorgecarleitao
Copy link
Owner

@jorgecarleitao jorgecarleitao commented Aug 30, 2021

Adds support for Arrow's Extension "DataType", enabling users to define and use custom data types from Arrow's ecosystem.

Note that, from now on, array's data_type is not sufficient to downcast an array. I have updated the user guide with the new rules. The easiest way to think about them:

  • enum PhysicalType has a one-to-one relationship to concrete arrays (e.g. PhysicalType::Utf8 <-> Utf8Array<i32>)
  • enum DataType has a many-to-one relationship with a PhysicalType, which can be obtained via .to_physical_type() -> PhysicalType.

The corollary of this change is that, to correctly downcast arrays, users should now use match array.data_type().to_physical_type(). This is only relevant to the ExtensionType, which boxes a DataType which indicates its own logical type.

Thanks to @sundy-li for the initial implementation at #338 and follow-up discussions at #350.

Closes #361

@jorgecarleitao jorgecarleitao added the feature A new feature label Aug 30, 2021
@codecov
Copy link

codecov bot commented Aug 30, 2021

Codecov Report

Merging #359 (e1bd078) into main (145b7c8) will increase coverage by 0.07%.
The diff coverage is 79.49%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #359      +/-   ##
==========================================
+ Coverage   81.23%   81.30%   +0.07%     
==========================================
  Files         326      326              
  Lines       20969    21000      +31     
==========================================
+ Hits        17035    17075      +40     
+ Misses       3934     3925       -9     
Impacted Files Coverage Δ
src/array/display.rs 57.89% <ø> (ø)
src/array/fixed_size_list/mod.rs 54.90% <0.00%> (+0.90%) ⬆️
src/array/list/mod.rs 80.00% <0.00%> (-0.89%) ⬇️
src/datatypes/field.rs 19.32% <0.00%> (+0.68%) ⬆️
src/ffi/schema.rs 57.91% <0.00%> (-0.27%) ⬇️
src/array/dictionary/mod.rs 69.76% <50.00%> (+1.58%) ⬆️
src/scalar/mod.rs 61.90% <50.00%> (-4.77%) ⬇️
src/io/json_integration/schema.rs 44.85% <55.73%> (+0.83%) ⬆️
src/array/union/mod.rs 82.65% <69.23%> (+2.24%) ⬆️
src/array/struct_.rs 41.33% <80.00%> (ø)
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 145b7c8...e1bd078. Read the comment docs.

src/types/mod.rs Outdated Show resolved Hide resolved
src/datatypes/mod.rs Outdated Show resolved Hide resolved
@jorgecarleitao jorgecarleitao force-pushed the extension2 branch 4 times, most recently from 1eef2d0 to a4f7ff3 Compare August 31, 2021 11:11
@jorgecarleitao jorgecarleitao changed the title Added support for Extension logical type Added support for Extension (logical) type Aug 31, 2021
@jorgecarleitao jorgecarleitao merged commit d2df4a5 into main Aug 31, 2021
@jorgecarleitao jorgecarleitao deleted the extension2 branch August 31, 2021 18:30
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature A new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Added Extension to DataType
2 participants