New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal]) #13377
Conversation
This requires bigdecimal 3.1.0 or later for BigDecimal#scale. Arrow::Array.new([BigDecimal]) detects the max precision and scale from BigDecimals and creates suitable Arrow::Decimal{128,256}DataType automatically. This also truncates given BigDecimal when the specified Arrow::Decimal{128,256}DataType doesn't have enough and scale. This still doesn't check precision. If an user specifies data that have too much precision, the data are used as-is.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kou I left some comments.
precision = [builder_info[:precision] || 0, value.precision].max | ||
scale = [builder_info[:scale] || 0, value.scale].max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 2 lines don't permit overwriting the value's precision and the scale with smaller values by specifying the precision and the scale in builder_info
. Is it intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide an example case?
Arrow::Array.new([BigDecimal("1.1"), BigDecimal("11.11")])
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misreading the code. It's OK now. But, I guess we may need to permit to pass precision and scale to Arrow::Array.new
as optional arguments likeArrow::Array.new([BigDecimal("1.1"), BigDecimal("11.11")], scale: 1)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If users know an expected data type, they should use Arrow::XXXArray.new
instead of Arrow::Array.new
such as Arrow::Decimal128Array.new({scale: ..., precision: ...}, [BigDecimal(...). ...])
.
Arrow::Array.new
may not build a Arrow::Decimal128Array
. In the case, the given optional arguments are just ignored. It may confuse users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
values: array.to_a, | ||
}) | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it unnecessary to test with BigDecimal::NAN
and BigDecimal::INFINITY
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't notice them.
What is the expected behavior with them? It seems that decimal classes in C++ can't represent NAN
and INFINITY
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throwing FloatDomainError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should build an Apache Arrow array with Arrow::ArrayBuilder.build
instead of raising an exception. I'll use Arrow::StringArray
for the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, It makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a validation for BigDecimal::NAN
and BigDecimal::INFINITY
to Arrow::Decimal{128,256}ArrayBuilder
. If users append BigDecimal::NAN
or BigDecimal::INFINITY
, FloatDomainError
is raised.
+1 |
…pache#13377) This requires bigdecimal 3.1.0 or later for BigDecimal#scale. Arrow::Array.new([BigDecimal]) detects the max precision and scale from BigDecimals and creates suitable Arrow::Decimal{128,256}DataType automatically. This also truncates given BigDecimal when the specified Arrow::Decimal{128,256}DataType doesn't have enough and scale. This still doesn't check precision. If an user specifies data that have too much precision, the data are used as-is. Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…pache#13377) This requires bigdecimal 3.1.0 or later for BigDecimal#scale. Arrow::Array.new([BigDecimal]) detects the max precision and scale from BigDecimals and creates suitable Arrow::Decimal{128,256}DataType automatically. This also truncates given BigDecimal when the specified Arrow::Decimal{128,256}DataType doesn't have enough and scale. This still doesn't check precision. If an user specifies data that have too much precision, the data are used as-is. Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
This requires bigdecimal 3.1.0 or later for BigDecimal#scale.
Arrow::Array.new([BigDecimal]) detects the max precision and scale
from BigDecimals and creates suitable Arrow::Decimal{128,256}DataType
automatically.
This also truncates given BigDecimal when the specified
Arrow::Decimal{128,256}DataType doesn't have enough and scale. This
still doesn't check precision. If an user specifies data that have too
much precision, the data are used as-is.