Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-10600: [Go] Implement Decimal256 #13792

Merged
merged 13 commits into from
Aug 5, 2022

Conversation

zeroshade
Copy link
Member

No description provided.

@github-actions
Copy link

github-actions bot commented Aug 3, 2022

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't go through everything, assuming much of the code here is copied and adapted from decimal128?

go/arrow/datatype_fixedwidth.go Outdated Show resolved Hide resolved
go/arrow/datatype_fixedwidth.go Outdated Show resolved Hide resolved
func fromPositiveFloat32(v float32, prec, scale int32) (Num, error) {
var pscale float32
if scale >= -76 && scale <= 76 {
pscale = float32PowersOfTen[scale+76]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to cast v to float64 and then call fromPositive64? You would 1) remove a lot of code 2) get better precision (float32PowersOfTen contains some zeros and infinities for very small or large scales)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So i tried that originally when I was writing the ToFloat functionalities for decimal128 and found (very) small precision issues which I assumed would carry through to the From functions. though right now trying this locally I can't reproduce those precision issues, maybe it was a windows thing? I don't know. That said, given that those precision issues would be outside the bounds of what is considered even reasonable for a float, I agree with you that it's probably just easier/better to cast to float64 and use the fromPositiveFloat64 / tofloat64Positive functions and remove the excess code. I'll go do that.

go/arrow/decimal256/decimal256.go Outdated Show resolved Hide resolved
go/arrow/decimal256/decimal256.go Outdated Show resolved Hide resolved
if n == (Num{}) {
return 0
}
return int(1 | (int64(n.arr[3]) >> 63))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the purpose here to avoid branching? (otherwise why not a simple conditional?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the purpose here was to minimize branching as this could be potentially called frequently. I actually pulled this from code in Go's internals

go/arrow/decimal256/decimal256.go Outdated Show resolved Hide resolved

switch to.ID() {
case arrow.DECIMAL256:
return NewDecimal256Scalar(s.Value, to), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is missing some casting of s.Value in case the scales don't match.
And even if the scales match, should probably check that the value still fits in the new precision.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed this, calling Rescale and FitsInPrecision

@@ -297,6 +298,8 @@ func (s *Decimal128) CastTo(to arrow.DataType) (Scalar, error) {
switch to.ID() {
case arrow.DECIMAL128:
return NewDecimal128Scalar(s.Value, to), nil
case arrow.DECIMAL256:
return NewDecimal256Scalar(decimal256.FromDecimal128(s.Value), to), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below: should take into account differences in scale or precision.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed calling Rescale and FitsInPrecision

go/arrow/array/decimal256_test.go Show resolved Hide resolved
@zeroshade
Copy link
Member Author

I'm gonna bug you again @wolfeidau to have a look as we still lack many Go developers here who can take a look at these. 😄 After this, the union PR and one more PR afterwards (implementing unions in the IPC handling) Go will officially support all of the Arrow data types!

Copy link
Contributor

@wolfeidau wolfeidau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again only one small observation.

Happy to do what i can to review this, great work getting this last type working!

// all values in v are appended and considered valid.
func (b *Decimal256Builder) AppendValues(v []decimal256.Num, valid []bool) {
if len(v) != len(valid) && len(valid) != 0 {
panic("len(v) != len(valid) && len(valid) != 0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this missing the convention i have seen with the message being prefixed by "arrow: "?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point! I've added the prefixes.

Thanks for the reviews! technically Go has better Type support than C++ now! Since C++ doesn't support Float16 lol!

@pitrou
Copy link
Member

pitrou commented Aug 5, 2022

There seem to be a lot of Go CI failures suddenly.

@zeroshade
Copy link
Member Author

@pitrou Yea, when I merged the Union type PR it added new requirements to the Builder types which I hadn't implemented in the Decimal256Builder so the build failed. I just had to add the new required interface methods to the Decimal256Builder and now all should be good.

@zeroshade zeroshade merged commit 6d1bc62 into apache:master Aug 5, 2022
@zeroshade zeroshade deleted the arrow-10600-decimal256 branch August 5, 2022 16:28
@ursabot
Copy link

ursabot commented Aug 5, 2022

Benchmark runs are scheduled for baseline = 6a3fb97 and contender = 6d1bc62. 6d1bc62 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.14% ⬆️0.71%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.39% ⬆️0.92%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 6d1bc624 ec2-t3-xlarge-us-east-2
[Failed] 6d1bc624 test-mac-arm
[Finished] 6d1bc624 ursa-i9-9960x
[Finished] 6d1bc624 ursa-thinkcentre-m75q
[Finished] 6a3fb97a ec2-t3-xlarge-us-east-2
[Finished] 6a3fb97a test-mac-arm
[Finished] 6a3fb97a ursa-i9-9960x
[Finished] 6a3fb97a ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link

ursabot commented Aug 5, 2022

['Python', 'R'] benchmarks have high level of regressions.
test-mac-arm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants