Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arithmetic division for all types #8120

Merged
merged 8 commits into from Sep 8, 2019

Conversation

@bcardiff
Copy link
Member

commented Aug 27, 2019

Note: Final behavior described at #8120 (comment)


This PR makes / work as arithmetic division for all types.

1 / 2 # => 0.5

As a reminder, for integer or floor division // should be used.

1 // 2 # => 0

Given a binary operator , the result type is usually the type of the left hand side: typeof(a ⨁ b) == typeof(a). This helps to reduce the unions in expressions like a ⨁= b.

For / this rule does not apply since the type might be a different to the operands. We want anInt32 / anInt32 to be Float32.

There are 17 "numeric" types in the stdlib: [U]Int{8,16,32,64,128}, Float{32,64}, Big{Int,Float,Decimal,Rational}, Complex. (Actually, Complex does not inherit from Number.)

I propose typeof(a / b) == max(fa(typeof(a)), fa(typeof(b))), where fa(t) defines a float affinity type.

  • Float32 if t = [U]Int{8,16,32,64}
  • Float64 if t = [U]Int128
    • 1 / UInt128::MAX ≈ 2.938e-39 < 1.175e-38 ≈ Float32::MIN_POSITIVE
  • BigFloat if t = BigInt
  • t if t = Float{32,64},Big{Float,Decimal,Rational},Complex

And max is computed under the partial order inferred by {Float32 < Float64, Float64 < BigFloat < BigDecimal < BigRational, Float64 < Complex}

All the combinations can be viewed in the following table:

Screen Shot 2019-08-27 at 13 52 20

The max relation is not defined for Complex and Big{Int,Float,Decimal,Rational}, they are extensions to the basic numbers but they do not interact well within each other. And I don't see us adding BigRationalComplex in the std-lib.

Note: I checked if typeof(a / b) == fa(typeof(a)) would be a good rule, to be more similar to the other operators . But checking how things like loop { a = a / 2 } and loop { a = 2 / a } would work under one or another make me decide in favor of typeof(a / b) == max(fa(typeof(a)), fa(typeof(b))).

I have a couple of leftovers or further thoughts after these changes:

  • The former float division is still needed, but it was only exposed as a primitive through "/". "fdiv" can now be used as primitive method name, but will be available after the release.
  • Although Big and Complex can be moved out of the std-lib, having the right behavior was worth it.
  • I was not sure if BigDecimal < BigRational or BigRational < BigDecimal. Rational has more precision, but if working with decimal they are usually preferred to stay. But this should also impact other operators. So I stay went with BigDecimal < BigRational.
  • It might make sense to move the minimal operators (those that operate within the same types) to a core module and leave the extensions to make things more comfortable in the stdlib. I tried to move the implementation a bit in that direction so see how it feels.
  • While reviewing all numbers as a whole there are different criteria applied here and there: are the conversions handled in to_X methods or in initialize/self.new? Is there a generic implementation overridden or implementation should be only in the concrete classes?
  • I would like to have more specs related to the contract in numbers in spec/support/number.cr

I tried to avoid changing as less as possible, but there is still some work to do in order to get some neat implementation. But for now is an improvement and I needed to fill some missing gaps in Big* related to conversions.

Fixes #2968

@asterite

This comment has been minimized.

Copy link
Member

commented Aug 27, 2019

For / this rule does not apply since the type might be a different to the operands. We want anInt32 / anInt32 to be Float32.

I disagree here. We should use Float64 because it's the default float type for float literals, and basically what one should use except in rare cases (interfacing with C).

@asterite

This comment has been minimized.

Copy link
Member

commented Aug 27, 2019

The former float division is still needed, but it was only exposed as a primitive through "/". "fdiv" can now be used as primitive method name, but will be available after the release.

But fdiv will eventually be deprecated, right?

@bcardiff

This comment has been minimized.

Copy link
Member Author

commented Aug 27, 2019

I disagree here. We should use Float64 because it's the default float type for float literals, and basically what one should use except in rare cases (interfacing with C).

Oops, my brain must have default it to Float32 like Int32.

But fdiv will eventually be deprecated, right?

I'm not sure. It is convenient to have a proper implementation of // since in that operation we want to keep the type of the left hand side.


If we make fa(t) = Float64 if t = [U]Int{8,16,32,64}, then the only combination that will return Float32 will be Float32 / Float32. So if a program wants to explicitly work with them it will need to be cast at the beginning.

Would that be comfortable enough?

@asterite

This comment has been minimized.

Copy link
Member

commented Aug 27, 2019

I'm not sure. It is convenient to have a proper implementation of // since in that operation we want to keep the type of the left hand side.

What's the difference between how // and fdiv work?

If we make fa(t) = Float64 if t = [U]Int{8,16,32,64}, then the only combination that will return Float32 will be Float32 / Float32. So if a program wants to explicitly work with them it will need to be cast at the beginning.

Would that be comfortable enough?

I think so. Float32 is rarely used.

@bcardiff

This comment has been minimized.

Copy link
Member Author

commented Aug 27, 2019

What's the difference between how // and fdiv work?

// is floor division.
fdiv is float division.

  def //(other)
    self.fdiv(other).floor
  end

Also, in this pr fdiv is self.class.new(self / other) to honor lhs rule, but I think fdiv should be the primitive and / be defined on top of it for the chosen type combinations.

@asterite

This comment has been minimized.

Copy link
Member

commented Aug 27, 2019

Ooooh... sorry, I got confused. I meant to ask the difference between / and fdiv. I think only one should remain, and the "primitive" one be hidden with a separate name (because people coming from Ruby might incorrectly use it). But I don't understand why / can't do the right thing and then remove fdiv.

@asterite

This comment has been minimized.

Copy link
Member

commented Aug 27, 2019

In any case having / and fdiv right now is not a big issue, we can always deprecate and remove fdiv later.

@ysbaddaden

This comment has been minimized.

Copy link
Member

commented Aug 27, 2019

I second Asterite. I believe the cases where / should return a Float32 is when either lhs or rhs is a Float32 and the other isn't more significant (not Float64, BigX or Complex).

@bcardiff

This comment has been minimized.

Copy link
Member Author

commented Aug 28, 2019

defining fa(t) as

  • Float64 if t = [U]Int{8,16,32,64,128}
  • BigFloat if t = BigInt
  • t if t = Float{32,64},Big{Float,Decimal,Rational},Complex

would lead to

Screen Shot 2019-08-28 at 12 07 17

The only a / b : Float32 is when both a, b : Float32

@bcardiff

This comment has been minimized.

Copy link
Member Author

commented Aug 28, 2019

I disabled some Big* specs in 32 bits. It seems that we stumble on that in the past already 7282cf3 .

This should be ready for review.

@ysbaddaden

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

I still don't quite understand why Int / Float32 would return a Float64? Isn't the float type in a binary operation significant? I mean typeof(1 + 1_f32) => Float32.

@asterite

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

Ah, yes, I think Int / Float32 should be Float32. Float64 should only be the default when two integers are. involved. And when doing Int / Float64 of course.

@bcardiff

This comment has been minimized.

Copy link
Member Author

commented Aug 29, 2019

@ysbaddaden, but [U]Int128 / Float32 => Float64, right?

It’s good that I kept the colored table 😅

@asterite

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

I don't think [U]Int128 / Float32 should be an exception. Any primitive integer divided by Float32 should be Float32.

@bcardiff

This comment has been minimized.

Copy link
Member Author

commented Aug 29, 2019

But then to keep the typing rule symmetrical there are some operations that will not fit:

1f32 / UInt128::MAX ≈ 2.938e-39 < 1.175e-38 ≈ Float32::MIN_POSITIVE

It also bugs me a little that if Int / Float32 is Float32, then the typing rule formulation is a bit more complex, but is paperwork on the formality side. I will leave on the side.

@asterite

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

I think Float64 by default might not be bad then. You can always convert to Float32 without losing precision. And Float32 is not common at all.

@ysbaddaden

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

@bcardiff the same reasoning applies to all other binary operations with a Float32. Lower precision means that the value will be rounded, so 1_f32 + UInt128::MAX is Infinity (actually 3.402823669209385e+38) and 1_f32 / UInt128::MAX is 0_f32 (actually 2.938735877055719e-39). I believe this is expected behavior.

I don't understand why 1 + Float32 returns a Float32 but 1 / Float32 would suddenly be changed to return a Float64 with higher precision. That's unexpected. If the rounding/low precision is a problem, one should use a Float64 (the default) not a Float32 in its operations (uncommon).

There are use cases where we want lower precision. While documenting on alternative float sizes (16, 80 an 128-bit) I read that AI / ML algorithms sometimes cast to Float16 to specifically lose precision, which reduces memory usage, but also improves results: AI behaves better with blurry results, being too precise is counter productive.

@bcardiff

This comment has been minimized.

Copy link
Member Author

commented Aug 29, 2019

@ysbaddaden yes I recall the same use cases even losing some precision.

For native types:

  • Int / Float should keep the float (whatever it is) and
  • FloatA / FloatB should max the floats,
  • IntA / IntB should max using the float affinitiy

For extensions:

  • NumA / NumB should max using the float affinitiy

with that, the table will look like follows

Screen Shot 2019-08-29 at 11 57 28

Separating native types and extensions seem needed because otherwise Float64 * BigInt would've been Float64 according to the rules for native types, instead of BigFloat which I think is preferred.

👍 ?

Copy link
Member

left a comment

👍

spec/std/int_spec.cr Outdated Show resolved Hide resolved
spec/std/int_spec.cr Outdated Show resolved Hide resolved
spec/std/int_spec.cr Outdated Show resolved Hide resolved
spec/std/int_spec.cr Outdated Show resolved Hide resolved
@bcardiff bcardiff force-pushed the bcardiff:feature/arithmetic-division branch from 5943f3d to ddbaf9d Sep 6, 2019
@bcardiff bcardiff force-pushed the bcardiff:feature/arithmetic-division branch from ddbaf9d to 1cffcc5 Sep 6, 2019
@bcardiff

This comment has been minimized.

Copy link
Member Author

commented Sep 6, 2019

Rebased on master and apply some suggested changes in the spec. I triggered a test build so I can check early how good we to ship this in 0.31.0

@bcardiff bcardiff added this to the 0.31.0 milestone Sep 8, 2019
@bcardiff bcardiff merged commit c10aef4 into crystal-lang:master Sep 8, 2019
6 checks passed
6 checks passed
ci/circleci: check_format Your tests passed on CircleCI!
Details
ci/circleci: test_darwin Your tests passed on CircleCI!
Details
ci/circleci: test_linux Your tests passed on CircleCI!
Details
ci/circleci: test_linux32 Your tests passed on CircleCI!
Details
ci/circleci: test_preview_mt Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@bcardiff bcardiff deleted the bcardiff:feature/arithmetic-division branch Sep 21, 2019
gewo added a commit to gewo/maxminddb.cr that referenced this pull request Sep 24, 2019
crystal-lang/crystal#8120 changed the behavior of `/` to always return
floats.
@gewo gewo referenced this pull request Sep 24, 2019
@elorest

This comment has been minimized.

Copy link
Contributor

commented Oct 7, 2019

Do other languages do this? I'm so used to 3/2==1 being true.

@asterite

This comment has been minimized.

Copy link
Member

commented Oct 8, 2019

@elorest Nim, Python, Elm, Haskell, Perl, to name a few.

@elorest

This comment has been minimized.

Copy link
Contributor

commented Oct 8, 2019

@asterite thanks.

MatthiasWinkelmann added a commit to MatthiasWinkelmann/cadmium that referenced this pull request Oct 16, 2019
The / operator was changed to return floats: crystal-lang/crystal#8120. This change preserves previous behaviour (floor).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.