Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arithmetic division for all types #8120

Merged
merged 8 commits into from Sep 8, 2019

Conversation

bcardiff
Copy link
Member

@bcardiff bcardiff commented Aug 27, 2019

Note: Final behavior described at #8120 (comment)


This PR makes / work as arithmetic division for all types.

1 / 2 # => 0.5

As a reminder, for integer or floor division // should be used.

1 // 2 # => 0

Given a binary operator , the result type is usually the type of the left hand side: typeof(a ⨁ b) == typeof(a). This helps to reduce the unions in expressions like a ⨁= b.

For / this rule does not apply since the type might be a different to the operands. We want anInt32 / anInt32 to be Float32.

There are 17 "numeric" types in the stdlib: [U]Int{8,16,32,64,128}, Float{32,64}, Big{Int,Float,Decimal,Rational}, Complex. (Actually, Complex does not inherit from Number.)

I propose typeof(a / b) == max(fa(typeof(a)), fa(typeof(b))), where fa(t) defines a float affinity type.

  • Float32 if t = [U]Int{8,16,32,64}
  • Float64 if t = [U]Int128
    • 1 / UInt128::MAX ≈ 2.938e-39 < 1.175e-38 ≈ Float32::MIN_POSITIVE
  • BigFloat if t = BigInt
  • t if t = Float{32,64},Big{Float,Decimal,Rational},Complex

And max is computed under the partial order inferred by {Float32 < Float64, Float64 < BigFloat < BigDecimal < BigRational, Float64 < Complex}

All the combinations can be viewed in the following table:

Screen Shot 2019-08-27 at 13 52 20

The max relation is not defined for Complex and Big{Int,Float,Decimal,Rational}, they are extensions to the basic numbers but they do not interact well within each other. And I don't see us adding BigRationalComplex in the std-lib.

Note: I checked if typeof(a / b) == fa(typeof(a)) would be a good rule, to be more similar to the other operators . But checking how things like loop { a = a / 2 } and loop { a = 2 / a } would work under one or another make me decide in favor of typeof(a / b) == max(fa(typeof(a)), fa(typeof(b))).

I have a couple of leftovers or further thoughts after these changes:

  • The former float division is still needed, but it was only exposed as a primitive through "/". "fdiv" can now be used as primitive method name, but will be available after the release.
  • Although Big and Complex can be moved out of the std-lib, having the right behavior was worth it.
  • I was not sure if BigDecimal < BigRational or BigRational < BigDecimal. Rational has more precision, but if working with decimal they are usually preferred to stay. But this should also impact other operators. So I stay went with BigDecimal < BigRational.
  • It might make sense to move the minimal operators (those that operate within the same types) to a core module and leave the extensions to make things more comfortable in the stdlib. I tried to move the implementation a bit in that direction so see how it feels.
  • While reviewing all numbers as a whole there are different criteria applied here and there: are the conversions handled in to_X methods or in initialize/self.new? Is there a generic implementation overridden or implementation should be only in the concrete classes?
  • I would like to have more specs related to the contract in numbers in spec/support/number.cr

I tried to avoid changing as less as possible, but there is still some work to do in order to get some neat implementation. But for now is an improvement and I needed to fill some missing gaps in Big* related to conversions.

Fixes #2968

@asterite
Copy link
Member

For / this rule does not apply since the type might be a different to the operands. We want anInt32 / anInt32 to be Float32.

I disagree here. We should use Float64 because it's the default float type for float literals, and basically what one should use except in rare cases (interfacing with C).

@asterite
Copy link
Member

The former float division is still needed, but it was only exposed as a primitive through "/". "fdiv" can now be used as primitive method name, but will be available after the release.

But fdiv will eventually be deprecated, right?

@bcardiff
Copy link
Member Author

bcardiff commented Aug 27, 2019

I disagree here. We should use Float64 because it's the default float type for float literals, and basically what one should use except in rare cases (interfacing with C).

Oops, my brain must have default it to Float32 like Int32.

But fdiv will eventually be deprecated, right?

I'm not sure. It is convenient to have a proper implementation of // since in that operation we want to keep the type of the left hand side.


If we make fa(t) = Float64 if t = [U]Int{8,16,32,64}, then the only combination that will return Float32 will be Float32 / Float32. So if a program wants to explicitly work with them it will need to be cast at the beginning.

Would that be comfortable enough?

@asterite
Copy link
Member

I'm not sure. It is convenient to have a proper implementation of // since in that operation we want to keep the type of the left hand side.

What's the difference between how // and fdiv work?

If we make fa(t) = Float64 if t = [U]Int{8,16,32,64}, then the only combination that will return Float32 will be Float32 / Float32. So if a program wants to explicitly work with them it will need to be cast at the beginning.

Would that be comfortable enough?

I think so. Float32 is rarely used.

@bcardiff
Copy link
Member Author

What's the difference between how // and fdiv work?

// is floor division.
fdiv is float division.

  def //(other)
    self.fdiv(other).floor
  end

Also, in this pr fdiv is self.class.new(self / other) to honor lhs rule, but I think fdiv should be the primitive and / be defined on top of it for the chosen type combinations.

@asterite
Copy link
Member

Ooooh... sorry, I got confused. I meant to ask the difference between / and fdiv. I think only one should remain, and the "primitive" one be hidden with a separate name (because people coming from Ruby might incorrectly use it). But I don't understand why / can't do the right thing and then remove fdiv.

@asterite
Copy link
Member

In any case having / and fdiv right now is not a big issue, we can always deprecate and remove fdiv later.

@ysbaddaden
Copy link
Contributor

I second Asterite. I believe the cases where / should return a Float32 is when either lhs or rhs is a Float32 and the other isn't more significant (not Float64, BigX or Complex).

@bcardiff
Copy link
Member Author

defining fa(t) as

  • Float64 if t = [U]Int{8,16,32,64,128}
  • BigFloat if t = BigInt
  • t if t = Float{32,64},Big{Float,Decimal,Rational},Complex

would lead to

Screen Shot 2019-08-28 at 12 07 17

The only a / b : Float32 is when both a, b : Float32

@bcardiff
Copy link
Member Author

I disabled some Big* specs in 32 bits. It seems that we stumble on that in the past already 7282cf3 .

This should be ready for review.

@ysbaddaden
Copy link
Contributor

ysbaddaden commented Aug 29, 2019

I still don't quite understand why Int / Float32 would return a Float64? Isn't the float type in a binary operation significant? I mean typeof(1 + 1_f32) => Float32.

@asterite
Copy link
Member

Ah, yes, I think Int / Float32 should be Float32. Float64 should only be the default when two integers are. involved. And when doing Int / Float64 of course.

@bcardiff
Copy link
Member Author

bcardiff commented Aug 29, 2019

@ysbaddaden, but [U]Int128 / Float32 => Float64, right?

It’s good that I kept the colored table 😅

@asterite
Copy link
Member

I don't think [U]Int128 / Float32 should be an exception. Any primitive integer divided by Float32 should be Float32.

@bcardiff
Copy link
Member Author

But then to keep the typing rule symmetrical there are some operations that will not fit:

1f32 / UInt128::MAX ≈ 2.938e-39 < 1.175e-38 ≈ Float32::MIN_POSITIVE

It also bugs me a little that if Int / Float32 is Float32, then the typing rule formulation is a bit more complex, but is paperwork on the formality side. I will leave on the side.

@asterite
Copy link
Member

I think Float64 by default might not be bad then. You can always convert to Float32 without losing precision. And Float32 is not common at all.

@ysbaddaden
Copy link
Contributor

ysbaddaden commented Aug 29, 2019

@bcardiff the same reasoning applies to all other binary operations with a Float32. Lower precision means that the value will be rounded, so 1_f32 + UInt128::MAX is Infinity (actually 3.402823669209385e+38) and 1_f32 / UInt128::MAX is 0_f32 (actually 2.938735877055719e-39). I believe this is expected behavior.

I don't understand why 1 + Float32 returns a Float32 but 1 / Float32 would suddenly be changed to return a Float64 with higher precision. That's unexpected. If the rounding/low precision is a problem, one should use a Float64 (the default) not a Float32 in its operations (uncommon).

There are use cases where we want lower precision. While documenting on alternative float sizes (16, 80 an 128-bit) I read that AI / ML algorithms sometimes cast to Float16 to specifically lose precision, which reduces memory usage, but also improves results: AI behaves better with blurry results, being too precise is counter productive.

@bcardiff
Copy link
Member Author

@ysbaddaden yes I recall the same use cases even losing some precision.

For native types:

  • Int / Float should keep the float (whatever it is) and
  • FloatA / FloatB should max the floats,
  • IntA / IntB should max using the float affinitiy

For extensions:

  • NumA / NumB should max using the float affinitiy

with that, the table will look like follows

Screen Shot 2019-08-29 at 11 57 28

Separating native types and extensions seem needed because otherwise Float64 * BigInt would've been Float64 according to the rules for native types, instead of BigFloat which I think is preferred.

👍 ?

Copy link
Contributor

@ysbaddaden ysbaddaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

spec/std/int_spec.cr Outdated Show resolved Hide resolved
spec/std/int_spec.cr Outdated Show resolved Hide resolved
spec/std/int_spec.cr Outdated Show resolved Hide resolved
spec/std/int_spec.cr Outdated Show resolved Hide resolved
@bcardiff
Copy link
Member Author

bcardiff commented Sep 6, 2019

Rebased on master and apply some suggested changes in the spec. I triggered a test build so I can check early how good we to ship this in 0.31.0

@elorest
Copy link
Contributor

elorest commented Oct 7, 2019

Do other languages do this? I'm so used to 3/2==1 being true.

@asterite
Copy link
Member

asterite commented Oct 8, 2019

@elorest Nim, Python, Elm, Haskell, Perl, to name a few.

@elorest
Copy link
Contributor

elorest commented Oct 8, 2019

@asterite thanks.

@j8r
Copy link
Contributor

j8r commented Nov 23, 2020

For the floor division of a Float, it will be logical to return an Integer instead of a Float.
According to the definition of the floor division:

[..] the floor function is the function that takes as input a real number x, and gives as output the greatest integer less than or equal to x

For now we have 3.0 // 2 #=> 1.0

What do you think? It will make sense on the mathematical level, but maybe not on the Crystal language one, where usually calling a method on a type returns the same type.

@j8r
Copy link
Contributor

j8r commented Nov 23, 2020

For example, in Julia:

x = floor(1 / 2)
print(typeof(x)) #=> Float64
x = floor(Int, 1 / 2)
print(typeof(x)) #=> Int

I guess the actual behavior in Crystal makes also sense, I'm just going to add to_i32.

@asterite
Copy link
Member

The reason is that the range of a Float64 is bigger than that of an Int32 (or Int64). So if you do big_float // 1 you really want a float as a result because it can't be represented as an integer.

@j8r
Copy link
Contributor

j8r commented Nov 23, 2020

Another solution I came across for those wanting more mathematical "correctness" is to use Big.
For my side I won't because of performance reasons.

yous added a commit to yous/basehangul-crystal that referenced this pull request Jul 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make / be float division
6 participants