# Type-instability of log function #5805

Closed
opened this Issue Feb 13, 2014 · 33 comments

None yet

### 12 participants

Contributor
commented Feb 13, 2014
 I just noticed that the `log` function shows some kind of type instability: ``````julia> log(e) 1 julia> typeof(ans) Int64 `````` but ``````julia> log10(10) 1.0 julia> typeof(ans) Float64 `````` The same happens with `log2(2)` or computing `log(x,x)` for any `Float64`; in particular for `log(e,e)`. Is this the expected behaviour?
Member
 `e` is special: `typeof(e)` is `MathConst{:e}`, so this isn't type instability. I don't know the logic of returning `1` versus `1.0`, though. I guess one could argue that `1` is more exact.
Contributor
commented Feb 13, 2014
 The relevant source is in `constants.jl`: `log(::MathConst{:e}) = 1 # use 1 to correctly promote expressions like log(x)/log(e)` although I'm not sure I understand the comment.
Owner
 I don't think the statement about `log2(2)` is correct; I get float 1.0 for `log(x,x)`, `log(e,e)`, `log2(2)`, etc.
Owner
 Ah, "the same" was ambiguous --- I guess it refers to the 1.0, not the whole 1 vs. 1.0 thing.
Owner
 Using an integer `1` for `log(e)` gives a Float32 for `log(2.0f0)/log(e)`.
Contributor
commented Feb 13, 2014
 I do not understand either the comment of the relevant line in the source code. My point is the following: The idea of type stability, as I understand it, is to define functions such that you get the same type consistently. That's the philosophy behind the behaviour of `sqrt`, just to quote an example. Another is that `1/1=1.0`. That is the reason that I would have anticipated that `log(e)` yields `1.0` (instead on `1`) just as `log2(2)=1.0` or, using another `MathConst`, `log(pi,pi)=1.0`.
Owner
 Unfortunately the concept of "consistent" is not consistent. As @simonster said, `e` is not a `Float64`, but a different type. That allows you to do this: ``````julia> big(e) 2.718281828459045235360287471352662497757247093699959574966967627724076630353555e+00 with 256 bits of precision ``````
Member
 Put another way: if you force `e` to be a float, you get the expected result: ```julia> log(float(e)) 1.0 julia> log(1.0e) 1.0```
Owner
 Unfortunately the concept of "consistent" is not consistent. Cute as that phrase is, that's not in fact what's going on here. Rather, @lbenet's notion of type stability is not quite what we mean by it: The idea of type stability, as I understand it, is to define functions such that you get the same type consistently. The return type does not need to be universally consistent, but it should only depend on the types of its arguments, not their values. A function is type-stable if the type of of the value it returns only depends on the types of its arguments. In the case of `log(e)` it's ok to return an `Int` because we always return an `Int` when the argument to `log` is of type `MathConst{:e}`.
Contributor
commented Feb 13, 2014
 Thanks for clarifying type stability, and sorry for my lack of understanding. I simply found curious that `log(e)` returns an `Int` while `log2(2)` or `log(pi,pi)` return a `Float64`, or `log(big(e))` returns a `BigFloat`.
Contributor
commented Feb 13, 2014
 It still seems weird to me from an aesthetic/consistency perspective that `log(e)!==log2(2)` and `log(1*e)===log(1. * e)!==log(e)`, even if it's fine from the type-stability performance perspective.
Member
commented Feb 13, 2014
 Reasoning about the behavior of `MathConst`s can be surprisingly tricky; see e.g. #5537. I don't see any good reason why we need to have `log(e)!==log(e,e)`, though. Arguably, having `log(e)===1` is even inconsistent with the Euler's definition of the logarithm as the limit of `(n * x^(1/n) - 1)` as `n->Inf`; it doesn't make sense to definite the quantity whose limit is being taken to be restricted to integers, and I'm not sure it's conceptually worth the trouble to define the limit over some field to be a quantity that lies not in that field, but in a subfield.
Contributor
 Yeah, if the only case where you get an `Int` is when you pass `e`, when can it be useful? I mean, if you know you're passing `e` to `log()` and you know you want an `Int`, you already know you'll get `1` as a result, and you could write it by hand as well. ;-) Also, this special-casing does not exist for `cos(pi)`, which returns a `-1.0`.
Owner
 I agree the behavior is a bit marginal; `log(e) = 1` is basically the one and only definition of that kind in constants.jl. It's hard to know where to stop adding definitions for `MathConst`, but we've erred on the side of minimalism and should probably stick with that.
Owner
 I would be totally fine with getting rid of this definition – this is not meant to be a mini symbolic math system. I think that @stevengj is the one who added it, so we may want to let him argue for it's retention, however.
Owner
 Agree with Jeff's point: we probably shouldn't have `log(e)` unless we're going to have `sin(pi)`.
Owner
 The rationale for defining `log(e)` was that dividing by a logarithm of a constant is quite common as a way of factoring out the base of a logarithm from expressions, and it seemed logical to avoid the extra computation in the case where the base is `e` (in code that is generic about the base). It should return an integer so as to avoid unnecessary promotion when combined with, e.g., single-precision expressions. I think `log(e, x)` should be defined to be equivalent to `log(x)`, in any case. I don't have any problem with `sin(pi)=0` and `cos(pi)=-1` etc., but I think the need for these is less pronounced.
Owner
 @jiahao, I don't really understand your point about Euler. Everyone agrees that log(e) is an integer, regardless of whether you get that integer as a limit of non-integral values; the convention that we identify the integers with a subset of the reals is uncontroversial. I don't think Euler would have considered 1 and 1.0 to be distinct values. The question of what type to return for `log(e)` in Julia is purely one of technical convenience, it seems to me.
Contributor
commented Feb 14, 2014
 The rationale for defining log(e) was that dividing by a logarithm of a constant is quite common as a way of factoring out the base of a logarithm from expressions, and it seemed logical to avoid the extra computation in the case where the base is e (in code that is generic about the base). It should return an integer so as to avoid unnecessary promotion when combined with, e.g., single-precision expressions. @stevengj, thanks for the explanation. Note that `log2(2f0)` results in `1.0f0` and not 1. Then, `log2(2)` and `log2(2.0)` should be defined to yield 1, and something similar should be implemented for `log10` and any `log(x,x)`.
Owner
 @lbenet, the type instability problems introduced by making `log2(2) = 1` will overwhelm any possible efficiency gains from the saved computation. (In any case, I think if people are going to do calculations with logarithms in a generic base, they will generally be using `log` rather than `log2` or `log10`.) In contrast, as was explained above, `log(e) = 1` is actually type-stable because `e` has a special type (unlike `2` or `10`, which have the same type).
Contributor
 If the goal is that `log(e)` returns `1` of the most appropriate type for efficiency, it looks like it should actually return a new `MathConst` equal to `1`, since that's really what it means. Probably not worth it, though.
Member
commented Feb 15, 2014
 `MathConst` is about, as I understand it, allowing certain irrational constants to have arbitrary precision. `1` only has one significant digit.
Contributor
 @pao Yeah, but they also have the property that they are automatically converted to the type of other arguments in arithmetic operations, which is not the case of `Float32` and `Float64`. Anyway, I was just willing to point to where this reasoning might conceptually lead us. I don't think that's the way to go.
Owner
 @pao, as Kahan says, you are perhaps confusing accuracy with precision when you talk about "significant digits" (≠ accurate digits). `1` is exactly represented in every numeric type, regardless of the precision. There is no error in using `1` to represent log(e), regardless of what type we return. (As another example from Kahan, 3.000000000000 is very precise, but it is simultaneously a very inaccurate approximation for π.) Or perhaps this is what you were trying to say? @nalimilan, an `Int` constant is also automatically promoted to the appropriate floating-point type when it is used in arithmetic operations with those types. Actually, it is better than that, because if you multiply an `Int` by a `BigFloat` then it uses a special-purpose routine that is more efficient than promoting the integer to a `BigFloat` first. The only question, from my perspective, is avoiding unwanted promotions (when combined with floating-point types, e.g. `Float32`) while maximizing efficiency. Avoiding unwanted promotions rules out returning a `Float64`, but any hardware integer type is fine, and using `Int` seems to result in the smallest number of instructions. A `MathConst` is totally unnecessary. In fact, assuming `log(e) = 1` is inlined, LLVM tends to optimize it out of multiplications anyway, and often seems to inline any needed promotion.
Member
commented Feb 16, 2014
 Or perhaps this is what you were trying to say? I think it was what I was trying to say, but the way you say it overcomplicates the trivial point I was making, which was that you don't need arbitrary precision to specify `1`. `MathConst` is overkill.
Contributor
 @stevengj OK, got it.
Contributor
 (Beating my favorite dead horse.) Returning a Float64 wouldn't be a problem if mixed floating point types promoted away from Float64. You don't get Float32 or Float16 values by accident, whereas Float64 appears everywhere, e.g. floating point literals unless you append f0, constructors like ones unless you add a type specifier, and as discussed in this issue, mathematical operations on MathConst like cos(pi).
Owner
 What about BigFloat with Float64 in that case? Should that produce Float64 or BigFloat?
Owner
 It's pretty widespread in languages that operations promote to the widest operand; I don't think Julia should buck that trend. Anyway, please open a separate issue if you want to continue on that point. On this issue, is there a resolution?
Member
commented Feb 16, 2014
 I originally thought of the limit argument as saying that the limit of an expression where pointwise every value is only float-representable should have a limit that is a float (even if the final value is exactly presentable as a non-float). While I do see the point about Float64 polluting integers and lower-precision floats, I'm still not convinced, though, that having the inconsistency that `log(e) !== log(e, e)` is a good thing.
Owner
 I think there's two reasonable courses of action: get rid of the special case for `log(e)`. add special cases for all the obvious identities involving math constants: `log(e,e) == 1`, etc. Since math consts are primarily so that you can use them as literals in generic code that produce correct values of the appropriate type and precision and the only cases where I can see `log(e) == 1` being useful are when you have the `e` value coming from outside the scope in which the expression occurs, I'm mildly in favor of (1) rather than (2).
Owner
 @jiahao, my suggestion is to simply add a log(e,x)=log(x) method.
Member
commented Feb 16, 2014
 Yes, I think that would address my concern.
added a commit that referenced this issue Feb 17, 2014
 stevengj `log(e,x) = log(x), as discussed in #5805` `c1ed5e7`