Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upcasting of arithmetic to Int #1078

Closed
dcampbell24 opened this issue Jul 23, 2012 · 8 comments
Closed

upcasting of arithmetic to Int #1078

dcampbell24 opened this issue Jul 23, 2012 · 8 comments
Labels
status:won't change Indicates that work won't continue on an issue or pull request

Comments

@dcampbell24
Copy link
Contributor

While working on adding to The Computer Language Benchmark code, I have found that the types of expressions are often not what I want and counter-intuitive. The problem is that if I have a::Int8 and I do "a + 5", the type of the expression is an Int and in order to get the expected type (Int8) I must do a lot of manual coercion. I also can't write "a::Int8 = 5" or even "a::Int8 = int8(5)". This makes doing any kind of arithmetic where I don't want to use the default numeric type troublesome. I suggest treating numeric literals and constants the way Go does. Besides making working with types other than Int easier, this would also allow defining more precise numeric constants and allow constants to be used in a more flexible manner.

@dcampbell24
Copy link
Contributor Author

This is related and also seems strange to me.

julia> a = int8(2)
2

julia> b = int8(2)
2

julia> typeof(a + b)
Int64

@JeffBezanson
Copy link
Sponsor Member

a::Int8 = 5 works for local variables, but not globals since declaring types for globals is not supported yet. There is something to be said for making numeric literals arbitrary precision, but that is a separate issue from how arithmetic works. The question is, why do you need all intermediate results to be Int8? Our idea was to give the correct answer as long as it fits in an Int, and only do conversions to smaller types as needed for I/O or storage.

@dcampbell24
Copy link
Contributor Author

If I have a large collection of values, I may want them to be smaller than Int to save memory . I want functions that transform the values to return values of the same type so that the size and type of the collection will not change. I may just be taking the wrong approach, but here is an example where I was not sure what to do other than put uint8() calls everywhere. The issue may not be so much that the intermediate value is not an int8, but that the return type is also not an int8, and that the function returns in many places. Being able to automatically convert the return type would also work.

@ViralBShah
Copy link
Member

So, in this example, there will be unnecessary upcasting and downcasting?

julia> a = ones(Int32,5)
5-element Int32 Array:
 1
 1
 1
 1
 1

julia> a[5] = int32(1)+int32(5)
6

julia> typeof(a)
Array{Int32,1}

If I am going through the trouble of using a smaller integer type, it must be for a reason. If I want the right thing to happen, I wouldn't specify the types. I would much rather that the intermediates remain in the same precision.

@StefanKarpinski
Copy link
Sponsor Member

@davekong: in your example, I wonder why you're working specifically with Int8 values here? Why not write this for arbitrary integer values and then wrap a call to that in a method that casts to the correct type? Something like this:

function shift(cell_::Int, dir::Int)
    div5_mod2(x) = div(x-1,5) % 2 != 0
    if dir == E
        return cell_ + 1
    elseif dir == ESE
        if div5_mod2(cell_)
            return cell_ + 7
        else
            return cell_ + 6
        end
    elseif dir == SE
        if div5_mod2(cell_)
            return cell_ + 6
        else
            return cell_ + 5
        end
    elseif dir == S
        return cell_ + 10
    elseif dir == SW
        if div5_mod2(cell_)
            return cell_ + 5
        else
            return cell_ + 4
        end
    elseif dir == WSW
        if div5_mod2(cell_)
            return cell_ + 4
        else
            return cell_ + 3
        end
    elseif dir == W
        return cell_ - 1
    elseif dir == WNW
        if div5_mod2(cell_)
            return cell_ - 6
        else
            return cell_ - 7
        end
    elseif dir == NW
        if div5_mod2(cell_)
            return cell_ - 5
        else
            return cell_ - 6
        end
    elseif dir == N
        return cell_ - 10
    elseif dir == NE
        if div5_mod2(cell_)
            return cell_ - 4
        else
            return cell_ - 5
        end
    elseif dir == ENE
        if div5_mod2(cell_)
            return cell_ - 3 
        else
            return cell_ - 4
        end
    end
    return cell_
end
shift(cell_::Int8, dir::Int8) = int8(shift(int(cell_),int(dir)))

[Note that your div5_mod2 operations is probably not doing what you intended since the / operator does not do truncated integer division in Julia.] The last method could be generalized to something like this:

shift{T<:Integer}(cell_::T, dir::T) = convert(T,shift(int(cell_),int(dir)))

The notion that there is "unnecessary upcasting and downcasting" going on here is a bit naive. CPUs generally work at maximal efficiency with values of the "native" register size: 32-bits on a 32-bit platform, 64-bits on a 64-bit platform. In other words, precisely the size of a Julia Int. Manipulating smaller values is probably slower and requires more work, not less. In order to add 8-bit values and get an 8-bit result, you need to sign extend each byte to a larger size to fill a register, add the larger values, and extract the low byte of the result. In effect, by forcing each operation to have an 8-bit result, you'd be forcing the CPU to do more upcasting and downcasting: upcasting to do the actual operations and downcasting to get an 8-bit result after doing each operation. Of course, LLVM is probably smart enough to just use native-sized integer arithmetic throughout instead. By the same token, when you do have a situation where you upcast and then downcast in a way that could be done more efficiently with a single smaller type — e.g. 32 bits operations on a 64-bit machine that has 32-bit registers — LLVM is probably smart enough to just do it in the smaller type (e.g. 32 bits) if that's the right thing to do. Fiddling around with non-native sizes for arithmetic operations is generally not helpful. Use native arithmetic and let LLVM do its thing.

There's also the issue of generated code bloat. That's part of the design here. By doing all core arithmetic with Ints, we tend to reduce the exhibited polymorphism for many kinds of operations, in turn reducing the number of different versions of each operation that need to be generated and kept around.

@StefanKarpinski
Copy link
Sponsor Member

Another relatively simple way to force Int8 results here would be to declare ret::Int8 and then assign to that type location:

function shift(cell_::Int8, dir::Int8)
    ret::Int8
    div5_mod2(x) = div(x-1,5) % 2 != 0
    if dir == E
        ret = cell_ + 1
    elseif dir == ESE
        if div5_mod2(cell_)
            ret = cell_ + 7
        else
            ret = cell_ + 6
        end
    elseif dir == SE
        if div5_mod2(cell_)
            ret = cell_ + 6
        else
            ret = cell_ + 5
        end
    elseif dir == S
        ret = cell_ + 10
    elseif dir == SW
        if div5_mod2(cell_)
            ret = cell_ + 5
        else
            ret = cell_ + 4
        end
    elseif dir == WSW
        if div5_mod2(cell_)
            ret = cell_ + 4
        else
            ret = cell_ + 3
        end
    elseif dir == W
        ret = cell_ - 1
    elseif dir == WNW
        if div5_mod2(cell_)
            ret = cell_ - 6
        else
            ret = cell_ - 7
        end
    elseif dir == NW
        if div5_mod2(cell_)
            ret = cell_ - 5
        else
            ret = cell_ - 6
        end
    elseif dir == N
        ret = cell_ - 10
    elseif dir == NE
        if div5_mod2(cell_)
            ret = cell_ - 4
        else
            ret = cell_ - 5
        end
    elseif dir == ENE
        if div5_mod2(cell_)
            ret = cell_ - 3 
        else
            ret = cell_ - 4
        end
    else
        ret = cell_
    end
    return ret
end

@JeffBezanson
Copy link
Sponsor Member

Return type declaration is a planned feature.

@JeffBezanson
Copy link
Sponsor Member

See #1090.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:won't change Indicates that work won't continue on an issue or pull request
Projects
None yet
Development

No branches or pull requests

4 participants