## C05 Fast Numebrs
### Numebrs in Julia, their layout, and storage
#### Integers
Integers in Julia are stored as system integers.
- `Sys.WORD_SIZE`: show the bit width of Julia environment
- `bitstring()`: display the underlying binary representation of numbers.
- `isbitstype()`: Types such as integers and floats, whose representations are simply a set of bits, have optimized.

In [40]:
using BenchmarkTools

In [41]:
Sys.WORD_SIZE

64

In [42]:
bitstring(3)

"0000000000000000000000000000000000000000000000000000000000000011"

In [43]:
bitstring(-3)

"1111111111111111111111111111111111111111111111111111111111111101"

In [44]:
isbitstype(Int)

true

In [45]:
isbitstype(String)

false

In [46]:
myadd(x,y) = x + y;

In [47]:
@code_native myadd(1,2)

	[0m.text
	[0m.file	[0m"myadd"
	[0m.globl	[0mjulia_myadd_1587                [90m# -- Begin function julia_myadd_1587[39m
	[0m.p2align	[33m4[39m[0m, [33m0x90[39m
	[0m.type	[0mjulia_myadd_1587[0m,[0m@function
[91mjulia_myadd_1587:[39m                       [90m# @julia_myadd_1587[39m
[90m; ┌ @ /home/zpp/code/Julia-HPC/04-Numbers.ipynb:1 within `myadd`[39m
[90m# %bb.0:                                # %top[39m
	[96m[1mpush[22m[39m	[0mrbp
	[96m[1mmov[22m[39m	[0mrbp[0m, [0mrsp
[90m; │┌ @ int.jl:87 within `+`[39m
	[96m[1mlea[22m[39m	[0mrax[0m, [33m[[39m[0mrdi [0m+ [0mrsi[33m][39m
[90m; │└[39m
	[96m[1mpop[22m[39m	[0mrbp
	[96m[1mret[22m[39m
[91m.Lfunc_end0:[39m
	[0m.size	[0mjulia_myadd_1587[0m, [0m.Lfunc_end0-julia_myadd_1587
[90m; └[39m
                                        [90m# -- End function[39m
	[0m.section	[0m".note.GNU-stack"[0m,[0m""[0m,[0m@progbits


#### Integer overflow
- Maximum positive integer, calculate according to the rule of binary
    $$
    0 \underbrace{1 1 1 \cdots 1}_{N-1}
    $$
- Minimum negative integer
$$
1 \underbrace{0 0 0 \cdots 0}_{N-1}
$$

In [48]:
typemax(Int64)

9223372036854775807

In [49]:
2^63 - 1

9223372036854775807

In [50]:
sum([2^(i) for i in 63:-1:0])

-1

In [51]:
2^64 == 2^65

true

#### Solution of integer overflow - BigInt or Float 
- `BigInt` <mark>slower than regular int. But more precision than float</mark>
- `Float`
- `Int128`, `Int64` and `Int32` have similar speed; but `BigInt` take longer time than these.

In [52]:
big(2) ^ 64

18446744073709551616

In [53]:
x = rand(Int32);
y = rand(Int32);

In [54]:
@btime $(BigInt(y)) * $(BigInt(x))

  62.760 ns (3 allocations: 48 bytes)


2425388779603638717

In [55]:
@btime $(Int64(y)) * $(Int64(x))

  2.392 ns (0 allocations: 0 bytes)


2425388779603638717

In [56]:
@btime $(Int128(y)) * $(Int128(x))

  2.744 ns (0 allocations: 0 bytes)


2425388779603638717

In [57]:
@btime $(Int32(y)) * $(Int32(x))

  2.390 ns (0 allocations: 0 bytes)


-1434568259

#### The floating Point
1. The first bit is interpreted such that the number is positive if it is zero, and negative if it is one.
2. The next 11 bits are the exponent. This is interpreted as $2^{n-1023}$. In this case, this is 10000000000 in binary, and 1,024 in decimal. Thus, the value of the exponent is $2^{1024-1023}$ which is $2^{1}$, or $2$.
3. The last 52 bits are known as significand. The set of bits is interpreted as a binary fraction: $1, b_{1}, b_{2}, \cdots, b_{52}$. This represents the real number:
$$
1 + \sum_{i=1}^{52}b_{i} \times 2^{-i}
$$
- Integration:
$$
(-1)^{\text{sign}}(2^{n - 1023}) \times\left(1 + \sum_{i=1}^{52}b_{i} \times 2^{-i}\right)
$$

In [58]:
bitstring(2.5)

"0100000000000100000000000000000000000000000000000000000000000000"

In [59]:
bitstring(-2.5)

"1100000000000100000000000000000000000000000000000000000000000000"

In [219]:
function floatcalculation(x::String)
    # println(x)
    sign_value = parse(Int64, x[1])
    exponent_value = parse.(Int64,[x[2:12]...])
    significand_value = parse.(Int64, [x[13:end]...])

    signn = (-1)^sign_value
    exponent = 2^(sum(exponent_value .* [2^(i-1) for i in 11:-1:1]) - 1023)
    signif =  1 + sum(significand_value .* [1/2^(i) for i in 1:length(significand_value)])

    return signn * exponent * signif
end

floatcalculation (generic function with 1 method)

In [235]:
t = bitstring(-2.0897)
floatcalculation(t)

-2.0897

In [189]:
function floatbits(x::Float64)
    b = bitstring(x)
    b[1:1] * "|" * b[2:12] * "|" * b[13:end]
end

floatbits (generic function with 1 method)

In [67]:
floatbits(5.0)

"0|10000000001|0100000000000000000000000000000000000000000000000000"

In [69]:
2^10

1024

In [70]:
2^(2^10 + 1 - (2^10 - 1)) + 1 + 

4

#### Floating point accuracy


In [236]:
0.1 > 1//10

true

In [238]:
Rational(1,10)

1//10

In [241]:
float(big(Rational(.1)))

0.1000000000000000055511151231257827021181583404541015625

### Trading performance for accuracy

### Subnormal numbers