# Integers and Floating-Point numbers

Although much of this is taken care of in the background, there are situations where it is very good to know and understand how things work. This is important when performing operations with different amounts of 'precision'.

You can think of an integer 2.0 as a 'literal' and 1.0 as a floating-point 'literal'.

Floating-point numbers (Floats) are close to 'scientific notation' of numbers. Floating-point numbers have a significant like 0.375 and an exponent to scale it by factors of 10. So 0.375 is 3.75 x 10^-1. But on the computer it is not base 10 but base 2. 

$value = significand \times base^{exponent}$

0.375x2=0.75, (integer part is 0), 0.75x2=1.50.75x2=1.5, (integer part is 1), 0.5x2=1.00.5x2=1.0 (integer part is 1), 0.011, so 0.375 becomes $1.1 x 10^{-2}$

Integers do not have this format and are just the raw numbers but can come in the form of signed and unsigned. By 'unsigned' it means that the integer can only take positive values and 'signed' can take positive and negative. Unsigned integers can store larger numbers for the same number of bit allocated to memory. If the numbers are stored in 8 bits, the first bit from the left signifies the 'sign of the number' and the rest of the binary numbers the value. 


### Integer Types

Signed

- Int8
- Int16
- Int32
- Int64
- Int128

UnSigned

- UInt8
- UInt16
- UInt32
- UInt64
- UInt128

UInt128, means that there 128 bits used to store the number $0 to 2^128 - 1$, and in Int128 $-2^{128-1} to 2^{128-1} - 1$ bits for the number and the first bit for to say whether the number is positive or negative.

### Floats

- Float16 (half precision)
- Float32 (single precision)
- Float64 (double precision)

Float32 has 4 bytes and has 3 parts, the sign, the exponent and the significand (mantissa). Sign part has a '0' to denote a positive number and '1' for a negative number. 

* (each 'byte' has 8 bits) *

In [1]:
# if you are on a 64 bit computer it will be 64 and 32 likewise
a = 1
print( typeof(a) )

Int64

In [2]:
# if you are on a 64 bit computer it will be 64 and 32 likewise
b = 1.0
print( typeof(b) )

Float64

In [3]:
# get the 'word size' number of bits 
Sys.WORD_SIZE

64

In [4]:
Int

Int64

In [5]:
UInt

UInt64

In [6]:
UInt16

UInt16

## Hexadecimal (hex) is a format to efficiently represent binary data

- 11011010 is 'DA'
- 1010 is 'A'

Instead of working with binary we can work with 'hex'. It is like a small step above machine binary code. 32 bits is 8 hex digits.

Julia allows for hexadecimal notation directly.

In [7]:
hex1 = 0x1

0x01

In [8]:
typeof(hex1)

UInt8

In [11]:
#convert to Integer representation
parse(Int, "0x1")

1

In [12]:
#convert to Integer representation
Int(0x1)

1

In [13]:
#convert to Integer representation, automatic casting
hex1 + 0

1

In [14]:
0x10 + 0

16

In [15]:
0x0f + 0

15

In [16]:
Int(0xaf)

175

In [19]:
# convert a number to hexadecimal (base 16)
string(122, base=16)

"7a"

### similarly, Octal representations exist (base 8)

In [20]:
oct1 = 0o071

0x39

In [21]:
oct1 + 0

57

In [22]:
typeof(oct1)

UInt8

In [25]:
Int(oct1)

57

### binary can be directly represented as well (base 2)

In [23]:
bin1 = 0b0110

0x06

In [24]:
typeof( bin1 )

UInt8

In [26]:
Int(bin1)

6

In [27]:
bin1 + 0

6

In [28]:
bin2 = 0b011100111111110110

0x0001cff6

In [30]:
#casts to a larger size if needed to store the larger space
typeof( bin2 )

UInt32

In [33]:
# the leading zeros here count for the storage size
bin3 = 0b00000000000000000000000011100111111110110

0x000000000001cff6

In [34]:
typeof(bin3)

UInt64

 numbers too large for even Int64 get cast to a a BigInt

In [44]:
#get the min of a Int64
typemin(Int64)

-9223372036854775808

In [53]:
#get the max of a Int64
typemax(Int64)

9223372036854775807

In [58]:
typemax(UInt32)

0xffffffff

In [59]:
Int( typemax(UInt32) )

4294967295

In [60]:
typemin(UInt32)

0x00000000

In [66]:
Int128( typemax(UInt64) )

18446744073709551615

In [67]:
#this causes an error since the Int64 signed cannot hold the max of an unsigned
Int64( typemax(UInt64) )

LoadError: InexactError: check_top_bit(Int64, 18446744073709551615)

# Overflow

This is important to be aware of 

- If your value exceeds the maximum allowed, then the value **wraps around** for signed numbers
- This will resemble 'modular' arithmetic in a way on signed numbers
- In unsigned this will cause a cast to a higher larger precision

Get the max of Int32, add 1 to it and it gets has to higher precision, but if the highest normal precision is used, Int64, then an overflow happens and it becomes a negative number from wrapping around

In [95]:
max32 = typemax(Int32)

2147483647

In [96]:
typeof(max32)

Int32

In [97]:
max32 + 1

2147483648

In [98]:
typeof(max32 + 1)

Int64

In [99]:
max64 = typemax(Int64)

9223372036854775807

In [100]:
typeof(max64)

Int64

In [101]:
max64 + 1

-9223372036854775808

In [102]:
typeof(max64 + 1)

Int64

In [103]:
max64 + 1 == typemin(Int64)

true

### to use really big numbers they have to be declared as 'big'

In [106]:
#becomes negative from overflow causing a wrapping around
b1 = 10^40

-5047021154770878464

In [110]:
#works correctly now as the number operated on it 'big'
big(10)^20

100000000000000000000

In [109]:
#will not work correctly as it needs to be a 'big' number to start off with
big(10^20)

7766279631452241920

### E-notation

In [112]:
2.5e-4 == 0.00025

true

In [116]:
-1.23e3 == -1230

true

### Floats can be made in a manner similar to E-notation by using an 'f' instead of an 'e'

In [125]:
f1 = 0.124f0

0.124f0

In [130]:
0.124f-5

1.24f-6

## Underscore '_' makes large numbers more readable, or very small ones readable

In [131]:
n1 = 1000000000

1000000000

In [132]:
n1 == 1_000_000_000

true

In [133]:
n2 = 0.0000000001

1.0e-10

In [134]:
n2 == 0.000_000_0001

true

### Underscores can be used in hex and binary too

In [135]:
h1 = 0xbeef_feed_1234

0x0000beeffeed1234

In [137]:
h1 + 1

0x0000beeffeed1235

In [138]:
Int(h1)

209937983410740

In [139]:
b1 = 0b1010_1101

0xad

In [140]:
b1 + 1

174

In [141]:
Int( b1 )

173

### Inf and NaN

In [142]:
NaN

NaN

In [143]:
typeof(NaN)

Float64

In [144]:
Inf

Inf

In [145]:
typeof(Inf)

Float64

In [146]:
Inf == Inf

true

In [148]:
#although seemingly the equal to itself it is not
NaN == NaN

false

In [171]:
NaN != NaN

true

In [149]:
Inf16

Inf16

In [151]:
Inf32

Inf32

In [152]:
1/Inf

0.0

In [153]:
1/0 

Inf

In [154]:
1/0 == Inf

true

In [155]:
2e-10 / 0 == Inf

true

In [156]:
1e4 + Inf

Inf

In [157]:
0/0

NaN

In [160]:
isnan(1)

false

In [161]:
isnan(Inf)

false

In [163]:
isnan( 0 / 0 )

true

In [164]:
isnan( 1 / 0 )

false

In [165]:
isnan( Inf / 0 )

false

In [166]:
Inf + Inf

Inf

In [167]:
Inf + NaN

NaN

In [168]:
Inf^2

Inf

In [169]:
0 * Inf

NaN

In [170]:
Inf / Inf

NaN

In [172]:
Inf < Inf + 1

false

In [173]:
Inf < Inf*2

false

## Arbitrary Sizes

These are defined as a BigInt and a BigFloat. Key point is that once a Big has participatd in arithmetic all other types are converted 

In [177]:
typemax(Int64) + 10

-9223372036854775799

In [178]:
BigInt(typemax(Int64)) + 10

9223372036854775817

In [179]:
BigFloat(typemax(Int64)) + 10

9.223372036854775817e+18

In [181]:
typeof( BigFloat(typemax(Int64)) + 10 )

BigFloat

In [182]:
typeof( BigFloat(typemax(Int64)) / 10_000_000 )

BigFloat

In [183]:
typeof( BigFloat(typemax(Int64)) / typemax(Int64) )

BigFloat

In [184]:
typeof( BigInt(typemax(Int64)) / typemax(Int64) )

BigFloat

In [185]:
BigInt(typemax(Int64)) / typemax(Int64)

1.0

a Big can be made from a string literal

In [186]:
big1 = big"111111111222222222223333333334444444555555556666667777"

111111111222222222223333333334444444555555556666667777

In [187]:
typeof(big1)

BigInt

In [189]:
big2 = big"11111111122222222222333333333444444455555555666666777.7"

1.111111112222222222233333333344444445555555566666677770000000000000000000000004e+52

In [190]:
typeof(big2)

BigFloat

In [191]:
big3 = parse( BigInt, "111111111222222222223333333334444444555555556666667777")

111111111222222222223333333334444444555555556666667777

In [192]:
typeof( big3 )

BigInt

In [193]:
(big"2")^100

1267650600228229401496703205376

In [194]:
string( (big"2")^100 )

"1267650600228229401496703205376"

In [195]:
string( (big"2")^100, base=16 )

"10000000000000000000000000"

### Literal coefficients

In order to make formulae more easy to write in comparison to equations coefficients can directly be used

In [200]:
x = 10
2x^2 - 10(10x-1) + 10

-780

In [201]:
x = 3
10^2x

1000000

## unicode for mathematical operations

In [202]:
4 / 2

2.0

In [203]:
# type \div and then tab after it without a space
4 ÷ 2

2

In [204]:
plus = +

+ (generic function with 206 methods)

In [206]:
plus(1,1)

2

# Vectorized 'dot' operations

There must be a space before the '.' (dot) to make sure it is not a field of the variable

In [216]:
# ok
1 + 10

11

In [217]:
# not ok
[1,2,3] + 10

LoadError: MethodError: no method matching +(::Vector{Int64}, ::Int64)
For element-wise addition, use broadcasting with dot syntax: array .+ scalar
[0mClosest candidates are:
[0m  +(::Any, ::Any, [91m::Any[39m, [91m::Any...[39m) at operators.jl:591
[0m  +([91m::T[39m, ::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at int.jl:87
[0m  +([91m::Rational[39m, ::Integer) at rational.jl:313
[0m  ...

In [218]:
# now it is ok because we use the 'dot' to project to every element
[1,2,3] .+ 10

3-element Vector{Int64}:
 11
 12
 13

In [220]:
#similarly the sin function takes in a single number
sin(pi/2)

1.0

In [221]:
#but it does not take in mulitple numbers
sin( [pi/2, pi/4, pi/5, pi/6] )

LoadError: MethodError: no method matching sin(::Vector{Float64})
[0mClosest candidates are:
[0m  sin([91m::T[39m) where T<:Union{Float32, Float64} at special/trig.jl:29
[0m  sin([91m::LinearAlgebra.Diagonal[39m) at ~/julia/julia-1.8.2/share/julia/stdlib/v1.8/LinearAlgebra/src/diagonal.jl:674
[0m  sin([91m::LinearAlgebra.UniformScaling[39m) at ~/julia/julia-1.8.2/share/julia/stdlib/v1.8/LinearAlgebra/src/uniformscaling.jl:173
[0m  ...

In [222]:
#but it does not take in mulitple numbers but now it is ok with the 'dot'
sin.( [pi/2, pi/4, pi/5, pi/6] )

4-element Vector{Float64}:
 1.0
 0.7071067811865475
 0.5877852522924731
 0.49999999999999994

In [223]:
sin.( [pi/2, pi/4, pi/5, pi/6] ) .^ 2 #square each value individually

4-element Vector{Float64}:
 1.0
 0.4999999999999999
 0.3454915028125263
 0.24999999999999994

In [225]:
# we can use the 'macro' to do the same thing
@. sin( [pi/2, pi/4, pi/5, pi/6] )

4-element Vector{Float64}:
 1.0
 0.7071067811865475
 0.5877852522924731
 0.49999999999999994

In [227]:
@. sin( [pi/2, pi/4, pi/5, pi/6] ) ^ 2 #square each value individually

4-element Vector{Float64}:
 1.0
 0.4999999999999999
 0.3454915028125263
 0.24999999999999994

## Equality on multiple variables 

In [234]:
10 == 10

true

In [236]:
# false
10 == [10 10]

false

In [238]:
# works
10 .== [10 10]

1×2 BitMatrix:
 1  1

In [242]:
# works
mat1 = 10 .* ones(2,2)
10 .== ( mat1 )

2×2 BitMatrix:
 1  1
 1  1

In [243]:
# works
mat1 = 10 .* ones(2,2)
mat1[2,1] = 20
10 .== ( mat1 )

2×2 BitMatrix:
 1  1
 0  1

we can compare different data structure values

In [244]:
mat1 = 10 .* ones(2,2)
mat2 = 10 .* ones(2,2)
mat1 == mat2

true

In [245]:
mat1 = 10 .* ones(2,2)
mat2 = 10 .* ones(2,2)
mat1 .== mat2

2×2 BitMatrix:
 1  1
 1  1

In [247]:
isequal( mat1 , mat2 )

true

In [248]:
isequal.( mat1 , mat2 )

2×2 BitMatrix:
 1  1
 1  1

In [249]:
mat1 = 10 .* ones(2,2)
mat2 = 10 .* ones(2,2)
mat2[2,1] = 20
mat1 .== mat2

2×2 BitMatrix:
 1  1
 0  1

In [250]:
isequal.( mat1 , mat2 )

2×2 BitMatrix:
 1  1
 0  1

In [251]:
isequal( mat1 , mat2 )

false

In [253]:
isequal( ["a","b","c"] , ["a","z","c"] )

false

In [254]:
isequal.( ["a","b","c"] , ["a","z","c"] )

3-element BitVector:
 1
 0
 1

In [255]:
["a","b","c"] == ["a","z","c"]

false

In [256]:
["a","b","c"] .== ["a","z","c"]

3-element BitVector:
 1
 0
 1

In [260]:
d1 = Dict("a"=>1,"b"=>2,"c"=>3)
d2 = Dict("a"=>1,"b"=>2,"c"=>3)

d1 == d2

true

In [262]:
d1["a"] = 11
d1 == d2

false

In [264]:
keys(d1) == keys(d2)

true

In [267]:
values(d1) == values(d2)

false

In [268]:
values(d1) .== values(d2)

3-element BitVector:
 1
 1
 0

In [270]:
struct MyStruct
    x
    y
end

In [272]:
ms1 = MyStruct(1,2)
ms2 = MyStruct(1,2)

ms1 == ms2

true

In [273]:
ms1 = MyStruct(1,2)
ms2 = MyStruct(2,2)

ms1 == ms2

false

## useful math functions

sin    cos    tan    cot    sec    csc
sinh   cosh   tanh   coth   sech   csch
asin   acos   atan   acot   asec   acsc
asinh  acosh  atanh  acoth  asech  acsch
sinc   cosc

In [282]:
tan(pi/4)

0.9999999999999999

In [283]:
tanh(pi/4)

0.6557942026326724

In [285]:
#natural logarithm (base e)
log(2.7)

0.9932517730102834

In [286]:
#natural logarithm (base 3)
log(3,9)

2.0

In [287]:
#natural exponential function
exp(1)

2.718281828459045

In [288]:
# square root
sqrt(144)

12.0

In [289]:
# cubic root
cbrt(64)

4.0

# Complex numbers

In [290]:
c1 = 1 + 2im

1 + 2im

In [291]:
c1 + 2

3 + 2im

In [292]:
c1 * 2 

2 + 4im

In [293]:
c1^2

-3 + 4im

In [294]:
c1 / ( 2 + 1im )

0.8 + 0.6im

In [295]:
c1^5.2

56.78660037547435 - 32.96873078937793im

In [296]:
real(c1)

1

In [297]:
imag(c1)

2

In [298]:
conj(c1)

1 - 2im

In [299]:
conj(c1) * c1

5 + 0im

In [301]:
abs(c1)

2.23606797749979

In [302]:
abs2(c1)

5

In [305]:
angle(c1) #radians

1.1071487177940904

In [306]:
sqrt(-1 + 0im)

0.0 + 1.0im

In [307]:
complex(10,20)

10 + 20im

In [309]:
2//7 + c1

9//7 + 2//1*im

In [310]:
real(2//7 + c1)

9//7

In [311]:
numerator( real(2//7 + c1) )

9

In [312]:
denominator( real(2//7 + c1) )

7