# Integers and Floating-Point numbers

Although much of this is taken care of in the background, there are situations where it is very good to know and understand how things work. This is important when performing operations with different amounts of 'precision'.

You can think of an integer 2.0 as a 'literal' and 1.0 as a floating-point 'literal'.

Floating-point numbers (Floats) are close to 'scientific notation' of numbers. Floating-point numbers have a significant like 0.375 and an exponent to scale it by factors of 10. So 0.375 is 3.75 x 10^-1. But on the computer it is not base 10 but base 2. 

$value = significand \times base^{exponent}$

0.375x2=0.75, (integer part is 0), 0.75x2=1.50.75x2=1.5, (integer part is 1), 0.5x2=1.00.5x2=1.0 (integer part is 1), 0.011, so 0.375 becomes $1.1 x 10^{-2}$

Integers do not have this format and are just the raw numbers but can come in the form of signed and unsigned. By 'unsigned' it means that the integer can only take positive values and 'signed' can take positive and negative. Unsigned integers can store larger numbers for the same number of bit allocated to memory. If the numbers are stored in 8 bits, the first bit from the left signifies the 'sign of the number' and the rest of the binary numbers the value. 


### Integer Types

Signed

- Int8
- Int16
- Int32
- Int64
- Int128

UnSigned

- UInt8
- UInt16
- UInt32
- UInt64
- UInt128

UInt128, means that there 128 bits used to store the number $0 to 2^128 - 1$, and in Int128 $-2^{128-1} to 2^{128-1} - 1$ bits for the number and the first bit for to say whether the number is positive or negative.

### Floats

- Float16 (half precision)
- Float32 (single precision)
- Float64 (double precision)

Float32 has 4 bytes and has 3 parts, the sign, the exponent and the significand (mantissa). Sign part has a '0' to denote a positive number and '1' for a negative number. 

* (each 'byte' has 8 bits) *

In [1]:
# if you are on a 64 bit computer it will be 64 and 32 likewise
a = 1
print( typeof(a) )

Int64

In [2]:
# if you are on a 64 bit computer it will be 64 and 32 likewise
b = 1.0
print( typeof(b) )

Float64

In [3]:
# get the 'word size' number of bits 
Sys.WORD_SIZE

64

In [4]:
Int

Int64

In [5]:
UInt

UInt64

In [6]:
UInt16

UInt16

## Hexadecimal (hex) is a format to efficiently represent binary data

- 11011010 is 'DA'
- 1010 is 'A'

Instead of working with binary we can work with 'hex'. It is like a small step above machine binary code. 32 bits is 8 hex digits.

Julia allows for hexadecimal notation directly.

In [7]:
hex1 = 0x1

0x01

In [8]:
typeof(hex1)

UInt8

In [9]:
#convert to Integer representation
parse(Int, "0x1")

1

In [10]:
#convert to Integer representation
Int(0x1)

1

In [11]:
#convert to Integer representation, automatic casting
hex1 + 0

1

In [12]:
0x10 + 0

16

In [13]:
0x0f + 0

15

In [14]:
Int(0xaf)

175

In [15]:
# convert a number to hexadecimal (base 16)
string(122, base=16)

"7a"

### similarly, Octal representations exist (base 8)

In [16]:
oct1 = 0o071

0x39

In [17]:
oct1 + 0

57

In [18]:
typeof(oct1)

UInt8

In [19]:
Int(oct1)

57

### binary can be directly represented as well (base 2)

In [20]:
bin1 = 0b0110

0x06

In [21]:
typeof( bin1 )

UInt8

In [22]:
Int(bin1)

6

In [23]:
bin1 + 0

6

In [24]:
bin2 = 0b011100111111110110

0x0001cff6

In [25]:
#casts to a larger size if needed to store the larger space
typeof( bin2 )

UInt32

In [26]:
# the leading zeros here count for the storage size
bin3 = 0b00000000000000000000000011100111111110110

0x000000000001cff6

In [27]:
typeof(bin3)

UInt64

 numbers too large for even Int64 get cast to a a BigInt

In [28]:
#get the min of a Int64
typemin(Int64)

-9223372036854775808

In [29]:
#get the max of a Int64
typemax(Int64)

9223372036854775807

In [30]:
typemax(UInt32)

0xffffffff

In [31]:
Int( typemax(UInt32) )

4294967295

In [32]:
typemin(UInt32)

0x00000000

In [33]:
Int128( typemax(UInt64) )

18446744073709551615

In [34]:
#this causes an error since the Int64 signed cannot hold the max of an unsigned
Int64( typemax(UInt64) )

LoadError: InexactError: check_top_bit(Int64, 18446744073709551615)

# Overflow

This is important to be aware of 

- If your value exceeds the maximum allowed, then the value **wraps around** for signed numbers
- This will resemble 'modular' arithmetic in a way on signed numbers
- In unsigned this will cause a cast to a higher larger precision

Get the max of Int32, add 1 to it and it gets has to higher precision, but if the highest normal precision is used, Int64, then an overflow happens and it becomes a negative number from wrapping around

In [35]:
max32 = typemax(Int32)

2147483647

In [36]:
typeof(max32)

Int32

In [37]:
max32 + 1

2147483648

In [38]:
typeof(max32 + 1)

Int64

In [39]:
max64 = typemax(Int64)

9223372036854775807

In [40]:
typeof(max64)

Int64

In [41]:
max64 + 1

-9223372036854775808

In [42]:
typeof(max64 + 1)

Int64

In [43]:
max64 + 1 == typemin(Int64)

true

### to use really big numbers they have to be declared as 'big'

In [44]:
#becomes negative from overflow causing a wrapping around
b1 = 10^40

-5047021154770878464

In [45]:
#works correctly now as the number operated on it 'big'
big(10)^20

100000000000000000000

In [46]:
#will not work correctly as it needs to be a 'big' number to start off with
big(10^20)

7766279631452241920

### E-notation

In [47]:
2.5e-4 == 0.00025

true

In [48]:
-1.23e3 == -1230

true

### Floats can be made in a manner similar to E-notation by using an 'f' instead of an 'e'

In [49]:
f1 = 0.124f0

0.124f0

In [50]:
0.124f-5

1.24f-6

## Underscore '_' makes large numbers more readable, or very small ones readable

In [51]:
n1 = 1000000000

1000000000

In [52]:
n1 == 1_000_000_000

true

In [53]:
n2 = 0.0000000001

1.0e-10

In [54]:
n2 == 0.000_000_0001

true

### Underscores can be used in hex and binary too

In [55]:
h1 = 0xbeef_feed_1234

0x0000beeffeed1234

In [56]:
h1 + 1

0x0000beeffeed1235

In [57]:
Int(h1)

209937983410740

In [58]:
b1 = 0b1010_1101

0xad

In [59]:
b1 + 1

174

In [60]:
Int( b1 )

173

## converting between numerical representations

In [61]:
hex_str1 = string( 1234 , base=16 )

"4d2"

In [62]:
oct_str1 = string( 1234 , base=8 )

"2322"

In [63]:
binary_str1 = string( 1234 , base=2 )

"10011010010"

In [64]:
int1 = parse( Int , hex_str1 , base=16 )

1234

In [65]:
int2 = parse( Int , oct_str1 , base=8 )

1234

In [66]:
int3 = parse( Int , binary_str1 , base=2 )

1234

### Inf and NaN

In [67]:
NaN

NaN

In [68]:
typeof(NaN)

Float64

In [69]:
Inf

Inf

In [70]:
typeof(Inf)

Float64

In [71]:
Inf == Inf

true

In [72]:
#although seemingly the equal to itself it is not
NaN == NaN

false

In [73]:
NaN != NaN

true

In [74]:
Inf16

Inf16

In [75]:
Inf32

Inf32

In [76]:
1/Inf

0.0

In [77]:
1/0 

Inf

In [78]:
1/0 == Inf

true

In [79]:
2e-10 / 0 == Inf

true

In [80]:
1e4 + Inf

Inf

In [81]:
0/0

NaN

In [82]:
isnan(1)

false

In [83]:
isnan(Inf)

false

In [84]:
isnan( 0 / 0 )

true

In [85]:
isnan( 1 / 0 )

false

In [86]:
isnan( Inf / 0 )

false

In [87]:
Inf + Inf

Inf

In [88]:
Inf + NaN

NaN

In [89]:
Inf^2

Inf

In [90]:
0 * Inf

NaN

In [91]:
Inf / Inf

NaN

In [92]:
Inf < Inf + 1

false

In [93]:
Inf < Inf*2

false

## Arbitrary Sizes

These are defined as a BigInt and a BigFloat. Key point is that once a Big has participatd in arithmetic all other types are converted 

In [94]:
typemax(Int64) + 10

-9223372036854775799

In [95]:
BigInt(typemax(Int64)) + 10

9223372036854775817

In [96]:
BigFloat(typemax(Int64)) + 10

9.223372036854775817e+18

In [97]:
typeof( BigFloat(typemax(Int64)) + 10 )

BigFloat

In [98]:
typeof( BigFloat(typemax(Int64)) / 10_000_000 )

BigFloat

In [99]:
typeof( BigFloat(typemax(Int64)) / typemax(Int64) )

BigFloat

In [100]:
typeof( BigInt(typemax(Int64)) / typemax(Int64) )

BigFloat

In [101]:
BigInt(typemax(Int64)) / typemax(Int64)

1.0

a Big can be made from a string literal

In [102]:
big1 = big"111111111222222222223333333334444444555555556666667777"

111111111222222222223333333334444444555555556666667777

In [103]:
typeof(big1)

BigInt

In [104]:
big2 = big"11111111122222222222333333333444444455555555666666777.7"

1.111111112222222222233333333344444445555555566666677770000000000000000000000004e+52

In [105]:
typeof(big2)

BigFloat

In [106]:
big3 = parse( BigInt, "111111111222222222223333333334444444555555556666667777")

111111111222222222223333333334444444555555556666667777

In [107]:
typeof( big3 )

BigInt

In [108]:
(big"2")^100

1267650600228229401496703205376

In [109]:
string( (big"2")^100 )

"1267650600228229401496703205376"

In [110]:
string( (big"2")^100, base=16 )

"10000000000000000000000000"

### Literal coefficients

In order to make formulae more easy to write in comparison to equations coefficients can directly be used

In [111]:
x = 10
2x^2 - 10(10x-1) + 10

-780

In [112]:
x = 3
10^2x

1000000

## unicode for mathematical operations

In [113]:
4 / 2

2.0

In [114]:
# type \div and then tab after it without a space
4 ÷ 2

2

In [115]:
plus = +

+ (generic function with 207 methods)

In [116]:
plus(1,1)

2

# Vectorized 'dot' operations

There must be a space before the '.' (dot) to make sure it is not a field of the variable

In [117]:
# ok
1 + 10

11

In [118]:
# not ok
[1,2,3] + 10

LoadError: MethodError: no method matching +(::Vector{Int64}, ::Int64)
For element-wise addition, use broadcasting with dot syntax: array .+ scalar

[0mClosest candidates are:
[0m  +(::Any, ::Any, [91m::Any[39m, [91m::Any...[39m)
[0m[90m   @[39m [90mBase[39m [90m[4moperators.jl:578[24m[39m
[0m  +([91m::T[39m, ::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8}
[0m[90m   @[39m [90mBase[39m [90m[4mint.jl:87[24m[39m
[0m  +([91m::T[39m, ::Integer) where T<:AbstractChar
[0m[90m   @[39m [90mBase[39m [90m[4mchar.jl:237[24m[39m
[0m  ...


In [119]:
# now it is ok because we use the 'dot' to project to every element
[1,2,3] .+ 10

3-element Vector{Int64}:
 11
 12
 13

In [120]:
#similarly the sin function takes in a single number
sin(pi/2)

1.0

In [121]:
#but it does not take in mulitple numbers
sin( [pi/2, pi/4, pi/5, pi/6] )

LoadError: MethodError: no method matching sin(::Vector{Float64})

[0mClosest candidates are:
[0m  sin([91m::T[39m) where T<:Union{Float32, Float64}
[0m[90m   @[39m [90mBase[39m [90mspecial/[39m[90m[4mtrig.jl:29[24m[39m
[0m  sin([91m::LinearAlgebra.UniformScaling[39m)
[0m[90m   @[39m [35mLinearAlgebra[39m [90m~/julia/julia-1.9.3/share/julia/stdlib/v1.9/LinearAlgebra/src/[39m[90m[4muniformscaling.jl:173[24m[39m
[0m  sin([91m::LinearAlgebra.Hermitian{var"#s971", S} where {var"#s971"<:Complex, S<:(AbstractMatrix{<:var"#s971"})}[39m)
[0m[90m   @[39m [35mLinearAlgebra[39m [90m~/julia/julia-1.9.3/share/julia/stdlib/v1.9/LinearAlgebra/src/[39m[90m[4msymmetric.jl:732[24m[39m
[0m  ...


In [122]:
#but it does not take in mulitple numbers but now it is ok with the 'dot'
sin.( [pi/2, pi/4, pi/5, pi/6] )

4-element Vector{Float64}:
 1.0
 0.7071067811865475
 0.5877852522924731
 0.49999999999999994

In [123]:
sin.( [pi/2, pi/4, pi/5, pi/6] ) .^ 2 #square each value individually

4-element Vector{Float64}:
 1.0
 0.4999999999999999
 0.3454915028125263
 0.24999999999999994

In [124]:
# we can use the 'macro' to do the same thing
@. sin( [pi/2, pi/4, pi/5, pi/6] )

4-element Vector{Float64}:
 1.0
 0.7071067811865475
 0.5877852522924731
 0.49999999999999994

In [125]:
@. sin( [pi/2, pi/4, pi/5, pi/6] ) ^ 2 #square each value individually

4-element Vector{Float64}:
 1.0
 0.4999999999999999
 0.3454915028125263
 0.24999999999999994

## Equality on multiple variables 

In [126]:
10 == 10

true

In [127]:
# false
10 == [10 10]

false

In [128]:
# works
10 .== [10 10]

1×2 BitMatrix:
 1  1

In [129]:
# works
mat1 = 10 .* ones(2,2)
10 .== ( mat1 )

2×2 BitMatrix:
 1  1
 1  1

In [130]:
# works
mat1 = 10 .* ones(2,2)
mat1[2,1] = 20
10 .== ( mat1 )

2×2 BitMatrix:
 1  1
 0  1

we can compare different data structure values

In [131]:
mat1 = 10 .* ones(2,2)
mat2 = 10 .* ones(2,2)
mat1 == mat2

true

In [132]:
mat1 = 10 .* ones(2,2)
mat2 = 10 .* ones(2,2)
mat1 .== mat2

2×2 BitMatrix:
 1  1
 1  1

In [133]:
isequal( mat1 , mat2 )

true

In [134]:
isequal.( mat1 , mat2 )

2×2 BitMatrix:
 1  1
 1  1

In [135]:
mat1 = 10 .* ones(2,2)
mat2 = 10 .* ones(2,2)
mat2[2,1] = 20
mat1 .== mat2

2×2 BitMatrix:
 1  1
 0  1

In [136]:
isequal.( mat1 , mat2 )

2×2 BitMatrix:
 1  1
 0  1

In [137]:
isequal( mat1 , mat2 )

false

In [138]:
isequal( ["a","b","c"] , ["a","z","c"] )

false

In [139]:
isequal.( ["a","b","c"] , ["a","z","c"] )

3-element BitVector:
 1
 0
 1

In [140]:
["a","b","c"] == ["a","z","c"]

false

In [141]:
["a","b","c"] .== ["a","z","c"]

3-element BitVector:
 1
 0
 1

In [142]:
d1 = Dict("a"=>1,"b"=>2,"c"=>3)
d2 = Dict("a"=>1,"b"=>2,"c"=>3)

d1 == d2

true

In [143]:
d1["a"] = 11
d1 == d2

false

In [144]:
keys(d1) == keys(d2)

true

In [145]:
values(d1) == values(d2)

false

In [146]:
values(d1) .== values(d2)

3-element BitVector:
 1
 1
 0

In [147]:
struct MyStruct
    x
    y
end

In [148]:
ms1 = MyStruct(1,2)
ms2 = MyStruct(1,2)

ms1 == ms2

true

In [149]:
ms1 = MyStruct(1,2)
ms2 = MyStruct(2,2)

ms1 == ms2

false

## useful math functions

sin    cos    tan    cot    sec    csc
sinh   cosh   tanh   coth   sech   csch
asin   acos   atan   acot   asec   acsc
asinh  acosh  atanh  acoth  asech  acsch
sinc   cosc

In [150]:
tan(pi/4)

0.9999999999999999

In [151]:
tanh(pi/4)

0.6557942026326724

In [152]:
#natural logarithm (base e)
log(2.7)

0.9932517730102834

In [153]:
#natural logarithm (base 3)
log(3,9)

2.0

In [154]:
#natural exponential function
exp(1)

2.718281828459045

In [155]:
# square root
sqrt(144)

12.0

In [156]:
# cubic root
cbrt(64)

4.0

# Complex numbers

In [157]:
c1 = 1 + 2im

1 + 2im

In [158]:
c1 + 2

3 + 2im

In [159]:
c1 * 2 

2 + 4im

In [160]:
c1^2

-3 + 4im

In [161]:
c1 / ( 2 + 1im )

0.8 + 0.6im

In [162]:
c1^5.2

56.78660037547435 - 32.96873078937793im

In [163]:
real(c1)

1

In [164]:
imag(c1)

2

In [165]:
conj(c1)

1 - 2im

In [166]:
conj(c1) * c1

5 + 0im

In [167]:
abs(c1)

2.23606797749979

In [168]:
abs2(c1)

5

In [169]:
angle(c1) #radians

1.1071487177940904

In [170]:
sqrt(-1 + 0im)

0.0 + 1.0im

In [171]:
complex(10,20)

10 + 20im

In [172]:
2//7 + c1

9//7 + 2//1*im

In [173]:
real(2//7 + c1)

9//7

In [174]:
numerator( real(2//7 + c1) )

9

In [175]:
denominator( real(2//7 + c1) )

7

# Characters

'Chars' are single characters which together make up a string. They are their own types. 

In [176]:
c1 = 'a'

'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

In [177]:
typeof( c1 )

Char

### convert char to a number and back

In [178]:
c = Int( c1 )

97

In [179]:
Char( 97 )

'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

### use arithmetic to move to a different Char

In [180]:
Char( Int(c1) + 10 )

'k': ASCII/Unicode U+006B (category Ll: Letter, lowercase)

In [181]:
Char( Int(c1) + 200 )

'ĩ': Unicode U+0129 (category Ll: Letter, lowercase)

### check for number to Char validity

In [182]:
isvalid( Char, 10^3  )

true

In [183]:
isvalid( Char, 10^10  )

false

### use hex to define a char position

In [184]:
isvalid( Char, 0x4da )

true

In [185]:
Char( 0x4da )

'Ӛ': Unicode U+04DA (category Lu: Letter, uppercase)

In [186]:
Char( 0x000061 )

'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

In [187]:
Char( 0x0000000000000000061 )

'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

In [188]:
Char( 0x065 )

'e': ASCII/Unicode U+0065 (category Ll: Letter, lowercase)

In [189]:
Char( 0o115)

'M': ASCII/Unicode U+004D (category Lu: Letter, uppercase)

## Unicode characters (in single quotes) can also be defined by hexadecimal digits \u and 1-4 hexadecimal digits or \U and 1-8 hexadecimal digits

In [190]:
'\u61'

'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

In [191]:
Int( '\u61' )

97

In [192]:
'\U61'

'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

In [193]:
Int( '\U61' )

97

In [194]:
# tab escape character
Int( '\t' )

9

In [195]:
# new line escape character
Int( '\n' )

10

### since there is a numerical value at the core of the character type arithmetic oeprations can be done

In [196]:
'A' < 'a'

true

In [197]:
'b' - 'Z'

8

### putting quotes inside strings

In [198]:
"""This string has a "quote" and is ok"""

"This string has a \"quote\" and is ok"

In [199]:
print("""This string has a "quote" and is ok""")

This string has a "quote" and is ok

In [200]:
"This has \"quotes\" as well and is ok"

"This has \"quotes\" as well and is ok"

In [201]:
print("This has \"quotes\" as well and is ok")

This has "quotes" as well and is ok

long lines can be broken up with an escape of the return character

In [202]:
"A very long\
long line of text"

"A very longlong line of text"

In [203]:
str1 = "A very long\
long line of text"

"A very longlong line of text"

In [204]:
print(str1)

A very longlong line of text

In [205]:
str1[ begin ]

'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)

In [206]:
str1[ 1 ]

'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)

In [207]:
str1[end]

't': ASCII/Unicode U+0074 (category Ll: Letter, lowercase)

In [208]:
str1[begin:end]

"A very longlong line of text"

In [209]:
str1[begin+2:end-2]

"very longlong line of te"

In [210]:
str1[begin:begin+1] * str1[begin+2:end-2] * str1[end-1:end]

"A very longlong line of text"

In [211]:
typeof(str1[1:4])

String

### mixing unicode and ascii characters together in a string

In [212]:
"\u61 , and b is \u62"

"a , and b is b"

In [213]:
str2 = "This is a cool string"

for c in str2
    println(c)
end

T
h
i
s
 
i
s
 
a
 
c
o
o
l
 
s
t
r
i
n
g


In [214]:
foreach( display , str2 )

'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase)

'h': ASCII/Unicode U+0068 (category Ll: Letter, lowercase)

'i': ASCII/Unicode U+0069 (category Ll: Letter, lowercase)

's': ASCII/Unicode U+0073 (category Ll: Letter, lowercase)

' ': ASCII/Unicode U+0020 (category Zs: Separator, space)

'i': ASCII/Unicode U+0069 (category Ll: Letter, lowercase)

's': ASCII/Unicode U+0073 (category Ll: Letter, lowercase)

' ': ASCII/Unicode U+0020 (category Zs: Separator, space)

'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

' ': ASCII/Unicode U+0020 (category Zs: Separator, space)

'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)

'o': ASCII/Unicode U+006F (category Ll: Letter, lowercase)

'o': ASCII/Unicode U+006F (category Ll: Letter, lowercase)

'l': ASCII/Unicode U+006C (category Ll: Letter, lowercase)

' ': ASCII/Unicode U+0020 (category Zs: Separator, space)

's': ASCII/Unicode U+0073 (category Ll: Letter, lowercase)

't': ASCII/Unicode U+0074 (category Ll: Letter, lowercase)

'r': ASCII/Unicode U+0072 (category Ll: Letter, lowercase)

'i': ASCII/Unicode U+0069 (category Ll: Letter, lowercase)

'n': ASCII/Unicode U+006E (category Ll: Letter, lowercase)

'g': ASCII/Unicode U+0067 (category Ll: Letter, lowercase)

In [215]:
foreach( c -> println(c) , str2 )

T
h
i
s
 
i
s
 
a
 
c
o
o
l
 
s
t
r
i
n
g


In [216]:
foreach( c -> begin 
        print(c, " ")
        end, str2 )

T h i s   i s   a   c o o l   s t r i n g 

In [217]:
'a' * 'b'

"ab"

In [218]:
reduce( * , ['a','b','c','d'] )

"abcd"

### String operations

In [219]:
"alex" == "Alex"

false

In [220]:
"alex" < "Alex"

false

In [221]:
"alex" > "Alex"

true

In [222]:
"alex" != "Alex"

true

In [223]:
"1 is less than 2" == "1 is less than $(1+1)"

true

In [224]:
findfirst( 'l' , "bubble bath" )

5

In [225]:
findfirst( 'b' , "bubble bath" )

1

### use 'findfirst' like you would a method like 'substring' in other languages

In [226]:
findfirst( "bat" , "bubble bath" )

8:10

In [227]:
findlast( 'b' , "bubble bath" )

8

In [228]:
# find first occurrence after an offset
findnext( 'b' , "bubbles with bath", 5 )

14

### 'occursin' is a very useful function when working with strings

In [229]:
occursin("world", "hello world!")

true

In [230]:
occursin('!', "hello world!")

true

# regular expressions

## using 'findall' with regexp

the 'regexp' is a string with the letter 'r' before it. 

In [231]:
str3 = "The cat chased another cat in the cathedral"
pattern = "cat"

"cat"

In [232]:
regex_pattern = r"\bcat\b"

r"\bcat\b"

In [233]:
matches = findall(regex_pattern, str3)

2-element Vector{UnitRange{Int64}}:
 5:7
 24:26

In [234]:
occursin( r"\bcat\b" , "The cat chased another cat in the cathedral" )

true

In [235]:
# find cat and dog words
findall( r"\b(cat|dog)\b" , "The cat chased another cat in the cathedral but there was no dog around" )

3-element Vector{UnitRange{Int64}}:
 5:7
 24:26
 62:64

In [236]:
patterns = [r"\bcat\b", r"\bdog\b"]
for pattern in patterns
    println( findall( pattern , "The cat chased another cat in the cathedral but there was no dog around" ) )
end

UnitRange{Int64}[5:7, 24:26]
UnitRange{Int64}[62:64]


In [237]:
# match is an efficient function to find the first occurrence for the regular expression in the string
match( r"\b(cat|dog)\b" , "The cat chased another cat in the cathedral but there was no dog around" )

RegexMatch("cat", 1="cat")

# Functions

In [238]:
function f1(x,y)
    return x*y
end
f1(2,4)

LoadError: cannot define function f1; it already has a value

In [239]:
#inline function
f2(x,y) = x*y
f2(2,4)

8

In [240]:
#stored function in variable (like in JS) it is a lambda
var_f = (x,y) -> x*y
var_f(2,4)

8

In [241]:
# last statement is the returned by defaul
function f3(x,y)
    x+y
end

f3 (generic function with 1 method)

In [242]:
array_of_functions = [var_f,f3]

2-element Vector{Function}:
 #7 (generic function with 1 method)
 f3 (generic function with 1 method)

In [243]:
array_of_functions[1](2,4)

8

In [244]:
array_of_functions[2](2,4)

6

In [245]:
#function names can have unicode in the names
function Δ(x,y)
    return x-y
end

Δ (generic function with 1 method)

In [246]:
Δ(10,10.1)

-0.09999999999999964

## return a function

In [247]:
function f3a(x)
    return x^2 + x
end
println( f3a(2) )

function f3b(x)
    return x -> x^2 + x
end
println( f3b(2) )
println( f3b(2)(2) )
tmp_f = f3b(2)
println( tmp_f(2) )

6
#9
6
6


# Ternary

it is a one liner if / else

condition ? do A if true : do B if false

they can be nested

In [248]:
x = 10
x > 8 ? print("bigger than 8") : print("not bigger than 8")

bigger than 8

In [249]:
x > 12 ? print("bigger than 12") : print("not bigger than 12")

not bigger than 12

In [250]:
# nested
x > 12 ? print("bigger than 12") : x==11 ? print("x is 11") : print("x is NOT 11")

x is NOT 11

# Tuples

Looks like the function arguments and return values

Important aspect is that:
- Fixed length
- Holds any values
- cannot be changed (immutable)

In [251]:
x1 = -5 
t1 = (1, 9+1+x1)

(1, 5)

In [252]:
t1[2]

5

In [253]:
length(t1)

2

In [254]:
# fails, immutable
t1[1] = 10

LoadError: MethodError: no method matching setindex!(::Tuple{Int64, Int64}, ::Int64, ::Int64)

In [255]:
# reassignment ok
t1 = [1,2,3,4]

4-element Vector{Int64}:
 1
 2
 3
 4

In [256]:
# Tuple from vector
t2 = Tuple([1,2,33,44])

(1, 2, 33, 44)

In [257]:
typeof( t2 )

NTuple{4, Int64}

In [258]:
# mixed types
t3 = (-1, 2//3, [1,2,3], "Hello", ("tupleString",3,2,1,[3,2,1]), "end")

(-1, 2//3, [1, 2, 3], "Hello", ("tupleString", 3, 2, 1, [3, 2, 1]), "end")

In [259]:
t3[3:4]

([1, 2, 3], "Hello")

In [260]:
t3[ [3,4] ]

([1, 2, 3], "Hello")

In [261]:
t3[ [3,4,end] ]

([1, 2, 3], "Hello", "end")

In [262]:
t3[ [begin,4,end-1] ]

(-1, "Hello", ("tupleString", 3, 2, 1, [3, 2, 1]))

In [263]:
#we can mutate inner objects
t3[3][1] = 111

111

In [264]:
t3

(-1, 2//3, [111, 2, 3], "Hello", ("tupleString", 3, 2, 1, [3, 2, 1]), "end")

In [265]:
#we can mutate inner objects
t3[3][:] .+= 10^3

3-element view(::Vector{Int64}, :) with eltype Int64:
 1111
 1002
 1003

In [266]:
t3

(-1, 2//3, [1111, 1002, 1003], "Hello", ("tupleString", 3, 2, 1, [3, 2, 1]), "end")

# Named Tuples

names can be put on tuple values so that words can identify value with the dot syntax and the indexing but not like with dictionaries where the key name is used, but using 'Symbols' the names can be acceessed

In [267]:
t4 = ( a=1 , b="Hi" , c=[1,2,3] )

(a = 1, b = "Hi", c = [1, 2, 3])

In [268]:
t4.b

"Hi"

In [269]:
t4[3]

3-element Vector{Int64}:
 1
 2
 3

In [270]:
# get the names
keys(t4)

(:a, :b, :c)

In [271]:
t4[:c]

3-element Vector{Int64}:
 1
 2
 3

In [272]:
# won't work
t4["c"]

LoadError: MethodError: no method matching getindex(::NamedTuple{(:a, :b, :c), Tuple{Int64, String, Vector{Int64}}}, ::String)

[0mClosest candidates are:
[0m  getindex(::NamedTuple, [91m::Int64[39m)
[0m[90m   @[39m [90mBase[39m [90m[4mnamedtuple.jl:136[24m[39m
[0m  getindex(::NamedTuple, [91m::Symbol[39m)
[0m[90m   @[39m [90mBase[39m [90m[4mnamedtuple.jl:137[24m[39m
[0m  getindex(::NamedTuple, [91m::Tuple{Vararg{Symbol}}[39m)
[0m[90m   @[39m [90mBase[39m [90m[4mnamedtuple.jl:138[24m[39m
[0m  ...


In [273]:
values( t4 )

(1, "Hi", [1, 2, 3])

# Destructuring Assignments and Return values

In [274]:
(a,b,c) = 10:12

10:12

In [275]:
println(a)
println(b)
println(c)

10
11
12


In [276]:
d,e,f = 22:24

22:24

In [277]:
println(d)
println(e)
println(f)

22
23
24


In [278]:
g,h,i = ["hello","world","!"]

3-element Vector{String}:
 "hello"
 "world"
 "!"

In [279]:
println(g)
println(h)
println(i)

hello
world
!


In [280]:
function f5(x,y)
    return x+y, x-y
end

f5 (generic function with 1 method)

In [281]:
f5(3,4)

(7, -1)

In [282]:
a,b = f5(3,4)

(7, -1)

In [283]:
println(a)
println(b)

7
-1


In [284]:
(c,d) = f5(3,4)

(7, -1)

In [285]:
println(c)
println(d)

7
-1


In [286]:
#ignore the first value returned
(_,f) = f5(5,3)

(8, 2)

In [287]:
println(f)

2


In [288]:
#returns a tuple
t3 = f5(10,4)

(14, 6)

In [289]:
println( typeof( t3 ) )

Tuple{Int64, Int64}


In [290]:
t3[:]

(14, 6)

In [291]:
#return tuple with parenthesis
function f6(x,y)
    return (x+y, x-y)
end

f6 (generic function with 1 method)

In [292]:
f6(4,9)

(13, -5)

In [293]:
#return an array
function f7(x,y)
    return [x+y, x-y]
end

f7 (generic function with 1 method)

In [294]:
f7(7,2)

2-element Vector{Int64}:
 9
 5

In [295]:
val1, val2 = f7(7,2)

2-element Vector{Int64}:
 9
 5

In [296]:
println( val1 )
println( val2 )

9
5


In [297]:
vals = f7(7,2)

2-element Vector{Int64}:
 9
 5

In [298]:
vals

2-element Vector{Int64}:
 9
 5

In [299]:
# using array indices
ar1 = Array{Any,1}(undef,2)
ar1[1], ar1[2] = [11,22]

2-element Vector{Int64}:
 11
 22

In [300]:
ar1

2-element Vector{Any}:
 11
 22

## splat and slurp

...x does splat

x... does slurp

In [301]:
ar2 = [1,2,3,4,5]
a, b, c... = ar2

5-element Vector{Int64}:
 1
 2
 3
 4
 5

In [302]:
println(a)
println(b)
println(c)

1
2
[3, 4, 5]


## variable arguments

In [303]:
function f_varargs(a,b,c...)
    println(a)
    println(b)
    println(c)
end

f_varargs (generic function with 1 method)

In [304]:
f_varargs(1,2,[11,22,"Hi",44])

1
2
(Any[11, 22, "Hi", 44],)


In [305]:
f_varargs(1,2,[11,22,"Hi",44], "cool str", 4/3)

1
2
(Any[11, 22, "Hi", 44], "cool str", 1.3333333333333333)


In [306]:
# won't work, not enough required input arguments
f_varargs([1,2,3])

LoadError: MethodError: no method matching f_varargs(::Vector{Int64})

[0mClosest candidates are:
[0m  f_varargs(::Any, [91m::Any[39m, [91m::Any...[39m)
[0m[90m   @[39m [32mMain[39m [90m[4mIn[303]:1[24m[39m


In [307]:
f_varargs([1,2,3]...)

1
2
(3,)


# Optional Arguments

In [308]:
function f6(x,y,z=30)
    println(x)
    println(y)
    println(z)
end

f6 (generic function with 2 methods)

In [309]:
f6(1,2,3)

1
2
3


In [310]:
f6(1,2)

1
2
30


## keyword arguments

In [311]:
function f7(x,y; z=1, param1=0.5)
    println("x=$x, y=$y, z=$z, param1=$param1")
end

f7 (generic function with 1 method)

In [312]:
f7(1,2,z=3,param1=4)

x=1, y=2, z=3, param1=4


In [313]:
f7(1,2,param1=4,z=3)

x=1, y=2, z=3, param1=4


In [314]:
f7(1,2,param1=4)

x=1, y=2, z=1, param1=4


In [315]:
f7(1,2)

x=1, y=2, z=1, param1=0.5


### extra keyword arguments

In [316]:
function f8(x,y; z=1,kwargs...)
    println("x=$x, y=$y, z=$z, kwargs=$kwargs")
end

f8 (generic function with 1 method)

In [317]:
f8(1,2,z=4,fun="sure",r=1.0,p1=(3,2))

x=1, y=2, z=4, kwargs=Base.Pairs{Symbol, Any, Tuple{Symbol, Symbol, Symbol}, NamedTuple{(:fun, :r, :p1), Tuple{String, Float64, Tuple{Int64, Int64}}}}(:fun => "sure", :r => 1.0, :p1 => (3, 2))


# Do-Block

- we can use lambdas and function names but also a block of code which is good for multiline

In [318]:
function f8(x,y)
    tmp1 = x + y
    tmp2 = x * y
    return tmp1 * tmp2
end

map( row -> f8(row...) , eachrow(rand(3,2)) )

3-element Vector{Float64}:
 1.0834608497241933
 0.686044505530873
 0.2088338164493965

In [319]:
map( row -> begin
        x,y = row
        tmp1 = x + y
        tmp2 = x * y
        return tmp1 * tmp2
    end, eachrow(rand(3,2)) )

3-element Vector{Float64}:
 0.20042783998464547
 0.10116018888675375
 0.05946887069198231

In [320]:
map( eachrow(rand(3,2)) ) do row
    x,y = row
        tmp1 = x + y
        tmp2 = x * y
        return tmp1 * tmp2
    end

3-element Vector{Float64}:
 0.03397233742670432
 0.00048489945790067203
 0.051712137961591444

# function composition

\circ operator can be used to put multiple functions

instead of f(g(x,y)) you can do (f$\circ$g)(x,y)

In [321]:
f9(x,y) = return x*10 + x/10
f10(x) = x+1000

f10 (generic function with 1 method)

In [322]:
f10( f9(10,20) )

1101.0

In [323]:
(f10 ∘ f9)(10,20)

1101.0

# function piping

some languages allow chaining of functions by f1(x).f2(x) in which the output of one becomes the input of the other and here we use '|>'

In [324]:
1:4 |> sum |> x-> x^3

1000

In [325]:
#multiple values independentlyw with broadcast
1:5 .|> x->x^2 .|> x-> x+1000

5-element Vector{Int64}:
 1001
 1004
 1009
 1016
 1025

In [326]:
(1:5 .|> x->x^2 .|> x-> x+1000) |> sum

5055

# Multiple dispatch

allows you to define functions with the same name and then based on the 'type' of the arguments passed different functionalities are used

In [327]:
function f_md(x::Int)
    println("Int passed")
end
function f_md(x::AbstractFloat)
    println("Float passed")
end
function f_md(x::String)
    println("String passed")
end


f_md (generic function with 3 methods)

In [328]:
f_md(6.1)
f_md("Hello!")
f_md(3)

Float passed
String passed
Int passed


## compound expresssions

In [329]:
a = 1
b = 2
c = begin
        tmp1 = a - b
        tmp2 = a + b
        tmp1 / tmp2
    end
c

-0.3333333333333333

In [333]:
# equivalently
c = (tmp1 = a - b; tmp2 = a + b; tmp1 / tmp2)

-0.3333333333333333

In [334]:
# same
c = begin tmp1 = a - b; tmp2 = a + b; tmp1 / tmp2 end

-0.3333333333333333

In [335]:
# values declared inside an if can be used outside but beware of undefined
function f5(x,y)
    if( x == y )
        test = "equal"
    end
    println(test)
end
f5(1,1)
f5(1,2)

equal


LoadError: UndefVarError: `test` not defined

In [339]:
#if blocks can return a value
function f6(x,y)
    if( x == y)
        return "equal"
    else
        return "not equal"
    end
end

print(f6(3,3))

equal

In [340]:
#if blocks don't need a return keyword but return the last statement value by default
function f6(x,y)
    if( x == y)
        "equal"
    else
        "not equal"
    end
end

print(f6(4,3))

not equal

# short if statement

In [342]:
1>0 && println("greater than zero")

non zero


In [345]:
#last element can be an expression
1>0 && (x = (1, 2, 3))

(1, 2, 3)

In [348]:
# while loops need to explicitly reference the non-local 'global' variable
i=0
while i <= 3
    println(i)
    global i += 1
end
println("i = $i")

0
1
2
3
i = 4


In [353]:
 while true
   println(i)
   if i >= 3
       break
   end
   global i += 1
end

4


In [350]:
for j = 1:3
    println(j)
end
#or 
for j in 1:3
    println(j)
end
println("j=$j")

1
2
3
1
2
3


LoadError: UndefVarError: `j` not defined

In [351]:
for i in [1,4,0]
    println(i)
end

1
4
0


In [352]:
for s ∈ ["foo","bar","baz"]
    println(s)
end

foo
bar
baz


In [354]:
for j = 1:1000
   println(j)
   if j >= 3
       break
   end
end

1
2
3


### continue keyword

In [355]:
 for i = 1:10
   if i % 3 != 0
       continue
   end
   println(i)
end

3
6
9


## nested for-loops single line

In [356]:
for i = 1:2, j = 3:4
   println((i, j))
end

(1, 3)
(1, 4)
(2, 3)
(2, 4)


# zip

you can zip multiple arrays together and iterate over them

In [357]:
for (j, k) in zip([1 2 3], [4 5 6 7])
   println((j,k))
end

(1, 4)
(2, 5)
(3, 6)


In [359]:
for (j, k, m) in zip( [1 2 3], [4 5 6 7], [-4,-444,-444] )
   println( (j,k,m) )
end

(1, 4, -4)
(2, 5, -444)
(3, 6, -444)


# try / catch

sometimes some code may not be able to execute producing errors (Exceptions) and need to be 'handled', like sqrt(-1)

In [361]:
try
   sqrt("ten")
catch e
    println("bad input to sqrt fn")
    println(e)
end

bad input to sqrt fn
MethodError(sqrt, ("ten",), 0x00000000000082ee)


In [362]:
try
   1 / "Hi"
catch e
    println("bad division")
    println(e)
end

bad division
MethodError(/, (1, "Hi"), 0x00000000000082ee)


In [363]:
function f8(x)

    local res

    try
        res = sqrt(x)
    catch err        
        println(err)        
    else
        println("res was successful = $res")
    finally
        println("finally no matter what")
    end
    
    println("very very end")
end


f8 (generic function with 2 methods)

In [364]:
f8(1)

res was successful = 1.0
finally no matter what
very very end


In [365]:
f8(-1)

DomainError(-1.0, "sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).")
finally no matter what
very very end
