# __Chapter 2: Data Types and Structures__

<br>
Tyler J. Brough <br>
Last Update: February 25, 2021 <br>
<br>
<br>

## 2.1 Simple Types (Scalar)

<br>

The built-in basic data types are the following:

* Single quotes produce a `char`

<br>

In [11]:
x = 'a'

'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

In [12]:
typeof(x)

Char

<br>

* Double quotes produce a `String`

<br>

In [13]:
x = "a"
typeof(x)

String

<br>

* The Boolean type takes values `true` and `false`

<br>

In [14]:
x = true
typeof(x)

Bool

<br>

* The "default" integer type is `Int64`

<br>

In [15]:
x = 123
typeof(x)

Int64

<br>

* The "default" float type is `Float64`

<br>

In [16]:
x = 123.
typeof(x)

Float64

<br>

* There are other types, such as `Complex{T}`

<br>

In [17]:
a =  + 2im
typeof(a)

Complex{Int64}

<br>

* And `Rational{T}` as well

<br>

In [18]:
a = 2 // 3
typeof(a)

Rational{Int64}

<br>

* We can also use type annotations as well:

<br>

In [32]:
## Have to put this in a function because type annotations are not allowed for global variables
function thetypes()
    a::Int64 = 123
    println("a is of type: $(typeof(a))")
    
    b::Float64 = 123.
    println("b is of type: $(typeof(b))")
    
    c::Char = 'a'
    println("c is of type: $(typeof(c))")
    
    d::String = "a"
    println("d is of type: $(typeof(d))")
end

thetypes (generic function with 1 method)

In [33]:
thetypes()

a is of type: Int64
b is of type: Float64
c is of type: Char
d is of type: String


<br>
<br>

#### 2.1.1 Basic Mathematic Operations

All standard basic mathematical arithmetic operations are supported in the obvious way:

In [5]:
1 + 2 # addition

3

In [6]:
4 - 2 # subtraction

2

In [7]:
3 * 2 # multiplication

6

In [8]:
12 / 4 # division

3.0

In [9]:
2 ^ 5 # exponentiation

32

In [11]:
exp(1) # the natural exponential function

2.718281828459045

In [12]:
ℯ # \euler + TAB

ℯ = 2.7182818284590...

In [14]:
MathConstants.e # another way to do the natural exp

ℯ = 2.7182818284590...

In [16]:
12 ÷ 2 # division can also be done with \div + TAB

6

In [18]:
3 % 2 # remainder (modulus)

1

In [20]:
π # \pi + TAB

π = 3.1415926535897...

In [22]:
σ = 1.0 # \sigma + TAB

1.0

In [23]:
θ = 0.75 # \theta + TAB

0.75

In [24]:
ρ = -1 # \rho + TAB

-1

In [25]:
λ = 7.0 # \lambda + TAB

7.0

#### __2.1.2 Strings__

The `String` type in Julia can be seen in some ways as a specialized array of individual chars. Unlike arrays, strings are immutable (`a="abc"; a[2] = 'B'` would raise an error). 

<br>

A string on a single row can be created using a single pair of double quotes, while a string on multiple rows can use triple quotes: 

In [1]:
a = "a string"

"a string"

In [2]:
b = "a string\non multiple rows\n"

"a string\non multiple rows\n"

In [3]:
c = """
a string
on multiple rows
"""

"a string\non multiple rows\n"

In [4]:
a[3]

's': ASCII/Unicode U+0073 (category Ll: Letter, lowercase)

Julia supports most typical string operations. For example:

* `split(s, " ")` defaults to whitespace

* `join([s1,s2], "")` 

* `replace(s, "toSearch" => "toReplace")`

* `strip(s)` removes leading and trailing whitespace

<br>

In [1]:
s = "Darth Vader"
split(s, " ")

2-element Array{SubString{String},1}:
 "Darth"
 "Vader"

In [3]:
s1 = "Qui-Gon"
s2 = "Jinn"
join([s1, s2], " ")

"Qui-Gon Jinn"

In [4]:
s = "obj.toSearch"
replace(s, "toSearch" => "toReplace")

"obj.toReplace"

In [6]:
s = " Ramana Maharshi "
strip(s)

"Ramana Maharshi"

To convert string representating numbers to integers and floats use:

In [7]:
myint = parse(Int,"2017")

2017

In [8]:
typeof(myint)

Int64

To convert integers and floats to strings, use

In [9]:
mystring = string(123)

"123"

In [10]:
typeof(mystring)

String

<br>

##### __Concatenation__

There are several ways to concatenate strings:

* Using the concatenation operator: `*`

In [34]:
firstname = "Robert "
lastname = "Zimmerman"
fullname = firstname * lastname
println(fullname)

Robert Zimmerman


* Using the `string` function: `string(str1, str2, str3)`

In [35]:
fullname = string("Obi-Wan", " ", "Kenobi")
println(fullname)

Obi-Wan Kenobi


* Using interpolation, that is combining string variables using the dollar sign: 

In [36]:
println("Who is $firstname $lastname?")

Who is Robert  Zimmerman?


## 2.2 Arrays (Lists)

Arrays are N-dimensional mutable containers. We will look at one-dimensional arrays in this section.

<br>

There are several ways to create arrays:

In [50]:
a = [1, 2, 3]

3-element Array{Int64,1}:
 1
 2
 3

In [51]:
typeof(a)

Array{Int64,1}

In [52]:
size(a)

(3,)

In [53]:
b = [1 2 3]

1×3 Array{Int64,2}:
 1  2  3

In [54]:
size(b)

(1, 3)

In [55]:
## Empty arrays:
a = [] 

Any[]

In [56]:
a = Int64[]

Int64[]

In [58]:
typeof(a)

Array{Int64,1}

In [62]:
b = Float64[]

Float64[]

In [63]:
typeof(b)

Array{Float64,1}

In [64]:
## n-element zero array
a = zeros(10)

10-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

In [65]:
## Or like this
a = zeros(Int64, 10)

10-element Array{Int64,1}:
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0

In [66]:
a = ones(10) # Or ones(Int64, 10)

10-element Array{Float64,1}:
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0

In [68]:
## Using the Vector{T}() alias
a = Vector{Float64}(undef, 10)

10-element Array{Float64,1}:
 2.410389252e-314
 2.323371707e-314
 2.3232910124e-314
 2.323201962e-314
 2.3230231735e-314
 2.323201962e-314
 2.323054588e-314
 2.322841539e-314
 2.322841539e-314
 2.574861666e-314

In [71]:
## Directly using the constructor
a = Array{Int64, 1}(undef, 3)

3-element Array{Int64,1}:
 4852148848
 4852148944
 4852148976

In [72]:
## Using the fill function
a = fill(42, 3)

3-element Array{Int64,1}:
 42
 42
 42

In [73]:
## Using random number generators

a = rand(3)  ## Uniform
b = randn(3) ## Standard Normal

3-element Array{Float64,1}:
  0.10228583036703465
 -0.8795619998587627
  1.0951245272547943

<br>

Square brackets `[]` are used to access the elements of an array.

<br>

In [74]:
b[2]

-0.8795619998587627

<br>

The slice syntax `from:step:to` is generally supported and in most cases very fast 

<br>

In [78]:
## Use the `collect` function to transform the iterator into an array
a = collect(2:2:10)

5-element Array{Int64,1}:
  2
  4
  6
  8
 10

In [79]:
## a few examples

In [80]:
collect(4:2:8)

3-element Array{Int64,1}:
 4
 6
 8

In [81]:
collect(8:-2:4)

3-element Array{Int64,1}:
 8
 6
 4

In [82]:
reverse(a)

5-element Array{Int64,1}:
 10
  8
  6
  4
  2

In [83]:
collect(a[end:-1:1])

5-element Array{Int64,1}:
 10
  8
  6
  4
  2

In [84]:
## the keyword `end` gets the last element in an array
a[end]

10

In [85]:
## You can use it to slice an array
a[3:end]

3-element Array{Int64,1}:
  6
  8
 10

In [86]:
## You can use the `vcat` command
y = vcat(2015, 2025:2030, 2100)

8-element Array{Int64,1}:
 2015
 2025
 2026
 2027
 2028
 2029
 2030
 2100

<br>

Ther are many functions that operate on arrays. I will demonstate some of them below. Use the help command to look up additional details. 

<br>

In [90]:
a = [1, 2, 3]
b = 4
push!(a, b)

4-element Array{Int64,1}:
 1
 2
 3
 4

In [92]:
b = [4, 5, 6]
append!(a, b)

6-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6

In [93]:
c = vcat(1, [2, 3], [4, 5])

5-element Array{Int64,1}:
 1
 2
 3
 4
 5

In [94]:
pop!(a) # remove an element from the end of the array

6

In [95]:
popfirst!(a)

1

In [96]:
a = collect(1:10)
deleteat!(a, 4)

9-element Array{Int64,1}:
  1
  2
  3
  5
  6
  7
  8
  9
 10

In [97]:
pushfirst!(a, 4)

10-element Array{Int64,1}:
  4
  1
  2
  3
  5
  6
  7
  8
  9
 10

In [98]:
sort!(a)

10-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

In [100]:
a = [1, 1, 2, 3, 4, 4, 5]
unique!(a)

5-element Array{Int64,1}:
 1
 2
 3
 4
 5

In [101]:
a = collect(1:10)
reverse!(a)

10-element Array{Int64,1}:
 10
  9
  8
  7
  6
  5
  4
  3
  2
  1

In [102]:
in(4, a)

true

In [103]:
length(a)

10

In [104]:
maximum(a)

10

In [107]:
## Note that `maximum` is different than `max`
max(1, 2, 3, 99)

99

In [108]:
## Or use the splat operator
max(a...)

10

In [109]:
minimum(a)

1

In [110]:
sum(a)

55

In [111]:
cumsum(a)

10-element Array{Int64,1}:
 10
 19
 27
 34
 40
 45
 49
 52
 54
 55

In [112]:
empty!(a)

Int64[]

In [114]:
a = rand(10)
b = vec(a)

10-element Array{Float64,1}:
 0.7651855304035577
 0.8960571234291288
 0.5006792871068055
 0.47835045800549736
 0.722541273313176
 0.9835777628532685
 0.011221762215448017
 0.30322914007198687
 0.6945808749473015
 0.6959090766309974

In [122]:
using Random
a = collect(1:10)
shuffle!(a)

10-element Array{Int64,1}:
  9
  3
  1
  8
  2
  5
  6
 10
  4
  7

In [123]:
isempty(a)

false

In [126]:
a = [1, 4, 2, 4, 3, 4, 4, 4, 5, 4, 6, 4, 7, 4, 8, 4, 9, 4, 10, 4]
findall(x -> x == 4, a) # this uses a Lambda function (more next chapter!)

11-element Array{Int64,1}:
  2
  4
  6
  7
  8
 10
 12
 14
 16
 18
 20

In [127]:
## Can nest these functions
deleteat!(a, findall(x -> x == 4, a))

9-element Array{Int64,1}:
  1
  2
  3
  5
  6
  7
  8
  9
 10

In [128]:
a

9-element Array{Int64,1}:
  1
  2
  3
  5
  6
  7
  8
  9
 10

In [129]:
enumerate(a)

enumerate([1, 2, 3, 5, 6, 7, 8, 9, 10])

In [138]:
names = ["Marc", "Anne"]
sex = ["M", "F"]
age = [18, 16]
collect(zip(names, sex, age))

2-element Array{Tuple{String,String,Int64},1}:
 ("Marc", "M", 18)
 ("Anne", "F", 16)

#### __2.2.1 Multidimensional and Nested Arrays__

In this section we deal with multi-dimensional arrays

* `Array{T, 2}` or `Matrix{T}`

In [139]:
a = Array{Float64, 2}

Array{Float64,2}

In [141]:
b = Matrix{Float64} # just an alias for the above

Array{Float64,2}

In [143]:
# A nested array of arrays
a = [[1, 2, 3], [4, 5, 6]]

2-element Array{Array{Int64,1},1}:
 [1, 2, 3]
 [4, 5, 6]

In [144]:
a = [1 4; 2 5; 3 6]

3×2 Array{Int64,2}:
 1  4
 2  5
 3  6

In [145]:
a = zeros(2, 3, 4)

2×3×4 Array{Float64,3}:
[:, :, 1] =
 0.0  0.0  0.0
 0.0  0.0  0.0

[:, :, 2] =
 0.0  0.0  0.0
 0.0  0.0  0.0

[:, :, 3] =
 0.0  0.0  0.0
 0.0  0.0  0.0

[:, :, 4] =
 0.0  0.0  0.0
 0.0  0.0  0.0

In [146]:
b = ones(2, 3, 2)

2×3×2 Array{Float64,3}:
[:, :, 1] =
 1.0  1.0  1.0
 1.0  1.0  1.0

[:, :, 2] =
 1.0  1.0  1.0
 1.0  1.0  1.0

In [147]:
a = fill(42, 2, 2, 2)

2×2×2 Array{Int64,3}:
[:, :, 1] =
 42  42
 42  42

[:, :, 2] =
 42  42
 42  42

In [148]:
a = rand(2, 3, 3)

2×3×3 Array{Float64,3}:
[:, :, 1] =
 0.682405  0.874909  0.642113
 0.986359  0.536694  0.852135

[:, :, 2] =
 0.58103   0.678035  0.843086
 0.683732  0.376481  0.796716

[:, :, 3] =
 0.848141  0.0724671  0.0220103
 0.445552  0.254459   0.933494

In [149]:
a = [3x + 2y + z for x in 1:2, y in 2:3, z in 1:2]

2×2×2 Array{Int64,3}:
[:, :, 1] =
  8  10
 11  13

[:, :, 2] =
  9  11
 12  14

In [152]:
a = [[1, 2, 3], [4, 5, 6]]
mask = [[true, true, false], [false, true, false]]
a[mask]

LoadError: ArgumentError: invalid index: Array{Bool,1}[[1, 1, 0], [0, 1, 0]] of type Array{Array{Bool,1},1}

In [153]:
size(a)

(2,)

In [154]:
ndims(a)

1

In [156]:
a = rand(2, 2, 3)
reshape(a, 3, 2, 2)

3×2×2 Array{Float64,3}:
[:, :, 1] =
 0.2127     0.169897
 0.0552129  0.2688
 0.748385   0.0404856

[:, :, 2] =
 0.131653   0.610439
 0.0387539  0.466162
 0.783017   0.0318343

In [160]:
a = rand(2, 1, 3)
dropdims(a, dims=(2))

2×3 Array{Float64,2}:
 0.40038  0.99335   0.399157
 0.92723  0.857448  0.713128

In [164]:
a = rand(3, 2)
transpose(a)

2×3 LinearAlgebra.Transpose{Float64,Array{Float64,2}}:
 0.182649  0.0112997  0.558284
 0.393008  0.593962   0.22453

## __2.3 Tuples__

Use the `Tuple{T1, T2, T3}` to create an immutable list of elements:

In [165]:
t = (1, 2.5, "a")

(1, 2.5, "a")

In [166]:
typeof(t)

Tuple{Int64,Float64,String}

In [167]:
## also without parentheses:
t = 1, 2.5, "a"

(1, 2.5, "a")

_Immutable_ refers to the fact that once they are created, elements of the data structure cannot be added, removed, or changed

In [168]:
t[1] # index to the first element

1

In [170]:
t[1] = -99 ## this will raise and exception

LoadError: MethodError: no method matching setindex!(::Tuple{Int64,Float64,String}, ::Int64, ::Int64)

## __2.4 Named Tuples__

In [171]:
nt = (a=1, b=2.5)

(a = 1, b = 2.5)

In [172]:
nt.a

1

In [173]:
nt.b

2.5

In [174]:
keys(nt)

(:a, :b)

In [175]:
values(nt)

(1, 2.5)

In [176]:
collect(nt)

2-element Array{Real,1}:
 1
 2.5

In [177]:
pairs(nt)

pairs(::NamedTuple) with 2 entries:
  :a => 1
  :b => 2.5

In [182]:
person = (firstname="Robert", lastname="Zimmerman", age=79, )

(firstname = "Robert", lastname = "Zimmerman", age = 79)

In [183]:
person.firstname

"Robert"

In [184]:
person.lastname

"Zimmerman"

In [185]:
person.age

79

In [190]:
function bob(firstname, lastname, age)
    println("$firstname $lastname is $age years old.")
end

bob (generic function with 3 methods)

In [191]:
bob(person...) ## you can splat the tuple as arguments to a function

Robert Zimmerman is 79 years old.


## __2.5 Dictionaries__

Dictionaries store mappings from keys to values and they have an apparently random sorting. Julia dictionaries are very similar to dictionaries in Python.

In [193]:
d = Dict('a'=>1, 'b'=>2, 'c'=>3)

Dict{Char,Int64} with 3 entries:
  'a' => 1
  'c' => 3
  'b' => 2

In [194]:
d['a']

1

In [201]:
## add a key-value pair on the fly
d['d'] = 4

4

In [202]:
delete!(d, 'b')

Dict{Char,Int64} with 3 entries:
  'a' => 1
  'c' => 3
  'd' => 4

In [203]:
map((i, j) -> mydict[i] = j, ['a', 'b', 'c'], [1, 2, 3])

3-element Array{Int64,1}:
 1
 2
 3

In [204]:
mydict

Dict{Char,Int64} with 3 entries:
  'a' => 1
  'c' => 3
  'b' => 2

In [205]:
mydict['a']

1

In [207]:
get(mydict, 'a', 0) ## 0 is the default value if missing

1

In [208]:
keys(mydict)

Base.KeySet for a Dict{Char,Int64} with 3 entries. Keys:
  'a'
  'c'
  'b'

In [209]:
values(mydict)

Base.ValueIterator for a Dict{Char,Int64} with 3 entries. Values:
  1
  3
  2

In [210]:
haskey(mydict, 'a')

true

In [211]:
in(('a' => 1), mydict)

true

In [212]:
## iterate through k, v pairs
for (k, v) in mydict
    println("$k is $v")
end

a is 1
c is 3
b is 2


## __2.6 Sets__

Use `Set{T}` to represent collections of unordered, unique values.

In [213]:
s = Set() ## create an empty zero-element set

Set{Any}()

In [214]:
s = Set([1, 2, 2, 3, 4]) ## initialize with an array of values

Set{Int64} with 4 elements:
  4
  2
  3
  1

In [216]:
push!(s, 5)

Set{Int64} with 5 elements:
  4
  2
  3
  5
  1

In [217]:
delete!(s, 1)

Set{Int64} with 4 elements:
  4
  2
  3
  5

Set operations are allowed:

In [219]:
set1 = Set([1, 2, 3, 4])
set2 = Set([3, 4, 5, 6])
intersect(set1, set2)

Set{Int64} with 2 elements:
  4
  3

In [220]:
union(set1, set2)

Set{Int64} with 6 elements:
  4
  2
  3
  5
  6
  1

In [221]:
setdiff(set1, set2)

Set{Int64} with 2 elements:
  2
  1

## 2.7 Memory and Copy Issues

Please see the chapter for details.

## 2.8 Various Notes on Data Types

Please see the chapter for details.

#### 2.8.1 Random Numbers

In [222]:
## random float in [0, 1]
rand()

0.4151818668818319

In [224]:
## random integer in [a, b]
rand(1:10)

7

In [225]:
## random float in a:b with precision to the second decimal place
rand(1.0:0.01:10.0)

9.13

In [235]:
## random float in [a, b] using a particular distribution (Normal, Poisson, ...)
using Random
rand(Uniform(0, 1), 2, 3)

2×3 Array{Float64,2}:
 0.430239  0.00557312  0.111077
 0.204796  0.510879    0.288856

In [232]:
using StatsKit.Distributions

In [236]:
rand(Uniform(0, 1), 2, 3)

2×3 Array{Float64,2}:
 0.111379  0.927599  0.713103
 0.600087  0.683229  0.225987

In [237]:
rand(Normal(10, 2), 5, 5)

5×5 Array{Float64,2}:
  9.45199  9.03614   8.47969   7.73148  11.0408
 10.8523   9.14099  10.6642    7.31505   8.50401
 10.7057   9.65153  13.8258   11.4881   10.4956
  7.83528  8.52566  10.0036   11.251     6.42139
  8.31054  9.9088    8.79128  10.0573    9.0759

In [239]:
x = rand(Normal(0, 1), 10_000_000) # generate 10m floats very quickly

10000000-element Array{Float64,1}:
 -0.7549169996300645
 -1.1269423528418
  0.4365320304996664
 -0.2944074350841612
  0.5483036543176085
 -1.9362897670339991
  0.012218296507594777
 -0.47365759845410227
  0.19401605880570996
 -1.4079799667072872
 -0.8910765130846043
  0.3129544810241108
 -0.32705840105192757
  ⋮
  1.4981095703485852
  0.665790579856924
 -0.9566642846285326
  1.5356750416950729
  0.11876123475110811
  0.44355987516781065
 -0.3034460299805314
 -0.39334056431833053
  0.493374101193763
 -1.0534228899805547
  2.32766863334846
 -0.903959014816474

In [241]:
y = rand(Beta(16, 16), 1000)

1000-element Array{Float64,1}:
 0.4636813656837811
 0.5870554107222898
 0.6282577321068474
 0.5795277021160102
 0.4050393555532643
 0.4183894776779732
 0.4723367616500721
 0.4969525908653988
 0.5680628457505805
 0.3476815727621883
 0.5414487920718752
 0.5668817314901754
 0.5025355032926666
 ⋮
 0.5803787307087193
 0.40937138697027015
 0.4440835138602639
 0.4030817555708699
 0.486235793824721
 0.4814330458455863
 0.551847554061124
 0.36330024643910974
 0.4216244568115568
 0.7186760167981301
 0.5662428205380474
 0.3719712957587214

#### 2.8.2 Missing, Nothing, and NaN

See chapter for details.