# Julia for Data Science - Chapter 03

---
## Data Types

In [1]:
x = 123 # julia assumes simplest possible type... here, Integer

123

In [2]:
y = "hello world!" # julia assumes type String

"hello world!"

In [3]:
typeof(x) # 64 bit integer b/c this is a 64-bit machine

Int64

In [4]:
typeof(y)

String

In [7]:
# type casting
z = Int32(x)

123

In [8]:
typeof(z)

Int32

In [9]:
w = Int32("whatever") # can't cast String to Integer

MethodError: MethodError: no method matching Int32(::String)
Closest candidates are:
  Int32(!Matched::Union{Bool, Int32, Int64, UInt32, UInt64, UInt8, Int128, Int16, Int8, UInt128, UInt16}) at boot.jl:732
  Int32(!Matched::Float32) at float.jl:700
  Int32(!Matched::Float64) at float.jl:679
  ...

| Data Type | Sample Values |
|-----------|---------------|
| Int8      | 98, -123 |
| Int32     | 2134112, -2199996 |
| Int64     | 123123123123121, -1234123451234 |
| Float32   | 12312312.3223, -12312312.3223 |
| Float64   | 12332523452345.345233343, -123333312312.3223232 |
| Bool      | true, false (notice that the contents of this type of variable are always lowercase in Julia) |
| Char      | 'a', '?' (notice single quotes) |
| String    | "some word or sentence", " " (notice double quotes) |
| BigInt    | 3454893223743457239848953894985240398349234435234532 |
| BigFloat  | 3454893223743457239848953894985240398349234435234532.3432 |
| Array     | [1, 2322433423, 0.12312312, false, ‘c’, “whatever”] |

`BigInt` and `BigFloat` are special because (1) they theoretically have no limit to how big they can be and (2) they can't be defined with double colon notation `::` ( e.g. `x::Int64`). Instead use a constructor: `x = BigInt()`.

---
## Arrays

### Array basics

Arrays in Julia are like lists in Python. They're mutable.

In [11]:
p = [1, 2322433423, 0.12312312, false, 'c', "whatever"];

In [13]:
p[1]

1

In [14]:
p[2]

2322433423

In [16]:
p[3]

0.12312312

Notice that indexing in Julia is 1-based like R, not 0-based like Python.

In [20]:
p[end] # keyword 'end' access last element in an array

"whatever"

In [25]:
p[end-1] # you can operate on 'end' as an integer index

'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)

In [26]:
p[end-2]

false

You can initialize arrays... and arrays can be multidimensional.

In [30]:
Z = Array{Int64}(undef, 3, 4)

3×4 Array{Int64,2}:
 4556378120  4522542000  4522542160  4522542288
 4522541936  4522542064  4522542224  4522542320
 4522541968  4522542128  4522542256  4522542352

In [31]:
Z = Array{Any}(undef, 3, 1)

3×1 Array{Any,2}:
 #undef
 #undef
 #undef

### Accessing multiple elements in an array

In [32]:
p[1:3]

3-element Array{Any,1}:
          1         
 2322433423         
          0.12312312

In [33]:
p[end-2:end]

3-element Array{Any,1}:
 false          
      'c'       
      "whatever"

In [34]:
p[[1,4]] # gets the first and fourth element of the array p

2-element Array{Any,1}:
     1
 false

In [35]:
# this is a more realistic example
ind = [1,4]
p[ind]

2-element Array{Any,1}:
     1
 false

### Multidimensional arrays

In [36]:
A = Array{Int64}(undef, 3, 4);
A[:] = 1:12;
A

3×4 Array{Int64,2}:
 1  4  7  10
 2  5  8  11
 3  6  9  12

In [42]:
# B = Array{Int64}(undef, 1, 2, 3)
# B[:] = 1:6;
# B

In [43]:
# B[:]

In [44]:
A[2, 3] # returns element in 2nd row, 3rd column (8)

8

In [47]:
A[3,:] # returns the entire 3rd row as a 4-element Array (syntax sugar)

4-element Array{Int64,1}:
  3
  6
  9
 12

In [48]:
A[3,1:end] # does the same as above... more verbose

4-element Array{Int64,1}:
  3
  6
  9
 12

In [49]:
A[:,:] # accesses the entire matrix

3×4 Array{Int64,2}:
 1  4  7  10
 2  5  8  11
 3  6  9  12

In [50]:
A[1:end,1:end] # same as above

3×4 Array{Int64,2}:
 1  4  7  10
 2  5  8  11
 3  6  9  12

In [56]:
A[:] # returns the 2-D Array as a 1-D Array

12-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12

---
## Dictionaries

Julia dictionaries are like Python dictionaries (hash tables). The syntax for creating them is slightly different... Julia is more like Ruby.

In [57]:
a = Dict()

Dict{Any,Any} with 0 entries

In [59]:
b = Dict("one" => 1, "two" => 2, "three" => 3, "four" => 4)

Dict{String,Int64} with 4 entries:
  "two"   => 2
  "four"  => 4
  "one"   => 1
  "three" => 3

In [60]:
b["three"]

3

In [61]:
b["five"]

KeyError: KeyError: key "five" not found

---
## Basic Commands and Functions

### `print()` and `println()`

In [64]:
print(p)
print(p)

Any[1, 2322433423, 0.123123, false, 'c', "whatever"]Any[1, 2322433423, 0.123123, false, 'c', "whatever"]

In [65]:
println(p)
println(p)

Any[1, 2322433423, 0.123123, false, 'c', "whatever"]
Any[1, 2322433423, 0.123123, false, 'c', "whatever"]


In [66]:
print("cheese", 'd', 123, true)

cheesed123true

In [68]:
println("cheese", 'd', 123, true)

ErrorException: function println does not accept keyword arguments

In [70]:
?(print)

search: [0m[1mp[22m[0m[1mr[22m[0m[1mi[22m[0m[1mn[22m[0m[1mt[22m [0m[1mp[22m[0m[1mr[22m[0m[1mi[22m[0m[1mn[22m[0m[1mt[22mln [0m[1mp[22m[0m[1mr[22m[0m[1mi[22m[0m[1mn[22m[0m[1mt[22mstyled s[0m[1mp[22m[0m[1mr[22m[0m[1mi[22m[0m[1mn[22m[0m[1mt[22m is[0m[1mp[22m[0m[1mr[22m[0m[1mi[22m[0m[1mn[22m[0m[1mt[22m [0m[1mp[22m[0m[1mr[22mev[0m[1mi[22m[0m[1mn[22md [0m[1mp[22ma[0m[1mr[22ment[0m[1mi[22m[0m[1mn[22mdices [0m[1mp[22m[0m[1mr[22mec[0m[1mi[22msio[0m[1mn[22m



```
print([io::IO], xs...)
```

Write to `io` (or to the default output stream [`stdout`](@ref) if `io` is not given) a canonical (un-decorated) text representation of values `xs` if there is one, otherwise call [`show`](@ref). The representation used by `print` includes minimal formatting and tries to avoid Julia-specific details.

Printing `nothing` is not allowed and throws an error.

# Examples

```jldoctest
julia> print("Hello World!")
Hello World!
julia> io = IOBuffer();

julia> print(io, "Hello", ' ', :World!)

julia> String(take!(io))
"Hello World!"
```


### `typemax()` and `typemin()`

In [72]:
typemax(Int64)

9223372036854775807

In [73]:
typemin(Int64)

-9223372036854775808

In [74]:
typemax(Int8)

127

In [75]:
typemax(Float64)

Inf

In [76]:
typemin(Float64)

-Inf

### `collect()`

`collect(ElementType, X)`, where `X` is any data type that corresponds to a kind of range (usually referred to as a "collection"), and `ElementType` is the type of elements of `X` that you wish to obtain (this parameter is usually omitted).

In [77]:
1:5

1:5

In [78]:
collect(Int8, 1:5)

5-element Array{Int8,1}:
 1
 2
 3
 4
 5

In [79]:
collect(1:5)

5-element Array{Int64,1}:
 1
 2
 3
 4
 5

### `show()`

Prints the contents of an array without all the metadata.

In [82]:
collect(1:5)

5-element Array{Int64,1}:
 1
 2
 3
 4
 5

In [81]:
show(collect(1:5))

[1, 2, 3, 4, 5]

In [83]:
show([123, 134])

[123, 134]

In [84]:
show([123 345])

[123 345]

In [85]:
1 == 1

true

In [86]:
[123, 234] == [123 234]

false

In [87]:
typeof([123, 234])

Array{Int64,1}

In [88]:
typeof([123 234])

Array{Int64,2}

In [89]:
[123, 234]

2-element Array{Int64,1}:
 123
 234

In [90]:
[123 234]

1×2 Array{Int64,2}:
 123  234

In [91]:
[1 2
 3 4]

2×2 Array{Int64,2}:
 1  2
 3  4

In [92]:
show(collect(1:50))

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]

### range()

In [111]:
range(0, stop=5) # same as 0:5

0:5

In [113]:
range(0, 5, length=11)

0.0:0.5:5.0

In [114]:
range(0, 5, step=0.5) # 0:0.5:5

0.0:0.5:5.0

In [108]:
show(collect(0:0.5:5)) # start:step:end

[0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]

In [105]:
show(collect(range(0, 5, length=11)))

[0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0]

---
## Mathematical Functions

### `round()`

In [115]:
round(123.45)

123.0

In [118]:
?round

search: [0m[1mr[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22m [0m[1mr[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22ming [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mUp [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mDown [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mToZero [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mingMode [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mNearest



```
round(z::Complex[, RoundingModeReal, [RoundingModeImaginary]])
round(z::Complex[, RoundingModeReal, [RoundingModeImaginary]]; digits=, base=10)
round(z::Complex[, RoundingModeReal, [RoundingModeImaginary]]; sigdigits=, base=10)
```

Return the nearest integral value of the same type as the complex-valued `z` to `z`, breaking ties using the specified [`RoundingMode`](@ref)s. The first [`RoundingMode`](@ref) is used for rounding the real components while the second is used for rounding the imaginary components.

# Example

```jldoctest
julia> round(3.14 + 4.5im)
3.0 + 4.0im
```

---

```
round([T,] x, [r::RoundingMode])
round(x, [r::RoundingMode]; digits::Integer=0, base = 10)
round(x, [r::RoundingMode]; sigdigits::Integer, base = 10)
```

Rounds the number `x`.

Without keyword arguments, `x` is rounded to an integer value, returning a value of type `T`, or of the same type of `x` if no `T` is provided. An [`InexactError`](@ref) will be thrown if the value is not representable by `T`, similar to [`convert`](@ref).

If the `digits` keyword argument is provided, it rounds to the specified number of digits after the decimal place (or before if negative), in base `base`.

If the `sigdigits` keyword argument is provided, it rounds to the specified number of significant digits, in base `base`.

The [`RoundingMode`](@ref) `r` controls the direction of the rounding; the default is [`RoundNearest`](@ref), which rounds to the nearest integer, with ties (fractional values of 0.5) being rounded to the nearest even integer. Note that `round` may give incorrect results if the global rounding mode is changed (see [`rounding`](@ref)).

# Examples

```jldoctest
julia> round(1.7)
2.0

julia> round(Int, 1.7)
2

julia> round(1.5)
2.0

julia> round(2.5)
2.0

julia> round(pi; digits=2)
3.14

julia> round(pi; digits=3, base=2)
3.125

julia> round(123.456; sigdigits=2)
120.0

julia> round(357.913; sigdigits=4, base=2)
352.0
```

!!! note
    Rounding to specified digits in bases other than 2 can be inexact when operating on binary floating point numbers. For example, the [`Float64`](@ref) value represented by `1.15` is actually *less* than 1.15, yet will be rounded to 1.2.

    # Examples

    ```jldoctest; setup = :(using Printf)
    julia> x = 1.15
    1.15

    julia> @sprintf "%.20f" x
    "1.14999999999999991118"

    julia> x < 115//100
    true

    julia> round(x, digits=1)
    1.2
    ```


# Extensions

To extend `round` to new numeric types, it is typically sufficient to define `Base.round(x::NewType, r::RoundingMode)`.

---

```
round(dt::TimeType, p::Period, [r::RoundingMode]) -> TimeType
```

Return the `Date` or `DateTime` nearest to `dt` at resolution `p`. By default (`RoundNearestTiesUp`), ties (e.g., rounding 9:30 to the nearest hour) will be rounded up.

For convenience, `p` may be a type instead of a value: `round(dt, Dates.Hour)` is a shortcut for `round(dt, Dates.Hour(1))`.

```jldoctest
julia> round(Date(1985, 8, 16), Dates.Month)
1985-08-01

julia> round(DateTime(2013, 2, 13, 0, 31, 20), Dates.Minute(15))
2013-02-13T00:30:00

julia> round(DateTime(2016, 8, 6, 12, 0, 0), Dates.Day)
2016-08-07T00:00:00
```

Valid rounding modes for `round(::TimeType, ::Period, ::RoundingMode)` are `RoundNearestTiesUp` (default), `RoundDown` (`floor`), and `RoundUp` (`ceil`).

---

```
round(x::Period, precision::T, [r::RoundingMode]) where T <: Union{TimePeriod, Week, Day} -> T
```

Round `x` to the nearest multiple of `precision`. If `x` and `precision` are different subtypes of `Period`, the return value will have the same type as `precision`. By default (`RoundNearestTiesUp`), ties (e.g., rounding 90 minutes to the nearest hour) will be rounded up.

For convenience, `precision` may be a type instead of a value: `round(x, Dates.Hour)` is a shortcut for `round(x, Dates.Hour(1))`.

```jldoctest
julia> round(Dates.Day(16), Dates.Week)
2 weeks

julia> round(Dates.Minute(44), Dates.Minute(15))
45 minutes

julia> round(Dates.Hour(36), Dates.Day)
2 days
```

Valid rounding modes for `round(::Period, ::T, ::RoundingMode)` are `RoundNearestTiesUp` (default), `RoundDown` (`floor`), and `RoundUp` (`ceil`).

Rounding to a `precision` of `Month`s or `Year`s is not supported, as these `Period`s are of inconsistent length.


In [119]:
round(Int64, 123.45)

123

In [121]:
round(123.45678; digits=2)

123.46

In [123]:
round(1234.567; sigdigits=3)

1230.0