# Table of Contents
1. [Variables](#️⃣-variables)
2. [Structures](#🏚-structures)
3. [Operators](#❗-operators)
4. [Functions](#🆒-functions)
5. [For Loop](#➰-for-loop)
6. [While Loop](#➰-while-loop)
7. [Native Data](#🖥-native-data)
    1. [Broadcasting Operators and Functions](#broadcasting-operators-and-functions)
    2. [Functions with a bang](#functions-with-a-bang)
    3. [Strings](#🧵-strings)
    4. [String Manipulation](#🔗-string-manipulation)
    5. [Dictionaries](#📔-dictionaries)
    6. [Splat Operator](#⏺-splat-operator)
8. [Julia Dates](#📆-julia-dates)
    1. [Date and DateTime](#date-and-datetime-types)
9. [Random](#🔢-random-numbers)
10. [DataFrames](#📑-dataframes)
11. [Filter and Subset](#🔍-filter-and-subset-dataframe)
    1. [Filter](#filter)
    2. [Subset](#subset)
12. [Select](#👆-select)
    1. [Select Renames](#renaming-colums-using-select)

#### Libraries

In [1]:
using Dates
using DataFrames
using CSV

# #️⃣ Variables

Most used:
* Integer: ```Int64```
* Real Numbers: ```Float64```
* Boolean: ```Bool```
* Strings: ```String```

In [2]:
name = "Julia"
age = 11
active = true
currency = 4.56

println(typeof(name))
println(typeof(age))
println(typeof(active))
println(typeof(currency))

String
Int64
Bool
Float64


# 🏚 Structures

In [108]:
struct Language
    name::String
    title::String
    year_of_birth::Int64
    fast::Bool
end

mutable struct MutLanguage
    name::String
    title::String
    year_of_birth::Int64
    fast::Bool
end

### You can list field names of structs

In [109]:
fieldnames(Language)

(:name, :title, :year_of_birth, :fast)

## Use of Structs

In [110]:
julia = Language("Julia", "Rapidus", 2012, true)
python = Language("Python", "Letargicus", 1991, false)

Language("Python", "Letargicus", 1991, false)

In [111]:
function printLanguage(lang::Language)
    type = typeof(lang)
    name = lang.name
    title = lang.title
    year = lang.year_of_birth
    fast = lang.fast ? "Yes" : "No"
    println("Type: $type; Name: $name, Title: $title, Year: $year, Fast: $fast")
end

function printLanguage(lang::MutLanguage)
    type = typeof(lang)
    name = lang.name
    title = lang.title
    year = lang.year_of_birth
    fast = lang.fast ? "Yes" : "No"
    println("Type: $type; Name: $name, Title: $title, Year: $year, Fast: $fast")
end

printLanguage (generic function with 2 methods)

In [112]:
printLanguage(julia)
printLanguage(python)

Type: Language; Name: Julia, Title: Rapidus, Year: 2012, Fast: Yes


Type: Language; Name: Python, Title: Letargicus, Year: 1991, Fast: No


### Mutable Struct

In [113]:
julia_mutable = MutLanguage("Julia", "Rapidus", 2012, true)
printLanguage(julia_mutable)

julia_mutable.title = "Python Obliteratus"
printLanguage(julia_mutable)

Type: MutLanguage; Name: Julia, Title: Rapidus, Year: 2012, Fast: Yes
Type: MutLanguage; Name: Julia, Title: Python Obliteratus, Year: 2012, Fast: Yes


# ❗ Operators

They are 3 operators:
* ```!```: **NOT**
* ```&&```: **AND**
* ```||```: **OR**

In [10]:
!true

false

In [11]:
(false && true) || (!false)

true

In [12]:
(6 isa Real) && (6 isa Int64)

true

## Numeric operators

1. **Equality**
    * == "equal"
    * != or ≠ "not equal
2. **Less Than**
    * < "less than"
    * <= or ≤ "less than or equal to"
3. **Greater Than**
    * \> "greater than"
    * \>= or ≥ "greater than or equal to"

In [13]:
1 == 1

true

In [14]:
1 >= 10

false

Between types

In [15]:
1 == 1.0

true

In [16]:
(1 != 10) || (3.14 <= 2.71)

true

# 🆒 Functions

In [17]:
function function_name(arg1, arg2)
    result = "Do stuff"
    return result
end

function_name (generic function with 1 method)

In [18]:
function logarithm(x::Real, base::Real = 2.7182818284590)
    return log(base, x)
end

logarithm (generic function with 2 methods)

In [19]:
function_name("abc", "mummy")

"Do stuff"

In [20]:
f_name(a::Int64, b::Int64) = a + b

f_name (generic function with 1 method)

In [21]:
f_name(4, 6)

10

# ➰ For loop

In [22]:
for i in 1:10
    println(i)
end

1
2
3
4
5
6
7
8
9
10


# ➰ While Loop

In [23]:
n = 0
while n < 3
    global  n += 1
    println(n)
end

1
2
3


# 🖥 Native Data

Julia has several native data structures. They are abstractions of data that represent some form of structured data. They hold homogeneous or heterogeneous data. Since they are collections, they can be looped over with the for loops.

Data Types:
* ```String```
* ```Tuple```
* ```NamedTuple```
* ```UnitRange```
* ```Arrays```
* ```Pair```
* ```Dict```
* ```Symbol```

## Broadcasting Operators and Functions

We can broadcast mathematical operations like ```*``` (multiplication) or ```+``` (addition) using the dot operator. For example, broadcasted addition would imply a change from ```+``` to ```.+```:

In [24]:
[1, 2, 3] .+ 1

3-element Vector{Int64}:
 2
 3
 4

It also works automatically with functions.

In [25]:
logarithm.([1, 2, 3])

3-element Vector{Float64}:
 0.0
 0.6931471805599569
 1.0986122886681282

## Functions with a bang ```!```

It is a Julia convention to append a bang ```!``` to names of functions that modify one or more of their arguments. This convention warns the user that the function is **not pure**, i.e., that it has *side effects*. A function with side effects is useful when you want to update a large data structure or variable container without having all the overhead from creating a new instance.

In [26]:
function add_one!(V)
    for i in eachindex(V)
        V[i] += 1
    end
    
    return nothing
end

add_one! (generic function with 1 method)

In [27]:
data = [1, 2, 3]
add_one!(data)

data

3-element Vector{Int64}:
 2
 3
 4

## 🧵 Strings

In [28]:
typeof("This is String")

String

In [29]:
text = "
This is a big multiline string.
As you can see.
It is still a String to Julia.
"

println(typeof(text))
println(text)

String

This is a big multiline string.
As you can see.
It is still a String to Julia.



In [30]:
s = """
    This is a big multiline string with a nested "quotation".
    As you can see.
    It is still a String to Julia.
    """

println(typeof(s))
println(s)

String
This is a big multiline string with a nested "quotation".
As you can see.
It is still a String to Julia.



## ➕ String Concatenation

It is ```*``` not ```+```

In [31]:
hello = "Hello"
goodbye = "Goodbye"

hello * goodbye

"HelloGoodbye"

In [32]:
join([hello, goodbye], " ")

"Hello Goodbye"

In [33]:
println("$hello $goodbye")

Hello Goodbye


## 🔗 String Manipulation

In [34]:
julia_string = "Julia is an amazing open source programming language"

"Julia is an amazing open source programming language"

In [35]:
contains(julia_string, "Julia")

true

In [36]:
startswith(julia_string, "Julia")

true

In [37]:
endswith(julia_string, "Julia")

false

In [38]:
lowercase(julia_string)

"julia is an amazing open source programming language"

In [39]:
uppercase(julia_string)

"JULIA IS AN AMAZING OPEN SOURCE PROGRAMMING LANGUAGE"

In [40]:
titlecase(julia_string)

"Julia Is An Amazing Open Source Programming Language"

In [41]:
split(julia_string, " ")

8-element Vector{SubString{String}}:
 "Julia"
 "is"
 "an"
 "amazing"
 "open"
 "source"
 "programming"
 "language"

### String to a number

In [42]:
parse(Int64, "123")

123

In [43]:
# Error - parse(Int64, "asd")

In [44]:
tryparse(Int64, "asd")

In [45]:
tryparse(Int64, "1abc")

# 📔 Dictionaries

In [46]:
my_dict = Dict([("one", 1), ("two", 2)])

Dict{String, Int64} with 2 entries:
  "two" => 2
  "one" => 1

In [47]:
my_dict = Dict("one" => 1, "two" => 2)

Dict{String, Int64} with 2 entries:
  "two" => 2
  "one" => 1

In [48]:
my_dict["one"]

1

In [49]:
my_dict["three"] = 3

3

In [50]:
my_dict["three"]

3

In [51]:
"two" in keys(my_dict)

true

In [52]:
delete!(my_dict, "three")

Dict{String, Int64} with 2 entries:
  "two" => 2
  "one" => 1

In [53]:
popped_value = pop!(my_dict, "two") # Keep deleted value

2

In [54]:
A = ["one", "two", "three"]
B = [1, 2, 3]

my_dict = Dict(zip(A, B))

my_dict

Dict{String, Int64} with 3 entries:
  "two"   => 2
  "one"   => 1
  "three" => 3

# ⏺ Splat operator

In Julia You can convert array into sequence of arguments with ...

In [55]:
my_collection = [1, 2, 3]
add_elements(a, b, c) = a + b + c

add_elements (generic function with 1 method)

In [56]:
# before
add_elements(my_collection[1], my_collection[2], my_collection[3])

6

In [57]:
# after
add_elements(my_collection...)

6

# 📆 Julia Dates

## ```Date``` and ```DateTime``` Types

The ```Dates``` standard library module has two types for working with dates:
1. ```Date```: representing time in days
2. ```DateTime```: representing Time in millisecond precision

In [58]:
Date(1987) # year

1987-01-01

In [59]:
Date(1987, 9) # year, month

1987-09-01

In [60]:
Date(1987, 9, 13) # year, month, day

1987-09-13

In [61]:
DateTime(1987, 9, 13, 21) # year, month, day, hour

1987-09-13T21:00:00

In [62]:
DateTime(1987, 9, 13, 21, 21) # year, month, day, hour, minute

1987-09-13T21:21:00

In [63]:
DateTime(Year(1987), Month(9), Day(13), Hour(21), Minute(21))

1987-09-13T21:21:00

### Parsing Dates

In [64]:
Date("19870913", "yyyymmdd")

1987-09-13

In [65]:
DateTime("1987-09-13T21:21:00", "yyyy-mm-ddTHH:MM:SS")

1987-09-13T21:21:00

In [66]:
Date("19870913", dateformat"yyyymmdd")

1987-09-13

In [67]:
my_birthday = Date("1987-09-13")

1987-09-13

In [68]:
year(my_birthday)

1987

In [69]:
month(my_birthday)

9

In [70]:
day(my_birthday)

13

In [71]:
yearmonth(my_birthday)

(1987, 9)

In [72]:
monthday(my_birthday)

(9, 13)

In [73]:
yearmonthday(my_birthday)

(1987, 9, 13)

In [74]:
dayofweek(my_birthday)

7

In [75]:
dayname(my_birthday)

"Sunday"

In [76]:
dayofweekofmonth(my_birthday) # Second sunday of september

2

In [77]:
my_birthday + Day(90)

1987-12-12

In [78]:
my_birthday + Day(90) + Month(2) + Year(1)

1989-02-11

In [79]:
today() - my_birthday

13231 days

In [80]:
DateTime(today()) - DateTime(my_birthday)

1143158400000 milliseconds

In [81]:
for date in Date("2021-01-01"):Day(1):Date("2021-01-07")
    println(date)
end

2021-01-01
2021-01-02
2021-01-03
2021-01-04
2021-01-05
2021-01-06
2021-01-07


In [82]:
for date in Date("2021-01-01"):Day(3):Date("2021-01-07")
    println(date)
end

2021-01-01


2021-01-04
2021-01-07


In [83]:
for date in Date("2021-01-01"):Month(1):Date("2021-03-01")
    println(date)
end

2021-01-01
2021-02-01
2021-03-01


In [84]:
date_interval = Date("2021-01-01"):Month(1):Date("2021-03-01")
typeof(date_interval)

StepRange{Date, Month}

In [85]:
collected_date_interval = collect(date_interval)

3-element Vector{Date}:
 2021-01-01
 2021-02-01
 2021-03-01

In [86]:
collected_date_interval[end]

2021-03-01

In [87]:
collected_date_interval .+ Day(10)

3-element Vector{Date}:
 2021-01-11
 2021-02-11
 2021-03-11

# 🔢 Random Numbers

In [88]:
using Random: seed!

Two Functions:
* ```rand```: random element of data structure
* ```randn```: random number from standard normal distribution (mean 0 and standard deviation 1)

### rand

In [89]:
rand()

0.6179989070822092

In [90]:
rand(3)

3-element Vector{Float64}:
 0.10744384560400977
 0.21742581630896074
 0.7629251062013356

In [91]:
rand(1.0:10.0)

7.0

In [92]:
rand(2:2:20)

10

In [93]:
rand(2:2:20, 3)

3-element Vector{Int64}:
  8
  4
 20

In [94]:
rand((42, "Julia", 3.14))

42

In [95]:
rand([1, 2, 3])

3

In [96]:
rand(Dict(:one => 1, :two => 2))

:one => 1

In [97]:
rand(1.0:3.0, (2, 2))

2×2 Matrix{Float64}:
 1.0  2.0
 1.0  2.0

### randn

In [98]:
randn()

0.37891423571522187

In [99]:
randn((2, 2))

2×2 Matrix{Float64}:
 0.427803  -0.43792
 0.111006  -1.20213

# 📑 DataFrames

In [100]:
names = ["Sally", "Bob", "Alice", "Hank"]
grades = [1, 5, 8.5, 4]

df = DataFrame(; name= names, grade_2023= grades)

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Sally,1.0
2,Bob,5.0
3,Alice,8.5
4,Hank,4.0


In [101]:
function grades_2023()
    name = ["Sally", "Bob", "Alice", "Hank"]
    grade_2023 = [1, 5, 8.5, 4]

    DataFrame(; name, grade_2023)
end

df = grades_2023()

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Sally,1.0
2,Bob,5.0
3,Alice,8.5
4,Hank,4.0


Change contents of df

In [102]:
df = DataFrame(name= ["Malice"], grade_2023= ["10"])

Row,name,grade_2023
Unnamed: 0_level_1,String,String
1,Malice,10


Recover it from function

In [103]:
df = grades_2023()

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Sally,1.0
2,Bob,5.0
3,Alice,8.5
4,Hank,4.0


In [104]:
DataFrame(σ = ["a", "a", "a"], δ = [π, π/2, π/3])

Row,σ,δ
Unnamed: 0_level_1,String,Float64
1,a,3.14159
2,a,1.5708
3,a,1.0472


In [105]:
df

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Sally,1.0
2,Bob,5.0
3,Alice,8.5
4,Hank,4.0


In [106]:
function grades_2023(names::Vector{Int})
    df = grades_2023()
    df[names, :]
end

grades_2023([2, 1])

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Bob,5.0
2,Sally,1.0


In [107]:
df

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Sally,1.0
2,Bob,5.0
3,Alice,8.5
4,Hank,4.0


# 🔍 Filter and Subset DataFrame

## Filter

Was added Ealier than Subset to julia and is more powerfull and consistent with syntax.

In [114]:
grades_2023()

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Sally,1.0
2,Bob,5.0
3,Alice,8.5
4,Hank,4.0


We can filter rows by using ```filter(source => f::Function, df)```.

In [115]:
equals_alice(name::String) = name == "Alice"

equals_alice (generic function with 1 method)

With this function we can pass it as filter argument

In [116]:
filter(:name => equals_alice, grades_2023())

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Alice,8.5


Also works for vector

In [117]:
filter(equals_alice, ["Alice", "Bob", "Dave"])

1-element Vector{String}:
 "Alice"

We can use **Anonymous Function**

In [118]:
filter(:name => n -> n == "Alice", grades_2023())

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Alice,8.5


We can make it even shorter

In [119]:
filter(:name => ==("Alice"), grades_2023())

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Alice,8.5


To get rows that are not Alice simply use ```!=```

In [120]:
filter(:name => !=("Alice"), grades_2023())

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Sally,1.0
2,Bob,5.0
3,Hank,4.0


More complex filter when we have the people whose names start with A or B and have a grade above 6

In [121]:
function complex_filter(name, grade)::Bool
    desired_name = startswith(name, 'A') || startswith(name, 'B')
    desired_grade = 6 < grade

    return desired_grade && desired_name
end

complex_filter (generic function with 1 method)

In [122]:
filter([:name, :grade_2023] => complex_filter, grades_2023())

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Alice,8.5


## Subset

Subset makes it easier to work with missing values. In contrast to ```filter``` it works on complete columns instead of rows or single values. If we want to use same functions we should wrap it inside ```ByRow```

In [123]:
subset(grades_2023(), :name => ByRow(equals_alice))

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Alice,8.5


We can also use anonymous functions

In [125]:
subset(grades_2023(), :name => ByRow(name -> name == "Alice"))

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Alice,8.5


In [124]:
subset(grades_2023(), :name => ByRow(==("Alice")))

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Alice,8.5


More complex subset use

In [126]:
function salaries()
    names = ["John", "Hank", "Karen", "Zed"]
    salary = [1_900, 2_800, 2_800, missing]
    DataFrame(; names, salary)
end
salaries()

Row,names,salary
Unnamed: 0_level_1,String,Int64?
1,John,1900
2,Hank,2800
3,Karen,2800
4,Zed,missing


This ```DataFrame``` Has salary value missing for Zed.

If we want to use ```filter``` on this ```DataFrame``` it will throw an error:

In [128]:
# filter(:salary => >(2_000), salaries()) # -- TypeError: non-boolean (Missing) used in boolean context

```subset``` will also fail but error will point us towards an easy solition:

In [130]:
# subset(salaries(), :salary => ByRow(>(2_000))) # -- ArgumentError: missing was returned in condition number 1 but only true or false are allowed; pass skipmissing=true to skip missing values

As you can see we can just add ```skipmissing= true``` argument:

In [131]:
subset(salaries(), :salary => ByRow(>(2_000)), skipmissing= true)

Row,names,salary
Unnamed: 0_level_1,String,Int64?
1,Hank,2800
2,Karen,2800


# 👆 Select

```filter``` **removes rows**, ```select``` **removes columns**.

It is however more versatile than just removing colums.

In [133]:
function responses()
    id = [1, 2]
    q1 = [28, 61]
    q2 = [:us, :fr]
    q3 = ["F", "B"]
    q4 = ["B", "C"]
    q5 = ["A", "E"]
    DataFrame(; id, q1, q2, q3, q4, q5)
end
responses()

Row,id,q1,q2,q3,q4,q5
Unnamed: 0_level_1,Int64,Int64,Symbol,String,String,String
1,1,28,us,F,B,A
2,2,61,fr,B,C,E


Here, the data represents answers for five questions (q1, q2, …, q5) in a given questionnaire. We will start by “selecting” a few columns from this dataset.

As usual, we use **symbols** to specify columns:

In [134]:
select(responses(), :id, :q1)

Row,id,q1
Unnamed: 0_level_1,Int64,Int64
1,1,28
2,2,61


We can also use **strings**

In [135]:
select(responses(), "id", "q1", "q2")

Row,id,q1,q2
Unnamed: 0_level_1,Int64,Int64,Symbol
1,1,28,us
2,2,61,fr


Also regex

In [137]:
select(responses(), r"^q")

Row,q1,q2,q3,q4,q5
Unnamed: 0_level_1,Int64,Symbol,String,String,String
1,28,us,F,B,A
2,61,fr,B,C,E


To select everything except one or more colums use **Not**

In [139]:
select(responses(), Not(:q5, :q1, :id))

Row,q2,q3,q4
Unnamed: 0_level_1,Symbol,String,String
1,us,F,B
2,fr,B,C


You can also mix

In [140]:
select(responses(), :q5, Not(:q5))

Row,q5,id,q1,q2,q3,q4
Unnamed: 0_level_1,String,Int64,Int64,Symbol,String,String
1,A,1,28,us,F,B
2,E,2,61,fr,B,C


```q5``` is our first column because it got selected before other colums.

You can achieve same thing by using ```:``` instead of ```Not(:q5)```. You can treat ```:``` as "All columns that we didn't include yet":

In [141]:
select(responses(), :q5, :)

Row,q5,id,q1,q2,q3,q4
Unnamed: 0_level_1,String,Int64,Int64,Symbol,String,String
1,A,1,28,us,F,B
2,E,2,61,fr,B,C


Or to put ```q5``` at second position

In [144]:
select(responses(), 1, :q5, :)

Row,id,q5,q1,q2,q3,q4
Unnamed: 0_level_1,Int64,String,Int64,Symbol,String,String
1,1,A,28,us,F,B
2,2,E,61,fr,B,C


### Renaming Colums using Select

In [145]:
select(responses(), 1 => "Participant", :q1 => "Age", :q2 => "Nationality")

Row,Participant,Age,Nationality
Unnamed: 0_level_1,Int64,Int64,Symbol
1,1,28,us
2,2,61,fr


Using [Splat](#⏺-splat-operator) operator we can write:

In [146]:
renames = (1 => "Participant", :q1 => "Age", :q2 => "Nationality")
select(responses(), renames...)

Row,Participant,Age,Nationality
Unnamed: 0_level_1,Int64,Int64,Symbol
1,1,28,us
2,2,61,fr
