# Introduction to Julia
## Julia Language Context and Basic Concepts
by Iga Szczesniak (AIR Centre) and Júlio Hoffimann (Arpeggeo®)

![Julia EO 2024 Banner](figures/JuliaEOlogo.png "Julia EO 2024 Banner")

>**Quick Facts**
>
>- Julia is an **open-source**, **high-level**, and **high-performance** programming language developed at MIT by Jeff Bezanson, Stefan Karpinski, Viral B. Shah, and Alan Edelman.
>- It was **first released in 2012**, with version 1.0 being released in 2018. The latest version is **v1.10.0**.
>- Julia solves **the two-language problem**, providing the speed of C and user-friendly syntax similar to Python or MATLAB.
>- It is **dynamically typed** and **just-in-time compiled**.
>- Julia is a **function-oriented** language.
>- Well-suited for **data science applications, technical computing, and scientific machine learning**.

In [1]:
versioninfo()

Julia Version 1.9.0-rc1
Commit 3b2e0d8fbc1 (2023-03-07 07:51 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.4.0)
  CPU: 8 × Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, icelake-client)
  Threads: 8 on 8 virtual cores
Environment:
  JULIA_NUM_THREADS = 8


In [2]:
VERSION

v"1.9.0-rc1"

## 1. Getting started
- Defining variables
- Printing 
- Simple math syntax

### Defining variables

In [3]:
name = "JuliaEO"

"JuliaEO"

In [4]:
age = 2 # years
# You can leave comments by typing the hash key

2

In [5]:
typeof(name)
# Shift enter or the play button to execute the cell

String

In [6]:
typeof(age)

Int64

In [7]:
# Convert age to a float and assign it to the variable age_float
age_float = convert(Float32, age)

2.0f0

### Printing

In [8]:
println("Hello JuliaEO!")

Hello JuliaEO!


We can use the `$` sign to insert existing variables into a string

In [9]:
println("Name = $name")

Name = JuliaEO


We can use the `*` sign for concatenation

In [10]:
"Name = $name, " * "Age = $age"

"Name = JuliaEO, Age = 2"

### Simple math syntax 

In [11]:
sum = 2 + 4

6

In [12]:
difference = 5 - 3

2

In [13]:
product = 3 * 100

300

In [14]:
quotient = 20 / 4

5.0

In [15]:
power = 10 ^ 2 

100

## 2. Data Structures
Once we start working with many variables, it is better to store the data in structures. 

### Struct

We can define the custom data structures that can hold multiple pieces of data in a well-defined format with a `struct`.
<br>Note: Structs are **immutable**. Once we assign values to the struct, we cannot change them later.<br>  

In [16]:
struct Location
    name::String
    lat::Float32
    lon::Float32
    island::Bool
end

In [17]:
fieldnames(Location)

(:name, :lat, :lon, :island)

In [18]:
terceira = Location("Terceira", 38.7216, 27.2206, true)

Location("Terceira", 38.7216f0, 27.2206f0, true)

In [19]:
paris = Location("Paris", 48.8566, 2.3522, false)

Location("Paris", 48.8566f0, 2.3522f0, false)

### Mutable Struct

If we want to update any of the variables, we use a `mutable struct`.

In [20]:
mutable struct MutableLocation
    name::String
    lat::Float32
    lon::Float32
    island::Bool
end

In [21]:
terceira_mutable = MutableLocation("Terceira", 38.7216, 27.2206, true)

MutableLocation("Terceira", 38.7216f0, 27.2206f0, true)

In [22]:
terceira_mutable.name = "TER"

"TER"

In [23]:
terceira_mutable

MutableLocation("TER", 38.7216f0, 27.2206f0, true)

## 3. Native Data Structures 

### Tuples
Tuples are ordered sequences of elements that allow indexing. They are **immutable**, meaning their fields cannot be updated.

In [24]:
my_tuple = (1, "hello", 3.14)

(1, "hello", 3.14)

In [25]:
my_tuple[1]

1

In [26]:
typeof(my_tuple)

Tuple{Int64, String, Float64}

>Note: Julia is 1-based indexing, not 0-based like Python.

Named Tuples are almost like regular Tuples, but they assign a name to each element.

In [27]:
named_tuple = (number = 1, word = "hello", pi = "3.14")

(number = 1, word = "hello", pi = "3.14")

In [28]:
named_tuple[1]

1

In [29]:
named_tuple.word

"hello"

In [30]:
typeof(named_tuple)

NamedTuple{(:number, :word, :pi), Tuple{Int64, String, String}}

### Dictionaries
Dictionaries store sets of data that are related to each other, e.g. contacts lists. They are **mutable**, allowing us to update their fields but we cannot index into them.

In [31]:
contactslist = Dict("Marta" => "555 333", "Alice" => "222 888")

Dict{String, String} with 2 entries:
  "Alice" => "222 888"
  "Marta" => "555 333"

In [32]:
# Accessing elements from a dictionary 
contactslist["Alice"]

"222 888"

In [33]:
# Adding another entry to a dictionary 
contactslist["Emma"] = "111 000"

"111 000"

In [34]:
contactslist

Dict{String, String} with 3 entries:
  "Alice" => "222 888"
  "Marta" => "555 333"
  "Emma"  => "111 000"

In [35]:
# Removing elements from a dictionary 
pop!(contactslist, "Alice")

"222 888"

In [36]:
contactslist

Dict{String, String} with 2 entries:
  "Marta" => "555 333"
  "Emma"  => "111 000"

Dictionaries are not ordered and we can't index into them.

In [37]:
contactslist[1]

KeyError: KeyError: key 1 not found

But they are mutable and we can update their fields


In [38]:
contactslist["Emma"] = "222 000"
contactslist

Dict{String, String} with 2 entries:
  "Marta" => "555 333"
  "Emma"  => "222 000"

### Arrays
Arrays are **mutable** and **ordered** which means we can update their fields and index into them.  

In [39]:
bestmovies = ["Harry Potter", "Interstellar", "Black Swan", "Oppenheimer", "Moonrise Kingdom"]

5-element Vector{String}:
 "Harry Potter"
 "Interstellar"
 "Black Swan"
 "Oppenheimer"
 "Moonrise Kingdom"

In [40]:
bestmovies[1]

"Harry Potter"

In [41]:
bestmovies[1] = "The Godfather"

"The Godfather"

In [42]:
bestmovies

5-element Vector{String}:
 "The Godfather"
 "Interstellar"
 "Black Swan"
 "Oppenheimer"
 "Moonrise Kingdom"

We can also edit an array by using the `push!` and `pop!` functions. `push!` adds an element to the end of an array and `pop!` removes the last element of an array.

In [43]:
push!(bestmovies, "Lost in Translation")

6-element Vector{String}:
 "The Godfather"
 "Interstellar"
 "Black Swan"
 "Oppenheimer"
 "Moonrise Kingdom"
 "Lost in Translation"

And then remove it.

In [44]:
pop!(bestmovies)

"Lost in Translation"

In [45]:
bestmovies

5-element Vector{String}:
 "The Godfather"
 "Interstellar"
 "Black Swan"
 "Oppenheimer"
 "Moonrise Kingdom"

Arrays can have any number of dimesnsions and they can store other arrays too.

In [46]:
numbers = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]

3-element Vector{Vector{Int64}}:
 [1, 2, 3]
 [4, 5]
 [6, 7, 8, 9]

In [47]:
rand(4,5)

4×5 Matrix{Float64}:
 0.914189  0.40487    0.89453    0.221824  0.119625
 0.3516    0.103428   0.595414   0.630268  0.355562
 0.9532    0.0618298  0.0916936  0.206443  0.13951
 0.190692  0.26316    0.75798    0.92707   0.281857

**Common arrays**

In [48]:
zeros(3,3)

3×3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

In [49]:
ones(Int64, 3,3)

3×3 Matrix{Int64}:
 1  1  1
 1  1  1
 1  1  1

**Concatenation of arrays**

The `cat()` is a built in function for concatenating arrays or collections of objects along a specified dimensions.

In [50]:
a = [1, 2, 3, 4] 
b = [5, 10, 15, 20] 

4-element Vector{Int64}:
  5
 10
 15
 20

In [51]:
cat(a, b, dims=1) # Same as vcat

8-element Vector{Int64}:
  1
  2
  3
  4
  5
 10
 15
 20

In [52]:
cat(a, b, dims=2) # same as hcat

4×2 Matrix{Int64}:
 1   5
 2  10
 3  15
 4  20

Alternatively we could use the `vcat()` to concatenate the given arrays along dimension 1 and `hcat()` to concatenate the given arrays along dimension 2.

In [53]:
vcat(a, b) 

8-element Vector{Int64}:
  1
  2
  3
  4
  5
 10
 15
 20

In [54]:
hcat(a, b) 

4×2 Matrix{Int64}:
 1   5
 2  10
 3  15
 4  20

**Array Inspection**

In [55]:
eltype(bestmovies)

String

In [56]:
length(bestmovies) # Total number of elements

5

In [57]:
ndims(bestmovies) # Number of dimensions

1

In [58]:
size(bestmovies) # Array’s dimensions

(5,)

In [59]:
size(bestmovies, 1)

5

**Indexing**

In [60]:
# Example vector
my_vec = [1, 2, 3, 4, 5]

5-element Vector{Int64}:
 1
 2
 3
 4
 5

In [61]:
my_vec[3]

3

In [62]:
my_vec[end]

5

In [63]:
# Example matrix
my_mat = [[1 2 3]
          [4 5 6]
          [7 8 9]]

3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9

In [64]:
my_mat[2,1]

4

**Slicing**

In [65]:
my_vec[2:4]

3-element Vector{Int64}:
 2
 3
 4

In [66]:
my_mat[2, :] # Access the 2nd row of the matrix 

3-element Vector{Int64}:
 4
 5
 6

**Manipulations**

In [67]:
my_mat[2,2] = 100

100

In [68]:
my_mat

3×3 Matrix{Int64}:
 1    2  3
 4  100  6
 7    8  9

In [69]:
my_mat[3, :] = [99,99,99]

3-element Vector{Int64}:
 99
 99
 99

In [70]:
my_mat

3×3 Matrix{Int64}:
  1    2   3
  4  100   6
 99   99  99

**Multiplication**

In [71]:
A = rand(0:9,3,3)

3×3 Matrix{Int64}:
 5  3  0
 2  8  4
 0  1  1

In [72]:
x = fill(1.0, (3,))

3-element Vector{Float64}:
 1.0
 1.0
 1.0

In [73]:
b = A*x

3-element Vector{Float64}:
  8.0
 14.0
  2.0

**Transposition**

In [74]:
A'

3×3 adjoint(::Matrix{Int64}) with eltype Int64:
 5  2  0
 3  8  1
 0  4  1

or

In [75]:
transpose(A)

3×3 transpose(::Matrix{Int64}) with eltype Int64:
 5  2  0
 3  8  1
 0  4  1

In [76]:
A'A

3×3 Matrix{Int64}:
 29  31   8
 31  74  33
  8  33  17

## 4. Loops

### While loop

The syntax for a `while` loop is:

```julia
while *condition*
    *loop body*
end
```

For example, we could use `while` to count or to iterate elements over an array.

In [77]:
n = 0
while n < 10
    n += 1
    println(n)
end

1
2
3
4
5
6
7
8
9
10


### For loop

The syntax for a `for` loop is:

```julia
for *var* in *loop iterable*
    *loop body*
end
```

We could use a `for` loop to generate the same result as in the example above.

In [78]:
for n in 1:10
    println(n)
end

1
2
3
4
5
6
7
8
9
10


In [79]:
for n ∈ 1:10
    println(n)
end

1
2
3
4
5
6
7
8
9
10


## 5. Conditionals

#### with the `if` keyword
In Julia, the syntax

```julia
if *condition 1*
    *option 1*
elseif *condition 2*
    *option 2*
else
    *option 3*
end
```

For example, we want to write a conditional statement that prints a number if the number is even and the string "odd" if the number is odd.

In [80]:
x = 1
if (x % 2 == 0)
    println(x)
else
    println("odd")
end

odd


#### with ternary operators
In Julia, we can rewrite this code using the tenary operator 

```julia
a ? b : c
```

which is equal to 

```julia
if a
    b
else
    c
end
```

In [81]:
(x % 2 == 0) ? x : "odd"

"odd"

## 6. Functions

The basic syntax for defining functions in Julia is using `function` and `end` keywords

In [82]:
function sayhi(name)
    println("Hi $name, it's great to see you!")
end

sayhi (generic function with 1 method)

In [83]:
sayhi("Julia")

Hi Julia, it's great to see you!


In [84]:
function f(x)
    x^2
end

f (generic function with 1 method)

In [85]:
f(15)

225

Alternatively, we could have declared the above functions in just a single line

In [86]:
sayhi2(name) = println("Hi $name, it's great to see you!")

sayhi2 (generic function with 1 method)

In [87]:
f2(x) = x^2

f2 (generic function with 1 method)

In [88]:
sayhi2("Iga")

Hi Iga, it's great to see you!


In [89]:
f2(15)

225

Or we could have declared them as the "anonymous" functions

In [90]:
sayhi3 = name -> println("Hi $name, it's great to see you!")

#11 (generic function with 1 method)

In [91]:
f3 = x -> x^2

#13 (generic function with 1 method)

In [92]:
sayhi3("Eva")

Hi Eva, it's great to see you!


In [93]:
f3(15)

225

In [94]:
# Exercise
# Try to make a function that prints "Hello Julia" if the given name is "Julia" and prints "Hello World" if the given name is different. 
# Insert your code here







In [95]:
# Answer
function print_hello(name)
    if name == "Julia"
        println("Hello Julia")
    else
        println("Hello World")
    end 
end

print_hello (generic function with 1 method)

In [96]:
print_hello("Alex")

Hello World


### Mutating vs non-mutating functions

Using a non-mutating function `sort` without `!` 

In [97]:
v = [3, 5, 2]

3-element Vector{Int64}:
 3
 5
 2

In [98]:
sort(v)

3-element Vector{Int64}:
 2
 3
 5

In [99]:
v

3-element Vector{Int64}:
 3
 5
 2

The exclamation mark `!` at the end of a function `sort!` performs a mutation on its arguments. Well, let's see.

In [100]:
sort!(v)

3-element Vector{Int64}:
 2
 3
 5

In [101]:
v

3-element Vector{Int64}:
 2
 3
 5

### Broadcasting

1) We can apply a function element-wise to arrays or collections without explicitly using loops by *broadcasting* using the dot `.` syntax. 

In [102]:
f.([1, 2, 3])

3-element Vector{Int64}:
 1
 4
 9

2) Multiply each element of A by the scalar x.

In [103]:
A = [1 2 3; 4 5 6; 7 8 9]

3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9

In [104]:
x = 2

2

In [105]:
result = A .* x

3×3 Matrix{Int64}:
  2   4   6
  8  10  12
 14  16  18

## 7. Multiple Dispatch 
A key feature of Julia that makes your code *generic* and *fast*. 

Julia is a dynamically typed language, meaning we don't need to specify the types of our input arguments when declaring a function. It is optional because Julia dynamically determines the correct types on the fly. However, we can still explicitly specify types if we want to:

In [106]:
function add(x::Int, y::Int)
    return x + y
end

add (generic function with 1 method)

In [107]:
function add(x::Float64, y::Float64)
    return x + y
end

add (generic function with 2 methods)

It doesn't mean that we overwrite or replace any of the functions we declared. We are simply adding an additional **method** to the generic function called add.

In [108]:
result_int = add(5, 3)         

8

In [109]:
result_float = add(3.5, 2.5)  

6.0

We can use the `methods` function to see how many methods there are for the function.

In [110]:
methods(add)

To see which method is being dispatched when we call a generic function, we can use @which.

In [111]:
@which add(8, 9)

## 8. Package Manager
Julia has a built-in package manager called `Pkg`, which simplifies reusing other people's code, eliminating the need for external tools such as anaconda and pip. Here are some of its functions:

- Installing and uninstalling packages.
- Creating project environments with Project.toml.
- Ensuring complete project reproducibility via Manifest.toml.
- Defining compatibility with dependencies.
- Facilitating development of packages.

In [112]:
using Pkg

In [113]:
Pkg.status()

[32m[1mStatus[22m[39m `~/Desktop/coding/juliaeo24_notebook/Project.toml`
[32m⌃[39m [90m[336ed68f] [39mCSV v0.10.11
[32m⌃[39m [90m[13f3f980] [39mCairoMakie v0.11.3
  [90m[324d7699] [39mCategoricalArrays v0.10.8
  [90m[a93c6f00] [39mDataFrames v1.6.1
[33m⌅[39m [90m[ee78f7c6] [39mMakie v0.20.2
  [90m[f43a241f] [39mDownloads v1.6.0
  [90m[10745b16] [39mStatistics v1.9.0
[36m[1mInfo[22m[39m Packages marked with [32m⌃[39m and [33m⌅[39m have new versions available, but those with [33m⌅[39m are restricted by compatibility constraints from upgrading. To see why use `status --outdated`


In [114]:
Pkg.activate(".")

[32m[1m  Activating[22m[39m project at `~/Desktop/coding/juliaeo24_notebook`


The first time you use a package on a given Julia installation, you need to use the package manager to add it.

In [115]:
Pkg.add("CSV")

[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`


[32m[1m   Resolving[22m[39m package versions...


[32m[1m  No Changes[22m[39m to `~/Desktop/coding/juliaeo24_notebook/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/coding/juliaeo24_notebook/Manifest.toml`


In [116]:
Pkg.status()

[32m[1mStatus[22m[39m `~/Desktop/coding/juliaeo24_notebook/Project.toml`


[32m⌃[39m [90m[336ed68f] [39mCSV v0.10.11
[32m⌃[39m [90m[13f3f980] [39mCairoMakie v0.11.3
  [90m[324d7699] [39mCategoricalArrays v0.10.8
  [90m[a93c6f00] [39mDataFrames v1.6.1
[33m⌅[39m [90m[ee78f7c6] [39mMakie v0.20.2
  [90m[f43a241f] [39mDownloads v1.6.0
  [90m[10745b16] [39mStatistics v1.9.0
[36m[1mInfo[22m[39m Packages marked with [32m⌃[39m and [33m⌅[39m have new versions available, but those with [33m⌅[39m are restricted by compatibility constraints from upgrading. To see why use `status --outdated`


In [117]:
Pkg.add("DataFrames") 

[32m[1m   Resolving[22m[39m package versions...


[32m[1m  No Changes[22m[39m to `~/Desktop/coding/juliaeo24_notebook/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/coding/juliaeo24_notebook/Manifest.toml`


In [118]:
Pkg.add("Statistics")

[32m[1m   Resolving[22m[39m package versions...


[32m[1m  No Changes[22m[39m to `~/Desktop/coding/juliaeo24_notebook/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/coding/juliaeo24_notebook/Manifest.toml`


Every time you use Julia (start a new session at the REPL or open a notebook for the first time) you need to load the package with the `using` keyword.

In [119]:
using DataFrames

## 9. Tabular data
The commonly used packages for working with tabular data in Julia are **CSV.jl**, **Downloads.jl**, **DataFrames.jl**, and **Statistics.jl**. Now, we will explore some of their features in the *Exercise_1.ipynb*.

In [120]:
Pkg.add("Downloads")

[32m[1m   Resolving[22m[39m package versions...


[32m[1m  No Changes[22m[39m to `~/Desktop/coding/juliaeo24_notebook/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/coding/juliaeo24_notebook/Manifest.toml`


In [121]:
using DataFrames, CSV, Downloads 

In [122]:
df = CSV.read("data/vessel_locations.csv", DataFrame)

Row,Column1,dev_eui,gw_eui,timestamp_utc_iso_string,rssi_dbm,snr_db,dev_lon_deg_wgs84,dev_lat_deg_wgs84,battery_v,gw_lon_deg_wgs84,gw_lat_deg_wgs84,signal_quality
Unnamed: 0_level_1,Int64,String31,String31,String31,Int64,Float64,Float64,Float64,Float64,Float64,Float64,String31
1,0,0200000000000000,7076FF0056060728,2022-03-16 03:10:48,-112,6.2,-27.2673,38.6557,0.0,0.0,0.0,Poor
2,1,0200000000000000,7076FF0056060728,2022-03-16 03:22:19,-120,0.2,-27.2675,38.6555,0.0,0.0,0.0,Poor
3,2,0200000000000000,7076FF0056060728,2022-03-16 05:41:10,-100,10.2,-27.2671,38.6556,0.0,0.0,0.0,Fair
4,3,0200000000000000,7076FF0056060728,2022-03-16 05:43:30,-105,9.2,-27.2672,38.6554,0.0,0.0,0.0,Fair
5,4,0200000000000000,7076FF0056060728,2022-03-16 12:18:26,-105,8.2,-27.2666,38.6562,0.0,0.0,0.0,Fair
6,5,0200000000000000,7076FF0056060728,2022-03-16 15:38:22,-102,9.8,-27.2673,38.6556,0.0,0.0,0.0,Fair
7,6,0200000000000000,7076FF0056060728,2022-03-16 23:57:15,-108,8.2,-27.2668,38.6561,0.0,0.0,0.0,Fair
8,7,0200000000000000,7076FF0056060728,2022-03-16 23:59:34,-105,8.0,-27.267,38.6561,0.0,0.0,0.0,Fair
9,8,0200000000000000,7076FF0056060728,2022-03-17 00:01:53,-107,8.0,-27.267,38.656,0.0,0.0,0.0,Fair
10,9,0200000000000000,7076FF0056060728,2022-03-17 00:04:12,-105,9.2,-27.2671,38.6558,0.0,0.0,0.0,Fair


In [123]:
df.gw_eui

76321-element PooledArrays.PooledVector{String31, UInt32, Vector{UInt32}}:
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 ⋮
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"

In [124]:
df[!, :gw_eui]

76321-element PooledArrays.PooledVector{String31, UInt32, Vector{UInt32}}:
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 "7076FF0056060728"
 ⋮
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"
 "7276FF000B03183D"

In [125]:
unique(df.gw_eui)

9-element Vector{String31}:
 "7076FF0056060728"
 "7076FF0056060729"
 "7076FF00560705D1"
 "7076FF00560705D7"
 "7076FF00560720B7"
 "7276FF000B031748"
 "7276FF000B031818"
 "7276FF000B03183A"
 "7276FF000B03183D"

In [126]:
df[1, :gw_eui]

"7076FF0056060728"

In [127]:
df[1:2, :gw_eui]

2-element PooledArrays.PooledVector{String31, UInt32, Vector{UInt32}}:
 "7076FF0056060728"
 "7076FF0056060728"

**Filtering**

In [128]:
# For a vector
filter(x-> x>3, [1,2,3,4,5])

2-element Vector{Int64}:
 4
 5

In [129]:
# For a dataframe
signal_excellent(signal_quality) = signal_quality == "Excellent"

signal_excellent (generic function with 1 method)

In [130]:
filter(:signal_quality => signal_excellent, df)

Row,Column1,dev_eui,gw_eui,timestamp_utc_iso_string,rssi_dbm,snr_db,dev_lon_deg_wgs84,dev_lat_deg_wgs84,battery_v,gw_lon_deg_wgs84,gw_lat_deg_wgs84,signal_quality
Unnamed: 0_level_1,Int64,String31,String31,String31,Int64,Float64,Float64,Float64,Float64,Float64,Float64,String31
1,21193,0200000000000000,7076FF0056060729,2022-05-07 19:46:37,-54,13.5,-28.3902,38.4219,0.0,0.0,0.0,Excellent
2,21194,0200000000000000,7076FF0056060729,2022-05-07 19:49:14,-56,12.0,-28.3901,38.4219,0.0,0.0,0.0,Excellent
3,21195,0200000000000000,7076FF0056060729,2022-05-07 19:51:56,-54,12.2,-28.3901,38.4218,0.0,0.0,0.0,Excellent
4,21197,0200000000000000,7076FF0056060729,2022-05-09 08:52:11,-58,11.0,-28.3902,38.4218,0.0,0.0,0.0,Excellent
5,21198,0200000000000000,7076FF0056060729,2022-05-09 08:54:47,-56,11.8,-28.3903,38.4218,0.0,0.0,0.0,Excellent
6,21200,0200000000000000,7076FF0056060729,2022-05-09 09:00:32,-56,12.2,-28.3903,38.4218,0.0,0.0,0.0,Excellent
7,21203,0200000000000000,7076FF0056060729,2022-05-12 12:02:22,-54,11.8,-28.3902,38.4218,0.0,0.0,0.0,Excellent
8,21204,0200000000000000,7076FF0056060729,2022-05-16 13:30:27,-71,11.5,-28.3902,38.4218,0.0,0.0,0.0,Excellent
9,21205,0200000000000000,7076FF0056060729,2022-05-17 11:44:15,-60,14.0,-28.3903,38.4218,0.0,0.0,0.0,Excellent
10,21206,0200000000000000,7076FF0056060729,2022-05-17 11:47:35,-59,10.5,-28.3903,38.4219,0.0,0.0,0.0,Excellent


**Subset**

In [131]:
subset(df, :signal_quality => ByRow(==("Fair")))

Row,Column1,dev_eui,gw_eui,timestamp_utc_iso_string,rssi_dbm,snr_db,dev_lon_deg_wgs84,dev_lat_deg_wgs84,battery_v,gw_lon_deg_wgs84,gw_lat_deg_wgs84,signal_quality
Unnamed: 0_level_1,Int64,String31,String31,String31,Int64,Float64,Float64,Float64,Float64,Float64,Float64,String31
1,2,0200000000000000,7076FF0056060728,2022-03-16 05:41:10,-100,10.2,-27.2671,38.6556,0.0,0.0,0.0,Fair
2,3,0200000000000000,7076FF0056060728,2022-03-16 05:43:30,-105,9.2,-27.2672,38.6554,0.0,0.0,0.0,Fair
3,4,0200000000000000,7076FF0056060728,2022-03-16 12:18:26,-105,8.2,-27.2666,38.6562,0.0,0.0,0.0,Fair
4,5,0200000000000000,7076FF0056060728,2022-03-16 15:38:22,-102,9.8,-27.2673,38.6556,0.0,0.0,0.0,Fair
5,6,0200000000000000,7076FF0056060728,2022-03-16 23:57:15,-108,8.2,-27.2668,38.6561,0.0,0.0,0.0,Fair
6,7,0200000000000000,7076FF0056060728,2022-03-16 23:59:34,-105,8.0,-27.267,38.6561,0.0,0.0,0.0,Fair
7,8,0200000000000000,7076FF0056060728,2022-03-17 00:01:53,-107,8.0,-27.267,38.656,0.0,0.0,0.0,Fair
8,9,0200000000000000,7076FF0056060728,2022-03-17 00:04:12,-105,9.2,-27.2671,38.6558,0.0,0.0,0.0,Fair
9,10,0200000000000000,7076FF0056060728,2022-03-17 19:05:30,-102,9.0,-27.2671,38.6557,0.0,0.0,0.0,Fair
10,11,0200000000000000,7076FF0056060728,2022-03-17 23:00:17,-101,8.8,-27.2671,38.6552,0.0,0.0,0.0,Fair


In [132]:
subset(df, :signal_quality => ByRow(name -> name == "Fair"))

Row,Column1,dev_eui,gw_eui,timestamp_utc_iso_string,rssi_dbm,snr_db,dev_lon_deg_wgs84,dev_lat_deg_wgs84,battery_v,gw_lon_deg_wgs84,gw_lat_deg_wgs84,signal_quality
Unnamed: 0_level_1,Int64,String31,String31,String31,Int64,Float64,Float64,Float64,Float64,Float64,Float64,String31
1,2,0200000000000000,7076FF0056060728,2022-03-16 05:41:10,-100,10.2,-27.2671,38.6556,0.0,0.0,0.0,Fair
2,3,0200000000000000,7076FF0056060728,2022-03-16 05:43:30,-105,9.2,-27.2672,38.6554,0.0,0.0,0.0,Fair
3,4,0200000000000000,7076FF0056060728,2022-03-16 12:18:26,-105,8.2,-27.2666,38.6562,0.0,0.0,0.0,Fair
4,5,0200000000000000,7076FF0056060728,2022-03-16 15:38:22,-102,9.8,-27.2673,38.6556,0.0,0.0,0.0,Fair
5,6,0200000000000000,7076FF0056060728,2022-03-16 23:57:15,-108,8.2,-27.2668,38.6561,0.0,0.0,0.0,Fair
6,7,0200000000000000,7076FF0056060728,2022-03-16 23:59:34,-105,8.0,-27.267,38.6561,0.0,0.0,0.0,Fair
7,8,0200000000000000,7076FF0056060728,2022-03-17 00:01:53,-107,8.0,-27.267,38.656,0.0,0.0,0.0,Fair
8,9,0200000000000000,7076FF0056060728,2022-03-17 00:04:12,-105,9.2,-27.2671,38.6558,0.0,0.0,0.0,Fair
9,10,0200000000000000,7076FF0056060728,2022-03-17 19:05:30,-102,9.0,-27.2671,38.6557,0.0,0.0,0.0,Fair
10,11,0200000000000000,7076FF0056060728,2022-03-17 23:00:17,-101,8.8,-27.2671,38.6552,0.0,0.0,0.0,Fair


**Select**

In [133]:
select(df, :gw_eui, :timestamp_utc_iso_string)

Row,gw_eui,timestamp_utc_iso_string
Unnamed: 0_level_1,String31,String31
1,7076FF0056060728,2022-03-16 03:10:48
2,7076FF0056060728,2022-03-16 03:22:19
3,7076FF0056060728,2022-03-16 05:41:10
4,7076FF0056060728,2022-03-16 05:43:30
5,7076FF0056060728,2022-03-16 12:18:26
6,7076FF0056060728,2022-03-16 15:38:22
7,7076FF0056060728,2022-03-16 23:57:15
8,7076FF0056060728,2022-03-16 23:59:34
9,7076FF0056060728,2022-03-17 00:01:53
10,7076FF0056060728,2022-03-17 00:04:12


In [134]:
select(df, :gw_eui, :timestamp_utc_iso_string, :signal_quality)

Row,gw_eui,timestamp_utc_iso_string,signal_quality
Unnamed: 0_level_1,String31,String31,String31
1,7076FF0056060728,2022-03-16 03:10:48,Poor
2,7076FF0056060728,2022-03-16 03:22:19,Poor
3,7076FF0056060728,2022-03-16 05:41:10,Fair
4,7076FF0056060728,2022-03-16 05:43:30,Fair
5,7076FF0056060728,2022-03-16 12:18:26,Fair
6,7076FF0056060728,2022-03-16 15:38:22,Fair
7,7076FF0056060728,2022-03-16 23:57:15,Fair
8,7076FF0056060728,2022-03-16 23:59:34,Fair
9,7076FF0056060728,2022-03-17 00:01:53,Fair
10,7076FF0056060728,2022-03-17 00:04:12,Fair


In [135]:
select(df, Not(:battery_v))

Row,Column1,dev_eui,gw_eui,timestamp_utc_iso_string,rssi_dbm,snr_db,dev_lon_deg_wgs84,dev_lat_deg_wgs84,gw_lon_deg_wgs84,gw_lat_deg_wgs84,signal_quality
Unnamed: 0_level_1,Int64,String31,String31,String31,Int64,Float64,Float64,Float64,Float64,Float64,String31
1,0,0200000000000000,7076FF0056060728,2022-03-16 03:10:48,-112,6.2,-27.2673,38.6557,0.0,0.0,Poor
2,1,0200000000000000,7076FF0056060728,2022-03-16 03:22:19,-120,0.2,-27.2675,38.6555,0.0,0.0,Poor
3,2,0200000000000000,7076FF0056060728,2022-03-16 05:41:10,-100,10.2,-27.2671,38.6556,0.0,0.0,Fair
4,3,0200000000000000,7076FF0056060728,2022-03-16 05:43:30,-105,9.2,-27.2672,38.6554,0.0,0.0,Fair
5,4,0200000000000000,7076FF0056060728,2022-03-16 12:18:26,-105,8.2,-27.2666,38.6562,0.0,0.0,Fair
6,5,0200000000000000,7076FF0056060728,2022-03-16 15:38:22,-102,9.8,-27.2673,38.6556,0.0,0.0,Fair
7,6,0200000000000000,7076FF0056060728,2022-03-16 23:57:15,-108,8.2,-27.2668,38.6561,0.0,0.0,Fair
8,7,0200000000000000,7076FF0056060728,2022-03-16 23:59:34,-105,8.0,-27.267,38.6561,0.0,0.0,Fair
9,8,0200000000000000,7076FF0056060728,2022-03-17 00:01:53,-107,8.0,-27.267,38.656,0.0,0.0,Fair
10,9,0200000000000000,7076FF0056060728,2022-03-17 00:04:12,-105,9.2,-27.2671,38.6558,0.0,0.0,Fair


In [136]:
select(df, Not([:dev_lon_deg_wgs84, :dev_lat_deg_wgs84]))

Row,Column1,dev_eui,gw_eui,timestamp_utc_iso_string,rssi_dbm,snr_db,battery_v,gw_lon_deg_wgs84,gw_lat_deg_wgs84,signal_quality
Unnamed: 0_level_1,Int64,String31,String31,String31,Int64,Float64,Float64,Float64,Float64,String31
1,0,0200000000000000,7076FF0056060728,2022-03-16 03:10:48,-112,6.2,0.0,0.0,0.0,Poor
2,1,0200000000000000,7076FF0056060728,2022-03-16 03:22:19,-120,0.2,0.0,0.0,0.0,Poor
3,2,0200000000000000,7076FF0056060728,2022-03-16 05:41:10,-100,10.2,0.0,0.0,0.0,Fair
4,3,0200000000000000,7076FF0056060728,2022-03-16 05:43:30,-105,9.2,0.0,0.0,0.0,Fair
5,4,0200000000000000,7076FF0056060728,2022-03-16 12:18:26,-105,8.2,0.0,0.0,0.0,Fair
6,5,0200000000000000,7076FF0056060728,2022-03-16 15:38:22,-102,9.8,0.0,0.0,0.0,Fair
7,6,0200000000000000,7076FF0056060728,2022-03-16 23:57:15,-108,8.2,0.0,0.0,0.0,Fair
8,7,0200000000000000,7076FF0056060728,2022-03-16 23:59:34,-105,8.0,0.0,0.0,0.0,Fair
9,8,0200000000000000,7076FF0056060728,2022-03-17 00:01:53,-107,8.0,0.0,0.0,0.0,Fair
10,9,0200000000000000,7076FF0056060728,2022-03-17 00:04:12,-105,9.2,0.0,0.0,0.0,Fair


**Categorical data**

In [137]:
import Pkg; Pkg.add("CategoricalArrays")

[32m[1m   Resolving[22m[39m package versions...


[32m[1m  No Changes[22m[39m to `~/Desktop/coding/juliaeo24_notebook/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/coding/juliaeo24_notebook/Manifest.toml`


In [138]:
using CategoricalArrays 
Pkg.instantiate()

In [139]:
sort(df, :signal_quality)

Row,Column1,dev_eui,gw_eui,timestamp_utc_iso_string,rssi_dbm,snr_db,dev_lon_deg_wgs84,dev_lat_deg_wgs84,battery_v,gw_lon_deg_wgs84,gw_lat_deg_wgs84,signal_quality
Unnamed: 0_level_1,Int64,String31,String31,String31,Int64,Float64,Float64,Float64,Float64,Float64,Float64,String31
1,21193,0200000000000000,7076FF0056060729,2022-05-07 19:46:37,-54,13.5,-28.3902,38.4219,0.0,0.0,0.0,Excellent
2,21194,0200000000000000,7076FF0056060729,2022-05-07 19:49:14,-56,12.0,-28.3901,38.4219,0.0,0.0,0.0,Excellent
3,21195,0200000000000000,7076FF0056060729,2022-05-07 19:51:56,-54,12.2,-28.3901,38.4218,0.0,0.0,0.0,Excellent
4,21197,0200000000000000,7076FF0056060729,2022-05-09 08:52:11,-58,11.0,-28.3902,38.4218,0.0,0.0,0.0,Excellent
5,21198,0200000000000000,7076FF0056060729,2022-05-09 08:54:47,-56,11.8,-28.3903,38.4218,0.0,0.0,0.0,Excellent
6,21200,0200000000000000,7076FF0056060729,2022-05-09 09:00:32,-56,12.2,-28.3903,38.4218,0.0,0.0,0.0,Excellent
7,21203,0200000000000000,7076FF0056060729,2022-05-12 12:02:22,-54,11.8,-28.3902,38.4218,0.0,0.0,0.0,Excellent
8,21204,0200000000000000,7076FF0056060729,2022-05-16 13:30:27,-71,11.5,-28.3902,38.4218,0.0,0.0,0.0,Excellent
9,21205,0200000000000000,7076FF0056060729,2022-05-17 11:44:15,-60,14.0,-28.3903,38.4218,0.0,0.0,0.0,Excellent
10,21206,0200000000000000,7076FF0056060729,2022-05-17 11:47:35,-59,10.5,-28.3903,38.4219,0.0,0.0,0.0,Excellent


**Groupby**

In [140]:
groupby(df, :gw_eui)

Row,Column1,dev_eui,gw_eui,timestamp_utc_iso_string,rssi_dbm,snr_db,dev_lon_deg_wgs84,dev_lat_deg_wgs84,battery_v,gw_lon_deg_wgs84,gw_lat_deg_wgs84,signal_quality
Unnamed: 0_level_1,Int64,String31,String31,String31,Int64,Float64,Float64,Float64,Float64,Float64,Float64,String31
1,0,0200000000000000,7076FF0056060728,2022-03-16 03:10:48,-112,6.2,-27.2673,38.6557,0.0,0.0,0.0,Poor
2,1,0200000000000000,7076FF0056060728,2022-03-16 03:22:19,-120,0.2,-27.2675,38.6555,0.0,0.0,0.0,Poor
3,2,0200000000000000,7076FF0056060728,2022-03-16 05:41:10,-100,10.2,-27.2671,38.6556,0.0,0.0,0.0,Fair
4,3,0200000000000000,7076FF0056060728,2022-03-16 05:43:30,-105,9.2,-27.2672,38.6554,0.0,0.0,0.0,Fair
5,4,0200000000000000,7076FF0056060728,2022-03-16 12:18:26,-105,8.2,-27.2666,38.6562,0.0,0.0,0.0,Fair
6,5,0200000000000000,7076FF0056060728,2022-03-16 15:38:22,-102,9.8,-27.2673,38.6556,0.0,0.0,0.0,Fair
7,6,0200000000000000,7076FF0056060728,2022-03-16 23:57:15,-108,8.2,-27.2668,38.6561,0.0,0.0,0.0,Fair
8,7,0200000000000000,7076FF0056060728,2022-03-16 23:59:34,-105,8.0,-27.267,38.6561,0.0,0.0,0.0,Fair
9,8,0200000000000000,7076FF0056060728,2022-03-17 00:01:53,-107,8.0,-27.267,38.656,0.0,0.0,0.0,Fair
10,9,0200000000000000,7076FF0056060728,2022-03-17 00:04:12,-105,9.2,-27.2671,38.6558,0.0,0.0,0.0,Fair

Row,Column1,dev_eui,gw_eui,timestamp_utc_iso_string,rssi_dbm,snr_db,dev_lon_deg_wgs84,dev_lat_deg_wgs84,battery_v,gw_lon_deg_wgs84,gw_lat_deg_wgs84,signal_quality
Unnamed: 0_level_1,Int64,String31,String31,String31,Int64,Float64,Float64,Float64,Float64,Float64,Float64,String31
1,75709,0200000000000000,7276FF000B03183D,2022-02-25 17:22:06,-125,-5.0,-27.2336,38.6935,0.0,0.0,0.0,Very Poor
2,75710,0200000000000000,7276FF000B03183D,2022-02-25 20:59:37,-115,6.2,-27.2014,38.6822,0.0,0.0,0.0,Poor
3,75711,0200000000000000,7276FF000B03183D,2022-02-25 21:00:44,-100,9.5,-27.1898,38.6879,0.0,0.0,0.0,Fair
4,75712,0200000000000000,7276FF000B03183D,2022-02-26 15:10:27,-118,2.5,-27.2017,38.6821,0.0,0.0,0.0,Poor
5,75713,0200000000000000,7276FF000B03183D,2022-02-26 15:11:33,-94,7.8,-27.1903,38.6876,0.0,0.0,0.0,Good
6,75714,0200000000000000,7276FF000B03183D,2022-02-26 15:12:06,-102,9.5,-27.1827,38.6929,0.0,0.0,0.0,Fair
7,75715,0200000000000000,7276FF000B03183D,2022-02-26 19:21:07,-102,11.5,-27.183,38.7864,0.0,0.0,0.0,Fair
8,75716,0200000000000000,7276FF000B03183D,2022-02-26 19:47:56,-114,9.2,-27.1607,38.7092,0.0,0.0,0.0,Poor
9,75717,0200000000000000,7276FF000B03183D,2022-02-26 19:48:29,-112,8.8,-27.1674,38.7038,0.0,0.0,0.0,Poor
10,75718,0200000000000000,7276FF000B03183D,2022-02-26 19:49:02,-102,6.8,-27.1747,38.6986,0.0,0.0,0.0,Fair


**Join**

Defining our own dataframe using the `DataFrame` function

In [141]:
grades_2023 = DataFrame(; name=["Sally", "Bob", "Alice", "Hank"],
grade_2023=[1, 5, 8.5, 4])

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Sally,1.0
2,Bob,5.0
3,Alice,8.5
4,Hank,4.0


and creating a new one to join 

In [142]:
grades_2024 = DataFrame(; name=["Bob 2", "Sally", "Hank"],
grade_2024=[9.5, 9.5, 6])

Row,name,grade_2024
Unnamed: 0_level_1,String,Float64
1,Bob 2,9.5
2,Sally,9.5
3,Hank,6.0


`innerjoin`

In [143]:
innerjoin(grades_2023, grades_2024, on=:name)

Row,name,grade_2023,grade_2024
Unnamed: 0_level_1,String,Float64,Float64
1,Sally,1.0,9.5
2,Hank,4.0,6.0


`outerjoin`

In [144]:
outerjoin(grades_2023, grades_2024, on=:name)

Row,name,grade_2023,grade_2024
Unnamed: 0_level_1,String,Float64?,Float64?
1,Sally,1.0,9.5
2,Hank,4.0,6.0
3,Bob,5.0,missing
4,Alice,8.5,missing
5,Bob 2,missing,9.5


`leftjoin`

In [145]:
leftjoin(grades_2023, grades_2024, on=:name)

Row,name,grade_2023,grade_2024
Unnamed: 0_level_1,String,Float64,Float64?
1,Sally,1.0,9.5
2,Hank,4.0,6.0
3,Bob,5.0,missing
4,Alice,8.5,missing


`rightjoin`

In [146]:
rightjoin(grades_2023, grades_2024, on=:name)

Row,name,grade_2023,grade_2024
Unnamed: 0_level_1,String,Float64?,Float64
1,Sally,1.0,9.5
2,Hank,4.0,6.0
3,Bob 2,missing,9.5


`semijoin`

In [147]:
semijoin(grades_2023, grades_2024, on=:name)

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Sally,1.0
2,Hank,4.0


`antijoin`

In [148]:
antijoin(grades_2023, grades_2024, on=:name)

Row,name,grade_2023
Unnamed: 0_level_1,String,Float64
1,Bob,5.0
2,Alice,8.5


**Variable transformation**

In [149]:
plus_one(grades) = grades .+ 1

plus_one (generic function with 1 method)

In [150]:
transform(grades_2023, :grade_2023 => plus_one)

Row,name,grade_2023,grade_2023_plus_one
Unnamed: 0_level_1,String,Float64,Float64
1,Sally,1.0,2.0
2,Bob,5.0,6.0
3,Alice,8.5,9.5
4,Hank,4.0,5.0


In [151]:
# Rename
transform(grades_2023, :grade_2023 => plus_one => :grade_2023_2)

Row,name,grade_2023,grade_2023_2
Unnamed: 0_level_1,String,Float64,Float64
1,Sally,1.0,2.0
2,Bob,5.0,6.0
3,Alice,8.5,9.5
4,Hank,4.0,5.0


## 10. The Community
Julia has an amazing [User Community](https://julialang.org/community/) that is ready to help you! The main channels where you can seek help if you encounter a problem, bug, or simply have a question to learn more about Julia are:
- Forum https://discourse.julialang.org/
- Julia Slack https://julialang.slack.com/
- Open an issue on Github! 