# Introduction to Julia

by Karl Zhu

This notebook is an introduction to the Julia language and its commonly used IJulia/Jupyter notebook interface. It is based in part on past year's materials for this lecture:
- 2024: ([Sean Lo and Zikai Xiong](https://github.com/angkoulouras/15.S60_2024/tree/main/5_linear_programming))
- 2023: ([Haoyue Wang and Shuvomoy Das Gupta](https://github.com/alexschmid3/15.S60_2023/tree/main/6_linear_programming))
- 2021: ([Irra Na and Lea Kapelevich](https://github.com/adelarue/cos_2021/blob/main/6%2B7_julia_and_jump/Intro%20to%20Julia-%20complete.ipynb))
- 2018: ([Arthur Delarue](https://github.com/PhilChodrow/cos_2018/tree/master/6_julia_and_jump))
- 2017: ([Sebastien Martin and Miles Lubin](https://philchodrow.github.io/cos_2017/5_julia_and_jump/intro-julia-jupyter.ipynb))

![](figures/founders.png)


#### 2009: 
Stefan Karpinski is frustrated while developing a network simulation tool.

August 2009: Karpinski speaks to Viral Shah 

Development begins at MIT.

Goals for the Julia language: \
Combine the pros of iron out the cons of each of the scientific computing languages into one open sourced, liberally-licensed language \
Efficiency and speed \
Simplicity \
Parallel computing simplified \
Dynamism

#### 2012

first public release with a liberal MIT license \
Julia v0.2.0 (now unmaintained)

#### 2014

second release: Julia v0.3.0

.....

#### Mar 2021

Julia v1.6 released (new Long-Term Stable version)

#### Nov 2021

Julia v1.7

#### Aug 2022

Julia v1.8

#### May 2023

Julia v1.9

#### Dec 2023

Julia v1.10

#### Oct 2024

Julia v1.11

## Why Julia?

From the [Julia website](https://julialang.org/):
- Fast: Julia programs "automatically compile to efficient native code via LLVM", achieving close to C speeds with close to Python syntax! This allows it to solve the two-language problem in scientific computing.  
- Dynamic: it is "dynamically typed and feels like a scripting language", and can be used interactively i.e. via the REPL.
- Reproducible: you can "recreate the same Julia environment every time, across platforms" - very important for ensuring consistent results in scientific computing!
- Composable: Julia uses [multiple dispatch](https://docs.julialang.org/en/v1/manual/methods/) which promotes writing safe and correct code.  

For an OR PhD student:
- **Fast and easy to code**: thanks to its Julia's JIT compiler, and dynamic-typed language.
- **JuMP has excellent documentation and is currently well-maintained**. This is largely thanks to Oscar Dowson (https://github.com/odow) who is currently paid full-time to maintain and extend JuMP. 

In particular Julia is a MIT creation and JuMP was first developed by MIT ORC students Miles Lubin (current BDFL), Iain Dunning, and Joey Huchette.

### 1) It's a **high-level** language with easy-to-use syntax:

Easy to use and learn, with a similar syntax to Python/Matlab.
It is possible to do complicated computations quickly.
For example, Solving $Ax = b$ with $A = \begin{pmatrix}
 1 & 2 & 3\\ 
 2 & 1 & 2\\ 
 3 & 2 & 1
\end{pmatrix}$ and $b = \begin{pmatrix}
 1 \\ 
 1 \\ 
 1 
\end{pmatrix}$ is as simple as:

In [1]:
A = [1 2 3
     2 1 2
     3 2 1]

b = [1,1,1]
A \ b

3-element Vector{Float64}:
 0.24999999999999997
 8.326672684688673e-17
 0.25

Many language features (`for`, `if`, `while`, `continue` keywords, indexing, list comprehensions and generator expressions) are similar to those in Python, so if you know Python you nearly know Julia! [Here's](https://docs.julialang.org/en/v1/manual/noteworthy-differences) a helpful page if you are coming from Python specifically (or other languages like R).

Also, it's interactive! We see this in action right now in our Jupyter notebook, but you could also launch an interactive Julia session in the REPL.

### 2) It is fast!

A **high-performance** language:

Julia is fast. Thanks to multiple dispatch, a strong type system, and just-in-time compilation, it can reach performance comparable to C and Fortran.

(Note: this figure was taken from [Julia Micro-benchmarks](https://julialang.org/benchmarks/), which uses Julia 1.0 released in 2018 and e.g. Python 3.6.)

![figures/Julia-benchmarks.png](figures/Julia-benchmarks.png)

Julia's just-in-time compliation speeds up for loops. Here's a simple example comparing Julia vs Python:

In [2]:
# Julia 
function julia_nested_loops(n)
    total = 0
    for i in 1:n
        for j in 1:n
            total += i * j
        end
    end
    return total
end

print("Julia execution time: ")
@time julia_nested_loops(10000)  # Run with n = 10,000

# vs Python
using PyCall

# Import Python's time module
@pyimport time as py_time

# Define the Python nested loop function in Julia
py"""
def python_nested_loops(n):
    total = 0
    for i in range(n):
        for j in range(n):
            total += i * j
    return total
"""

# Measure execution time in Python
start_time = py_time.time()
py"python_nested_loops"(10000)
end_time = py_time.time()
println("Python execution time: ", end_time - start_time, " seconds")

Julia execution time:   0.000001 seconds
Python execution time: 3.736337184906006 seconds


This is great as we don't need to worry about vectorizing the code; speeding up development as well as making for easier to read code.

### 3) The Two-Language Problem

- Before Julia: you might have 1 easy, scripting, interactive language for prototyping and quick development, and 1 compiled language for performance-sensitive code. You might start with the scripting language and then have to transfer / rewrite your code to the compiled language. Some examples:
    - Python --> Python + CPython for performance sensitive code
    - Python --> Python's fast compiled libraries (e.g. numpy, scikit-learn, various ML libraries) with base Python as a "glue" for performance-sensitive code
- Now you can start with Julia for your prototyping, and *stay in Julia* while optimizing it for fast performance!
- (Optional: watch Julia co-founder Viral Shah talk about the two-language problem and how Julia solves it [here](https://www.youtube.com/watch?v=Cr3lPsRAFmY).)
    

### 4) (the unreasonable effectiveness of) Multiple Dispatch

- In compiled languages like C and Fortran, one need to declare the type of each variable (e.g. Int, Float, Char...), so that the compiler can create efficient machine code for each function.
- In interpreted languages like Python and MATLAB, we don't have to declare the type of variables. The type of the variables is computed at run time, at the same time as the value of the variables. (Slow)
- Julia aims to bridge the two paradigms, by "walking like Python and running like C". It does not require type declaration (MATLAB-like syntax), but also creates efficient compiled machine codes.
- The trick is: when a function is defined (without declaring variable types), Julia will create different "methods" that affiliated with this function, with different specifications of variable types. When a function is called and the types of all the arguments are given, the compiler will produce a method instance, which is the machine code of this specific choice of variable type. 
- (Optional: see this [talk](https://www.youtube.com/watch?v=kc9HwsxE1OY) on how it works as a substitute for object-oriented programming)


In [3]:
# When we write a function, it can have many "methods"
+(1, 2)

3

In [4]:
methods(+)

In [5]:
function my_function(x)
    println("Default output")
end

function my_function(x::Int) # only called when x is an integer
    println("You gave me an integer!")
end

methods(my_function)

In [6]:
my_function(1.0)
my_function(1)
my_function("ORC")

Default output
You gave me an integer!
Default output


In [7]:
# you can check which method will be dispatched to with @which
@which +(1, 2)

### 5) Type stability

- A function is type stable if the types of all arguments is enough information for the compiler to infer the type of every variable and expression within the function.
- If a function is type stable, Julia is able to create efficient machine codes.
- If a function is not type stable, the compiler will produce machine code full of “if”s, covering all options of what the type of each variable could be. This is comprehensive (you still get correct code) but the resulting machine code is slow.


In [8]:
function add(x::Number, y::Number)
    z = x + y
    return z
end
# Number is a abstract type which constains many possible concrete types, such as Int8, Int64, Float64...
# If type of x is Float64 and type of y is Int8, then type of z is Float64 -- this is because we saw earlier that + has many methods
# This function is type stable.

add (generic function with 1 method)

In [9]:
function largest(a::Float64, b::Int64)
    if a > b
        c = a
    else
        c = b    
    end
    return c
end
# If a>b, the type of c is Float64
# If a<=b, the type of c is Int64
# The type of c depends on not only the types of a and b, but also the values of a and b.
# This function is not type stable

# we can make it type stable:
function largest_stable(a::Float64, b::Int64)
    b = b*1.0
    if a > b
        c = a
    else
        c = b    
    end
    return c
end

largest_stable (generic function with 1 method)

In [10]:
# We can make it more generally type stable (not just Floats and Ints) by using promote():
function largest_stable_new(a::Number, b::Number)
    a, b = promote(a, b)
    if a > b
        return a
    else 
        return b
    end
end

largest_stable_new (generic function with 1 method)

In [11]:
@show largest_stable_new(1, 0.5)
@show largest_stable_new(1, 1.5)
@show largest_stable_new(1, 1//2)
;

largest_stable_new(1, 0.5) = 1.0
largest_stable_new(1, 1.5) = 1.5
largest_stable_new(1, 1 // 2) = 1//1


### Example: find entering variable with minimum reduced cost

Complete `find_entering_var` below, which returns the minimum reduced cost and index of the entering variable (with the minimum reduced cost), inside an iteration of the simplex method. 

If no variable has negative reduced cost, we will simply return zeros for `min_rc` and `min_idx`. If multiple variables have the lowest reduced cost, we will return the last of these.

Remember the vector of reduced costs is given by:
$$
rc = c_N - A'\pi
$$
and the $i^{th}$ reduced cost is
$$
rc_i = c_i - A_i' \pi
$$
where $A_i$ is the $i^{th}$ column of $A$.

In [12]:
using LinearAlgebra
function find_entering_var(
    A::Matrix{Float64}, 
    c::Vector{Float64}, 
    pi::Vector{Float64}, 
    var_status::Vector{Int}, 
)
    # var_status[i]=1 if i in the base, o.w. var_status[i]=0
    min_rc = 0
    min_idx = 0
    for k in eachindex(var_status)
        # only check nonbasic variables
        if iszero(var_status[k])
            rc = c[k] - dot(A[:, k], pi)
            if rc < min_rc
                min_rc = rc
                min_idx = k
            end
        end
    end
    return (min_rc, min_idx)
end

find_entering_var (generic function with 1 method)

In [13]:
# test your function by running this cell

using Random
function make_data(T::Type)
Random.seed!(1)
    basic_idxs = [2, 4, 6]
    A = T[3 2 1 2 1 0 0; 1 1 1 1 0 1 0; 4 3 3 4 0 0 1]
    B = A[:, basic_idxs]
    B_inv = inv(B) # note this would never happen inside the algorithm, we always have B_inv available
    b = T[225, 117, 420]
    c = -T[19, 13, 12, 17, 0, 0, 0]
    c_b = c[basic_idxs]
    x_b = B_inv * b
    var_status = [0, 1, 0, 2, 0, 3]
    pi = B_inv' * c_b
    return (A, b, c, B_inv, pi, var_status, basic_idxs)
end
(A, b, c, B_inv, pi, var_status, basic_idxs) = make_data(Float64)

find_entering_var(A, c, pi, var_status) # should be (-1.5, 1)

(-1.5, 1)

- We can set type template T to allow for different types.
- Real is an abstract type that includes Float32 and Float64

In [14]:
function find_entering_var(
    A::Matrix{T}, 
    c::Vector{T}, 
    pi::Vector{T}, 
    var_status::Vector{Int},
) where {T <: Real}
    min_rc = 0
    min_idx = 0
    for k in eachindex(var_status)
        # only check nonbasic variables
        if iszero(var_status[k])         
            rc = c[k] - dot(A[:, k], pi)
            if rc < min_rc
                min_rc = rc
                min_idx = k
            end
        end
    end
    return (min_rc, min_idx)
end

find_entering_var (generic function with 2 methods)

In [15]:
# let's generate some rational data
(A, b, c, B_inv, pi, var_status, basic_idxs) = make_data(Rational{Int})
@show A
@show b
@show c
;

A = Rational{Int64}[3//1 2//1 1//1 2//1 1//1 0//1 0//1; 1//1 1//1 1//1 1//1 0//1 1//1 0//1; 4//1 3//1 3//1 4//1 0//1 0//1 1//1]
b = Rational{Int64}[225//1, 117//1, 420//1]
c = Rational{Int64}[-19//1, -13//1, -12//1, -17//1, 0//1, 0//1, 0//1]


In [16]:
# test that this "just works" by running this cell
(min_rc, min_idx) = find_entering_var(A, c, pi, var_status) # should be (-3//2, 1)

(-3//2, 1)

#### Let's check if it is type stable

In [17]:
@code_warntype find_entering_var(A, c, pi, var_status)

MethodInstance for find_entering_var(::Matrix{Rational{Int64}}, ::Vector{Rational{Int64}}, ::Vector{Rational{Int64}}, ::Vector{Int64})
  from find_entering_var([90mA[39m::[1mMatrix[22m[0m{T}, [90mc[39m::[1mVector[22m[0m{T}, [90mpi[39m::[1mVector[22m[0m{T}, [90mvar_status[39m::[1mVector[22m[0m{Int64}) where T<:Real[90m @[39m [90mMain[39m [90m~/Documents/repos/15.S60_2025/5_linear_programming/[39m[90m[4mjl_notebook_cell_df34fa98e69747e1a8f8a730347b8e2f_Y162sZmlsZQ==.jl:1[24m[39m
Static Parameters
  T = [36mRational{Int64}[39m
Arguments
  #self#[36m::Core.Const(find_entering_var)[39m
  A[36m::Matrix{Rational{Int64}}[39m
  c[36m::Vector{Rational{Int64}}[39m
  pi[36m::Vector{Rational{Int64}}[39m
  var_status[36m::Vector{Int64}[39m
Locals
  @_6[33m[1m::Union{Nothing, Tuple{Int64, Int64}}[22m[39m
  min_idx[36m::Int64[39m
  min_rc[33m[1m::Union{Rational{Int64}, Int64}[22m[39m
  k[36m::Int64[39m
  rc[36m::Rational{Int64}[39m
Body[91m[1m:

#### We can make it type stable by changing the type of min_rc at initialization

In [18]:
function find_entering_var(
    A::Matrix{T}, 
    c::Vector{T}, 
    pi::Vector{T}, 
    var_status::Vector{Int}
) where {T <: Real}
    min_rc = zero(T) # <----------------
    min_idx = 0
    for k in eachindex(var_status)
        # only check nonbasic variables
        if iszero(var_status[k])
            rc = c[k] - dot(A[:, k], pi)
            if rc < min_rc
                min_rc = rc
                min_idx = k
            end
        end
    end
    return (min_rc, min_idx)
end

@code_warntype find_entering_var(A, c, pi, var_status)

MethodInstance for find_entering_var(::Matrix{Rational{Int64}}, ::Vector{Rational{Int64}}, ::Vector{Rational{Int64}}, ::Vector{Int64})
  from find_entering_var([90mA[39m::[1mMatrix[22m[0m{T}, [90mc[39m::[1mVector[22m[0m{T}, [90mpi[39m::[1mVector[22m[0m{T}, [90mvar_status[39m::[1mVector[22m[0m{Int64}) where T<:Real[90m @[39m [90mMain[39m [90m~/Documents/repos/15.S60_2025/5_linear_programming/[39m[90m[4mjl_notebook_cell_df34fa98e69747e1a8f8a730347b8e2f_Y201sZmlsZQ==.jl:1[24m[39m
Static Parameters
  T = [36mRational{Int64}[39m
Arguments
  #self#[36m::Core.Const(find_entering_var)[39m
  A[36m::Matrix{Rational{Int64}}[39m
  c[36m::Vector{Rational{Int64}}[39m
  pi[36m::Vector{Rational{Int64}}[39m
  var_status[36m::Vector{Int64}[39m
Locals
  @_6[33m[1m::Union{Nothing, Tuple{Int64, Int64}}[22m[39m
  min_idx[36m::Int64[39m
  min_rc[36m::Rational{Int64}[39m
  k[36m::Int64[39m
  rc[36m::Rational{Int64}[39m
Body[36m::Tuple{Rational{Int64}, Int

For those interested, read more [here](https://arxiv.org/abs/2109.01950).

## Exercise (optional)
Complete the function `find_leaving_var` to return `(min_ratio, min_idx)`. I.e. the minimum and the minimizer of:
$$
\min_{k: e_k' B^{-1} A_i > 0} \frac{e_k' B^{-1}b}{e_k' B^{-1} A_i}
$$
If $ e_k' B^{-1} A_i \leq 0 $ for all $k$, return (0, Inf). Assume you are provided the vectors `B_inv_A_i = B \ A_i`, and `x_b = B \ b`, as well as a list of basic indices as input.

Test for correctness and type stability by running the box below. 

In [19]:
# hint:
@show typeof(Inf)
@show typeof(Float64(Inf))
@show typeof(Rational{Int}(Inf))

typeof(Inf) = Float64
typeof(Float64(Inf)) = Float64
typeof(Rational{Int}(Inf)) = Rational{Int64}


Rational{Int64}

In [20]:
function find_leaving_var(x_b::Vector{T}, B_inv_A_i::Vector{T}, basic_idxs::Vector{Int}) where {T <: Real}
    min_ratio = T(Inf)
    min_idx = 0
    for k in eachindex(B_inv_A_i)
        if B_inv_A_i[k] > 0
            ratio = x_b[k] / B_inv_A_i[k]
            if ratio < min_ratio
                min_ratio = ratio
                min_idx = k
            end
        end
    end
    return (min_ratio, min_idx)
end

find_leaving_var (generic function with 1 method)

In [21]:
# use our data and entering variable from before
(A, b, c, B_inv, pi, var_status, basic_idxs) = make_data(Float64)
(_, entering_idx) = find_entering_var(A, c, pi, var_status)
x_b = B_inv * b
B_inv_A_i = B_inv * A[:, entering_idx]


(min_ratio, leaving_idx) = find_leaving_var(x_b, B_inv_A_i, basic_idxs) # should be (14.999999999999993, 1)

(15.000000000000004, 1)

In [22]:
@code_warntype find_leaving_var(x_b, B_inv_A_i, basic_idxs)

MethodInstance for find_leaving_var(::Vector{Float64}, ::Vector{Float64}, ::Vector{Int64})
  from find_leaving_var([90mx_b[39m::[1mVector[22m[0m{T}, [90mB_inv_A_i[39m::[1mVector[22m[0m{T}, [90mbasic_idxs[39m::[1mVector[22m[0m{Int64}) where T<:Real[90m @[39m [90mMain[39m [90m~/Documents/repos/15.S60_2025/5_linear_programming/[39m[90m[4mjl_notebook_cell_df34fa98e69747e1a8f8a730347b8e2f_X62sZmlsZQ==.jl:1[24m[39m
Static Parameters
  T = [36mFloat64[39m
Arguments
  #self#[36m::Core.Const(find_leaving_var)[39m
  x_b[36m::Vector{Float64}[39m
  B_inv_A_i[36m::Vector{Float64}[39m
  basic_idxs[36m::Vector{Int64}[39m
Locals
  @_5[33m[1m::Union{Nothing, Tuple{Int64, Int64}}[22m[39m
  min_idx[36m::Int64[39m
  min_ratio[36m::Float64[39m
  k[36m::Int64[39m
  ratio[36m::Float64[39m
Body[36m::Tuple{Float64, Int64}[39m
[90m1 ─[39m       (min_ratio = ($(Expr(:static_parameter, 1)))(Main.Inf))
[90m│  [39m       (min_idx = 0)
[90m│  [39m %3  = Main.eac

## Why not Julia?

If Julia is so awesome, why do people not use it? Here are some (crowd-sourced) reasons:
- Hard to write: clunky syntax, type stability restrictions
- Hard to debug: reading long, confusing stack traces
- Slow startup
- No inheritance from concrete types
- Immature package ecosystem
- Immature online resources / documentation

### 1) Hard to write / 2) Hard to debug

Good news here is that stack traces especially in Julia 1.10.0 are a lot friendlier than previous versions of Julia:
- [Easier error messages](https://julialang.org/blog/2023/12/julia-1.10-highlights/#new_parser_written_in_julia)
- [Less verbose stacktraces](https://julialang.org/blog/2023/12/julia-1.10-highlights/#improvements_in_stacktrace_rendering)

### 3) Slow startup

It's a common criticism of Julia that because packages have to be compiled when using them, it might take a really long time to set up your working environment. This time adds up whenever you restart your Julia session regardless of the reason (a hung program, clearing the global namespace etc.) This is known as the "Time to First ___" problem (TTFX). 

Good news is that Julia 1.9 onwards introduced the idea of [package extensions](https://julialang.org/blog/2023/04/julia-1.9-highlights/#package_extensions) and [caching native code](https://julialang.org/blog/2023/04/julia-1.9-highlights/#caching_of_native_code), and Julia 1.10 further pushed the envelope in [package load time](https://julialang.org/blog/2023/12/julia-1.10-highlights/#package_load_time_improvements)! Hopefully this means that your TTFX problem is now within an acceptable range whenever you restart the kernel.

There is also Revise.jl, for people who develop and test their own Julia packages (similar to the idea of "editable installs" in Python), which reduces the need to restart the Julia kernel everytime you make a change to your package.

Another common criticism is the "compilation time" for when a function is run for the first time: `TODO`

### 4) No inheritance from concrete types

In Python you might be used to an object-oriented paradigm: for example, defining a `Person` class with certain common attributes (age, nationality) and then defining subclasses such as `Student` and `Teacher` with class-specific attributes and methods. When trying to replicate the same design in Julia you run into problems:

In [23]:
struct Person
    age::Int
    nationality::String
end

struct Student <: Person # <: is the subtyping relation
    grade::String
end

ErrorException: invalid subtyping in definition of Student: can only subtype abstract types.

The truth here is that there is a conscious design in Julia not to allow these "types" to "subclass each other". Specifically, `Person` is a concrete type which cannot be subtyped. Julia forces you to choose one of the two options:
- Have an abstract `Person` which cannot be instantiated but can be subtyped;
- Have a concrete `Person` type which can be instantiated but cannot be subtyped further by another user-defined type.

The benefit of making you choose is that once you define a concrete type (e.g. `Student`), the methods that take a `Student` instance you write cannot be overwritten by someone else who uses your code. Julia is a language which favours composition over inheritance and shallow inheritance trees, which does take getting used to coming from e.g. Python, but it gets intuitive!

Further reading:
- [Abstract vs. concrete types](https://docs.julialang.org/en/v1/manual/types)
- a [Discourse post](https://discourse.julialang.org/t/method-inheritance-the-julian-way/67198) on Julian inheritance patterns

### 5) Immature package ecosystem

Compared to a language like Python, it is true that Julia has only been around for much less time, and "mega-packages" (packages in Python which are so commonly used and widely documented they are associated with Python itself, such as `numpy` and `pandas`) don't exist to the same scale in Julia. However, the flip side is that Julia is a new and fast-growing community, and packages are being created / ported from other languages at a very fast pace!

Here are some comprehensive packages for whatever you might want to do in scientific computing:
- Statistics: `StatsBase.jl` and `Statistics.jl`
- Machine learning: `MLJ.jl`, `Flux.jl` and `Knet.jl` for deep learning
- Data tools: `DataFrames.jl`, `CSV.jl`, `Arrow.jl` and `Spark.jl` for big data
- Data visualization: `Plots.jl`, `Makie.jl`
- Optimization: `JuMP.jl` and `Optim.jl`
- Differential equations: `DifferentialEquations.jl`

### 6) Immature online resources / documentation

Here are some places to learn more about Julia if you are interested:
- The official Julia documentation is a good place to start; in particular, [Performance tips](https://docs.julialang.org/en/v1/manual/performance-tips) can help you quickly debug slow parts of your code, and the [Style guide](https://docs.julialang.org/en/v1/manual/style-guide) points you toward the Julian way of coding.
- The [Discourse](https://discourse.julialang.org) page is a good page for Julia FAQs (the Julia-specific version of StackOverflow)
- The [YouTube](https://youtube.com/@TheJuliaLanguage?si=-KIkpckbzjIlM2R7) channel contains talks for various topics in Julia (usually JuliaCon proceedings) where you can learn more about a specific part of the language / specific package
- The Slack channel (link on Julia's homepage) gets you access to the Julia community, where you can ask questions, discuss Julia in general, see what's going on at JuliaCon etc.