# <img src="https://github.com/JuliaLang/julia-logo-graphics/raw/master/images/julia-logo-color.png" height="100" /> _for Pythonistas_

> TL;DR: _Julia looks and feels a lot like Python, only much faster. It's dynamic, expressive, extensible, with batteries included, in particular for Data Science_.

This notebook is an **introduction to Julia for Python programmers**.

It will go through the most important Python features (such as functions, basic types, list comprehensions, exceptions, generators, modules, packages, and so on) and show you how to code them in Julia.

# Getting Started with Julia in Colab/Jupyter
You can either run this notebook in Google Colab, or using Jupyter on your own machine.

## Running on Google Colab
1. Work on a copy of this notebook: _File_ > _Save a copy in Drive_ (you will need a Google account). Alternatively, you can download the notebook using _File_ > _Download .ipynb_, then upload it to [Colab](https://colab.research.google.com/).
2. Execute the following cell (click on it and press Ctrl+Enter) to install Julia, IJulia (the Jupyter kernel for Julia) and other packages. You can update `JULIA_VERSION` and the other parameters, if you know what you're doing. Installation takes 2-3 minutes.
3. Reload this page (press Ctrl+R, or ⌘+R, or the F5 key) and continue to the _Checking the Installation_ section.

* _Note_: If your Colab Runtime gets reset (e.g., due to inactivity), repeat steps 2 and 3.

In [None]:
%%shell
set -e

#---------------------------------------------------#
JULIA_VERSION="1.5.1" # any version ≥ 0.7.0
JULIA_PACKAGES="IJulia BenchmarkTools PyCall PyPlot"
JULIA_PACKAGES_IF_GPU="CUDA"
JULIA_NUM_THREADS=4
#---------------------------------------------------#

if [ -n "$COLAB_GPU" ] && [ -z `which julia` ]; then
  # Install Julia
  JULIA_VER=`cut -d '.' -f -2 <<< "$JULIA_VERSION"`
  echo "Installing Julia $JULIA_VERSION on the current Colab Runtime..."
  BASE_URL="https://julialang-s3.julialang.org/bin/linux/x64"
  URL="$BASE_URL/$JULIA_VER/julia-$JULIA_VERSION-linux-x86_64.tar.gz"
  wget -nv $URL -O /tmp/julia.tar.gz # -nv means "not verbose"
  tar -x -f /tmp/julia.tar.gz -C /usr/local --strip-components 1
  rm /tmp/julia.tar.gz

  # Install Packages
  if [ "$COLAB_GPU" = "1" ]; then
      JULIA_PACKAGES="$JULIA_PACKAGES $JULIA_PACKAGES_IF_GPU"
  fi
  for PKG in `echo $JULIA_PACKAGES`; do
    echo "Installing Julia package $PKG..."
    julia -e 'using Pkg; pkg"add '$PKG'; precompile;"'
  done

  # Install kernel and rename it to "julia"
  echo "Installing IJulia kernel..."
  julia -e 'using IJulia; IJulia.installkernel("julia", env=Dict(
      "JULIA_NUM_THREADS"=>"'"$JULIA_NUM_THREADS"'"))'
  KERNEL_DIR=`julia -e "using IJulia; print(IJulia.kerneldir())"`
  KERNEL_NAME=`ls -d "$KERNEL_DIR"/julia*`
  mv -f $KERNEL_NAME "$KERNEL_DIR"/julia  

  echo ''
  echo "Successfully installed `julia -v`!"
  echo "Please reload this page (press Ctrl+R, ⌘+R, or the F5 key) then"
  echo "jump to the 'Checking the Installation' section."
fi

Installing Julia 1.4.2 on the current Colab Runtime...
2020-07-02 00:00:58 URL:https://storage.googleapis.com/julialang2/bin/linux/x64/1.4/julia-1.4.2-linux-x86_64.tar.gz [99093958/99093958] -> "/tmp/julia.tar.gz" [1]
Installing Julia package IJulia...
    Cloning default registries into `~/.julia`
    Cloning registry from "https://github.com/JuliaRegistries/General.git"
[2K[?25h      Added registry `General` to `~/.julia/registries/General`
  Resolving package versions...
  Installed VersionParsing ── v1.2.0
  Installed MbedTLS_jll ───── v2.16.6+0
  Installed SoftGlobalScope ─ v1.0.10
  Installed ZeroMQ_jll ────── v4.3.2+4
  Installed Parsers ───────── v1.0.6
  Installed Conda ─────────── v1.4.1
  Installed JSON ──────────── v0.21.0
  Installed IJulia ────────── v1.21.2
  Installed ZMQ ───────────── v1.2.1
  Installed MbedTLS ───────── v1.0.2
Downloading artifact: MbedTLS
######################################################################## 100.0%
[1A[2K[?25hDownloading artif



## Running This Notebook Locally
If you prefer to run this notebook on your machine instead of Google Colab:

* Download this notebook (File > Download .ipynb)
* Install [Julia](https://julialang.org/downloads/)
* Run the following command in a terminal to install `IJulia` (the Jupyter kernel for Julia), and a few packages we will use:
```bash
julia -e 'using Pkg
            pkg"add IJulia; precompile;"
            pkg"add BenchmarkTools; precompile;"
            pkg"add PyCall; precompile;"
            pkg"add PyPlot; precompile;"'
```

* Next, go to the directory containing this notebook:

    ```julia
cd /path/to/notebook/directory
```

* Start Jupyter Notebook:

    ```bash
julia -e 'using IJulia; IJulia.notebook()'
```

    Or replace `notebook()` with `jupyterlab()` if you prefer JupyterLab.

    If you do not already have [Jupyter](https://jupyter.org/install) installed, IJulia will propose to install it. If you agree, it will automatically install a private Miniconda (just for Julia), and install Jupyter and Python inside it.

* Lastly, open this notebook and skip directly to the next section.

## Checking the Installation
The `versioninfo()` function should print your Julia version and some other info about the system (if you ever ask for help or file an issue about Julia, you should always provide this information).

In [1]:
versioninfo()

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, broadwell)
Environment:
  JULIA_NUM_THREADS = 4


# Writing/Reading Files
The `do` syntax we saw earlier is helpful when using the `open()` function:


In [260]:
open("test.txt", "w") do f
    write(f, "This is a test.\n")
    write(f, "I repeat, this is a test.\n")
end

open("test.txt") do f
    for line in eachline(f)
        println("[$line]")
    end
end

[This is a test.]
[I repeat, this is a test.]


The `open()` function automatically closes the file at the end of the block. Notice that the line feeds `\n` at the end of each line are not returned by the `eachline()` function. So the equivalent Python code is:

```python
# PYTHON
with open("test.txt", "w") as f:
    f.write("This is a test.\n")
    f.write("I repeat, this is a test.\n")

with open("test.txt") as f:
    for line in f.readlines():
        line = line.rstrip("\n")
        print(f"[{line}]")
```

Alternatively, you can read the whole file into a string:

In [261]:
open("test.txt") do f
    s = read(f, String)
end

"This is a test.\nI repeat, this is a test.\n"

Or more concisely:

In [262]:
s = read("test.txt", String)

"This is a test.\nI repeat, this is a test.\n"

The Python equivalent is:

```python
# PYTHON
with open("test.txt") as f:
    s = f.read()
```

# Exceptions

Julia's exceptions behave very much like in Python:

In [263]:
a = [1]
try
    push!(a, 2)
    #throw("Oops") # try uncommenting this line
    push!(a, 3)
catch ex
    println(ex)
    push!(a, 4)
finally
    push!(a, 5)
end
println(a)

[1, 2, 3, 5]


The equivalent Python code is:

```python
# PYTHON
a = [1]
try:
    a.append(2)
    #raise Exception("Oops") # try uncommenting this line
    a.append(3)
except Exception as ex:
    print(ex)
    a.append(4)
finally:
    a.append(5)

print(a)
```

There is a whole hierarchy of standard exceptions which can be thrown, just like in Python. For example:

In [264]:
choice = 1 # try changing this value (from 1 to 4)
try
    choice == 1 && open("/foo/bar/i_dont_exist.txt")
    choice == 2 && sqrt(-1)
    choice == 3 && push!(a, "Oops")
    println("Everything worked like a charm")
catch ex
    if ex isa SystemError
        println("Oops. System error #$(ex.errnum) ($(ex.prefix))")
    elseif ex isa DomainError
        println("Oh no, I could not compute sqrt(-1)")
    else
        println("I got an unexpected error: $ex")
    end
end

Oops. System error #2 (opening file "/foo/bar/i_dont_exist.txt")


Compare this with Python's equivalent code:

```python
# PYTHON
choice = 3 # try changing this value (from 1 to 4)
try:
  if choice == 1:
      open("/foo/bar/i_dont_exist.txt")
  if choice == 2:
      math.sqrt(-1)
  if choice == 3:
      #a.append("Ok") # this would actually work
      raise TypeError("Oops") # so let's fail manually
  print("Everything worked like a charm")
except OSError as ex:
    print(f"Oops. OS error (#{ex.errno} ({ex.strerror})")
except ValueError:
    print("Oh no, I could not compute sqrt(-1)")
except Exception as ex:
    print(f"I got an unexpected error: {ex}")
```


A few things to note here:

* Julia only allows a single `catch` block which handles all possible exceptions.
* `obj isa SomeClass` is a shorthand for `isa(obj, SomeClass)` which is equivalent to Python's `isinstance(obj, SomeClass)`.

|Julia|Python
|-----|------
|`try`<br />&nbsp;&nbsp;&nbsp;&nbsp;`...`<br />`catch ex`<br />&nbsp;&nbsp;&nbsp;&nbsp;`if ex isa SomeError`<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`...`<br />&nbsp;&nbsp;&nbsp;&nbsp;`else`<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`...`<br />&nbsp;&nbsp;&nbsp;&nbsp;`end`<br />`finally`<br />&nbsp;&nbsp;&nbsp;&nbsp;`...`<br />`end` | `try:`<br />&nbsp;&nbsp;&nbsp;&nbsp;`...`<br />`except SomeException as ex:`<br />&nbsp;&nbsp;&nbsp;&nbsp;`...`<br />`except Exception as ex:`<br />&nbsp;&nbsp;&nbsp;&nbsp;`...`<br />`finally:`<br />&nbsp;&nbsp;&nbsp;&nbsp;`...`
|`throw any_value` | `raise SomeException(...)`
| `obj isa SomeType`<br />or<br /> `isa(obj, SomeType`) | `isinstance(obj, SomeType)`

Note that Julia does not support the equivalent of Python's `try / catch / else` construct. You need to write something like this:

In [265]:
catch_exception = true
try
    println("Try something")
    #error("ERROR: Catch me!") # try uncommenting this line
    catch_exception = false
    #error("ERROR: Don't catch me!") # try uncommenting this line
    println("No error occurred")
catch ex
    if catch_exception
        println("I caught this exception: $ex")
    else
        throw(ex)
    end
finally
    println("The end")
end
println("After the end")

Try something
No error occurred
The end
After the end


The equivalent Python code is shorter, but it's fairly uncommon:

```python
# PYTHON
try:
    print("Try something")
    raise Exception("Catch me!") # try uncommenting this line
except Exception as ex:
    print(f"I caught this exception: {ex}")
else:
    raise Exception("Don't catch me!") # try uncommenting this line
    print("No error occured")
finally:
    print("The end")

print("After the end")
```

# Docstrings
It's good practice to add docstrings to every function you export. The docstring is placed just _before_ the definition of the function:

In [266]:
"Compute the square of number x"
square(x::Number) = x^2

square

You can retrieve a function's docstring using the `@doc` macro:

In [267]:
@doc square

Compute the square of number x


The docstring is displayed when asking for help:

In [268]:
?square

search: [0m[1ms[22m[0m[1mq[22m[0m[1mu[22m[0m[1ma[22m[0m[1mr[22m[0m[1me[22m [0m[1mS[22m[0m[1mq[22m[0m[1mu[22m[0m[1ma[22m[0m[1mr[22m[0m[1me[22m [0m[1mS[22m[0m[1mq[22m[0m[1mu[22m[0m[1ma[22m[0m[1mr[22m[0m[1me[22m_v2 My[0m[1mS[22m[0m[1mq[22m[0m[1mu[22m[0m[1ma[22m[0m[1mr[22m[0m[1me[22ms Ab[0m[1ms[22mtractS[0m[1mq[22m[0m[1mu[22m[0m[1ma[22m[0m[1mr[22m[0m[1me[22m la[0m[1ms[22mtdayof[0m[1mq[22m[0m[1mu[22m[0m[1ma[22m[0m[1mr[22mt[0m[1me[22mr



Compute the square of number x


Docstrings follow the [Markdown format](https://en.wikipedia.org/wiki/Markdown#:~:text=Markdown%20is%20a%20lightweight%20markup,using%20a%20plain%20text%20editor.).
A typical docstring starts with the signature of the function, indented by 4 spaces, so it will get syntax highlighted as Julia code.
It also includes an `Examples` section with Julia REPL outputs:

In [269]:
"""
    cube(x::Number)

Compute the cube of `x`.

# Examples
```julia-repl
julia> cube(5)
125
julia> cube(im)
0 - 1im
```
"""
cube(x) = x^3

cube

Instead of using `julia-repl` code blocks for the examples, you can use `jldoctest` to mark these examples as doctests (similar to Python's doctests).

The help gets nicely formatted:

In [270]:
?cube

search: [0m[1mc[22m[0m[1mu[22m[0m[1mb[22m[0m[1me[22m [0m[1mC[22mdo[0m[1mu[22m[0m[1mb[22ml[0m[1me[22m



```
cube(x::Number)
```

Compute the cube of `x`.

# Examples

```julia-repl
julia> cube(5)
125
julia> cube(im)
0 - 1im
```


When there are several methods for a given function, it is common to give general information about the function in the first method (usually the most generic), and only add docstrings to other methods if they add useful information (without repeating the general info).

Alternatively, you may attach the general information to the function itself:

In [271]:
"""
    foo(x)

Compute the foo of the bar
"""
function foo end  # declares the foo function

# foo(x::Number) behaves normally, no need for a docstring
foo(x::Number) = "baz"

"""
    foo(x::String)

For strings, compute the qux of the bar instead.
"""
foo(x::String) = "qux"

foo

In [272]:
?foo

search: [0m[1mf[22m[0m[1mo[22m[0m[1mo[22m [0m[1mf[22ml[0m[1mo[22m[0m[1mo[22mr pointer_[0m[1mf[22mr[0m[1mo[22mm_[0m[1mo[22mbjref wait[0m[1mf[22m[0m[1mo[22mrbutt[0m[1mo[22mnpress Over[0m[1mf[22ml[0m[1mo[22mwErr[0m[1mo[22mr



```
foo(x)
```

Compute the foo of the bar

---

```
foo(x::String)
```

For strings, compute the qux of the bar instead.


# Macros

We have seen a few macros already: `@which`, `@assert`, `@time`, `@benchmark`, `@btime` and `@doc`. You guessed it: all macros start with an `@` sign.

What is a macro? It is a function which can fully inspect the expression that follows it, and apply any transformation to that code at parse time, before compilation.

This makes it possible for anyone to effectively extend the language in any way they please. Whereas C/C++ macros just do simple text replacement, **Julia macros are powerful meta-programming tools**.

On the flip side, this also means that **each macro has its own syntax and behavior**.

**A personal opinion**: in my experience, languages that provide great flexibility typically attract a community of programmers with a tinkering mindset, who will _love_ to experiment with all the fun features the language has to offer. This is great for creativity, but it can also be a nuisance if the community ends up producing too much experimental code, without much care for code reliability, API stability, or even for simplicity. By all means, let's be creative, let's experiment, but _with great power comes great responsibility_: let's also value reliability, stability and simplicity.

That said, to give you an idea of what macro definitions look like in Julia, here's a simple toy macro that replaces `a + b` expressions with `a - b`, and leaves other expressions alone.

In [273]:
macro addtosub(x)
  if x.head == :call && x.args[1] == :+ && length(x.args) == 3
    Expr(:call, :-, x.args[2], x.args[3])
  else
    x
  end
end

@addtosub 10 + 2

8

In this macro definition, `:call`, `:+` and `:-` are **symbols**. These are similar to strings, only more efficient and less flexible. They are typically used as identifiers, such as keys in dictionaries.

If you're curious, the macro works because the parser converts `10 + 2` to `Expr(:call, :+, 10, 2)` and passes this expression to the macro (before compilation). The `if` statement checks that the expression is a function call, where the called function is the `+` function, with two arguments. If so, then the macro returns a new expression, corresponding to a call to the `-` function, with the same arguments. So `a + b` becomes `a - b`.

For more info, check out [this page](https://docs.julialang.org/en/v1/manual/metaprogramming/).

## Special Prefixed Strings

`py"..."` strings are defined by the `PyCall` module. Writing `py"something"` is equivalent to writing `@py_str "something"`. In other words, anyone can write a macro that defines a new kind of prefixed string. For example, if you write the `@ok_str` macro, it will be called when you write `ok"something"`.

Another example is the `Pkg` module which defines the `@pkg_str` macro: this is why you can use `pkg"..."` to interact with the `Pkg` module. This is how `pkg"add PyCall; precompile;"` worked (at the end of the very first cell). This downloaded, installed and precompiled the `PyCall` module.

# Modules
In Python, a module must be defined in a dedicated file. In Julia, modules are independent from the file system. You can define several modules per file, or define one module across multiple files, it's up to you. Let's create a simple module containing two submodules, each containing a variable and a function:

In [274]:
module ModA
    pi = 3.14
    square(x) = x^2

    module ModB
        e = 2.718
        cube(x) = x^3
    end

    module ModC
        root2 = √2
        relu(x) = max(0, x)
    end
end

Main.ModA

The default module is `Main`, so whatever we define is put in this module (except when defining a package, as we will see). This is why the `ModA`'s full name is `Main.ModA`.

We can now access the contents of these modules by providing the full paths:

In [275]:
Main.ModA.ModC.root2

1.4142135623730951

Since our code runs in the `Main` module, we can leave out the `Main.` part:

In [276]:
ModA.ModC.root2

1.4142135623730951

Alternatively, you can use `import`:

In [277]:
import Main.ModA.ModC.root2

root2

1.4142135623730951

Or we can use `import` with a relative path. In this case, we need to prefix `ModA` with a dot `.` to indicate that we want the module `ModA` located in the current module:

In [278]:
import .ModA.ModC.root2

root2

1.4142135623730951

Alternatively, we can `import` the submodule:

In [279]:
import .ModA.ModC

ModC.root2

1.4142135623730951

When you want to import more than one name from a module, you can use this syntax:

In [280]:
import .ModA.ModC: root2, relu

This is equivalent to this more verbose syntax:

In [281]:
import .ModA.ModC.root2, .ModA.ModC.relu

Nested modules do <u>not</u> automatically have access to names in enclosing modules. To import names from a parent module, use `..x`. From a grand-parent module, use `...x`, and so on.

In [282]:
module ModD
    d = 1
    module ModE
        try
            println(d)
        catch ex
            println(ex)
        end
    end
    module ModF
        f = 2
        module ModG
            import ..f
            import ...d
            println(f)
            println(d)
        end
    end
end

UndefVarError(:d)
2
1


Main.ModD

Instead of `import`, you can use `using`. It is analog to Python's `from foo import *`. It only gives access to names which were explicitly exported using `export` (similar to the way `from foo import *` in Python only imports names listed in the module's `__all__` list):

In [283]:
module ModH
    h1 = 1
    h2 = 2
    export h1
end

Main.ModH

In [284]:
using .ModH

println(h1)

try
    println(h2)
catch ex
    ex
end

1


UndefVarError(:h2)

Note that `using Foo` not only imports all exported names (like Python's `from foo import *`), it also imports `Foo` itself (similarly, `using Foo.Bar` imports `Bar` itself):


In [285]:
ModH

Main.ModH

Even if a name is not exported, you can always access it using its full path, or using `import`:

In [286]:
ModH.h2

2

In [287]:
import .ModH.h2

h2

2

You can also import individual names like this:

In [288]:
module ModG
    g1 = 1
    g2 = 2
    export g2
end

using .ModG: g1, g2

println(g1)
println(g2)

1
2


Notice that this syntax gives you access to any name you want, whether or not it was exported. In other words, whether a name is exported or not only affects the `using Foo` syntax.

Importantly, when you want to expand a function which is defined in a module, you must import the function using `import`, or you must specify the function's path:

In [289]:
module ModH
    double(x) = x * 2
    triple(x) = x * 3
end

import .ModH: double
double(x::AbstractString) = repeat(x, 2)

ModH.triple(x::AbstractString) = repeat(x, 3)

println(double(2))
println(double("Two"))

println(ModH.triple(3))
println(ModH.triple("Three"))

4
TwoTwo
9
ThreeThreeThree




You must never extend a function imported with `using`, unless you provide the function's path:

In [290]:
module ModI
    quadruple(x) = x * 4
    export quadruple
end

using .ModI
ModI.quadruple(x::AbstractString) = repeat(x, 4) # OK
println(quadruple(4))
println(quadruple("Four"))

#quadruple(x::AbstractString) = repeat(x, 4) # uncomment to see the error

16
FourFourFourFour


There is no equivalent of Python's `import foo as x` ([yet](https://github.com/JuliaLang/julia/issues/1255)), but you can do something like this:

In [291]:
import .ModI: quadruple
x = quadruple

quadruple (generic function with 2 methods)

In general, a module named `Foo` will be defined in a file named `Foo.jl` (along with its submodules). However, if the module becomes too big for a single file, you can split it into multiple files and include these files in `Foo.jl` using the `include()` function.

For example, let's create three files: `Awesome.jl`, `great.jl` and `amazing/Fantastic.jl`, where:
* `Awesome.jl` defines the `Awesome` module and includes the other two files
* `great.jl` just defines a function
* `amazing/Fantastic.jl` defines the `Fantastic` submodule

In [292]:
code_awesome = """
module Awesome
include("great.jl")
include("amazing/Fantastic.jl")
end
"""

code_great = """
great() = "This is great!"
"""

code_fantastic = """
module Fantastic
fantastic = true
end
"""

open(f->write(f, code_awesome), "Awesome.jl", "w")
open(f->write(f, code_great), "great.jl", "w")
mkdir("amazing")
open(f->write(f, code_fantastic), "amazing/Fantastic.jl", "w")

38

If we try to execute `import Awesome` now, it won't work since Julia does not search in the current directory by default. Let's change this:

In [293]:
pushfirst!(LOAD_PATH, ".")

4-element Array{String,1}:
 "."
 "@"
 "@v#.#"
 "@stdlib"

Now when we import the `Awesome` module, Julia will look for a file named `Awesome.jl` in the current directory, or for `Awesome/src/Awesome.jl`, or for `Awesome.jl/src/Awesome.jl`. If it does not find any of these, it will look in the other places listed in the `LOAD_PATH` array (we will discuss this in more details in the "Package Management" section).

In [294]:
import Awesome
println(Awesome.great())
println("Is fantastic? ", Awesome.Fantastic.fantastic)

┌ Info: Precompiling Awesome [top-level]
└ @ Base loading.jl:1260


This is great!
Is fantastic? true


Let's restore the original `LOAD_PATH`:

In [295]:
popfirst!(LOAD_PATH)

"."

In short:

|Julia | Python
|------|-------
|`import Foo` | `import foo`
|`import Foo.Bar` | `from foo import bar`
|`import Foo.Bar: a, b` | `from foo.bar import a, b`
|`import Foo.Bar.a, Foo.Bar.b` | `from foo.bar import a, b`
|`import .Foo` | `import .foo`
|`import ..Foo.Bar` | `from ..foo import bar`
|`import ...Foo.Bar` | `from ...foo import bar`
|`import .Foo: a, b` | `from .foo import a, b`
||
|`using Foo` | `from foo import *; import foo`
|`using Foo.Bar` | `from foo.bar import *; from foo import bar `
|`using Foo.Bar: a, b` | `from foo.bar import a, b`

|Extending function `Foo.f()` | Result
|-----------------------------|--------
|`import Foo.f  # or Foo: f` <br />`f(x::Int64) = ...`  | OK
|`import Foo`<br />`Foo.f(x::Int64) = ...` | OK
|`using Foo`<br />`Foo.f(x::Int64) = ...` | OK
|`import Foo.f # or Foo: f`<br />`Foo.f(x::Int64) = ...` | `ERROR: Foo not defined`
|`using Foo`<br />`f(x::Int64) = ...` | `ERROR: Foo.f must be explicitly imported`
|`using Foo: f`<br />`f(x::Int64) = ...` | `ERROR: Foo.f must be explicitly imported`

# Scopes
Julia has two types of scopes: global and local.

Every module has its own global scope, independent from all other global scopes. There is no overarching global scope.

Modules, macros and types (including structs) can only be defined in a global scope.

Most code blocks, including `function`, `struct`, `for`, `while`, etc., have their own local scope. For example:

In [296]:
for q in 1:3
    println(q)
end

try
    println(q) # q is not available here
catch ex
    ex
end

1
2
3


UndefVarError(:q)

A local scope inherits from its parent scope:

In [297]:
z = 5
for i in 1:3
    w = 10
    println(i * w * z) # i and w are local, z is from the parent scope
end

50
100
150


An inner scope can assign to a variable in the parent scope, if the parent scope is not global:

In [298]:
for i in 1:3
    s = 0
    for j in 1:5
        s = j # variable s is from the parent scope
    end
    println(s)
end

5
5
5


You can force a variable to be local by using the `local` keyword:

In [299]:
for i in 1:3
    s = 0
    for j in 1:5
        local s = j # variable s is local now
    end
    println(s)
end

0
0
0


To assign to a global variable, you must declare the variable as `global` in the local scope:

In [300]:
for i in 1:3
    global p
    p = i
end
p

3

There is one exception to this rule: when executing code directly in the REPL (since Julia 1.5) or in IJulia, you do not need to declare a variable as `global` if the global variable already exists:

In [301]:
s = 0
for i in 1:3
    s = i # implicitly global s: only in REPL Julia 1.5+ or IJulia
end
s

3

In functions, assigning to a variable which is not explicitly declared as global always makes it local (even in the REPL and IJulia):

In [302]:
s, t = 1, 2 # globals

function foo()
   s = 10 * t # s is local, t is global
end

println(foo())
println(s)

20
1


Just like in Python, functions can capture variables from the enclosing scope (not from the scope the function is called from):

In [303]:
t = 1

foo() = t # foo() captures t from the global scope

function bar()
    t = 5 # this is a new local variable
    println(foo()) # foo() still uses t from the global scope
end

bar()

1


In [304]:
function quz()
    global t
    t = 5 # we change the global t
    println(foo()) # and this affects foo()
end

quz()

5


Closures work much like in Python:

In [305]:
function create_multiplier(n)
    function mul(x)
        x * n # variable n is captured from the parent scope
    end
end

mul2 = create_multiplier(2)
mul2(5)

10

An inner function can modify variables from its parent scope:

In [306]:
function create_counter()
    c = 0
    inc() = c += 1 # this inner function modifies the c from the outer function
end

cnt = create_counter()
println(cnt())
println(cnt())

1
2


Consider the following code, and see if you can figure out why it prints the same result multiple times:

In [307]:
funcs = []
i = 1
while i ≤ 5
    push!(funcs, ()->i^2)
    global i += 1
end
for fn in funcs
    println(fn())
end

36
36
36
36
36


The answer is that there is a single variable `i`, which is captured by all 5 closures. By the time these closures are executed, the value of `i` is 6, so the square is 36, for every closure.

If we use a `for` loop, we don't have this problem, since a new local variable is created at every iteration:

In [308]:
funcs = []
for i in 1:5
    push!(funcs, ()->i^2)
end
for fn in funcs
    println(fn())
end

1
4
9
16
25


Any local variable created within a `for` loop, a `while` loop or a comprehension also get a new copy at each iteration. So we could code the above example like this:

In [309]:
funcs = []
i = 1
while i ≤ 5  # since we are in a while loop...
    global i
    local j = i # ...and j is created here, it's a new `j` at each iteration
    push!(funcs, ()->j^2)
    i += 1
end
for fn in funcs
    println(fn())
end

1
4
9
16
25


Another way to get the same result is to use a `let` block, which also creates a new local variable every time it is executed:

In [310]:
funcs = []
i = 0
while i < 5
    let i=i
        push!(funcs, ()->i^2)
    end
    global i += 1
end
for fn in funcs
    println(fn())
end

0
1
4
9
16


This `let i=i` block defines a new local variable `i` at every iteration, and initializes it with the value of `i` from the parent scope. Therefore each closure captures a different local variable `i`.

Variables in a `let` block are initialized from left to right, so they can access variables on their left:

In [311]:
a = 1
let a=a+1, b=a
    println("a=$a, b=$b")
end

a=2, b=2


In this example, the local variable `a` is initialized with the value of `a + 1`, where `a` comes from the parent scope (i.e., it's the global `a` in this case). However, `b` is initialized with the value of the local `a`, since it now hides the variable `a` from the parent scope.

Default values in function arguments also have this left-to-right scoping logic:

In [312]:
a = 1
foobar(a=a+1, b=a) = println("a=$a, b=$b")
foobar()
foobar(5)

a=2, b=2
a=5, b=5

In this example, the first argument's default value is `a + 1`, where `a` comes from the parent scope (i.e., the global `a` in this case). However, the second argument's default value is `a`, where `a` in this case is the value of the first argument (<u>not</u> the parent scope's `a`).

Note that `if` blocks and `begin` blocks do <u>not</u> have their own local scope, they just use the parent scope:

In [313]:
a = 1
if true
    a = 2 # same `a` as above
end
a

2

In [314]:
a = 1
begin
    a = 2  # same `a` as above
end
a

2

# Package Management


## Basic Workflow
The simplest way to write a Julia program is to create a `.jl` file somewhere and run it using `julia`. You would usually do this with your favorite editor, but in this notebook we must do this programmatically. For example:

In [315]:
code = """
println("Hello world")
"""

open(f->write(f, code), "my_program1.jl", "w")

23

Then let's run the program using a shell command:

In [316]:
;julia my_program1.jl

Hello world


If you need to use a package which is not part of the standard library, such as `PyCall`, you first need to install it using Julia's package manager `Pkg`:

In [317]:
using Pkg
Pkg.add("PyCall")

[32m[1m   Updating[22m[39m registry at `~/.julia/registries/General`


[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`


[?25h

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m


Alternatively, in interactive mode, you can enter the `Pkg` mode by typing `]`, then type a command:

In [318]:
]add PyCall

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m


You can also precompile the new package to avoid the compilation delay when the package is first used:

In [319]:
]add PyCall; precompile;

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m
[32m[1mPrecompiling[22m[39m project...


One last alternative is to use `pkg"..."` strings to run commands in your programs:

In [320]:
pkg"add PyCall; precompile;"

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m
[32m[1mPrecompiling[22m[39m project...


Now you can import `PyCall` in any of your Julia programs:

In [321]:
code = """
using PyCall
py"print('1 + 2 =', 1 + 2)"
"""

open(f->write(f, code), "my_program2.jl", "w")

41

In [322]:
;julia my_program2.jl

1 + 2 = 3


You can also add packages by providing their URL (typically on github). This is useful when you want to use a package which is not in the [official Julia Package registry](https://github.com/JuliaRegistries/General), or when you want the very latest version of a package:

In [323]:
]add https://github.com/JuliaLang/Example.jl

[?25l    

[32m[1m    Cloning[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`


[2K[?25h[?25l    

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`


[2K[?25h

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m


You can install a specific package version like this:

In [324]:
]add PyCall@1.91.3

[32m[1m  Resolving[22m[39m package versions...
[32m[1m  Installed[22m[39m PyCall ─ v1.91.3
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [438e738f][39m[95m ↓ PyCall v1.91.4 ⇒ v1.91.3[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [438e738f][39m[95m ↓ PyCall v1.91.4 ⇒ v1.91.3[39m
[32m[1m   Building[22m[39m PyCall → `~/.julia/packages/PyCall/kAhnQ/deps/build.log`


If you only specify version `1` or version `1.91`, Julia will get the latest version with that prefix. For example, `]add PyCall@0.91` would install the latest version `0.91.x`.

You can also update a package to its latest version:

In [325]:
]update PyCall

[32m[1m   Updating[22m[39m registry at `~/.julia/registries/General`


[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`


[?25h

[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [438e738f][39m[93m ↑ PyCall v1.91.3 ⇒ v1.91.4[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [438e738f][39m[93m ↑ PyCall v1.91.3 ⇒ v1.91.4[39m


You can update all packages to their latest versions:

In [326]:
]update

[32m[1m   Updating[22m[39m registry at `~/.julia/registries/General`


[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`


[?25h[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`


[?25h

[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
[90m [no changes][39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
[90m [no changes][39m


If you don't want a particular package to be updated the next time you call `]update`, you can pin it:

In [327]:
]pin PyCall

[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [438e738f][39m[93m ~ PyCall v1.91.4 ⇒ v1.91.4 ⚲[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [438e738f][39m[93m ~ PyCall v1.91.4 ⇒ v1.91.4 ⚲[39m


To unpin the package:

In [328]:
]free PyCall

[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [438e738f][39m[93m ~ PyCall v1.91.4 ⚲ ⇒ v1.91.4[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [438e738f][39m[93m ~ PyCall v1.91.4 ⚲ ⇒ v1.91.4[39m


You can also run the tests defined in a package:

In [329]:
]test Example

[32m[1m    Testing[22m[39m Example
[32m[1mStatus[22m[39m `/tmp/jl_2kZjcq/Manifest.toml`
 [90m [7876af07][39m[37m Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
 [90m [2a0f44e3][39m[37m Base64 [39m
 [90m [8ba89e20][39m[37m Distributed [39m
 [90m [b77e0a4c][39m[37m InteractiveUtils [39m
 [90m [56ddb016][39m[37m Logging [39m
 [90m [d6f4376e][39m[37m Markdown [39m
 [90m [9a3f8284][39m[37m Random [39m
 [90m [9e88b42a][39m[37m Serialization [39m
 [90m [6462fe0b][39m[37m Sockets [39m
 [90m [8dfed614][39m[37m Test [39m
[32m[1m    Testing[22m[39m Example tests passed 


Of course, you can remove a package:

In [330]:
]rm Example

[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [7876af07][39m[91m - Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
[32m[1m   Updating[22m[39m `~/.julia/environments/v1.4/Manifest.toml`
 [90m [7876af07][39m[91m - Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m


Lastly, you can check which packages are installed using `]status` (or `]st` for short):

In [331]:
]st

[32m[1mStatus[22m[39m `~/.julia/environments/v1.4/Project.toml`
 [90m [6e4b80f9][39m[37m BenchmarkTools v0.5.0[39m
 [90m [052768ef][39m[37m CUDA v1.0.2[39m
 [90m [7073ff75][39m[37m IJulia v1.21.2[39m
 [90m [438e738f][39m[37m PyCall v1.91.4[39m
 [90m [d330b81b][39m[37m PyPlot v2.9.0[39m


For more `Pkg` commands, type `]help`.

|Julia (in interactive mode) | Python (in a terminal)
|-----|------
|`]status` | `pip freeze`<br />or<br />`conda list`
|`]add Foo` | `pip install foo`<br />or<br />`conda install foo`
|`]add Foo@1.2` | `pip install foo==1.2`<br />or<br />`conda install foo=1.2`
|`]update Foo` | `pip install --upgrade foo`<br />or<br />`conda update foo`
|`]pin Foo` | `foo==<version>` in `requirements.txt`<br /> or<br />`foo=<version>` in `environment.yml`
|`]free Foo` | `foo` in `requirements.txt`<br />or<br />`foo` in `environment.yml`
|`]test Foo` | `python -m unittest foo`
|`]rm Foo` | `pip uninstall foo`<br />or<br />`conda remove foo`
|`]help` | `pip --help`


This workflow is fairly simple, but it means that all of your programs will be using the same version of each package. This is analog to installing packages using `pip install` without using virtual environments.


## Projects

If you want to have multiple projects, each with different libraries and library versions, you should define **projects**. These are analog to Python virtual environments.

A project is just a directory containing a `Project.toml` file and a `Manifest.toml` file:

```
my_project/
    Project.toml
    Manifest.toml
```

* `Project.toml` is similar to a `requirements.txt` file (for pip) or `environment.yml` (for conda): it lists the dependencies of the project, and compatibility constraints (e.g., `SomeDependency = 2.5`).
* `Manifest.toml` is an automatically generated file which lists the exact versions and unique IDs (UUIDs) of all the packages that Julia found, based on `Project.toml`. It includes all the implicit dependencies of the project's packages. This is useful to reproduce an environment precisely. Analog to the output of `pip --freeze`.

By default, the active project is located in `~/.julia/environments/v#.#` (where `#.#` is the Julia version you are using, such as 1.4). You can set a different project when starting Julia:

```bash
# BASH
julia --project=/path/to/my_project
```

Or you can set the `JULIA_PROJECT` environment variable:

```bash
# BASH
export JULIA_PROJECT=/path/to/my_project
julia
```

Or you can just activate a project directly in Julia (this is analog to running `source my_project/env/bin/activate` when using virtualenv):

In [332]:
Pkg.activate("my_project")

[32m[1m Activating[22m[39m new environment at `/content/my_project/Project.toml`


The `my_project` directory does not exist yet, but it gets created automatically, along with the `Project.toml` and `Manifest.toml` files, when you first add a package:

In [333]:
]add PyCall

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `/content/my_project/Project.toml`
 [90m [438e738f][39m[92m + PyCall v1.91.4[39m
[32m[1m   Updating[22m[39m `/content/my_project/Manifest.toml`
 [90m [8f4d0f93][39m[92m + Conda v1.4.1[39m
 [90m [682c06a0][39m[92m + JSON v0.21.0[39m
 [90m [1914dd2f][39m[92m + MacroTools v0.5.5[39m
 [90m [69de0a69][39m[92m + Parsers v1.0.6[39m
 [90m [438e738f][39m[92m + PyCall v1.91.4[39m
 [90m [81def892][39m[92m + VersionParsing v1.2.0[39m
 [90m [2a0f44e3][39m[92m + Base64 [39m
 [90m [ade2ca70][39m[92m + Dates [39m
 [90m [8ba89e20][39m[92m + Distributed [39m
 [90m [b77e0a4c][39m[92m + InteractiveUtils [39m
 [90m [8f399da3][39m[92m + Libdl [39m
 [90m [37e2e46d][39m[92m + LinearAlgebra [39m
 [90m [56ddb016][39m[92m + Logging [39m
 [90m [d6f4376e][39m[92m + Markdown [39m
 [90m [a63ad114][39m[92m + Mmap [39m
 [90m [de0858da][39m[92m + Printf [39m
 [90m [9

You can also add a package via its URL:

In [334]:
]add https://github.com/JuliaLang/Example.jl

[?25l[2K

[32m[1m   Updating[22m[39m git-repo `https://github.com/JuliaLang/Example.jl`


[?25h

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `/content/my_project/Project.toml`
 [90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m
[32m[1m   Updating[22m[39m `/content/my_project/Manifest.toml`
 [90m [7876af07][39m[92m + Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl)[39m


Let's also add a package with a specific version:

In [335]:
]add Example@0.3

[32m[1m  Resolving[22m[39m package versions...
[32m[1m  Installed[22m[39m Example ─ v0.3.3
[32m[1m   Updating[22m[39m `/content/my_project/Project.toml`
 [90m [7876af07][39m[95m ↓ Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl) ⇒ v0.3.3[39m
[32m[1m   Updating[22m[39m `/content/my_project/Manifest.toml`
 [90m [7876af07][39m[95m ↓ Example v0.5.4 #master (https://github.com/JuliaLang/Example.jl) ⇒ v0.3.3[39m


Now the `Project.toml` and `Manifest.toml` files were created:

In [336]:
;find my_project

my_project
my_project/Manifest.toml
my_project/Project.toml


Notice that the packages we added to the project were _not_ placed in the `my_project` directory itself. They were saved in the `~/.julia/packages` directory, the compiled files were placed in `~/.julia/compiled` director, logs were written to `~/.julia/logs` and so on.

If several projects use the same package, it will only be downloaded and built once (well, once per version). The `~/.julia/packages` directory can hold multiple versions of the same package, so it's fine if different projects use different versions of the same package. There will be no conflict, no "dependency hell".


The `Project.toml` just says that the project depends on `PyCall` and `Example`, and it specifies the UUID of this package:

In [337]:
print(read("my_project/Project.toml", String))

[deps]
Example = "7876af07-990d-54b4-ab0e-23690620f79a"
PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"


UUIDs are useful to avoid name conflicts. If several people name their package `CoolStuff`, then the UUID will clarify which one we are referring to.

The `Manifest.toml` file is much longer, since it contains all the packages which `PyCall` and `Example` depend on, along with their versions (except for the standard library packages), and the dependency graph. This file should never be modified manually:


In [338]:
print(read("my_project/Manifest.toml", String))

# This file is machine-generated - editing it directly is not advised

[[Base64]]
uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"

[[Conda]]
deps = ["JSON", "VersionParsing"]
git-tree-sha1 = "7a58bb32ce5d85f8bf7559aa7c2842f9aecf52fc"
uuid = "8f4d0f93-b110-5947-807f-2305c1781a2d"
version = "1.4.1"

[[Dates]]
deps = ["Printf"]
uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"

[[Distributed]]
deps = ["Random", "Serialization", "Sockets"]
uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b"

[[Example]]
git-tree-sha1 = "276fa06109ac5c80035cff711b0a18ad5b3117cc"
uuid = "7876af07-990d-54b4-ab0e-23690620f79a"
version = "0.3.3"

[[InteractiveUtils]]
deps = ["Markdown"]
uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"

[[JSON]]
deps = ["Dates", "Mmap", "Parsers", "Unicode"]
git-tree-sha1 = "b34d7cef7b337321e97d22242c3c2b91f476748e"
uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
version = "0.21.0"

[[Libdl]]
uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"

[[LinearAlgebra]]
deps = ["Libdl"]
uuid = "37e2e46d-f89d-

Note that `Manifest.toml` contains the precise version of the `Example` package that was installed, but the `Project.toml` file does not specify that version `0.3` is required. That's because Julia cannot know whether your project is supposed to work only with any version `0.3.x`, or whether it could work with other versions as well. So if you want to specify a version constraint for the `Example` package, you must add it manually in `Project.toml`. You would normally use your favorite editor to do this, but in this notebook we'll update `Project.toml` programmatically:

In [339]:
append_config = """

[compat]
Example = "0.3"
"""

open(f->write(f, append_config), "my_project/Project.toml", "a")

26

Here is the updated `Project.toml` file:

In [340]:
print(read("my_project/Project.toml", String))

[deps]
Example = "7876af07-990d-54b4-ab0e-23690620f79a"
PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"

[compat]
Example = "0.3"


Now if we try to replace `Example` 0.3 with version 0.2, we get an error:

In [341]:
try
    pkg"add Example@0.2"
catch ex
    ex
end

[32m[1m  Resolving[22m[39m package versions...


Pkg.Resolve.ResolverError("empty intersection between Example@0.2 and project compatibility 0.3", nothing)

Now you can run a program based on this project, and it will have the possibility to use all the packages which have been added to this project, with their specific versions. If you import a package which was not explicitly added to this project, Julia will fallback to the default project:

In [342]:
code = """
import PyCall # found in the project
import PyPlot # not found, so falls back to default project
println("Success!")
"""

open(f->write(f, code), "my_program3.jl", "w")

117

In [343]:
;julia --project=my_project my_program3.jl

Success!


## Packages
Falling back to the default project is fine, as long as you run the code on your own machine, but if you want to share your code with other people, it would be brittle to count on packages installed in _their_ default project. Instead, if you plan to share your code, you should clearly specify which packages it depends on, and use only these packages. Such a shareable project is called a **package**.

A package is a regular project (as defined above), but with a few extras:
* the `Project.toml` file must specify a `name`, a `version` and a `uuid`.
* there must be a `src/PackageName.jl` file containing a module named `PackageName`.
* you generally want to specify the `authors` and `description`, and maybe also the `license`, `repository` (e.g., the package's github URL), and some `keywords`, but all of these are optional.

It is very easy to create a new package using the `]generate` command. To define the `authors` field, `Pkg` will look up the `user.name` and `user.email` git config entries, so let's define them before we generate the package:

In [344]:
;git config --global user.name "Alice Bob"

In [345]:
;git config --global user.email "alice.bob@example.com"

In [346]:
]generate MyPackages/Hello

[32m[1m Generating[22m[39m  project Hello:
    MyPackages/Hello/Project.toml
    MyPackages/Hello/src/Hello.jl


This generated the `MyPackages/Hello/Project.toml` file (along with the enclosing directories) and the `MyPackages/Hello/src/Hello.jl` file. Let's take a look at the `Project.toml` file:

In [347]:
print(read("MyPackages/Hello/Project.toml", String))

name = "Hello"
uuid = "b1200148-98bf-43d1-9bb1-85f7b4552217"
authors = ["Alice Bob <alice.bob@example.com>"]
version = "0.1.0"


Notice that the project has no dependencies yet, but it has a name, a unique UUID, and a version (plus an author).

Note: if `Pkg` does not find a your name or email in the git config, it falls back to environment variables (`GIT_AUTHOR_NAME`, `GIT_COMMITTER_NAME`, `USER`, `USERNAME`, `NAME` and `GIT_AUTHOR_EMAIL`, `GIT_COMMITTER_EMAIL`, `EMAIL`).

And let's look at the `src/Hello.jl` file:

In [348]:
print(read("MyPackages/Hello/src/Hello.jl", String))

module Hello

greet() = print("Hello World!")

end # module


Let's try to use the `greet()` function from the `Hello` package:

In [349]:
try
    import Hello
    Hello.greet()
catch ex
    ex
end

ArgumentError("Package Hello not found in current path:\n- Run `import Pkg; Pkg.add(\"Hello\")` to install the Hello package.\n")

Julia could not find the `Hello` package. When you're working on a package, don't forget to activate it first!

In [350]:
]activate MyPackages/Hello

[32m[1m Activating[22m[39m environment at `/content/MyPackages/Hello/Project.toml`


In [351]:
import Hello
Hello.greet()

┌ Info: Precompiling Hello [b1200148-98bf-43d1-9bb1-85f7b4552217]
└ @ Base loading.jl:1260


Hello World!

It works!

If the `Hello` package depends on other packages, we must add them:

In [352]:
]add PyCall Example

[32m[1m  Resolving[22m[39m package versions...
[32m[1m  Installed[22m[39m Example ─ v0.5.3
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Project.toml`
 [90m [7876af07][39m[92m + Example v0.5.3[39m
 [90m [438e738f][39m[92m + PyCall v1.91.4[39m
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Manifest.toml`
 [90m [8f4d0f93][39m[92m + Conda v1.4.1[39m
 [90m [7876af07][39m[92m + Example v0.5.3[39m
 [90m [682c06a0][39m[92m + JSON v0.21.0[39m
 [90m [1914dd2f][39m[92m + MacroTools v0.5.5[39m
 [90m [69de0a69][39m[92m + Parsers v1.0.6[39m
 [90m [438e738f][39m[92m + PyCall v1.91.4[39m
 [90m [81def892][39m[92m + VersionParsing v1.2.0[39m
 [90m [2a0f44e3][39m[92m + Base64 [39m
 [90m [ade2ca70][39m[92m + Dates [39m
 [90m [8ba89e20][39m[92m + Distributed [39m
 [90m [b77e0a4c][39m[92m + InteractiveUtils [39m
 [90m [8f399da3][39m[92m + Libdl [39m
 [90m [37e2e46d][39m[92m + LinearAlgebra [39m
 [90m [56ddb016][39m

You must not use any package which has not been added to the project. If you do, you will get a warning.

Once you are happy with your package, you can deploy it to github (or anywhere else). Then you can add it to your own projects just like any other package.

If you want to make your package available to the world via the official Julia registry, you just need to send a Pull Request to https://github.com/JuliaRegistries/General. However, it's highly recommended to automate this using the [Registrator.jl](https://github.com/JuliaRegistries/Registrator.jl) github app.

If you want to use other registries (including private registries), check out [this page](https://julialang.github.io/Pkg.jl/v1.4/registries/#).

Also check out the [`PkgTemplate`](https://github.com/invenia/PkgTemplates.jl) package, which provides more sophisticated templates for creating new packages, for example with continuous integration, code coverage tests, etc.

## Fixing Issues in a Dependency
Sometimes you may run into an issue inside one of the packages your project depends on. When this happens, you can use `Pkg`'s `dev` command to fix the issue. For example, let's pretend the `Example` package has a bug:

In [353]:
]dev Example

[?25l    

[32m[1m    Cloning[22m[39m git-repo `https://github.com/JuliaLang/Example.jl.git`


[2K[?25h

[32m[1m  Resolving[22m[39m package versions...
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Project.toml`
 [90m [7876af07][39m[93m ↑ Example v0.5.3 ⇒ v0.5.4 [`~/.julia/dev/Example`][39m
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Manifest.toml`
 [90m [7876af07][39m[93m ↑ Example v0.5.3 ⇒ v0.5.4 [`~/.julia/dev/Example`][39m


This command cloned the repo into `~/.julia/dev/Example`:

In [354]:
;ls -l "~/.julia/dev"

total 4
drwxr-xr-x 7 root root 4096 Jul  2 00:06 Example


It also updated the `Hello` package's `Manifest.toml` file to ensure the package now uses the `Example` clone. You can see this using `]status`:

In [355]:
]st

[36m[1mProject [22m[39mHello v0.1.0
[32m[1mStatus[22m[39m `/content/MyPackages/Hello/Project.toml`
 [90m [7876af07][39m[37m Example v0.5.4 [`~/.julia/dev/Example`][39m
 [90m [438e738f][39m[37m PyCall v1.91.4[39m


So you would now go ahead and edit the clone and fix the bug. Of course, you would also want to send a PR to the package's owners so the source package gets fixed. Once that happens, you can go back to the official `Example` package easily:

In [356]:
]free Example

[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Project.toml`
 [90m [7876af07][39m[95m ↓ Example v0.5.4 [`~/.julia/dev/Example`] ⇒ v0.5.3[39m
[32m[1m   Updating[22m[39m `/content/MyPackages/Hello/Manifest.toml`
 [90m [7876af07][39m[95m ↓ Example v0.5.4 [`~/.julia/dev/Example`] ⇒ v0.5.3[39m


In [357]:
]st

[36m[1mProject [22m[39mHello v0.1.0
[32m[1mStatus[22m[39m `/content/MyPackages/Hello/Project.toml`
 [90m [7876af07][39m[37m Example v0.5.3[39m
 [90m [438e738f][39m[37m PyCall v1.91.4[39m


## Instantiating a Project
If you want to run someone else's project and you want to make sure you are using the exact same package versions, you can clone the project, and assuming it has a `Manifest.toml` file, you can activate the project and run `]instantiate` to install all the appropriate packages. For example, let's instantiate the `Registrator.jl` project:

In [358]:
;git clone https://github.com/JuliaRegistries/Registrator.jl

Cloning into 'Registrator.jl'...


In [359]:
]activate Registrator.jl

[32m[1m Activating[22m[39m environment at `/content/Registrator.jl/Project.toml`


In [360]:
]instantiate

[32m[1m  Installed[22m[39m TableTraits ───────────────── v1.0.0
[32m[1m  Installed[22m[39m AutoHashEquals ────────────── v0.2.0
[32m[1m  Installed[22m[39m Hiccup ────────────────────── v0.2.2
[32m[1m  Installed[22m[39m DataAPI ───────────────────── v1.2.0
[32m[1m  Installed[22m[39m Lazy ──────────────────────── v0.14.0
[32m[1m  Installed[22m[39m WebSockets ────────────────── v1.5.2
[32m[1m  Installed[22m[39m JSON2 ─────────────────────── v0.3.1
[32m[1m  Installed[22m[39m HTTP ──────────────────────── v0.8.14
[32m[1m  Installed[22m[39m IniFile ───────────────────── v0.5.0
[32m[1m  Installed[22m[39m ZMQ ───────────────────────── v1.2.0
[32m[1m  Installed[22m[39m GitForge ──────────────────── v0.1.5
[32m[1m  Installed[22m[39m AssetRegistry ─────────────── v0.1.0
[32m[1m  Installed[22m[39m TimeToLive ────────────────── v0.3.0
[32m[1m  Installed[22m[39m DataValueInterfaces ───────── v1.0.0
[32m[1m  Installed[22m[39m IteratorInterfa

Usually, that's all you need to know about projects and packages, but let's look at bit under the hood, so you can handle less common cases.

## Load Path
When you import a package, Julia searches for it in the environments listed in the `LOAD_PATH` array. An **environment** can be a project or a directory containing a bunch of packages directly. By default, the `LOAD_PATH` array contains three elements:

In [361]:
LOAD_PATH

3-element Array{String,1}:
 "@"
 "@v#.#"
 "@stdlib"

Here's what these elements mean:
* `"@"` represents the active project, if any: that's the project activated via `--project`, `JULIA_PROJECT`, `]activate` or `Pkg.activate()`.
* `"@v#.#"` represents the default shared project for the version of Julia we are running. That's why it is used by default when there is no active project.
* `"@stdlib"` represents the standard library. This is not a project: it's a directory containing many packages.

If you want to see the actual paths, you can call `Base.load_path()`:

In [362]:
Base.load_path()

3-element Array{String,1}:
 "/content/Registrator.jl/Project.toml"
 "/root/.julia/environments/v1.4/Project.toml"
 "/usr/local/share/julia/stdlib/v1.4"

You can change the load path if you want to. For example, if you want Julia to look only in the active project and in the standard library, without looking in the default project, then you can set the `JULIA_LOAD_PATH` environment variable to `"@:@stdlib"`.

If you try to run `my_program3.jl` this way, it will successfully import `PyCall`, but it will fail to import `PyPlot`, since it is not listed in `Project.toml` (however, it would successfully import any package from the standard library):

In [363]:
try
    withenv("JULIA_LOAD_PATH"=>"@:@stdlib") do
        run(`julia --project=my_project my_program3.jl`)
    end
catch ex
    ex
end

ERROR: LoadError: ArgumentError: Package PyPlot not found in current path:
- Run `import Pkg; Pkg.add("PyPlot")` to install the PyPlot package.

Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] exec_options(::Base.JLOptions) at ./client.jl:288
 [4] _start() at ./client.jl:484
in expression starting at /content/my_program3.jl:2


ProcessFailedException(Base.Process[Process(`[4mjulia[24m [4m--project=my_project[24m [4mmy_program3.jl[24m`, ProcessExited(1))])

You can also modify the `LOAD_PATH` array programmatically, for example to make all the packages in the `my_packages/` directory available to the project:

In [364]:
push!(LOAD_PATH, "my_packages")

4-element Array{String,1}:
 "@"
 "@v#.#"
 "@stdlib"
 "my_packages"

Now any package added to this directory will be directly available to us:

In [365]:
]generate my_packages/Hello2

[32m[1m Generating[22m[39m  project Hello2:
    my_packages/Hello2/Project.toml
    my_packages/Hello2/src/Hello2.jl


In [366]:
using Hello2
Hello2.greet()

┌ Info: Precompiling Hello2 [b76a3422-75bc-4a82-ad3b-dff89fdf93f4]
└ @ Base loading.jl:1260


Hello World!

This is a convenience for development, as we didn't have to push this package to a repository or even add it to the project. However, it's just for development: once you're happy with your package, make sure to push it to a repo, and add it to the project normally.

## Depots
As we saw earlier, new packages you add to a project are placed in the `~/.julia/packages` directory, logs are placed in `~/.julia/logs`, and so on.

A directory like `~/.julia` which contains `Pkg` related content is called a **depot**. Julia installs all new packages in the default depot, which is the first directory in the `DEPOT_PATH` array (this array can be modified manually in Julia, or set via the `JULIA_DEPOT_PATH` environment variable):

In [367]:
DEPOT_PATH

3-element Array{String,1}:
 "/root/.julia"
 "/usr/local/local/share/julia"
 "/usr/local/share/julia"

The default depot needs to be writeable for the current user, since that's where new packages will be written to (as well as logs and other stuff). The other depots can be read-only: they're typically used for private package registries.

You can occasionally run the `]gc` command, which will remove all unused package versions (`Pkg` will use the logs to located existing projects).

In summary: when some code runs `using Foo` or `import Foo`, the `LOAD_PATH` is used to determine _which_ specific package `Foo` refers to, while the `DEPOT_PATH` is used to determine _where_ it is. The exception is when the `LOAD_PATH` contains directories which directly contain packages: for these packages, the `DEPOT_PATH` is not used.

# Parallel Computing
Julia supports coroutines (aka green threads), multithreading (without a [GIL](https://en.wikipedia.org/wiki/Global_interpreter_lock#:~:text=A%20global%20interpreter%20lock%20(GIL,on%20a%20multi%2Dcore%20processor.) like CPython!), multiprocessing and distributed computing.

## Coroutines
Let's go back to the `fibonacci()` generator function:

In [368]:
function fibonacci(n)
    Channel() do ch
        a, b = 1, 1
        for i in 1:n
            put!(ch, a)
            a, b = b, a + b
        end
    end
end

for f in fibonacci(10)
    println(f)
end

1
1
2
3
5
8
13
21
34
55


Under the hood, `Channel() do ... end` creates a `Channel` object, and spawns an asynchronous `Task` to execute the code in the `do ... end` block. The task is scheduled to execute immediately, but when it calls the `put!()` function on the channel to yield a value, it blocks until another task calls the `take!()` function to grab that value. You do not see the `take!()` function explicitly in this code example, since it is executed automatically in the `for` loop, in the main task. To demonstrate this, we can just call the `take!()` function 10 times to get all the items from the channel:

In [369]:
ch = fibonacci(10)
for i in 1:10
    println(take!(ch))
end

1
1
2
3
5
8
13
21
34
55


This channel is bound to the task, therefore it is automatically closed when the task ends. So if we try to get one more element, we will get an exception:

In [370]:
try
    take!(ch)
catch ex
    ex
end

InvalidStateException("Channel is closed.", :closed)

Here is a more explicit version of the `fibonacci()` function:

In [371]:
function fibonacci(n)
  function generator_func(ch, n)
    a, b = 1, 1
    for i in 1:n
        put!(ch, a)
        a, b = b, a + b
    end
  end
  ch = Channel()
  task = @task generator_func(ch, n) # creates a task without starting it
  bind(ch, task) # the channel will be closed when the task ends
  schedule(task) # start running the task asynchronously
  ch
end

fibonacci (generic function with 1 method)

And here is a more explicit version of the `for` loop:

In [372]:
ch = fibonacci(10)
while isopen(ch)
  value = take!(ch)
  println(value)
end

1
1
2
3
5
8
13
21
34
55


Note that asynchronous tasks (also called "coroutines" or "green threads") are not actually run in parallel: they cooperate to alternate execution. Some functions, such as `put!()`, `take!()`, and many I/O functions, interrupt the current task's execution, at which point it lets Julia's scheduler decide which task should resume its execution. This is just like Python's coroutines.

For more details on coroutines and tasks, see [the manual](https://docs.julialang.org/en/v1/manual/control-flow/#man-tasks-1).

## Multithreading
Julia also supports multithreading. Currently, you need to specify the number of O.S. threads upon startup, by setting the `JULIA_NUM_THREADS` environment variable (or setting the `-t` argument in Julia 1.5+). In the first cell, we configured the IJulia kernel so that set environment variable is set:

In [373]:
ENV["JULIA_NUM_THREADS"]

"4"

The actual number of threads started by Julia may be lower than that, as it is limited to the number of available cores on the machine (thanks to hyperthreading, each physical core may run two threads). Here is the number of threads that were actually started:

In [374]:
using Base.Threads
nthreads()

2

Now let's run 10 tasks across these threads:

In [375]:
@threads for i in 1:10
    println("thread #", threadid(), " is starting task #$i")
    sleep(rand()) # pretend we're actually working
    println("thread #", threadid(), " is finished")
end

thread #1 is starting task #1
thread #2 is starting task #6
thread #2 is finished
thread #2 is starting task #7
thread #1 is finished
thread #1 is starting task #2
thread #2 is finished
thread #2 is starting task #8
thread #1 is finished
thread #1 is starting task #3
thread #1 is finished
thread #1 is starting task #4
thread #2 is finished
thread #2 is starting task #9
thread #1 is finished
thread #1 is starting task #5
thread #1 is finished
thread #2 is finished
thread #2 is starting task #10
thread #2 is finished


Here is a multithreaded version of the `estimate_pi()` function. Each thread computes part of the sum, and the parts are added at the end:

In [376]:
function parallel_estimate_pi(n)
    s = zeros(nthreads())
    nt = n ÷ nthreads()
    @threads for t in 1:nthreads()
        for i in (1:nt) .+ nt*(t - 1)
          @inbounds s[t] += (isodd(i) ? -1 : 1) / (2i + 1)
        end
    end
    return 4.0 * (1.0 + sum(s))
end

@btime parallel_estimate_pi(100_000_000)

  128.853 ms (16 allocations: 1.63 KiB)


3.1415926635894196

The `@inbounds` macro is an optimization: it tells the Julia compiler not to add any bounds check when accessing the array. It's safe in this case since the `s` array has one element per thread, and `t` varies from `1` to `nthreads()`, so there is no risk for `s[t]` to be out of bounds.

Let's compare this with the single-threaded implementation:

In [377]:
@btime estimate_pi(100_000_000)

  134.263 ms (0 allocations: 0 bytes)


3.141592663589326

If you are running this notebook on Colab, the parallel implementation is probably no faster than the single-threaded one. That's because the Colab Runtime only has a single CPU, so there is no benefit from multithreading (plus there is a bit of overhead for managing threads). However, on my 8-core machine, using 16 threads, the parallel implementation is about 6 times faster than the single-threaded one.

Julia has a `mapreduce()` function which makes it easy to implement functions like `parallel_estimate_pi()`:

In [378]:
function parallel_estimate_pi2(n)
    4.0 * mapreduce(i -> (isodd(i) ? -1 : 1) / (2i + 1), +, 0:n)
end

parallel_estimate_pi2 (generic function with 1 method)

In [379]:
@btime parallel_estimate_pi2(100_000_000)

  106.664 ms (0 allocations: 0 bytes)


3.1415926635897917

The `mapreduce()` function is well optimized, so it's about twice faster than `parallel_estimate_pi()`.

You can also spawn a task using `Threads.@spawn`. It will get executed on any one of the running threads (it will not start a new thread):

In [380]:
task = Threads.@spawn begin
    println("Thread starting")
    sleep(1)
    println("Thread stopping")
    42 # result
end

println("Hello!")

println("The result is: ", fetch(task))


Hello!
Thread starting
Thread stopping
The result is: 42


The `fetch()` function waits for the thread to finish, and fetches the result. You can also just call `wait()` if you don't need the result.

Last but not least, you can use channels to synchronize and communicate across tasks, even if they are running across separate threads:

In [381]:
ch = Channel()
task1 = Threads.@spawn begin
    for i in 1:5
        sleep(rand())
        put!(ch, i^2)
    end
    println("Finished sending!")
    close(ch)
end

task2 = Threads.@spawn begin
    foreach(v->println("Received $v"), ch)
    println("Finished receiving!")
end

wait(task2)

Received 1
Received 4
Received 9
Received 16
Finished sending!
Received 25
Finished receiving!


For more details about multithreading, check out [this page](https://docs.julialang.org/en/v1/manual/parallel-computing/#man-multithreading-1).

## Multiprocessing & Distributed Programming
Julia can spawn multiple Julia processes upon startup if you specify the number of processes via the `-p` argument. You can also spawn extra processes from Julia itself:

In [382]:
using Distributed
addprocs(4)
workers() # array of worker process ids

4-element Array{Int64,1}:
 2
 3
 4
 5

The main process has id 1:

In [383]:
myid()

1

The `@everywhere` macro lets you run any code on all workers:

In [384]:
@everywhere println("Hi! I'm worker $(myid())")

Hi! I'm worker 1
      From worker 4:	Hi! I'm worker 4
      From worker 3:	Hi! I'm worker 3
      From worker 2:	Hi! I'm worker 2
      From worker 5:	Hi! I'm worker 5


You can also execute code on a particular worker by using `@spawnat <worker id> <statement>`:

In [385]:
@spawnat 3 println("Hi! I'm worker $(myid())")

Future(3, 1, 14, nothing)

If you specify `:any` instead of a worker id, Julia chooses the worker for you:

In [386]:
@spawnat :any println("Hi! I'm worker $(myid())")

      From worker 3:	Hi! I'm worker 3


Future(2, 1, 15, nothing)

Both `@everywhere` and `@spawnat` return immediately. The output of `@spawnat` is a `Future` object. You can call `fetch()` on this object to wait for the result:

In [387]:
result = @spawnat 3 1+2+3+4
fetch(result)

10

If you import some package in the main process, it is <u>not</u> automatically imported in the workers. For example, the following code fails because the worker does not know what `pyimport` is:

In [388]:
using PyCall

result = @spawnat 4 (np = pyimport("numpy"); np.log(10))

try
    fetch(result)
catch ex
    ex
end

      From worker 2:	Hi! I'm worker 2


RemoteException(4, CapturedException(UndefVarError(:pyimport), Any[(#121 at macros.jl:87, 1), (#101 at process_messages.jl:290, 1), (run_work_thunk at process_messages.jl:79, 1), (run_work_thunk at process_messages.jl:88, 1), (#94 at task.jl:358, 1)]))

You must use `@everywhere` or `@spawnat` to import the packages you need in each worker:

In [389]:
@everywhere using PyCall

result = @spawnat 4 (np = pyimport("numpy"); np.log(10))

fetch(result)

2.302585092994046

Similarly, if you define a function in the main process, it is <u>not</u> automatically available in the workers. You must define the function in every worker:

In [390]:
@everywhere addtwo(n) = n + 2
result = @spawnat 4 addtwo(40)
fetch(result)

42

You can pass a `Future` to `@everywhere` or `@spawnat`, as long as you wrap it in a `fetch()` function:

In [391]:
M = @spawnat 2 rand(5)
result = @spawnat 3 fetch(M) .* 10.0
fetch(result)

5-element Array{Float64,1}:
 4.475589942138973
 3.7844448153428067
 6.199227766558075
 8.66410018066203
 3.364462310811107

In this example, worker 2 creates a random array, then worker 3 fetches this array and multiplies each element by 10, then the main process fetches the result and displays it.

## GPU
Julia has excellent GPU support. As you may know, GPUs are devices which can run thousands of threads in parallel. Each thread is slower and more limited than on a CPU, but there are so many of them that plenty of tasks can be executed much faster on a GPU than on a CPU, provided these tasks can be parallelized.

Let's check which GPU device is installed:

In [392]:
;nvidia-smi

Thu Jul  2 00:08:11 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P0    26W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

If you're running on Colab, your runtime will generally have an Nvidia Tesla K80 GPU with 12GB of RAM installed, but sometimes other GPUs like Nvidia Tesla T4 16GB, or Nvidia Tesla P100).

If no GPU is detected, go to _Runtime_ > _Change runtime type_, set _Hardware accelerator_ to _GPU_, then go to _Runtime_ > _Factory reset runtime_, then reinstall Julia by running the first cell again, then reload the page and come back here). If you're running on your own machine, make sure you have a compatible GPU card installed, with the appropriate drivers.

Now let's create a large matrix and time how long it takes to square it on the CPU:

In [393]:
using BenchmarkTools

M = rand(2^11, 2^11)

function benchmark_matmul_cpu(M)
    M * M
    return
end

benchmark_matmul_cpu(M) # warm up
@btime benchmark_matmul_cpu($M)

  436.690 ms (2 allocations: 32.00 MiB)


Notes:
* For benchmarking, we wrapped the operation in a function which returns `nothing`.
* Why do we have a "warm up" line? Well, since Julia compiles code on the fly the first time it is executed, it's good practice to execute the operation we want to benchmark at least once before starting the benchmark, or else the benchmark will include the compilation time.
* We used `$M` instead of `M` on the last line. This is a feature of the `@btime` macro: it evaluates `M` before benchmarking takes place, to avoid the extra delay that is incurred when [benchmarking with global variables](https://docs.julialang.org/en/latest/manual/performance-tips/#Avoid-global-variables-1).

Now let's benchmark this same operation on the GPU: 

In [394]:
using CUDA

# Copy the data to the GPU. Creates a CuArray:
M_on_gpu = cu(M)

# Alternatively, create a new random matrix directly on the GPU:
#M_on_gpu = CUDA.CURAND.rand(2^11, 2^11)

function benchmark_matmul_gpu(M)
    CUDA.@sync M * M
    return
end

benchmark_matmul_gpu(M_on_gpu) # warm up
@btime benchmark_matmul_gpu($M_on_gpu)

[32m[1mDownloading[22m[39m artifact: CUDA10.1
[?25l

######################################################################### 100.0%


[1A[2K[?25h[32m[1mDownloading[22m[39m artifact: CUDNN+CUDA10.1
[?25l

######################################################################### 100.0%


[1A[2K[?25h[32m[1mDownloading[22m[39m artifact: CUTENSOR+CUDA10.1
[?25l

######################################################################### 100.0%


[1A[2K[?25h

│   caller = llvm_compat(::VersionNumber) at compatibility.jl:181
└ @ CUDA /root/.julia/packages/CUDA/42B9G/deps/compatibility.jl:181


  2.360 ms (9 allocations: 368 bytes)


That's _much_ faster (185x faster in my test on Colab with an NVidia Tesla P100 GPU).

Importantly:
* Before the GPU can work on some data, it needs to be copied to the GPU (or generated there directly).
* the `CUDA.@sync` macro waits for the GPU operation to complete. Without it, the operation would happen in parallel on the GPU, while execution would continue on the CPU. So we would just be timing how long it takes to _start_ the operation, not how long it takes to complete.
* In general, you don't need `CUDA.@sync`, since many operations (including `cu()`) call it implicitly, and it's usually a good idea to let the CPU and GPU work in parallel. Typically, the GPU will be working on the current batch of data while the CPU works on preparing the next batch.

Of course, the speed up will vary depending on the matrix size and the GPU type. Moreover, copying the data from the CPU to the GPU is often the slowest part of the operation, but we only benchmarked the matrix multiplication itself. Let's see what we get if we include the data transfer in the benchmark:

That's still much faster than on the CPU.

Let's check how much RAM we have left on the GPU:

In [395]:
CUDA.memory_status()

Effective GPU memory usage: 99.93% (15.888 GiB/15.899 GiB)
CUDA allocator usage: 15.594 GiB
BinnedPool usage: 15.594 GiB (16.000 MiB allocated, 15.578 GiB cached)


Julia's Garbage Collector will free CUDA arrays like any other object, when there's no more reference to it. However, `CUDA.jl` uses a memory pool to make allocations faster on the GPU, so don't be surprised if the allocated memory on the GPU does not go down immediately. Moreover, IJulia keeps a reference to the output of each cell, so if you let any cell output a `CuArray`, it will only be released when you execute `Out[<cell number>]=0`. If you want to force the Garbage Collector to run, you an run `GC.gc()`. To reclaim memory from the memory pool, use `CUDA.reclaim()`:

In [396]:
GC.gc()
CUDA.reclaim()

16726884352

Many other operations are implemented for `CuArray` (`+`,  `-`, etc.) and dotted operations (`.+`, `exp.()`, etc). Importantly, loop fusion also works on the GPU. For example, if we want to compute `M .* M .+ M`, without loop fusion the GPU would first compute `M .* M` and create a temporary array, then it would add `M` to that array, like this:

In [397]:
function benchmark_without_fusion(M)
    P = M .* M
    CUDA.@sync P .+ M
    return
end

benchmark_without_fusion(M_on_gpu) # warm up
@btime benchmark_without_fusion($M_on_gpu)

  676.534 μs (140 allocations: 4.30 KiB)


Instead, loop fusion ensures that the array is only traversed once, without the need for a temporary array:

In [398]:
function benchmark_with_fusion(M)
    CUDA.@sync M .* M .+ M
    return
end

benchmark_with_fusion(M_on_gpu) # warm up
@btime benchmark_with_fusion($M_on_gpu)

  387.141 μs (87 allocations: 3.36 KiB)


That's _much_ faster (75% faster in my test on Colab). 😃

Lastly, you can actually **write your own GPU kernels in Julia**! In other words, rather than using GPU operations implemented in the `CUDA.jl` package (or others), you can write Julia code that will be compiled for the GPU, and executed there. This can occasionally be useful to speed up some algorithms where the standard kernels don't suffice. For example, here's a GPU kernel which implements `u .+= v`, where `u` and `v` are two (large) vectors:

In [399]:
function worker_gpu_add!(u, v)
    index = (blockIdx().x - 1) * blockDim().x + threadIdx().x
    index ≤ length(u) && (@inbounds u[index] += v[index])
    return
end

function gpu_add!(u, v)
    numblocks = ceil(Int, length(u) / 256)
    @cuda threads=256 blocks=numblocks worker_gpu_add!(u, v)
    return u
end

gpu_add! (generic function with 1 method)

This code example is adapted from the [`CUDA.jl` package's documentation](https://juliagpu.gitlab.io/CUDA.jl/tutorials/introduction/), which I highly encourage you to check out if you're interested in writing your own kernels. Here are the key parts to understand this example, starting from the end:
* The `gpu_add!()` function first calculates `numblocks`, the number of blocks of threads to start, then it uses the `@cuda` macro to spawn `numblocks` blocks of GPU threads, each with 256 threads, and each thread runs `worker_gpu_add!(u, v)`.
* The `worker_gpu_add!()` function computes `u[index] += v[index]` for a single value of `index`: in other words, each thread will just update a single value in the vector! Let's see how the index is computed:
  * The `@cuda` macro spawned many blocks of 256 threads each. These blocks are organized in a grid, which is one-dimensional by default, but it can be up to three-dimensional. Therefore each thread and each block have an `(x, y, z)` coordinate in this grid. See this diagram from the [Nvidia blog post](https://developer.nvidia.com/blog/even-easier-introduction-cuda/):<br />
<img src="https://juliagpu.gitlab.io/CUDA.jl/tutorials/intro1.png" width="600"/>.
  * `threadIdx().x` returns the current GPU thread's `x` coordinate within its block (one difference with the diagram is that Julia is 1-indexed).
  * `blockIdx().x` returns the current block's `x` coordinate in the grid.
  * `blockDim().x` returns the block size along the `x` axis (in this example, it's 256).
  * `gridDim().x` returns the number of blocks in the grid, along the `x` axis (in this example it's `numblocks`).
  * So the `index` that each thread must update in the array is `(blockIdx().x - 1) * blockDim().x + threadIdx().x`.
* As explained earlier, the `@inbounds` macro is an optimization that tells Julia that the index is guaranteed to be inbounds, so there's no need for it to check.

Now writing your own GPU kernel won't seem like something only top experts with advanced C++ skills can do: you can do it too!

Let's check that the kernel works as expected:

In [400]:
u = rand(2^20)
v = rand(2^20)

u_on_gpu = cu(u)
v_on_gpu = cu(v)

u .+= v
gpu_add!(u_on_gpu, v_on_gpu)

@assert Array(u_on_gpu) ≈ u

Yes, it works well!

Note: the `≈` operator checks whether the operands are approximately equal within the float precision limit.

Let's benchmark our custom kernel:

In [401]:
function benchmark_custom_assign_add!(u, v)
    CUDA.@sync gpu_add!(u, v)
    return
end

benchmark_custom_assign_add!(u_on_gpu, v_on_gpu)
@btime benchmark_custom_assign_add!($u_on_gpu, $v_on_gpu)

  98.689 μs (52 allocations: 1.31 KiB)


Let's see how this compares to `CUDA.jl`'s implementation:

In [402]:
function benchmark_assign_add!(u, v)
    CUDA.@sync u .+= v
    return
end

benchmark_assign_add!(u_on_gpu, v_on_gpu)
@btime benchmark_assign_add!($u_on_gpu, $v_on_gpu)

  137.072 μs (70 allocations: 1.89 KiB)


How about that? Our custom kernel is faster than `CUDA.jl`'s kernel! But to be fair, our kernel would not work with huge vectors, since there's a limit to the number of blocks & threads you can spawn (see [Table 15](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications) in CUDA's documentation). To support such huge vectors, we need each worker to run a loop like this:

In [403]:
function worker_gpu_add!(u, v)
    index = (blockIdx().x - 1) * blockDim().x + threadIdx().x
    stride = blockDim().x * gridDim().x
    for i = index:stride:length(u)
        @inbounds u[i] += v[i]
    end
    return
end

worker_gpu_add! (generic function with 1 method)

This way, if `@cuda` is executed with a smaller number of blocks than needed to have one thread per array item, the workers will loop appropriately.

This should get you started! For more info, check out [`CUDA.jl`'s documentation](https://juliagpu.gitlab.io/CUDA.jl/).

# Command Line Arguments

Command line arguments are available via `ARGS`:



In [404]:
ARGS

1-element Array{String,1}:
 "/root/.local/share/jupyter/runtime/kernel-4b7aa9c6-4581-4d7b-acea-4e4dfaf036c8.json"

Unlike Python's `sys.argv`, the first element of this array is <u>not</u> the program name. If you need the program name, use `PROGRAM_FILE` instead:

In [405]:
PROGRAM_FILE

"/root/.julia/packages/IJulia/DrVMH/src/kernel.jl"

You can get the current module, directory, file or line number:

In [406]:
@__MODULE__, @__DIR__, @__FILE__, @__LINE__

(Main, "/content", "In[406]", 1)

The equivalent of Python's `if __name__ == "__main__"` is:

In [407]:
if abspath(PROGRAM_FILE) == @__FILE__
    println("Starting of the program")
end

# Memory Management

Let's check how many megabytes of RAM are available:

In [408]:
free() = println("Available RAM: ", Sys.free_memory() ÷ 10^6, " MB")

free()

Available RAM: 3120 MB


If a variable holds a large object that you don't need anymore, you can either wait until the variable falls out of scope, or set it to `nothing`. Either way, the memory will only be freed when the Garbage Collector does its magic, which may not be immediate. In general, you don't have to worry about that, but if you want, you can always call the GC directly:

In [409]:
function use_ram()
    M = rand(10000, 10000) # use 400+MB of RAM
    println("sum(M)=$(sum(M))")
end # M will be freed by the GC eventually after this

use_ram()

M = rand(10000, 10000) # use 400+MB of RAM
println("sum(M)=$(sum(M))")
M = nothing

GC.gc() # rarely needed

sum(M)=4.9997184380985916e7
sum(M)=5.000422876376158e7


In [410]:
free()

Available RAM: 1528 MB


# Thanks!

I hope you enjoyed this introduction to Julia! I recommend you join the friendly and helpful Julia community on Slack or Discourse.

Cheers!

Aurélien Geron