![Julia logo](images/julia-logo.png)

<div style="text-align: center">

# Getting Started with Julia

## Nebraska.Code()

### July 14, 2021

#### David W. Body

#### Twitter: @david_body

# Looks like Python, feels like Lisp, runs like Fortran.

</div>

## Introduction and Background

* Created by Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B. Shah in 2009
* [Released publicly in 2012](https://julialang.org/blog/2012/02/why-we-created-julia)
* [Version 1.0 released August 2018](https://julialang.org/blog/2018/08/one-point-zero)
* [Version 1.6.1 released April 2021](https://julialang.org/downloads/)

#### Main language features

* Multiple dispatch (parametric polymorphism)
* Dynamic type system ("optional" typing)
* High performance (approaching C, Fortran, etc.)
* Built-in package manager
* Lisp-like macros and metaprogramming
* Interoperability with Python, R, C, Fortran
* Designed for parallel and distributed computing

Today we're going to just hit a few highlights and just scratch the surface.

Goal is give everyone an idea of what Julia is like so you can decide if you want to learn more.

## Riddler example: Python vs Julia

### Riddler Express

https://fivethirtyeight.com/features/so-you-want-to-tether-your-goat-now-what/

> From Luke Robinson, a serenading stumper:

> My daughter really likes to hear me sing “The Unbirthday Song” from “Alice in Wonderland” to her. She also likes to sing it to other people. Obviously, the odds of my being able to sing it to her on any random day are 364 in 365, because I cannot sing it on her birthday. The question is, though, how many random people would she expect to be able to sing it to on any given day before it became more likely than not that she would encounter someone whose birthday it is? In other words, what is the expected length of her singing streak?

First let's look at a [**Python** simulation](Unbirthday%20Riddler%20-%20Python.ipynb) to calculate the approximate expected length of the singing streak.

Then let's compare a **Julia** simulation.

In [None]:
using Statistics: mean

function trial()
    n = 0
    singing = true
    while (singing)
        if (rand(1:365) == 1)
            singing = false
        else
            n += 1
        end
    end
    n
end

function do_trials(n_trials)
    trials = zeros(Int, n_trials)
    for i in 1:n_trials
        trials[i] = trial()
    end
    mean(trials)
end

In [None]:
@time begin
    n_trials = 1_000_000
    result = do_trials(n_trials)
    println("Expected number of days: $result")
end

-------------

#### A better (exact) solution

The fastest code is code that doesn't exist.

The [Geometric distrubution](https://en.wikipedia.org/wiki/Geometric_distribution) is the probability distribution of the number $Y$ of failures of Bernoulli trials before the first success. The probability mass function for the Geometric distribution is

$${\Pr(Y=k)=(1-p)^{k}p}$$

for k = 0, 1, 2, 3, .... where $p$ is probability of success for each Bernoulli trial.

The mean of the Geometric distribution is

$$E(Y) = \frac{1 - p}{p}$$

In our case, $p$ is the probability that a random person we encounter has a birthday today, so

$$p = \frac{1}{365}$$

and therefore

$$E(Y) = \frac{1 - \frac{1}{365}}{\frac{1}{365}}$$

$$ = \frac{365 - \frac{365}{365}}{\frac{365}{365}}$$

$$ = 365 - 1$$

$$ = 364$$

---

# Rationale for Julia

From [Why We Created Julia](https://julialang.org/blog/2012/02/why-we-created-julia/) (Published in 2012. Worth reading in it's entirety.)

> We are greedy: we want more.
>
> We want a language that's open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that's homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.
>
> (Did we mention it should be as fast as C?)


# The "two-language problem"

Languages like MATLAB, Python, and R, commonly used for data analysis and scientific computing are nice because they can be used interactively. They are great for exploratory analysis and prototyping. But they tend to be slow, especially when processing large amounts of data.

So computationally intensive tasks end up being rewritten in C, C++, or Fortran.

**Julia's goal is for as much of the standard library and third-party packages to be written in pure Julia as possible.** That makes the eco-system more accessible and it's easier for domain experts to create their own packages.

For example, compare the source code statistics for R and Julia from Github (captured June 15, 2021).

| R                                            | Julia                                                |
| -------------------------------------------- | ---------------------------------------------------- |
| ![R source stats](images/r-source-stats.png) | ![Julia source stats](images/julia-source-stats.png) |
