Data Science Fundamentals: Python |
[Table of Contents](../index.ipynb)
- - - 
<!--NAVIGATION-->
Module 17. [Overview](./00_julia_overview.ipynb) | [Getting Started](./01_julia_started.ipynb) | **[Commands](./02_julia_commands.ipynb)** | [Package: Gadfly](./03_julia_gadfly.ipynb)

# Commands

Our approach is aimed at those who already have at least some knowledge of programming — perhaps experience with Python, MATLAB, R, C or similar

In particular, we assume you have some familiarity with fundamental programming concepts such as

- variables
- loops
- conditionals (if/else)

Example: Plotting a White Noise Process
---

To begin, let’s suppose that we want to simulate and plot the white noise process ε0,ε1,…,εTε0,ε1,…,εT, where each draw εtεt is independent standard normal

In other words, we want to generate figures that look something like this:

![caption](files/test_program_1.png)

In brief,

using PyPlot makes the functionality in PyPlot available for use

In particular, it pulls the names exported by the PyPlot module into the global scope
One of these is plot(), which in turn calls the plot function from Matplotlib
randn() is a Julia function from the standard library for generating standard normals

Importing Functions
---

The effect of the statement using PyPlot is to make all the names exported by the PyPlot module available in the global scope

If you prefer to be more selective you can replace using PyPlot with import PyPlot: plot

Now only the plot function is accessible

Since our program uses only the plot function from this module, either would have worked in the previous example

Arrays
---

The function call epsilon_values = randn(ts_length) creates one of the most fundamental Julia data types: an array

In [10]:
typeof(epsilon_values)

Array{Float64,1}

In [11]:
epsilon_values


100-element Array{Float64,1}:
 -0.97022  
 -0.768929 
 -0.395993 
  0.475183 
  0.638632 
 -0.398748 
 -0.0803466
  1.00184  
 -0.704845 
 -1.43405  
  0.434751 
 -0.0922058
 -0.464083 
  ⋮        
 -0.854893 
  0.755098 
 -0.843619 
  0.277663 
 -0.903036 
 -1.51335  
 -0.659376 
 -0.740348 
  1.13515  
 -0.895744 
  0.02137  
 -1.55444  

The information from typeof() tells us that epsilon_values is an array of 64 bit floating point values, of dimension 1

Julia arrays are quite flexible — they can store heterogeneous data for example

In [12]:
x = [10, "foo", false]

3-element Array{Any,1}:
    10     
      "foo"
 false     

Notice now that the data type is recorded as Any, since the array contains mixed data

The first element of x is an integer

In [13]:
typeof(x[1])

Int64

The second is a string

In [14]:
typeof(x[2])

ASCIIString

The third is the boolean value false

In [15]:
typeof(x[3])

Bool

Notice from the above that

- array indices start at 1 (unlike Python, where arrays are zero-based)
- array elements are referenced using square brackets (unlike MATLAB and Fortran)
- Julia contains many functions for acting on arrays — we’ll review them later

For now here’s several examples, applied to the same list x = [10, "foo", false]

The first example just returns the length of the list

The second, pop!(), pops the last element off the list and returns it

In doing so it changes the list (by dropping the last element)

Because of this we call pop! a mutating method

It’s conventional in Julia that mutating methods end in ! to remind the user that the function has other effects beyond just returning a value

The function push!() is similar, expect that it appends its second argument to the array

For Loops
---

Although there’s no need in terms of what we wanted to achieve with our program, for the sake of learning syntax let’s rewrite our program to use a for loop

Here we first declared epsilon_values to be an empty array for storing 64 bit floating point numbers

The for loop then populates this array by successive calls to randn()

Called without an argument, randn() returns a single float
Like all code blocks in Julia, the end of the for loop code block (which is just one line here) is indicated by the keyword end

The word in from the for loop can be replaced by symbol =

The expression 1:ts_length creates an iterator that is looped over — in this case the integers from 1 to ts_length

Iterators are memory efficient because the elements are generated on the fly rather than stored in memory

In Julia you can also loop directly over arrays themselves, like so

In [17]:
words = ["foo", "bar"]
for word in words
    println("Hello $word")
end

Hello foo
Hello bar


While Loops
---

The syntax for the while loop contains no surprises

The next example does the same thing with a condition and the break statement

User-Defined Functions
---

For the sake of the exercise, let’s now go back to the for loop but restructure our program so that generation of random variables takes place within a user-defined function

Here

- function is a Julia keyword that indicates the start of a function definition
- generate_data is an arbitrary name for the function
- return is a keyword indicating the return value

More Useful Functions
---

Of course the function generate_data is completely contrived

We could just write the following and be done

This function will be passed a choice of probability distribution and respond by plotting a histogram of observations

In doing so we’ll make use of the Distributions package

First, lp = Laplace() creates an instance of a data type defined in the Distributions module that represents the Laplace distribution

The name lp is bound to this object

When we make the function call plot_histogram(lp, 500) the code in the body of the function plot_histogram is run with

the name distribution bound to the same object as lp
the name n bound to the integer 500

How It Works
---

Consider the function call rand(distribution, n)

This looks like something of a mystery

The function rand() is defined in the base library such that rand(n) returns n uniform random variables on [0,1)

In [21]:
rand(3)

3-element Array{Float64,1}:
 0.760911
 0.480226
 0.910422

On the other hand, distribution points to a data type representing the Laplace distribution that has been defined in a third party package

So how can it be that rand() is able to take this kind of object as an argument and return the output that we want?

The answer in a nutshell is multiple dispatch

This refers to the idea that functions in Julia can have different behavior depending on the particular arguments that they’re passed

Hence in Julia we can take an existing function and give it a new behavior by defining how it acts on a new type of object

The interpreter knows which function definition to apply in a given setting by looking at the types of the objects the function is called on

In Julia these alternative versions of a function are called methods

- - - 
<!--NAVIGATION-->
Module 17. [Overview](./00_julia_overview.ipynb) | [Getting Started](./01_julia_started.ipynb) | **[Commands](./02_julia_commands.ipynb)** | [Package: Gadfly](./03_julia_gadfly.ipynb)

- - -

Copyright © 2020 Qualex Consulting Services Incorporated.