-
Notifications
You must be signed in to change notification settings - Fork 23
/
00slides.Rmd
303 lines (169 loc) · 8.05 KB
/
00slides.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
---
title: "Reference Material"
subtitle: "Intro and Tips for Julia"
author: Tyler Ransom
date: ECON 6343, University of Oklahoma
output:
xaringan::moon_reader:
css: ['default', 'metropolis', 'metropolis-fonts', 'ou-colors.css']
# self_contained: true
nature:
highlightStyle: github
highlightLines: true
countIncrementalSlides: false
ratio: '16:9'
---
# Julia
- Julia is scientific computing language
- Similar in function to Python, R, or Matlab
- Aims to be a "high-level" language that is performant enough for intensive applications
- "High-level" in the sense that it doesn't require compilation to run
- "Performant" in the sense that it can sometimes run as fast as C, C++ or FORTRAN
- ... and can be considerably faster than Python, R or Matlab.
---
# Julia speed benchmarks (Source: [julialang.org](https://julialang.org/benchmarks/))
.center[
```{r,echo=FALSE}
knitr::include_graphics("benchmarks.svg")
```
]
---
# What makes Julia different?
- .hi[just in time] (JIT) compilation
- .hi[rich type system], which can yield massive performance gains
- .hi[multiple dispatch] (i.e. the same function can be reused for different types of inputs)
- .hi[metaprogramming] (like macros in Stata)
- .hi[loops don't slow you down] (compared to Matlab or R, where they are very slow)
---
# Learning Julia
- There are lots of resources for learning Julia
- Ultimately, learning is through experience
- [Julia homepage](https://julialang.org/)
- [Documentation](https://docs.julialang.org/en/v1/)
- [Julia Discourse](https://discourse.julialang.org/)
- [YouTube](https://www.youtube.com/user/JuliaLanguage)
- [Cheat sheet](https://juliadocs.github.io/Julia-Cheat-Sheet/)
I regularly use all of these resources as I program in Julia
---
# Installing Julia
Go [here](https://julialang.org/downloads/) and follow the instructions for your computer's operating system
---
# Julia REPL
REPL = .hi[R]ead .hi[E]val .hi[P]rint .hi[L]oop (i.e. the interactive console)
- Open Julia and you should see a prompt that says `julia> `
Stuff the REPL can do:
- basic calculator functions; e.g. `sqrt(π)` which returns 1.77245
- up arrow for last command
- `;` enters shell mode (where you can issue system commands from inside Julia)
- `?` enters help mode, e.g. `?sqrt`
- `]` opens package manager; e.g. `] add LinearAlgebra` (note: may [take awhile](https://twitter.com/chrispdanko/status/1256382196895682560?s=20))
---
# Basic operations (see [cheatsheet](https://juliadocs.github.io/Julia-Cheat-Sheet/) for more details)
- .hi[Array indexing:] use `[]`, e.g. `X[5,2]`
- .hi[Show output:] use `println()`, e.g. `println("size of X is ", size(X))`
- .hi[Assignment:] use `=`, e.g. `X = rand(15,3)`
- .hi[Commenting:] use `#` for single line, `#= ... =#` for multi-line
- .hi[Element-wise operators:] must put a `.` in front, e.g. `x .+ y` if `x` and `y` are arrays
- .hi[Load installed package:] `using Random`
- .hi[Execute script:] `include("myfile.jl")` $\equiv$ `do myfile.do` or `source('myfile.R')`
---
# Creating and executing a Julia script
- In Stata or R, you create a script and then execute it
- The same thing is true in Julia, but with a slight difference
- In Julia, even scripts should have functions wrapped around them
- The following is the contents of `myscript.jl`
```julia
using <Package1>, <Package2>
function scriptwrapper()
X = [ones(15,1) rand(15,3)]
y = randn(15,1)
β = X\y # compute OLS
return β
end
βhat = scriptwrapper()
```
- Then execute this script at the REPL by typing `include("myscript.jl")`
---
# Why do I need to wrap everything in a function?
- Wrapping code in a function allows the JIT compiler to optimize the code
- This is where the speed gains come from
- An added benefit is that it [promotes good programming practices](https://twitter.com/tyleransom/status/1227633474733060097?s=20)
- Putting everything in a function encourages you to abstract
- Abstraction usually leads to performance gains (see p. 22 [here](https://web.stanford.edu/~gentzkow/research/CodeAndData.pdf))
---
# The most common error message you'll receive
- Julia is obsessive about types
- `1.0` is different from `1` (the former is a `Float64` while the latter is an `Int64`)
- This matters: e.g. some functions are optimized to only accept `Int64` types
If you type this: `ones(π,1)`
You'll get this:
```julia
ERROR: MethodError: no method matching ones(::Irrational{:π}, ::Int64)
Closest candidates are:
ones(::Union{Integer, AbstractUnitRange}...) at array.jl:448
ones(::Type{T}, ::Union{Integer, AbstractUnitRange}...) where T at array.jl:449
Stacktrace:
[1] top-level scope at none:0
```
---
# `MethodError`
- The error message on the previous slide is saying that you are violating the rules of the function you're calling
- The solution is to read the error message and note that this function requires `Integer` types as inputs
- You will also encounter error messages in the following common situations:
- You are supplying the wrong number of inputs to a function
- You are trying to call a function or object that Julia can't find
- (either you haven't loaded a required package, or you haven't called that fn. yet)
- To resolve errors, copy the error message into a search engine and see what shows
---
# Cool Julia features
- My favorite feature is the ability to use Greek symbols in programming
- To write these, just simply type the LaTeX code for the symbol and then press Tab
- e.g. `\pi`+Tab = `π`
- Another cool feature is the `Distributions.jl` package
- This package allows a user to specify any desired probability distribution
- The user can take draws from it, compute quantiles or probabilities, etc.
- You can also double index an object
- e.g. `X = rand(15,2)` followed by `X[2,:][2]` (though this example is silly)
---
# Comprehensions
- Another excellent feature is known as comprehensions
- Allows the user in 1 line of code to create an object that could be a complex formula
- e.g. computing a present value $\sum_{t=1}^T \beta^t Y_t$
- `PV = sum([β^t*Y[t] for t=1:T])`
- Comprehensions allow for much lighter syntax than in other languages
---
# Data Input and Output
- .hi[Read a CSV file:] `using CSV; data = CSV.read("filename.csv")`
- .hi[Write a CSV file:] `using CSV; CSV.write("filename.csv", data)`
- .hi[Save a Julia Object:] `using JLD; save("filename.jld", "object_key", object, ...)`
- [this is like `.dta` (Stata) or `.rda` (R)]
- .hi[Load a Julia Object:] `using JLD; d = load("filename.jld") # Returns a dict`
- A `dict` (Dictionary) is a named list of objects [kind of like a `list()` in R]
---
# Running a regression
- The `CSV` package is usually used in concert with the `DataFrames` package
- To run Generalized Linear Model (GLM) regressions, use the `GLM` package
```julia
using CSV, DataFrames, GLM, HTTP, CategoricalArrays
# load Stata auto dataset (i.e. `sysuse auto` in Stata)
url = "https://tyleransom.github.io/teaching/MetricsLabs/auto.csv"
auto = CSV.read(HTTP.get(url).body, DataFrame)
# set `rep78` variable to be categorical
auto.rep78 = categorical(auto.rep78)
# run basic regression (`reg price mpg foreign headroom i.rep78` in Stata)
lm(@formula(price ~ mpg + foreign + headroom + rep78), auto)
```
---
# Piping in Julia
- In R, `%>%` can be used to create piping chains
- This makes code more readable
- e.g. `x %>% mean() %>% log()` instead of `log(mean(x))`
- In Julia, you can pipe with `|>`, e.g. `x |> sum |> log`
- Note: `|>` is `|` and then `>` (it doesn't show up separately in this font)
---
# Where to go from here
- [Problem Set 1](https://github.com/OU-PhD-Econometrics/fall-2021/blob/master/ProblemSets/PS1-julia-intro/PS1.pdf) will provide ample opportunity to practice Julia
- There are other tips and tricks that you will pick up over time
- Either from the [cheatsheet](https://juliadocs.github.io/Julia-Cheat-Sheet/) or on the [Discourse site](https://discourse.julialang.org/)
- The Discourse community is really nice, even if (like me) you ask a stupid question
- Happy coding!