-
Notifications
You must be signed in to change notification settings - Fork 180
/
Copy pathTQ01-core-functions-in-tidyquant.Rmd
443 lines (296 loc) · 18.9 KB
/
TQ01-core-functions-in-tidyquant.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
---
title: "Core Functions in tidyquant"
author: "Matt Dancho"
date: "`r Sys.Date()`"
output:
rmarkdown::html_vignette:
toc: true
toc_depth: 2
vignette: >
%\VignetteIndexEntry{Core Functions in tidyquant}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, echo = FALSE, message = FALSE, warning = FALSE}
knitr::opts_chunk$set(message = FALSE,
warning = FALSE,
fig.width = 8,
fig.height = 4.5,
fig.align = 'center',
out.width='95%',
dpi = 150)
```
> A few core functions with a lot of power
# Overview
The `tidyquant` package has a __few core functions with a lot of power__. Few functions means less of a learning curve for the user, which is why there are only a handful of functions the user needs to learn to perform the vast majority of financial analysis tasks. The main functions are:
* __Get a Stock Index, `tq_index()`, or a Stock Exchange, `tq_exchange()`__: Returns the stock symbols and various attributes for every stock in an index or exchange. Eighteen indexes and three exchanges are available.
* __Get Quantitative Data, `tq_get()`__: A one-stop shop to get data from various web-sources.
* __Transmute, `tq_transmute()`, and Mutate, `tq_mutate()`, Quantitative Data__: Perform and scale financial calculations completely within the `tidyverse`. These workhorse functions integrate the `xts`, `zoo`, `quantmod`, and `TTR` packages.
* __Performance analysis, `tq_performance()`, and portfolio aggregation, `tq_portfolio()`__: The `PerformanceAnalytics` integration enables analyzing performance of assets and portfolios. Because of the breadth of this topic, refer to [Performance Analysis with tidyquant](TQ05-performance-analysis-with-tidyquant.html) for a tutorial on these functions.
# Prerequisites
Load the `tidyquant` package to get started.
```r
# Loads tidyquant, lubridate, xts, quantmod, TTR
library(tidyverse)
library(tidyquant)
```
```{r, include=FALSE}
# Load for R CMD CHECK
library(dplyr)
library(lubridate)
library(tidyquant)
```
# 1.0 Retrieve Consolidated Symbol Data
## 1.1 Stock Indexes
A wide range of stock index / exchange lists can be retrieved using `tq_index()`. To get a full list of the options, use `tq_index_options()`.
```{r}
tq_index_options()
```
Set `x` as one of the options in the list of options above to get the desired stock index / exchange.
```{r, eval = FALSE}
tq_index("SP500")
```
The data source is State Street Global Advisors.
## 1.2 Stock Exchanges
Stock lists for three stock exchanges are available: NASDAQ, NYSE, and AMEX. If you forget, just use `tq_exchange_options()`. We can easily get the full list of stocks on the NASDAQ exchange.
```{r, eval=FALSE}
tq_exchange("NASDAQ")
```
# 1.0 Get Quantitative Data
The `tq_get()` function is used to collect data by changing the `get` argument. The data sources:
1. __Yahoo Finance__ - Daily stock data
2. __FRED__ - Economic data
3. __Quandl__ - Economic, Energy, & Financial Data API
4. __Tiingo__ - Financial API with sub-daily stock data and crypto-currency
5. __Alpha Vantage__ - Financial API with sub-daily, ForEx, and crypto-currency data
6. __Bloomberg__ - Financial API. Paid account is required.
Use `tq_get_options()` to see the full list.
```{r}
tq_get_options()
```
## 2.1 Yahoo! Finance
The stock prices can be retrieved succinctly using `get = "stock.prices"`. This returns stock price data from Yahoo Finance.
```{r}
aapl_prices <- tq_get("AAPL", get = "stock.prices", from = " 1990-01-01")
aapl_prices
```
We can get multiple stocks:
```{r}
stocks <- c("AAPL", "META", "NFLX") %>%
tq_get(from = "2013-01-01",
to = "2017-01-01")
stocks
```
Yahoo Japan stock prices can be retrieved using a similar call, `get = "stock.prices.japan"`.
```{r, eval = F}
x8411T <- tq_get("8411.T", get = "stock.prices.japan", from = "2016-01-01", to = "2016-12-31")
```
The data source is Yahoo Finance (https://finance.yahoo.com/) and Yahoo Finance Japan.
## 2.2 FRED Economic Data
A wealth of economic data can be extracted from the Federal Reserve Economic Data (FRED) database. The FRED contains over 10K data sets that are free to use. See the [FRED categories](https://fred.stlouisfed.org/categories) to narrow down the data base and to get data codes. The [WTI Crude Oil Prices](https://fred.stlouisfed.org/series/DCOILWTICO) are shown below.
```{r,}
wti_price_usd <- tq_get("DCOILWTICO", get = "economic.data")
wti_price_usd
```
## 2.3 Nasdaq Data Link (Quandl) API
[Quandl](https://data.nasdaq.com/) provides access to a vast number of financial and economic databases.
The Quandl packages must be installed separately.
```r
install.packages("Quandl")
```
### Authentication
To make full use of the integration we recommend you set your api key. To do this create or sign into your Quandl account and go to your account api key page.
```{r, eval = F}
quandl_api_key("<your-api-key>")
```
### Search
Searching Quandl from within the R console is possible with `quandl_search()`, a wrapper for `Quandl::Quandl.search()`. An example search is shown below. The only required argument is `query`. You can also visit the [Quandl Search](https://data.nasdaq.com/search) webpage to search for available database codes.
```{r, eval = F}
quandl_search(query = "Oil", database_code = "NSE", per_page = 3)
```
### Getting Quandl Data
Getting data is integrated into `tq_get()`. Two get options exist to retrieve Quandl data:
1. `get = "quandl"`: Get's Quandl time series data. A wrapper for `Quandl()`.
2. `get = "quandl.datatable"`: Gets Quandl datatables (larger data sets that may not be time series). A wrapper for `Quandl.datatable()`.
Getting data from Quandl can be achieved in much the same way as the other "get" options. Just pass the "codes" for the data along with desired arguments for the underlying function.
The following uses `get = "quandl"` and the "WIKI" database to download daily stock prices for AAPL in 2016. The output is a tidy data frame.
```{r, eval = F}
c("WIKI/AAPL") %>%
tq_get(get = "quandl",
from = "2016-01-01",
to = "2016-12-31")
```
The following time series options are available to be passed to the underlying `Quandl()` function:
* `start_date` (`from`) = "yyyy-mm-dd" | `end_date` (`to`) = "yyyy-mm-dd"
* `column_index` = numeric column number (e.g. 1)
* `rows` = numeric row number indicating first n rows (e.g. 100)
* `collapse` = "none", "daily", "weekly", "monthly", "quarterly", "annual"
* `transform` = "none", "diff", "rdiff", "cumul", "normalize"
Here's an example to get period returns of the adj.close (column index 11) using the `column_index`, `collapse` and `transform` arguments.
```{r, eval = F}
"WIKI/AAPL" %>%
tq_get(get = "quandl",
from = "2007-01-01",
to = "2016-12-31",
column_index = 11,
collapse = "annual",
transform = "rdiff")
```
Datatables are larger data sets. These can be downloaded using `get = "quandl.datatable"`. Note that the time series arguments do not work with data tables.
Here's several examples of [Zacks Fundamentals Collection B](https://data.nasdaq.com/databases/ZFB/documentation/about)
```{r, eval = F}
# Zacks Fundamentals Collection B (DOW 30 Available to non subscribers)
tq_get("ZACKS/FC", get = "quandl.datatable") # Zacks Fundamentals Condensed
tq_get("ZACKS/FR", get = "quandl.datatable") # Zacks Fundamental Ratios
tq_get("ZACKS/MT", get = "quandl.datatable") # Zacks Master Table
tq_get("ZACKS/MKTV", get = "quandl.datatable") # Zacks Market Value Supplement
tq_get("ZACKS/SHRS", get = "quandl.datatable") # Zacks Shares Out Supplement
```
## 2.4 Tiingo API
The Tiingo API is a free source for stock prices, cryptocurrencies, and intraday feeds from the IEX (Investors Exchange). This can serve as an alternate source of data to Yahoo! Finance.
### Authentication
To make full use of the integration you need to get an API key and then set your api key. If you don't have one already, go to Tiingo account and get your FREE API key. You can then set it as follows:
```{r, eval=FALSE}
tiingo_api_key('<your-api-key>')
```
### Getting Tiingo Data
The `tidyquant` package provides convenient wrappers to the `riingo` package (R interface to Tiingo). Here's how `tq_get()` maps to `riingo`:
- Tiingo Prices: `tq_get(get = "tiingo") = riingo::riingo_prices()`
- Tiingo IEX Data: `tq_get(get = "tiingo.iex") = riingo::riingo_iex_prices()`
- Tiingo Crypto Data: `tq_get(get = "tiingo.crypto") = riingo::riingo_crypto_prices()`
```{r, eval=F}
# Tiingo Prices (Free alternative to Yahoo Finance!)
tq_get(c("AAPL", "GOOG"), get = "tiingo", from = "2010-01-01")
# Sub-daily prices from IEX ----
tq_get(c("AAPL", "GOOG"),
get = "tiingo.iex",
from = "2020-01-01",
to = "2020-01-15",
resample_frequency = "5min")
# Tiingo Bitcoin in USD ----
tq_get(c("btcusd"),
get = "tiingo.crypto",
from = "2020-01-01",
to = "2020-01-15",
resample_frequency = "5min")
```
## 2.5 Alpha Vantage API
[Alpha Vantage](https://www.alphavantage.co/) provides access to a real-time and historical financial data. The `alphavantager` package, a lightweight R interface, has been integrated into `tidyquant` as follows. The benefit of the integration is the __scalability since we can now get multiple symbols returned in a tidy format__. You will need to install it first.
```r
install.packages("alphavantager")
```
### Authentication
To make full use of the integration you need to get an API key and then set your api key. If you don't have one already, go to [Alpha Vantage](https://www.alphavantage.co/) account and get your FREE API key. You can then set it as follows:
```{r, eval = F}
# install.packages("alphavantager")
av_api_key("<your-api-key>")
```
### Getting Alpha Vantage Data
Getting data is simple as the structure follows the [Alpha Vantage API documentation](https://www.alphavantage.co/documentation/). For example, if you wish to retrieve intraday data at 5 minute intervals for META and MSFT, you can build the parameters `x = c("META", "MSFT"), get = "alphavantager", av_fun = "TIME_SERIES_INTRADAY", interval = "5min"`. The familiar `x` and `get` are the same as you always use. The `av_fun` argument comes from `alphavantager::av_get()` and the Alpha Vantage documentation. The `interval` argument comes from the docs as well.
```{r, eval = F}
# Scaling is as simple as supplying multiple symbols
c("META", "MSFT") %>%
tq_get(get = "alphavantage", av_fun = "TIME_SERIES_INTRADAY", interval = "5min")
```
## 2.6 Bloomberg
[Bloomberg](https://www.bloomberg.com/professional/solution/bloomberg-terminal/) provides access to arguably the most comprehensive financial data and is actively used by most major financial institutions that work with financial data. The `Rblpapi` package, an R interface to Bloomberg, has been integrated into `tidyquant` as follows. The benefit of the integration is the __scalability since we can now get multiple symbols returned in a tidy format__.
### Authentication
To make full use of the integration you need to have a Bloomberg Terminal account (Note this is not a free service). If you have Bloomberg Terminal running on your machine, you can connect as follows:
```{r, eval = F}
# install.packages("Rblpapi")
Rblpapi::blpConnect()
```
### Getting Bloomberg Data
Getting data is simple as the structure follows the [Rblpapi API documentation](https://CRAN.R-project.org/package=Rblpapi). For example, if you wish to retrieve monthly data for SPX Index and AGTHX Equity, you can build the `tq_get` parameters as follows:
- `x = c('SPX Index','ODMAX Equity')`
- `get = "rblpapi"`
- `rblpapi_fun = "bdh"` Note that "bdh" is the default, and options include "bdh" (Bloomberg Data History), "bds" (Bloomberg Data Set), and "bdp" (Bloomberg Data Point)
- `from / to` These get passed to `start.date` and `end.date` and can be provided in "YYYY-MM-DD" character format. Note that `start.date` and `end.date` from `Rblpapi` can be used but must be converted to date or datetime.
- Other arguments: These are options that depend on the `rblpapi_fun`. See `Rblpapi` documentation.
```{r, eval = F}
# Get Bloomberg data in a tidy data frame
my_bloomberg_data <- c('SPX Index','ODMAX Equity') %>%
tq_get(get = "Rblpapi",
rblpapi_fun = "bdh",
fields = c("PX_LAST"),
options = c("periodicitySelection" = "WEEKLY"),
from = "2016-01-01",
to = "2016-12-31")
```
# 3.0 Mutate Quantitative Data
Mutating functions enable the `xts`/`zoo`, `quantmod` and `TTR` functions to shine. We'll touch on the mutation functions briefly using the `FANG` data set, which consists of daily prices for META, AMZN, GOOG, and NFLX from the beginning of 2013 to the end of 2016. We'll apply the functions to grouped data sets to get a feel for how each works
```{r}
FANG
```
For a detailed walkthrough of the compatible functions, see the next vignette in the series, [R Quantitative Analysis Package Integrations in tidyquant](TQ02-quant-integrations-in-tidyquant.html).
## 3.1 Transmute Quantitative Data, tq_transmute
Transmute the results of `tq_get()`. Transmute here holds almost the same meaning as in `dplyr`, only the newly created columns will be returned, but with `tq_transmute()`, the number of rows returned can be different than the original data frame. This is important for changing periodicity. An example is periodicity aggregation from daily to monthly.
```{r}
FANG %>%
group_by(symbol) %>%
tq_transmute(select = adjusted, mutate_fun = to.monthly, indexAt = "lastof")
```
Let's go through what happened. `select` allows you to easily choose what columns get passed to `mutate_fun`. In example above, `adjusted` selects the "adjusted" column from `data`, and sends it to the mutate function, `to.monthly`, which mutates the periodicity from daily to monthly. Additional arguments can be passed to the `mutate_fun` by way of `...`. We are passing the `indexAt` argument to return a date that matches the first date in the period.
### Working with non-OHLC data
Returns from FRED, Oanda, and other sources do not have open, high, low, close (OHLC) format. However, this is not a problem with `select`. The following example shows how to transmute WTI Crude daily prices to monthly prices. Since we only have a single column to pass, we can leave the `select` argument as `NULL` which selects all columns by default. This sends the price column to the `to.period` mutate function.
```{r, message=FALSE, warning=FALSE}
wti_prices <- tq_get("DCOILWTICO", get = "economic.data")
wti_prices %>%
tq_transmute(mutate_fun = to.period,
period = "months",
col_rename = "WTI Price")
```
## 3.2 Mutate Quantitative Data, tq_mutate
Adds a column or set of columns to the tibble with the calculated attributes (hence the original tibble is returned, mutated with the additional columns). An example is getting the `MACD` from `close`, which mutates the original input by adding MACD and Signal columns. Note that we can quickly rename the columns using the `col_rename` argument.
```{r}
FANG %>%
group_by(symbol) %>%
tq_mutate(select = close,
mutate_fun = MACD,
col_rename = c("MACD", "Signal"))
```
Note that a mutation can occur if, and only if, the mutation has the same structure of the original tibble. In other words, the calculation must have the same number of rows and row.names (or date fields), otherwise the mutation cannot be performed.
### Mutate rolling regressions with rollapply
A very powerful example is applying __custom functions__ across a rolling window using `rollapply`. A specific example is using the `rollapply` function to compute a rolling regression. This example is slightly more complicated so it will be broken down into three steps:
1. Get returns
2. Create a custom function
3. Apply the custom function across a rolling window using `tq_mutate(mutate_fun = rollapply)`
_Step 1: Get Returns_
First, get combined returns. The asset and baseline returns should be in wide format, which is needed for the `lm` function in the next step.
```{r}
fb_returns <- tq_get("META", get = "stock.prices", from = "2016-01-01", to = "2016-12-31") %>%
tq_transmute(adjusted, periodReturn, period = "weekly", col_rename = "fb.returns")
xlk_returns <- tq_get("XLK", from = "2016-01-01", to = "2016-12-31") %>%
tq_transmute(adjusted, periodReturn, period = "weekly", col_rename = "xlk.returns")
returns_combined <- left_join(fb_returns, xlk_returns, by = "date")
returns_combined
```
_Step 2: Create a custom function_
Next, create a custom regression function, which will be used to apply over the rolling window in Step 3. An important point is that the "data" will be passed to the regression function as an `xts` object. The `timetk::tk_tbl` function takes care of converting to a data frame for the `lm` function to work properly with the columns "fb.returns" and "xlk.returns".
```{r}
regr_fun <- function(data) {
coef(lm(fb.returns ~ xlk.returns, data = timetk::tk_tbl(data, silent = TRUE)))
}
```
_Step 3: Apply the custom function_
Now we can use `tq_mutate()` to apply the custom regression function over a rolling window using `rollapply` from the `zoo` package. Internally, since we left `select = NULL`, the `returns_combined` data frame is being passed automatically to the `data` argument of the `rollapply` function. All you need to specify is the `mutate_fun = rollapply` and any additional arguments necessary to apply the `rollapply` function. We'll specify a 12 week window via `width = 12`. The `FUN` argument is our custom regression function, `regr_fun`. It's extremely important to specify `by.column = FALSE`, which tells `rollapply` to perform the computation using the data as a whole rather than apply the function to each column independently. The `col_rename` argument is used to rename the added columns.
```{r}
returns_combined %>%
tq_mutate(mutate_fun = rollapply,
width = 12,
FUN = regr_fun,
by.column = FALSE,
col_rename = c("coef.0", "coef.1"))
returns_combined
```
As shown above, the rolling regression coefficients were added to the data frame.
## 3.3 _xy Variants, tq_mutate_xy and tq_transmute_xy
Enables working with mutation functions that require two primary inputs (e.g. EVWMA, VWAP, etc).
### Mutate with two primary inputs
EVWMA (exponential volume-weighted moving average) requires two inputs, price and volume. To work with these columns, we can switch to the xy variants, `tq_transmute_xy()` and `tq_mutate_xy()`. The only difference is instead of the `select` argument, you use `x` and `y` arguments to pass the columns needed based on the `mutate_fun` documentation.
```{r, message=FALSE, warning=FALSE}
FANG %>%
group_by(symbol) %>%
tq_mutate_xy(x = close, y = volume,
mutate_fun = EVWMA, col_rename = "EVWMA")
```