-
Notifications
You must be signed in to change notification settings - Fork 4
/
tstools.Rmd
582 lines (427 loc) · 20.8 KB
/
tstools.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
---
title: "tstools"
subtitle: "A Time Series Toolbox for Official Statistics"
author: "Matthias Bannert"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{A Time Series Toolbox for Official Statistics}
%\VignetteEngine{knitr::rmarkdown}
%\usepackage[utf8]{inputenc}
---
# About tstools
The *tstools* package provides convenience functions to process, plot and export time series. It was designed for users from the fields of official statistics and macroeconomics. The package is focused on regular time series of monthly and quarterly as well as yearly frequency. By summer 2018, most of the functionality provided by *tstools* deals with plotting and exporting time series and so does this manual.
<!-- Admittedly this a constraint, but being limited to this kind of data also provides the opportunity to choose default parameters and do a lot of parameter guessing. In turn, being able to use defaults and guessed parameters enables quicker specification and less configuration for the user. -->
## Why yet another time series package?
If you have ever thought (or heard) ``I can't believe it's so disgusting to create simple plots with 2 y-axes of different scales.``, or ``This R thing can't do time series bar charts 'properly' - even Excel can do this. I don't get the hype.`` or ``Why isn't the 2010 label in the middle of the year?`` **tstools** is for you.
In other words, whenever the 'business' directly works with code, they wonder why, e.g., legends are not placed automatically where they ought to be. In addition to make plotting more convenient and fun, the main goal of **tstools** is to provide a simple environment for production reports and a stand alone plots that focus around time series.
Instead of claiming that the business is wrong, limits flexibility or that their visual concepts are flawed, **tstools** tries to provide a solution that helps economists and establishment statisticians to work and plot conveniently using R.
Replace *automatically* with *by default* and you understand how **tstools** essentially works. The package uses R's base plot functionality and sets a plethora of defaults, that in combination with each other make the plots look nifty. All the package does when it comes to plotting, is try to guess what the user wants when they call *tsplot*.
The following sections will show some examples of popular time series plots that *used* to be hard to get in R:
- (line) charts with 2 y-axes but matching grids
- charts with highlighted time spans
- slick fan charts
- time series bar charts with negative and positive growth contributions
- charts with a continous time x-axis
- charts with a default legend
# Graphs
*tstools* produces base R plots. Hence all resulting plots can simply be extended by further calls to base R plot functions. Base R plots look rather technical and raw, which is why **tstools** tries to set a ton of useful defaults to make time series plots look fresh and clean from the start.
## Basic usage
Plotting with *tstools* is easy. There is only one generic plotting function called **tsplot**. Depending on what time series objects are passed on to the function, the method dispatcher chooses the right method and plots the graph. The following sections will walk through several applied plotting examples. Horizontal grids that suit two axes, automatic shifting of series to the middle of the period, colors, line types, filling up started years and many other features come as convenient defaults. Yet, all of these defaults can be adjusted using *themes*.
## Before we get started...
... let's set up some data for our reproducible examples first.
Let's do this separately so we can really focus on the **tstools** specific code in our discussion of the examples. Don't get confused, plotting is really easy, creating random data or reading in data in order to make the examples reproducible is what costs us some extra lines of code here.
```{r,echo=TRUE,message=FALSE,warning=FALSE}
library(tstools)
data(KOF)
short <- window(KOF$kofbarometer,
start = c(2007, 1),
end = c(2014, 1)
)
# list of time series
ts1 <- ts(runif(40, -10, 40), start = c(1995, 1), freq = 4)
ts2 <- ts(runif(80, 0, 50), start = c(2000, 1), freq = 12)
tslist <- list()
tslist$ts1 <- ts1
tslist$ts2 <- ts2
# data for stacked bar charts...
tsb1 <- ts(runif(30, -30, 20), start = c(2010, 1), frequency = 4)
tsb2 <- ts(runif(30, 0, 50), start = c(2010, 1), frequency = 4)
tsb3 <- ts(runif(30, 0, 50), start = c(2010, 1), frequency = 4)
min_series <- ts(runif(10, -10, 40), start = c(1995, 1), freq = 4)
min_series_2 <- ts(runif(25, -20, 40), start = c(1995, 1), freq = 12)
min_series_3 <- ts(runif(25, -20, 40), start = c(1995, 1), freq = 4)
min_li <- list(
series1 = min_series,
series2 = min_series_2,
series3 = min_series_3
)
missings <- ts(c(1, 2, 10, 3, 5, 6, NA, NA, 3, 2, 5, 3, 1, 1),
start = c(1995, 1), freq = 4
)
```
## Single time series: line chart
The most basic example of a time series plot is a time series line chart. The object *short* is of class ts.
```{r,fig.width = 7,fig.height=6,message=FALSE}
tsplot(short)
```
## Multiple time series (same y-axis) in one line chart
The function *tsplot* can handle multiple time series objects or lists at once. You can either throw multiple comma separated time series objects at *tsplot*
```{r,fig.width = 7,fig.height=6}
tsplot(ts1, ts2, auto_legend = FALSE)
```
or a list of time series...
```{r,fig.width = 7,fig.height=6}
tsplot(tslist, auto_legend = FALSE)
```
Even though *tsplot* supports *mts* and single time series, we clearly recommend to pass lists of time series to *tsplot* for the best and most tested experience. Lists are a convenient way to add legends to time series charts as *tsplot* automatically uses names of list elements in its legend. You may also define your lists adhoc within the *tsplot* call like so:
```{r,fig.width = 7,fig.height=6}
tsplot(list(
"Time Series 1" = ts1,
"Time Series 2" = ts2
))
```
## Auto-scale grids are there to help you, how to configure them!
The latest release has considerably improved finding suitable scale and grids
automatically. One improvement is the ability to detect not only the minimum
necessary range, but also to check whether some value is so close the x-axis
that most users prefer an extra grid to have a little extra breathing room.
If there's less than 15 percent of the bottom or grid left, *tsplot* automatically adds another
grid.
The following example tweaks this 15 percent margin to an exaggerated 70 percent,
in order to show that an extra grid up top is added because the *y_tick_margin*
parameter implies now that 70 percent of the outer most grid needs to stay clean.
```{r}
tsplot(short,
theme = init_tsplot_theme(y_tick_margin = .7)
)
```
## Manual value ticks
Often you just want to have a fixed scale, e.g., for an index that ranges from 0 to 100. Simply use the ``manual_value_ticks_l`` and ``manual_value_ticks_r`` arguments to specify manual ticks and grids.
In case you use 2 y-axes make sure both manual value tick vectors are of the same length.
```{r}
tsplot(KOF["kofbarometer"],
manual_value_ticks_l = seq(60, 120, by = 20)
)
```
## Fan Charts: Plotting Confidence Intervals
*tsplot* can assign confidence intervals to every
time series line. Simply choose the line and confidence level and define an upper
and lower bound *ts* object to draw shaded confidence bounds around a respective series. A confidence interval definition is basically a nested list. For the sake of clear code, it is recommended to define the CI list separately like so:
```{r,fig.width = 7,fig.height=6}
# Define confidence intervals
ci <- list(
"KOF Barometer" = list(
"80" = list(
lb = KOF$baro_lo_80,
ub = KOF$baro_hi_80
),
"95" = list(
lb = KOF$baro_lo_95,
ub = KOF$baro_hi_95
)
)
)
tsplot(list("KOF Barometer" = KOF$baro_point_fc),
ci = ci
)
```
The KOF data example dataset contains an *auto.arima* point forecast of the KOF Barometer as well as upper and lower bound at the 80% and 95% confidence level. Notice that *tsplot* does **not** do any forecasting or estimation of confidence bands, it simply takes some time series as upper and lower bounds and assigns them to a particular series. Thus it's agnostic of the estimation method.
## Stacked Bar Chart
Sometimes we want to display time series as bar charts. Most plotting engines
understand bar charts as something that has a categorical x-axis. So even if
you have time on the x-axis, periods are treated as categories,
which implies that a bar is centered above the category tick for that period.
*tstools* treats the x-axis for bar charts as continous and allows a quarterly series to truly represent an entire quarter. Note that stacked bar charts imply that all involved series have the same frequency.
```{r,fig.width = 7,fig.height=6}
tsplot(tsb1, tsb2, tsb3,
left_as_bar = TRUE,
auto_legend = FALSE,
theme = init_tsplot_theme(bar_gap = 10)
)
```
Notice that the gap size of the gap between the bars can be adjusted using the **bar_gap** theme parameter.
## Sum as line in stacked bar charts
One of the reasons for using bar charts with time series is
to add up positive and negative contributions. In this case it is also
helpful to be able to add the sum of the components to plot on a per period
basis. The following draws a line on top of the bars that represents the sum.
```{r,fig.width = 7,fig.height=6}
tsl <- list(tsb1, tsb2, tsb3)
tsplot(tsl,
left_as_bar = TRUE,
manual_value_ticks_l = seq(-40, 100, by = 20),
auto_legend = FALSE,
theme = init_tsplot_theme(sum_as_line = TRUE)
)
```
## Stacked bar charts with different start and end dates
It is even possible, to produce stacked bar charts from time series
of different start and end dates.
```{r,fig.width = 7,fig.height=6}
tsb1 <- ts(runif(30, -30, 20), start = c(2010, 1), frequency = 4)
tsb2 <- ts(runif(30, 0, 50), start = c(2010, 1), frequency = 4)
tsb3 <- ts(runif(30, 0, 50), start = c(2010, 1), frequency = 4)
tsb4 <- ts(runif(30, -40, 10), start = c(2005, 1), frequency = 4)
tsplot(tsb1, tsb2, tsb3, tsb4,
left_as_bar = TRUE,
auto_legend = FALSE
)
```
### Grouped bar charts
When different variables got the same scale, but cannot be aggregated
we want to display time series bars next to each other instead of stacking
them. In **tstools** stacking is the default but it can easily be tweaked using
the *group_bar_chart parameter.
```{r,fig.width = 7,fig.height=6}
tsb1 <- ts(runif(20, -30, 20), start = c(2010, 1), frequency = 12)
tsb2 <- ts(runif(20, 0, 50), start = c(2010, 1), frequency = 12)
tsb3 <- ts(runif(20, 0, 50), start = c(2010, 1), frequency = 12)
tsplot(tsb1, tsb2, tsb3,
left_as_bar = TRUE,
group_bar_chart = TRUE,
auto_legend = FALSE
)
```
## Stacked Area Charts
Stacked area charts are another way of stacking time series.
They might be more illustrative than stacked bar charts, but are subject to
certain limitations: plots always start at zero and all series need to be
either positive or negative.
```{r,fig.width = 7,fig.height=6}
set.seed(123)
tslist <- generate_random_ts(4,
starts = 1987:1990,
ranges_min = 1,
ranges_max = 3
)
tsplot(tslist, left_as_band = TRUE)
```
## Multiple Y-axis with different scales (line charts)
In order to compare indicators it's covenient in some domains to plot
two time series of completely different scale, e.g., a growth rate and
an indicator indexed at 100, to each other. Whenever the absolute level
is not overly interesting but rather the lead-lag structure and the
co-movement, 2 y-axes with different scales are popular. Hence *tsplot*
introduces a second argument, *tsr* (time series right), which takes either
an object of class ts or a list of time series.
```{r,fig.width = 7,fig.height=6}
data(KOF)
tsplot(KOF$kofbarometer,
tsr = KOF$reference, auto_legend = FALSE
)
```
## Multiple Y-axes with different scales (bar and line charts)
Sometimes you want a bar chart on one axis and an line chart on the other.
Guess what, *tstools* also has a convenient way of creating these. Simply
provide a list of time series to both the ... argument and the tsr argument
and choose *left_as_bar = TRUE*. Note that the line chart is automatically moved
to the middle of the quarterly bar.
```{r,fig.width = 7,fig.height=6}
tsb1 <- ts(runif(30, -30, 20), start = c(2010, 1), frequency = 4)
tsb2 <- ts(runif(30, 0, 30), start = c(2010, 1), frequency = 4)
tsb3 <- ts(runif(30, 0, 30), start = c(2010, 1), frequency = 4)
tsr1 <- ts(runif(30, -4, 6), start = c(2010, 1), frequency = 4)
tsplot(tsb1, tsb2, tsb3,
tsr = tsr1,
left_as_bar = TRUE,
auto_legend = FALSE
)
```
## Y-Grids: automatic vs. manual
*tstools* tries to guess a reasonable number of ticks (and horizontal grids).
This can be tricky when several time series and multiple axes are involved.
*tstools'* standard procedure uses value ranges and a logarithm based algorithm to find the order of magnitude of a scale. Further *tstools* brute forces through a number of reasonable tick counts and chooses a suitable number of ticks. In case there is more than one y-axis the choice will be passed on to the other axis.
## Using another function
However, there are countless possibilities and the number of ticks and grids may come down to a matter of personal taste. Hence, *tstools* provides not only the flexibility to set grids manually, you can even pass another algorithm implemented in you very own R function that gives back a vector of ticks. Simply pass a function to the ``find_ticks_function`` argument. Currently range, and potential tick count are fixed as arguments to these functions, but hopefully passing other sets of arguments will be possible soon.
## Tweaking the defaults: Themes
Font size, line color, bar color, grid color, show or not show grid, and a plethora of
other options would lead to a ton of parameters. If you had to specify all of those,
it would be time consuming task to create a quick explorative plot. So *tstools*
suggests many defaults to many parameters and stores these parameters in lists called
themes. To tweak a default, simply initialize the default theme, tweak a single list element
and pass the entire theme to *tsplot*. By doing so you can also define
properties of multiple plots just by passing the new theme to the *tsplot* call.
```{r}
def_theme <- init_tsplot_theme()
names(def_theme)
```
Please take a look at the help file (`?init_tsplot_theme`) for a comprehensive
list and documenatation of all theme parameters. The hand-picked examples below
should give you an idea of how much you can do with themes and how to use 'em.
### Highlight windows: mark a period
Let's assume the last 2 years of the time series are a forecast which should be
highlight by a shaded area behind the actual series. If you know in advance
which default parameter -- in this case *highlight_window = FALSE* --
you want to overwrite, I recommend to do so right in your calll to *init_tsplot_theme*.
The cool thing about it is: R Studio's auto auggest helps you find the parameter.
**Pro tip**: If you do not plan to reuse your theme in another plot, specify the
parameter directly in function call.
```{r,fig.width = 7,fig.height=6}
tsplot(tsb1, tsb2, tsb3,
left_as_bar = TRUE,
theme = init_tsplot_theme(highlight_window = TRUE)
)
```
### Add a Box Around Your Plot
```{r,fig.width = 7,fig.height=6}
tt <- init_tsplot_theme(use_box = TRUE)
tsplot(tsb1, tsb2, tsb3,
tsr = tsr1,
left_as_bar = TRUE,
theme = tt
)
```
### Change line types...
Note how you can simply change existing themes
```{r,fig.width = 7,fig.height=6}
tt$lty <- c(3, 2, 1)
tsplot(tsb1, tsb2, tsb3,
theme = tt
)
```
### Adjust the highlight window
```{r,fig.width = 7,fig.height=6}
nt <- init_tsplot_theme(highlight_window = TRUE)
nt$highlight_window_start <- c(2017, 1)
nt$highlight_window_end <- c(2018, 1)
tsplot(tsb1, tsb2,
theme = nt
)
```
### Handling missings (NA Handling)
```{r,fig.width = 7,fig.height=6,echo=TRUE}
tsplot(missings,
theme = init_tsplot_theme(
NA_continue_line = TRUE,
show_points = TRUE
)
)
```
### Fill up Year With NAs
```{r,fig.width = 7,fig.height=6,echo=TRUE}
tsplot(ts2)
```
## #Fill up Year With NAs
```{r,fig.width = 7,fig.height=6,echo=TRUE}
tsplot(ts2,
theme = init_tsplot_theme(fill_year_with_nas = FALSE)
)
```
### Legends
## Assign Names to Single Objects
```{r,fig.width = 7,fig.height=6,echo=TRUE}
tsplot(
"An arbitrary ts object" = ts1,
"another ts object" = ts2
)
```
## Legends Based on Names of List Elements
```{r,echo=TRUE}
names(tslist)
```
## Legends Based on Names of List Elements
```{r,fig.width = 7,fig.height=6,echo=TRUE}
tsplot(tslist)
```
## Legends: multiple columns
```{r,fig.width = 7,fig.height=6,echo=TRUE}
tsplot(min_li,
theme = init_tsplot_theme(legend_col = 2)
)
```
## Legends Left and Right Y-Axis
```{r,fig.width = 7,fig.height=6,echo=TRUE}
tsplot(KOF["kofbarometer"],
tsr = KOF["reference"]
)
```
## Force all Legends to the Left
```{r,fig.width = 7,fig.height=6,echo=TRUE}
tsplot(KOF["kofbarometer"],
tsr = list("reference (right scale)" = KOF$reference),
theme = init_tsplot_theme(legend_all_left = TRUE)
)
```
## Line Breaks in Legends
```{r,fig.width = 7,fig.height=6,echo=TRUE}
tsplot("Some like\n loooong legends\n with so many words" = ts1)
```
## Remember: auto_legend = FALSE
```{r,fig.width = 7,fig.height=6,echo=TRUE}
tsplot(KOF[1], auto_legend = FALSE)
```
<!-- ## Customizing the plots created by tstools -->
<!-- Plots generated by *tsplot* are simply a bunch of layers of base R plots. So anything -->
<!-- that can be done with a base R plot can also done with a time series plot generated -->
<!-- by *tsplot*. Here are just some applied examples to show what's possible. -->
<!-- ### Want to show a red line to highlight 6? -->
<!-- ```{r,fig.width = 7,fig.height=6} -->
<!-- tsplot(tsb1,tsb2,tsb3,tsr=tsr1, -->
<!-- manual_value_ticks_l = seq(-40,80,by=20), -->
<!-- manual_value_ticks_r = seq(-4,8, by=2), -->
<!-- left_as_bar = TRUE,auto_legend=FALSE) -->
<!-- abline(h=6,col="red") -->
<!-- ``` -->
<!-- ### Add some text to a plot -->
<!-- ```{r,fig.width = 7,fig.height=6} -->
<!-- tsplot(tsb1,tsb2,tsb3, -->
<!-- manual_value_ticks_l = seq(-40,80,by=20), -->
<!-- left_as_bar = TRUE,auto_legend=FALSE) -->
<!-- text(2017.5,-35,"2.3%") -->
<!-- text(2016.5,-35,"1.3%") -->
<!-- ``` -->
# Export your chart to .pdf
Saving your **tstools** chart to a .pdf document is as easy and convenient as
```{r,eval=FALSE}
tsplot(KOF[1],
output_format = "pdf",
theme = init_tsplot_print_theme(output_wide = TRUE)
)
```
Notice the optional print theme and *output_wide* parameter: You can export 4:3 (default) format .pdf files as well as 16:9 (wide) using the convenience *output_format* theme parameter. Notice also that you do not necessarily have to use the print theme. What is important is not to wrap the call in `pdf()` and `dev.off()` calls
when using the `output_format="pdf"` option.
# Export lists of time series
The latest release of **tstools** contains a major overhaul of its export
functionality. Exporting time series to csv is up to 400 times faster than before,
thanks to some profiling sessions and the inclusion of data.table (thanks Matt Dowle for the
awesome package). While this may not be that big of an achievement for some, those
who export hundreds of thousands or millions of time series will like it.
Besides the default .csv, .json, .xlsx, and .RData are available.
.csv allows for wide format and transposed wide format output.
## Csv: long format (default), wide format, transposed wide format.
```{r,eval=FALSE}
data(KOF)
write_ts(KOF, file.path(tempdir(), "test_export"), "csv")
```
```{r,eval=FALSE}
write_ts(KOF, file.path(tempdir(), "test_export_wide_trans"),
"csv",
wide = TRUE,
transpose = TRUE
)
```
*transpose = TRUE* moves the time to the header (x-axis) and places all variables in rows
below each other. Transposing data is a good solution if you have a larger amount
of variables and at max about 200-300 periods.
# Frequenctly asked questions (FAQs)
## 1. Can I combine **tsplot** calls with **ggplot** (themes)?
No, **tsplot** is base R based. You can simply add enhancements to the plot using base R calls. Add text, additional lines by calling functions such as `mtext()`, `abline()` etc.
## 2. I set `legend_col=1` but **tsplot** seems to ignore it. Why is that?
You're probably using 2 Y-axes. The default for two Y-axes is the have a left aligned and a right aligned axis. This allows you to leave out additions like '(left scale)', '(right scale)'. However, you can simply force all legends left.
```{r,eval=FALSE,echo=TRUE}
init_tsplot_theme(
legend_all_left = TRUE,
legend_col = 1
)
```
## 3. How can I change of my lines, bars and areas?
We have multiple pre-defined vectors for different chart types.
All of them are easy to adjust. Also
```{r,eval=FALSE,echo=TRUE}
init_tsplot_theme(
band_fill_color = c("#FF0000", "#00FF00"),
line_colors = c("#FF0000", "#00FF00"),
bar_fill_color = c("#FF0000", "#00FF00")
)
```