Replicate Verbout et al 2006

Gopi Goteti edited this page Jan 21, 2014 · 1 revision

Stephanie Verbout and others used data from the Storm Prediction Center to examine the evolving nature of the historical tornado record in the United States (Verbout, Stephanie M., et al. "Evolution of the US tornado database: 1954-2003." Weather and forecasting 21.1 (2006): 86-93.).

The below code produces numbers consistent, if not identical, to those from Verbout et al. The discrepancy could be due to the differences in Tornado data used (due to regular quality control the latest data used here may not be identical to the one used by SPC).

Read raw data add column names similar to those used by Elsner et al 2013 and consistent with documentation.

library(plyr)

torn <- read.csv("data/1950-2012_torn.csv", header = FALSE, sep = ",", as.is = TRUE)

colnames(torn) <- c("OM", "YEAR", "MONTH", "DAY", "DATE", "TIME", "TIMEZONE", 
    "STATE", "FIPS", "STATENUMBER", "FSCALE", "INJURIES", "FATALITIES", "LOSS", 
    "CROPLOSS", "SLAT", "SLON", "ELAT", "ELON", "LENGTH", "WIDTH", "NS", "SN", 
    "SG", "F1", "F2", "F3", "F4")

Tornadoes spanning multiple counties are listed separately for each county. Thus, a single tornado could appear multiple times. Identify unique tornadoes using YEAR, OM and NS. Check for uniqueness; especially, tornadoes spanning multiple years (i.e, those which begin on 12/31 and end on 1/1); need to check only those with NS > 1.

dec31 <- subset(torn, MONTH == 12 & DAY == 31 & NS != 1)
jan01 <- subset(torn, MONTH == 1 & DAY == 1 & NS != 1)
if (nrow(dec31) > 0 & nrow(jan01) > 0) {
    stop("check! unique id assignment may not be accurate!")
}
torn$id <- paste(torn$YEAR, torn$MONTH, torn$OM, torn$NS, sep = "-")

str(torn)
## 'data.frame':    58169 obs. of  29 variables:
##  $ OM         : int  1 1 1 2 3 4 5 6 7 8 ...
##  $ YEAR       : int  1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 ...
##  $ MONTH      : int  1 1 1 1 1 1 1 1 1 2 ...
##  $ DAY        : int  3 3 3 3 3 13 25 25 26 11 ...
##  $ DATE       : chr  "1950-01-03" "1950-01-03" "1950-01-03" "1950-01-03" ...
##  $ TIME       : chr  "11:00:00" "11:00:00" "11:10:00" "11:55:00" ...
##  $ TIMEZONE   : int  3 3 3 3 3 3 3 3 3 3 ...
##  $ STATE      : chr  "MO" "MO" "IL" "IL" ...
##  $ FIPS       : int  29 29 17 17 39 5 29 17 48 48 ...
##  $ STATENUMBER: int  1 1 1 2 1 1 2 3 1 2 ...
##  $ FSCALE     : int  3 3 3 3 1 3 2 2 2 2 ...
##  $ INJURIES   : int  3 3 0 3 1 1 5 0 2 0 ...
##  $ FATALITIES : int  0 0 0 0 0 1 0 0 0 0 ...
##  $ LOSS       : num  6 6 5 5 4 3 5 5 0 4 ...
##  $ CROPLOSS   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ SLAT       : num  38.8 38.8 38.8 39.1 40.9 ...
##  $ SLON       : num  -90.2 -90.2 -90.1 -89.3 -84.6 ...
##  $ ELAT       : num  38.8 38.8 38.8 39.1 0 ...
##  $ ELON       : num  -90 -90.1 -90 -89.2 0 ...
##  $ LENGTH     : num  9.5 6.2 3.3 3.6 0.1 0.6 2.3 0.1 4.7 9.9 ...
##  $ WIDTH      : int  150 150 100 130 10 17 300 100 133 400 ...
##  $ NS         : int  2 2 2 1 1 1 1 1 1 1 ...
##  $ SN         : int  0 1 1 1 1 1 1 1 1 1 ...
##  $ SG         : int  1 2 2 1 1 1 1 1 1 1 ...
##  $ F1         : int  0 189 119 135 161 113 93 91 47 39 ...
##  $ F2         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ F3         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ F4         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ id         : chr  "1950-1-1-2" "1950-1-1-2" "1950-1-1-2" "1950-1-2-1" ...
head(torn)
##   OM YEAR MONTH DAY       DATE     TIME TIMEZONE STATE FIPS STATENUMBER
## 1  1 1950     1   3 1950-01-03 11:00:00        3    MO   29           1
## 2  1 1950     1   3 1950-01-03 11:00:00        3    MO   29           1
## 3  1 1950     1   3 1950-01-03 11:10:00        3    IL   17           1
## 4  2 1950     1   3 1950-01-03 11:55:00        3    IL   17           2
## 5  3 1950     1   3 1950-01-03 16:00:00        3    OH   39           1
## 6  4 1950     1  13 1950-01-13 05:25:00        3    AR    5           1
##   FSCALE INJURIES FATALITIES LOSS CROPLOSS  SLAT   SLON  ELAT   ELON
## 1      3        3          0    6        0 38.77 -90.22 38.83 -90.03
## 2      3        3          0    6        0 38.77 -90.22 38.82 -90.12
## 3      3        0          0    5        0 38.82 -90.12 38.83 -90.03
## 4      3        3          0    5        0 39.10 -89.30 39.12 -89.23
## 5      1        1          0    4        0 40.88 -84.58  0.00   0.00
## 6      3        1          1    3        0 34.40 -94.37  0.00   0.00
##   LENGTH WIDTH NS SN SG  F1 F2 F3 F4         id
## 1    9.5   150  2  0  1   0  0  0  0 1950-1-1-2
## 2    6.2   150  2  1  2 189  0  0  0 1950-1-1-2
## 3    3.3   100  2  1  2 119  0  0  0 1950-1-1-2
## 4    3.6   130  1  1  1 135  0  0  0 1950-1-2-1
## 5    0.1    10  1  1  1 161  0  0  0 1950-1-3-1
## 6    0.6    17  1  1  1 113  0  0  0 1950-1-4-1
tail(torn)
##           OM YEAR MONTH DAY       DATE     TIME TIMEZONE STATE FIPS
## 58164 425349 2012    12  25 2012-12-25 20:09:00        3    AL    1
## 58165 425351 2012    12  25 2012-12-25 20:26:00        3    AL    1
## 58166 425352 2012    12  25 2012-12-25 20:32:00        3    AL    1
## 58167 425358 2012    12  25 2012-12-25 22:24:00        3    AL    1
## 58168 425359 2012    12  25 2012-12-25 22:29:00        3    AL    1
## 58169 421430 2012    12  26 2012-12-26 13:58:00        3    NC   37
##       STATENUMBER FSCALE INJURIES FATALITIES LOSS CROPLOSS  SLAT   SLON
## 58164           0      1        0          0 0.00        0 32.05 -86.85
## 58165           0      0        0          0 0.00        0 32.15 -86.68
## 58166           0      2        0          0 0.00        0 32.20 -86.60
## 58167           0      2        2          0 0.00        0 31.73 -86.15
## 58168           0      1        0          0 0.00        0 32.35 -86.05
## 58169           0      1        0          0 0.01        0 34.79 -76.67
##        ELAT   ELON LENGTH WIDTH NS SN SG  F1 F2 F3 F4               id
## 58164 32.07 -86.82   2.03   300  1  1  1  85  0  0  0 2012-12-425349-1
## 58165 32.16 -86.67   0.62    75  1  1  1  85  0  0  0 2012-12-425351-1
## 58166 32.31 -86.51   9.55   900  1  1  1  85  0  0  0 2012-12-425352-1
## 58167 31.88 -85.96  15.70   600  1  1  1 109  0  0  0 2012-12-425358-1
## 58168 32.36 -86.04   0.72   100  1  1  1 101  0  0  0 2012-12-425359-1
## 58169 34.80 -76.67   0.91   100  1  1  1  31  0  0  0 2012-12-421430-1

Function to summarize counts of unique number of tornadoes by year and month.

count_unique_tornadoes <- function(in_df, monthly_stats = TRUE) {

    require(plyr)

    if (monthly_stats) {
        # some months dont have data; assign NAs to those months
        mon_totals <- expand.grid(MONTH = seq(1, 12), stringsAsFactors = FALSE)

        # number of unique tornadoes per month
        mon_torn <- ddply(.data = in_df, .variables = .(MONTH), .fun = function(x_df) length(unique(x_df$id)), 
            .drop = FALSE)

        mon_totals <- merge(mon_totals, mon_torn, by = "MONTH", all = TRUE)

        # output matrix
        out_mat <- c(nrow(in_df), length(unique(in_df$id)), mon_totals$V1)
        out_mat <- matrix(out_mat, nrow = 1)
        colnames(out_mat) <- c("N_total", "N_unique", month.abb)
    } else {
        # output matrix
        out_mat <- c(nrow(in_df), length(unique(in_df$id)))
        out_mat <- matrix(out_mat, nrow = 1)
        colnames(out_mat) <- c("N_total", "N_unique")
    }

    return(out_mat)
}

Reproduce stats on counts of tornadoes produced by Verbout et al in Figure 1 of their paper. Counts of tornadoes by F-scale.

events_all <- ddply(.data = torn, .variables = .(YEAR), .fun = count_unique_tornadoes, 
    monthly_stats = FALSE)

# F1 or greater events
events_f1 <- ddply(.data = subset(torn, !(FSCALE %in% c("-9", "0"))), .variables = .(YEAR), 
    .fun = count_unique_tornadoes, monthly_stats = FALSE)
head(events_f1)
##   YEAR N_total N_unique
## 1 1950     187      182
## 2 1951     197      196
## 3 1952     210      204
## 4 1953     357      337
## 5 1954     457      439
## 6 1955     415      402
# F2 or greater events
events_f2 <- ddply(.data = subset(torn, !(FSCALE %in% c("-9", "0", "1"))), .variables = .(YEAR), 
    .fun = count_unique_tornadoes, monthly_stats = FALSE)
head(events_f2)
##   YEAR N_total N_unique
## 1 1950     104       99
## 2 1951     109      108
## 3 1952     130      124
## 4 1953     212      193
## 5 1954     251      233
## 6 1955     215      202
# F3 or greater events
events_f3 <- ddply(.data = subset(torn, !(FSCALE %in% c("-9", "0", "1", "2"))), 
    .variables = .(YEAR), .fun = count_unique_tornadoes, monthly_stats = FALSE)
head(events_f3)
##   YEAR N_total N_unique
## 1 1950      33       31
## 2 1951      27       27
## 3 1952      58       52
## 4 1953      70       59
## 5 1954      60       46
## 6 1955      49       39
# F4 or greater events
events_f4 <- ddply(.data = subset(torn, !(FSCALE %in% c("-9", "0", "1", "2", 
    "3"))), .variables = .(YEAR), .fun = count_unique_tornadoes, monthly_stats = FALSE)
head(events_f4)
##   YEAR N_total N_unique
## 1 1950       7        7
## 2 1951       5        5
## 3 1952      20       18
## 4 1953      27       22
## 5 1954      11        7
## 6 1955      18       10
event_stats <- cbind(events_all, events_f1, events_f2, events_f3, events_f4)
event_stats <- event_stats[, c(1, 3, 6, 9, 12, 15)]
colnames(event_stats) <- c("YEAR", "events_all", "F1+", "F2+", "F3+", "F4+")

str(event_stats)
## 'data.frame':    63 obs. of  6 variables:
##  $ YEAR      : int  1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 ...
##  $ events_all: int  201 260 240 422 550 593 504 858 564 604 ...
##  $ F1+       : int  182 196 204 337 439 402 368 600 383 415 ...
##  $ F2+       : int  99 108 124 193 233 202 198 331 185 193 ...
##  $ F3+       : int  31 27 52 59 46 39 51 98 40 41 ...
##  $ F4+       : int  7 5 18 22 7 10 13 26 5 7 ...
event_stats
##    YEAR events_all F1+ F2+ F3+ F4+
## 1  1950        201 182  99  31   7
## 2  1951        260 196 108  27   5
## 3  1952        240 204 124  52  18
## 4  1953        422 337 193  59  22
## 5  1954        550 439 233  46   7
## 6  1955        593 402 202  39  10
## 7  1956        504 368 198  51  13
## 8  1957        858 600 331  98  26
## 9  1958        564 383 185  40   5
## 10 1959        604 415 193  41   7
## 11 1960        616 451 221  48   7
## 12 1961        697 520 298  72   6
## 13 1962        657 442 201  39   6
## 14 1963        463 343 181  35   5
## 15 1964        704 508 266  51  12
## 16 1965        897 604 346  98  31
## 17 1966        585 374 176  31   7
## 18 1967        926 577 313  72  17
## 19 1968        657 451 214  51  12
## 20 1969        608 396 193  54   7
## 21 1970        653 461 234  44   9
## 22 1971        889 697 320  82  11
## 23 1972        741 567 222  41   4
## 24 1973       1102 883 386  85  14
## 25 1974        945 720 339 131  36
## 26 1975        919 609 236  45  11
## 27 1976        834 572 231  63  16
## 28 1977        852 514 182  47  10
## 29 1978        789 353 128  24   5
## 30 1979        855 468 150  34   6
## 31 1980        866 597 199  37   5
## 32 1981        782 499 179  30   7
## 33 1982       1047 673 252  65   7
## 34 1983        931 581 209  62   4
## 35 1984        907 534 182  56  15
## 36 1985        684 376 117  38   9
## 37 1986        765 411 140  24   3
## 38 1987        656 316  77  15   3
## 39 1988        702 419 108  27   3
## 40 1989        856 487 123  21  11
## 41 1990       1133 596 211  56  15
## 42 1991       1132 444 149  46   7
## 43 1992       1297 599 186  57  14
## 44 1993       1173 440 116  36   6
## 45 1994       1082 388 116  35   5
## 46 1995       1236 414 129  31  11
## 47 1996       1173 430 117  23   3
## 48 1997       1148 405 124  39  10
## 49 1998       1424 541 159  43   8
## 50 1999       1339 509 186  64  13
## 51 2000       1075 352  85  23   3
## 52 2001       1215 404 126  29   6
## 53 2002        933 311  96  31   5
## 54 2003       1374 483 128  35   8
## 55 2004       1817 601 131  28   5
## 56 2005       1263 448 105  21   1
## 57 2006       1103 417 125  32   2
## 58 2007       1098 424 125  32   5
## 59 2008       1692 708 210  59  10
## 60 2009       1156 454 106  22   2
## 61 2010       1280 511 170  45  13
## 62 2011       1692 897 280  84  23
## 63 2012        940 364 123  28   4