Replicate NOAA SPC Stats

Gopi Goteti edited this page Jan 21, 2014 · 1 revision

The Storm Prediction Center published stats on number of tornadoes by year and month and also the number of fatalities and injuries. See SPC's stats here and here. The below code successfully reproduces most of the numbers from SPC. The discrepancy could be due to the differences in Tornado data used (due to regular quality control the latest data used here may not be identical to the one used by SPC).

Read raw data add column names similar to those used by Elsner et al 2013 and consistent with documentation.

library(plyr)

torn <- read.csv("data/1950-2012_torn.csv", header = FALSE, sep = ",", as.is = TRUE)

colnames(torn) <- c("OM", "YEAR", "MONTH", "DAY", "DATE", "TIME", "TIMEZONE", 
    "STATE", "FIPS", "STATENUMBER", "FSCALE", "INJURIES", "FATALITIES", "LOSS", 
    "CROPLOSS", "SLAT", "SLON", "ELAT", "ELON", "LENGTH", "WIDTH", "NS", "SN", 
    "SG", "F1", "F2", "F3", "F4")

Tornadoes spanning multiple counties are listed separately for each county. Thus, a single tornado could appear multiple times. Identify unique tornadoes using YEAR, OM and NS. Check for uniqueness; especially, tornadoes spanning multiple years (i.e, those which begin on 12/31 and end on 1/1); need to check only those with NS > 1.

dec31 <- subset(torn, MONTH == 12 & DAY == 31 & NS != 1)
jan01 <- subset(torn, MONTH == 1 & DAY == 1 & NS != 1)
if (nrow(dec31) > 0 & nrow(jan01) > 0) {
    stop("check! unique id assignment may not be accurate!")
}
torn$id <- paste(torn$YEAR, torn$MONTH, torn$OM, torn$NS, sep = "-")

Formatted data for use in subsequent analysis.

str(torn)
## 'data.frame':    58169 obs. of  29 variables:
##  $ OM         : int  1 1 1 2 3 4 5 6 7 8 ...
##  $ YEAR       : int  1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 ...
##  $ MONTH      : int  1 1 1 1 1 1 1 1 1 2 ...
##  $ DAY        : int  3 3 3 3 3 13 25 25 26 11 ...
##  $ DATE       : chr  "1950-01-03" "1950-01-03" "1950-01-03" "1950-01-03" ...
##  $ TIME       : chr  "11:00:00" "11:00:00" "11:10:00" "11:55:00" ...
##  $ TIMEZONE   : int  3 3 3 3 3 3 3 3 3 3 ...
##  $ STATE      : chr  "MO" "MO" "IL" "IL" ...
##  $ FIPS       : int  29 29 17 17 39 5 29 17 48 48 ...
##  $ STATENUMBER: int  1 1 1 2 1 1 2 3 1 2 ...
##  $ FSCALE     : int  3 3 3 3 1 3 2 2 2 2 ...
##  $ INJURIES   : int  3 3 0 3 1 1 5 0 2 0 ...
##  $ FATALITIES : int  0 0 0 0 0 1 0 0 0 0 ...
##  $ LOSS       : num  6 6 5 5 4 3 5 5 0 4 ...
##  $ CROPLOSS   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ SLAT       : num  38.8 38.8 38.8 39.1 40.9 ...
##  $ SLON       : num  -90.2 -90.2 -90.1 -89.3 -84.6 ...
##  $ ELAT       : num  38.8 38.8 38.8 39.1 0 ...
##  $ ELON       : num  -90 -90.1 -90 -89.2 0 ...
##  $ LENGTH     : num  9.5 6.2 3.3 3.6 0.1 0.6 2.3 0.1 4.7 9.9 ...
##  $ WIDTH      : int  150 150 100 130 10 17 300 100 133 400 ...
##  $ NS         : int  2 2 2 1 1 1 1 1 1 1 ...
##  $ SN         : int  0 1 1 1 1 1 1 1 1 1 ...
##  $ SG         : int  1 2 2 1 1 1 1 1 1 1 ...
##  $ F1         : int  0 189 119 135 161 113 93 91 47 39 ...
##  $ F2         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ F3         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ F4         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ id         : chr  "1950-1-1-2" "1950-1-1-2" "1950-1-1-2" "1950-1-2-1" ...
head(torn)
##   OM YEAR MONTH DAY       DATE     TIME TIMEZONE STATE FIPS STATENUMBER
## 1  1 1950     1   3 1950-01-03 11:00:00        3    MO   29           1
## 2  1 1950     1   3 1950-01-03 11:00:00        3    MO   29           1
## 3  1 1950     1   3 1950-01-03 11:10:00        3    IL   17           1
## 4  2 1950     1   3 1950-01-03 11:55:00        3    IL   17           2
## 5  3 1950     1   3 1950-01-03 16:00:00        3    OH   39           1
## 6  4 1950     1  13 1950-01-13 05:25:00        3    AR    5           1
##   FSCALE INJURIES FATALITIES LOSS CROPLOSS  SLAT   SLON  ELAT   ELON
## 1      3        3          0    6        0 38.77 -90.22 38.83 -90.03
## 2      3        3          0    6        0 38.77 -90.22 38.82 -90.12
## 3      3        0          0    5        0 38.82 -90.12 38.83 -90.03
## 4      3        3          0    5        0 39.10 -89.30 39.12 -89.23
## 5      1        1          0    4        0 40.88 -84.58  0.00   0.00
## 6      3        1          1    3        0 34.40 -94.37  0.00   0.00
##   LENGTH WIDTH NS SN SG  F1 F2 F3 F4         id
## 1    9.5   150  2  0  1   0  0  0  0 1950-1-1-2
## 2    6.2   150  2  1  2 189  0  0  0 1950-1-1-2
## 3    3.3   100  2  1  2 119  0  0  0 1950-1-1-2
## 4    3.6   130  1  1  1 135  0  0  0 1950-1-2-1
## 5    0.1    10  1  1  1 161  0  0  0 1950-1-3-1
## 6    0.6    17  1  1  1 113  0  0  0 1950-1-4-1
tail(torn)
##           OM YEAR MONTH DAY       DATE     TIME TIMEZONE STATE FIPS
## 58164 425349 2012    12  25 2012-12-25 20:09:00        3    AL    1
## 58165 425351 2012    12  25 2012-12-25 20:26:00        3    AL    1
## 58166 425352 2012    12  25 2012-12-25 20:32:00        3    AL    1
## 58167 425358 2012    12  25 2012-12-25 22:24:00        3    AL    1
## 58168 425359 2012    12  25 2012-12-25 22:29:00        3    AL    1
## 58169 421430 2012    12  26 2012-12-26 13:58:00        3    NC   37
##       STATENUMBER FSCALE INJURIES FATALITIES LOSS CROPLOSS  SLAT   SLON
## 58164           0      1        0          0 0.00        0 32.05 -86.85
## 58165           0      0        0          0 0.00        0 32.15 -86.68
## 58166           0      2        0          0 0.00        0 32.20 -86.60
## 58167           0      2        2          0 0.00        0 31.73 -86.15
## 58168           0      1        0          0 0.00        0 32.35 -86.05
## 58169           0      1        0          0 0.01        0 34.79 -76.67
##        ELAT   ELON LENGTH WIDTH NS SN SG  F1 F2 F3 F4               id
## 58164 32.07 -86.82   2.03   300  1  1  1  85  0  0  0 2012-12-425349-1
## 58165 32.16 -86.67   0.62    75  1  1  1  85  0  0  0 2012-12-425351-1
## 58166 32.31 -86.51   9.55   900  1  1  1  85  0  0  0 2012-12-425352-1
## 58167 31.88 -85.96  15.70   600  1  1  1 109  0  0  0 2012-12-425358-1
## 58168 32.36 -86.04   0.72   100  1  1  1 101  0  0  0 2012-12-425359-1
## 58169 34.80 -76.67   0.91   100  1  1  1  31  0  0  0 2012-12-421430-1

Function to summarize counts of unique number of tornadoes by year and month.

count_unique_tornadoes <- function(in_df, monthly_stats = TRUE) {

    require(plyr)

    if (monthly_stats) {
        # some months dont have data; assign NAs to those months
        mon_totals <- expand.grid(MONTH = seq(1, 12), stringsAsFactors = FALSE)

        # number of unique tornadoes per month
        mon_torn <- ddply(.data = in_df, .variables = .(MONTH), .fun = function(x_df) length(unique(x_df$id)), 
            .drop = FALSE)

        mon_totals <- merge(mon_totals, mon_torn, by = "MONTH", all = TRUE)

        # output matrix
        out_mat <- c(nrow(in_df), length(unique(in_df$id)), mon_totals$V1)
        out_mat <- matrix(out_mat, nrow = 1)
        colnames(out_mat) <- c("N_total", "N_unique", month.abb)
    } else {
        # output matrix
        out_mat <- c(nrow(in_df), length(unique(in_df$id)))
        out_mat <- matrix(out_mat, nrow = 1)
        colnames(out_mat) <- c("N_total", "N_unique")
    }

    return(out_mat)
}

Reproduce stats on number of unique events produced by SPC.

event_stats <- ddply(.data = torn, .variables = .(YEAR), .fun = count_unique_tornadoes)

str(event_stats)
## 'data.frame':    63 obs. of  15 variables:
##  $ YEAR    : int  1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 ...
##  $ N_total : int  206 261 246 443 570 606 518 868 576 614 ...
##  $ N_unique: int  201 260 240 422 550 593 504 858 564 604 ...
##  $ Jan     : int  7 2 12 14 2 3 2 17 11 16 ...
##  $ Feb     : int  20 10 27 16 17 4 47 5 20 20 ...
##  $ Mar     : int  21 6 43 40 62 43 31 38 15 43 ...
##  $ Apr     : int  15 26 37 47 113 99 85 216 76 30 ...
##  $ May     : int  61 57 34 94 101 148 79 228 68 226 ...
##  $ Jun     : int  28 76 34 111 107 153 65 147 128 73 ...
##  $ Jul     : int  23 23 27 32 45 49 92 55 121 63 ...
##  $ Aug     : int  13 27 16 24 49 33 42 20 46 38 ...
##  $ Sep     : int  3 9 1 5 21 15 16 17 24 58 ...
##  $ Oct     : int  2 2 NA 6 14 23 29 18 9 24 ...
##  $ Nov     : int  4 12 6 12 2 20 7 59 45 11 ...
##  $ Dec     : int  4 10 3 21 17 3 9 38 1 2 ...
event_stats
##    YEAR N_total N_unique Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1  1950     206      201   7  20  21  15  61  28  23  13   3   2   4   4
## 2  1951     261      260   2  10   6  26  57  76  23  27   9   2  12  10
## 3  1952     246      240  12  27  43  37  34  34  27  16   1  NA   6   3
## 4  1953     443      422  14  16  40  47  94 111  32  24   5   6  12  21
## 5  1954     570      550   2  17  62 113 101 107  45  49  21  14   2  17
## 6  1955     606      593   3   4  43  99 148 153  49  33  15  23  20   3
## 7  1956     518      504   2  47  31  85  79  65  92  42  16  29   7   9
## 8  1957     868      858  17   5  38 216 228 147  55  20  17  18  59  38
## 9  1958     576      564  11  20  15  76  68 128 121  46  24   9  45   1
## 10 1959     614      604  16  20  43  30 226  73  63  38  58  24  11   2
## 11 1960     619      616   9  28  28  70 201 125  42  48  21  18  25   1
## 12 1961     721      697   1  31 124  74 137 107  77  27  53  14  36  16
## 13 1962     667      657  12  25  37  41 200 171  78  51  24  11   5   2
## 14 1963     476      463  15   5  49  84  71  90  62  26  33  13  15  NA
## 15 1964     719      704  14   2  36 157 134 137  63  79  25  22  17  18
## 16 1965     918      897  21  32  34 123 273 147  85  61  64  16  34   7
## 17 1966     593      585   1  28  12  80  98 126 100  58  22  29  20  11
## 18 1967     939      926  39   8  42 149 116 210  90  28 139  36   8  61
## 19 1968     668      657   5   7  28 102 142 136  56  66  25  14  44  32
## 20 1969     617      608   3   5   8  68 145 137  98  70  20  26   5  23
## 21 1970     671      653   9  16  25 117  88 134  81  55  54  50  10  14
## 22 1971     904      889  19  83  40  75 166 199 100  50  47  38  16  56
## 23 1972     747      741  33   7  69  96 140 114 115  59  49  34  17   8
## 24 1973    1115     1102  33  10  80 150 250 224  80  51  69  25  81  49
## 25 1974     974      945  24  23  36 267 144 194  59 107  25  45  13   8
## 26 1975     924      919  52  45  84 108 188 196  79  60  34  12  39  22
## 27 1976     848      834  12  36 180 113 155 169  84  38  35  11  NA   1
## 28 1977     861      852   5  17  64  88 228 132  99  82  65  25  24  23
## 29 1978     802      789  23   7  17 107 213 148 143  65  20   7   9  30
## 30 1979     865      855  16   4  53 123 112 150 132 126  69  47  21   2
## 31 1980     876      866   5  11  41 137 203 217  95  73  37  43   3   1
## 32 1981     786      782   2  25  33  84 187 223  98  64  26  32   7   1
## 33 1982    1066     1047  18   3  60 150 329 196  95  34  38   9  19  96
## 34 1983     939      931  13  41  71  65 249 178  99  76  19  13  49  58
## 35 1984     925      907   1  27  73 176 169 242  72  47  17  49  30   4
## 36 1985     704      684   2   7  38 134 182  82  51 108  40  18  19   3
## 37 1986     779      765  NA  30  76  84 173 134  88  67  65  26  17   5
## 38 1987     662      656   6  19  38  20 126 132 163  63  19   1  55  14
## 39 1988     704      702  17   4  28  58 132  63 103  61  76  19 121  20
## 40 1989     872      856  14  18  43  82 231 252  59  36  31  30  57   3
## 41 1990    1151     1133  11  57  86 108 243 329 106  60  45  35  18  35
## 42 1991    1144     1132  29  11 157 204 335 216  64  46  26  21  20   3
## 43 1992    1312     1297  15  29  55  53 137 399 213 115  81  34 146  20
## 44 1993    1186     1173  17  34  48  85 177 313 242 112  65  55  19   6
## 45 1994    1094     1082  13   9  58 205 161 234 155 120  30  51  42   4
## 46 1995    1253     1236  36   7  49 130 393 216 162  53  19  74  79  18
## 47 1996    1181     1173  35  14  71 177 235 128 202  72 101  68  55  15
## 48 1997    1154     1148  50  23 102 114 225 193 188  84  32 100  25  12
## 49 1998    1440     1424  47  72  72 182 310 376  82  61 104  86  26   6
## 50 1999    1364     1339 212  22  56 176 310 289 100  79  56  17   7  15
## 51 2000    1079     1075  16  56 103 136 241 136 148  52  47  64  50  26
## 52 2001    1232     1215   5  31  33 135 240 249 120  69  84 117 110  22
## 53 2002     957      933   3   2  47 116 204  97  68  86  61  58  94  97
## 54 2003    1415     1374  NA  18  43 156 542 292 167  44  32  26  53   1
## 55 2004    1842     1817   3   9  50 125 509 268 124 179 295  79 150  26
## 56 2005    1267     1263  33  10  62 132 122 317 138 123 133  18 149  26
## 57 2006    1142     1103  47  12 147 244 139 120  70  80  84  76  42  42
## 58 2007    1117     1098  21  52 170 167 252 128  69  75  52  86   7  19
## 59 2008    1739     1692  84 147 129 189 462 292  95 101 111  21  15  46
## 60 2009    1182     1156   6  36 115 226 201 270 118  60   8  65   3  48
## 61 2010    1315     1280  30   1  33 139 305 321 146  55  57 108  53  32
## 62 2011    1777     1692  16  63  75 758 326 160 101  57  51  23  47  15
## 63 2012     957      940  79  57 155 206 121 111  37  38  39  37   7  53

Reproduce stats on fatalities and injuries produced by SPC. Note that Tornado segments could each have different fatalities and injuries.The data row corresp to the overall event has the max fatalities/injuries.

torn_fat <- aggregate(cbind(FATALITIES, INJURIES) ~ id, data = torn, FUN = max)
torn_fat$YEAR <- as.numeric(substr(torn_fat$id, 1, 4))

# aggregate by year and month
fat_stats <- aggregate(cbind(FATALITIES, INJURIES) ~ YEAR, data = torn_fat, 
    FUN = sum)


str(fat_stats)
## 'data.frame':    63 obs. of  3 variables:
##  $ YEAR      : num  1950 1951 1952 1953 1954 ...
##  $ FATALITIES: int  70 34 230 523 36 129 81 193 67 58 ...
##  $ INJURIES  : int  659 524 1915 5131 715 926 1308 1976 535 734 ...
fat_stats
##    YEAR FATALITIES INJURIES
## 1  1950         70      659
## 2  1951         34      524
## 3  1952        230     1915
## 4  1953        523     5131
## 5  1954         36      715
## 6  1955        129      926
## 7  1956         81     1308
## 8  1957        193     1976
## 9  1958         67      535
## 10 1959         58      734
## 11 1960         46      737
## 12 1961         52     1087
## 13 1962         30      551
## 14 1963         31      538
## 15 1964         73     1143
## 16 1965        301     5197
## 17 1966         98     2030
## 18 1967        114     2144
## 19 1968        131     2522
## 20 1969         66     1311
## 21 1970         73     1355
## 22 1971        159     2718
## 23 1972         27      976
## 24 1973         89     2396
## 25 1974        366     6824
## 26 1975         60     1457
## 27 1976         44     1195
## 28 1977         43      771
## 29 1978         53      919
## 30 1979         84     3014
## 31 1980         28     1157
## 32 1981         24      798
## 33 1982         64     1276
## 34 1983         34      756
## 35 1984        122     2499
## 36 1985         94     1299
## 37 1986         15      536
## 38 1987         59     1018
## 39 1988         32      688
## 40 1989         50     1270
## 41 1990         53     1177
## 42 1991         39      864
## 43 1992         39     1323
## 44 1993         33      948
## 45 1994         69     1074
## 46 1995         30      928
## 47 1996         26      703
## 48 1997         68     1033
## 49 1998        130     1868
## 50 1999         94     1838
## 51 2000         41      829
## 52 2001         40      743
## 53 2002         55      966
## 54 2003         54     1088
## 55 2004         35      395
## 56 2005         38      537
## 57 2006         67      988
## 58 2007         81      617
## 59 2008        126     1703
## 60 2009         22      350
## 61 2010         45      701
## 62 2011        553     5483
## 63 2012         70      822