The Tornado Project

The goal of this project is to bring consistency and transparency to the analyses of publicly available Tornado data using an R-based analysis. The idea is to replicate the results from literature studies of relevance. By doing so, not only are these studies validated but also errors, if any, are found. Moreover, one could build on the foundation laid by this project to explore new theories. The initial focus is on probabilistic and stochastic analysis of the tornado hazard towards risk modeling.


Tornado data available from the NOAA Storm Prediction Center and other government agencies around the world has been the focus of many studies (e.g., see select literature). However, there are a number of issues with the culture of publishing only the end result of a scientific analysis:

  • The data itself is changing - both in quantity (additional data added every season) and quality (quality control measures appear to have been applied in the recent past, particularly to the data prior to the 1990s). Hence, it is not possible to exactly, or sometimes even approximately, reproduce the results of these studies. Since the data size is annually changing, it makes more sense to have the analyses revised annually as well.
  • With the exception of a few (thanks to R user Prof. James Elsner and colleagues), none of the studies provide the code used in their analysis. Hence, it is not possible to build on top of existing work and it is not easy to explore or test alternative theories.
  • During the course of this project several literature studies were found to have errors; sometimes, such errors were found to seriously question the final results of the study.
  • Data prior to 1950 in the United States appears to be available only on microfilm. Hopefully, through or due to this effort, some day this data becomes more widely available.


Immediate Next Steps

  • Replicate analysis of Meyer et al (2002)