mortchartgen is a tool which is used to create charts of mortality trends for different countries, age groups and causes of death based on data from WHO Mortality Database. The tool uses Pandas and matplotlib to generate the charts and stores the data in a MySQL database. A YAML configuration file is used to specify the charts to be generated.
I use the tool to generate charts for the website Mortalitetsdiagram ("mortality charts"). Files for this site (excluding the SVG charts themselves) are included in the subdirectory
mortchart-site. Currently, the generated charts are in Swedish. I am not affiliated with WHO, and they are not responsible for any interpretations of mortality trends based on charts generated from the tool. The tool is licensed under an ISC license.
It is assumed that you have a working Python setup, as well as access to a MySQL/MariaDB server, with a user privileged to create databases. The unzipped data files will require about 500 MB of disk space. The script
subprocess, and the main script,
download.py [directory]in order to download the data files and documentation from the WHO website and unzip them into
- Read the SQL file
setupdb.sqlinto the MySQL client, e.g.
mysql --defaults-extra-file=tableimp.cnf < setupdb.sql. This will create a database
Morticdwith two tables,
Deaths, as well as a user
whomuserwith select rights granted on these tables, which is used for the SQL queries in the chart generator.You can use the provided file
tableimp.cnfin this step and the next, as shown in the example, but then you have to adjust the relevant settings in the file (e.g. user, password, host and socket) in order for the database connection to work. For more information about the fields in the tables, consult the WHO documentation.
- Load the unzipped data files into the newly created tables. The file
popshould be loaded into the table
Pop, and the files with names starting with
Mortshould all be loaded into the table
Deaths. The script
tableimp.pyloops through the data files and reads them into the tables using
mysqlimport. You can call the script with
tableimp.py [directory], where
directoryis the download directory specified in step 1. The default configuration is to read the files locally from the client, and this has to be supported by the MySQL server. Otherwise, move the files into a location where the server can read them directly and remove the
tablemod.py. This stores tables of population and number of deaths (for the populations and cause-of-deaths groups specified in
chartgen.yaml) in a SQLite database,
chartgen.db. This speeds up the chart generation (see below) by avoiding repeated querying of the MySQL database with regular expressions. Some values in the dictionary
chartgen.yaml) may also have to be changed in order for the database connection to work. In particular, you should change
unix_socketto suit your MySQL server.
##Generate the charts
Call the function
chartgen.py in order to generate the charts. This function is automatically called if
chartgen.py is invoked from the system shell. The charts are saved as SVG files in the subdirectory
mortchart-site/charts. If you want to skip certain countries, age groups or causes of death, comment out the relevant lines in
chartgen.yaml. However, the cause
all cannot be excluded, because it is used to compute percentage of total deaths for other causes.
chartgen.py will save the dataframes used to generate the charts as CSV files in the subdirectory
csv, so that they can be further analysed in other programs.
##Special charts with R
The R script
specchartgen.r demonstrates how the generated CSV files can be used. It contains the functions
agetrends.plot which generates charts showing secular trends for a given combination of sex, cause and a interval of 5-year age groups,
sexratio.trends.plot which generates charts showing secular trends for sex ratios for mortality rates/percentages, and
ctrisyear.plot which generates charts giving a comparison of mortality between countries for a given cause and year. It can generate scatterplots of female vs male mortality or bar charts for a single sex. The function
ctrisyear.plot to generate charts for all causes and age groups in
chartgen.yaml and for all years in a given sequence and export these as SVG files in the subdirectory
mortchart-site/charts/ctriesyr. The function
causedist.plot generates charts of the age-specific distribution of causes of death for a given country, sex and year. By default, the list of causes is read from
specchartgen.r can be used to perform so-called Gompertzian analysis of mortality trends. By calling the function
mortparams.py, results with parameters can be plotted using the TeX facilities in matplotlib.
In additions to packages used by
mortparams.py imports rpy2 for communication with R. The model is fitted with Levenberg-Marquardt nonlinear least-squares (using minpack.lm). If
lmortfunc.test is called with
mortfunc = 'weibull', the mortality data is fitted to the two-parameter Weibull function instead of the Gompertz function (cf. Juckett and Rosenberg (1993)). It is also possible to fit survivorship curves, for the subpopulation who dies of a particular cause, instead of mortality curves, if
lmortfunc.test is called with
type = 'surv'. Fit of mortality curves corresponding to these survival curves (i.e. normalized to the fraction dying of the given cause) can be obtained by calling the function with
type = rate (the default) and
normrate = TRUE. For this normalization, life tables are constructed using LifeTables.
pc = 'p' or
pc = 'c', analysis can be fitted by period or birth cohort: the latter is only implemented for unnormalized mortality curves, however.
mortparams.py on an object returned by
paramsplot it is possible to plot observed data for a list of years versus the predictions made by the non-linear regressions.
##Generate the index page and documentation source
mortchart-site/indexgen.py you can generate
mortchart_site based on the settings in
chartgen.yaml and the templates
mortchart-site/jinjatempl(which use Jinja2). The first file contains a bare form, which you can use to search among the charts in a web browser, and the second file can be used to generate the site documentation in PDF or HTML format.
make pdfbib in
mortchart-site in order to generate PDF documentation from the Markdown source. This requires a LaTeX distribution as well as Pandoc (in order to convert Markdown). The HTML documentation is generated automatically when the site is built (see below).
##Generate the Mortchart site
The full site is now generated using Hakyll, a static site generator which is tightly integrated with Pandoc and uses the Haskell compiler GHC. To generate the site for the first time, run
make buildinit in the directory
mortchart-site (it will be generated in
mortchart-site/_site). The program assumes that the charts (both those made by
chartgen.py and those made by
specchartgen.r) have been generated. To update the site, run
make build; if you modify
site.hs, update with