Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Tirol, Austria — Obituary Notices

This is the source code of an ongoing investigation of obituary notices in Tirol, Austria.

I'm making the source code available for transparency and reproducability. If you plan to use this yourself, please make sure you read and understand the LICENSE. This is a Citizen Science project. Consider this as highly experimental prototype code.

The aim of this project is to collect, clean and represent this data in a way so it's useful for data journalists and researchers so we have another data vector to investigate and learn more about the COVID19 pandemic.


Charts and Analysis Results:

Todesanzeigen Tirol pro Woche

Todesanzeigen Tirol pro Bezirk pro Woche

Todesanzeigen Tirol pro Gemeinde pro Woche in Landeck


This is about scraping the raw HTML of web pages hosting obituary notices.

On each run, the crawlers will create a directory in the form of scrape/scrape_<crawler_name>_YYYYMMDD_HHMMSS and save each HTML page in files named [1|2|...].html. The crawlers default to scrape 10 pages, this can be customized using a command line parameter:

./bin/ 500
./bin/ 800


The aim of the parsing step is to transform the scraped HTML into consumable CSV data.


This file runs several sub-scripts to parse the different formats retrieved during the crawling step and transforms them to the same format of date,municipaly,district,hash. The names are cheaply hashed to available for comparison in the final deduplication step.

Note the crawlers will create directory names dynamically including the current dates. To parse them, you'll have to add the correct directory names as arguments for the individual scripts in ./bin/

The same needs to be done for the deduplication script ./src/parse_deduplicate.js. In this file you'll need to adapt the filenames of the individual parse CSV files and the desired output filename.

Analysis & Results

The CSV produced by the parsing step (data/tirol_obituaries_deduped.csv) is the source for a Jupyter Notebook for further investigation to produce the VEGA/Altair charts (note the charts don't get rendered in the GitHub preview of the notebook).

The final result of the processed data can be found here:


Charts and further analysis:

Dev Help

If you're running into trouble with Altair not being able to save PNGs from the notebooks, try to run this:

npm install -g --force vega-lite vega-cli canvas


Tirol, Austria — Obituary Notices Analysis




No releases published


No packages published