Skip to content

Create Mapping Statistics

Daniel Fleischhacker edited this page May 23, 2014 · 5 revisions

Create mapping statistics

This guide describes how to generate the mappings statistics as displayed as shown at


  1. Update extraction framework to newest version from GitHub
  2. Make sure newest version of all modules are compiled and installed locally (install-run)
  3. Download most recent ontology from mapping wiki
    1. cd core
    2. ../run download-ontology
    3. Commit new ontology version
  4. Download most recent mappings from mapping wiki
    1. cd ../core
    2. ../run download-mappings
    3. Commit new mappings
  5. Download most current Wikipedia dumps
    1. cd ../dump
    2. Choose one of the download.*.properties based on set of relevant Wikipedia language versions
    3. Adapt download path in property file
    4. ../run download config=download.*.properties
  6. Start extraction limited to data required for mapping statistics
    1. cd ../dump
    2. adapt "base-dir" to download directory
    3. adapt "source" parameter, default is NOT .xml.bz2 but .xml though stated differently!!!!
    4. ../run stats-extraction
  7. Start statistics extraction
    1. cd ../server
    2. Adapt base dir in pom.xml for launcher "stats" to download directory used in previous step
    3. ../run stats
  8. In case you want to run the statistics server on a different system than the one you created the statistics on, copy the mappingstats_* files from folder server/main/src/statistics/ on the generation server to the same folder on the hosting server
  9. Start statistics server
    1. cd ../server
    2. Adapt server URI in pom.xml for launcher "server"
    3. If you want to prefer IPv4 on a machine which also supports IPv6 define environment variable _JAVA_OPTIONS first: export _JAVA_OPTIONS=''
    4. ../run server
    5. The server is now available at the defined URI