a tool for tracking the changes of values in infoboxes from Wikipedia
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
src
.gitignore
README.md
pom.xml

README.md

InfoboxProvenanceTracking

a tool for tracking the changes of values in infoboxes from Wikipedia

USAGE

After compiling, the .jar can be used with the following parameters

-earlier, -e
      Earliest timestamp (Date in yyyy-MM-dd) to extract
      Default: 2001-01-02
-help, -h
      Print help information and exit
-language, -lang
      Dump Language
      Default: en
-lastchange, -last
      Only last change to an existing triple will be saved
      Default false
-later, -l
      Last timestamp(Date in yyyy-MM-dd) to extract
      Default: Current date
      This parameter is also a possibility to trim the number of loaded revisions. In case it is set to 2015-01-01. The program       loads all revisions from 2001-01-02 until 2015-01-01 excluding. The Wikipedia-Api doesn't support a trim of the lower       boarder.
-name, -a
      Name of the Article
-path
      Path to the dump containing directory
-threads, -t
      Number of threads to run
      Default: 1
-threadsF, -tf
      Number of parallel processed files
      Default: 1

e.g.:
-name miele -last
-path /src/test/resources/inputde -lang de -earlier 2014-10-09 -later 2016-12-30