Skip to content

How to Use

Marcel Heinz edited this page Mar 11, 2019 · 5 revisions
  • Required technology:

  • How to reproduce results:

    1. The file src/data/init.py serves as the core configuration. You need to enter depth level, and root categories.
    2. Run src/mine/pipeline.py. The process creates an annotated dictionary of article titles. Most data is mined from DBpedia.
    3. Run src/check/seed.py for annotating whether an articles is a seed.
    4. src/classify/decision_tree.py configures the decision tree classifier.

Be careful when inspecting other scripts. Many scripts explore indication directly in an active learning manner.

Having the titles as keys of article dictionaries allows convenient querying in the Python Console of Pycharm. For example: Get all articles with 'language' as the retrieved hypernym:

from data import load_articledict`
ad = load_articledict()
[a for a in articledict if "COPHypernym" in ad[a] and "language" in ad[a]["COPHypernym"]]

Clone this wiki locally