Skip to content

t-kuculo/QuoteKG

Repository files navigation

This is the code for creating QuoteKG, a multilingual knowledge graph of quotations. QuoteKG includes nearly one million quotes in 55 languages, said by more than 69,000 people of public interest across a wide range of topics.

Use the WikiquoteDumper first to download Wikiquote dumps in any languages and convert them into JSON format.

Steps

Each step produces intermediate files that can be changed with alterations to the parameters.

  • Run WikiquoteDumper to get language specific json files containing all the quotes
  • Run the WikiquoteToWikidataMapCreator to get a mapping from Wikiquote to Wikidata IDs
  • To separate the json files into files representing people and their quotes run python preprocessing.py
  • To create the quotation corpus pickle file run python main.py
  • To create a mapping between Wikidata IDs and DBpedia/Wikipedia IDs, run sh create_same_as_all_wikis.sh
  • To create the knowledge graph triples, update the parameters "current_date" and "file" in kg_creation.py and then run python kg_creation.py
  • To get the F1 scores for the quote alignment run python evaluation.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published