-
Notifications
You must be signed in to change notification settings - Fork 7
Getting started Wiktionary parser
Andrew Krizhanovsky edited this page Jul 8, 2017
·
16 revisions
This guide helps you convert the database of Wiktionary into the machine-readable format. English Wiktionary and Russian Wiktionary are supported now.
A step-by-step guide to launching your Wiktionary parser:
- Download the latest Wiktionary dump (...-pages-articles.xml.bz2, ...-pagelinks.sql.gz and ...-categorylinks.sql.gz)
- MySQL import ‒ Import Wiktionary database into local MySQL database
- File wikt_parsed_empty_sql ‒ Load empty Wiktionary parsed database into MySQL
- Setup NetBeans for parsing
Optional:
- MySQL Workbench - how to create the empty SQL-file for the Wiktionary parsed database
- SQLite - convert the Wiktionary parsed database (MySQL) into SQLite file
- JNLP and Java WebStart - create wiwordik-XX.jar and wiwordik-XX.jnlp files
- SQL examples
- Index wordlist index_native ‒ index wordlist for each language (tables index_native, index_de, index_fr, etc.)