Skip to content

Make your own Aard file

yogiks edited this page May 31, 2016 · 1 revision

Instructions

Below are the instructions which can help you to generate your own Wiktionary aard file(Kannada wiktionary used here as an example) to use it in Aard from Wikimedia xml dumps. I used Ubuntu 14.04 64-bit.

wget https://dumps.wikimedia.org/knwiktionary/latest/knwiktionary-latest-pages-articles.xml.bz2
  • Install the pre-requisites
sudo apt-get install libicu-dev

sudo apt-get install build-essential

sudo apt-get install python-dev

sudo apt-get install python-virtualenv

sudo apt-get install libevent-dev libxml2-dev libxslt1-dev
  • Create Python virtual environment:
virtualenv env-aard
  • Activate it:
source env-aard/bin/activate
  • Install Aard Tools:
sudo pip install -e git+git://github.com/aarddict/tools.git#egg=aardtools
  • Get Kannada wiktionary site info:
sudo aard-siteinfo kn.wiktionary.org > knwiktionary.json
  • Build mwlib article database:
sudo mw-buildcdb --input knwiktionary-latest-pages-articles.xml.bz2 --output knwiktionary-latest.cdb
  • Since Kannada wiktionary is being compiled, make sure your system has Kannada(kn) locale:
sudo locale-gen kn
  • Compile Kannada dictionary from the wiktionary article database:
sudo aardc wiki knwiktionary-latest.cdb knwiktionary.json

Verify that resulting dictionary has good metadata (description, license, source url), that “View Online” action works by opening it in Aard dictionary. For more detailed instructions on using Aard tools, have a look at this awesome documentation in Aard site. I've referred the same to write this.

Clone this wiki locally