-
Notifications
You must be signed in to change notification settings - Fork 0
Make your own Aard file
yogiks edited this page May 31, 2016
·
1 revision
Below are the instructions which can help you to generate your own Wiktionary aard file(Kannada wiktionary used here as an example) to use it in Aard from Wikimedia xml dumps. I used Ubuntu 14.04 64-bit.
- Get the latest Kannada wiktionary xml dump here which has all the pages/articles of Kannada wiktionary.
wget https://dumps.wikimedia.org/knwiktionary/latest/knwiktionary-latest-pages-articles.xml.bz2
- Install the pre-requisites
sudo apt-get install libicu-dev
sudo apt-get install build-essential
sudo apt-get install python-dev
sudo apt-get install python-virtualenv
sudo apt-get install libevent-dev libxml2-dev libxslt1-dev
- Create Python virtual environment:
virtualenv env-aard
- Activate it:
source env-aard/bin/activate
- Install Aard Tools:
sudo pip install -e git+git://github.com/aarddict/tools.git#egg=aardtools
- Get Kannada wiktionary site info:
sudo aard-siteinfo kn.wiktionary.org > knwiktionary.json
- Build mwlib article database:
sudo mw-buildcdb --input knwiktionary-latest-pages-articles.xml.bz2 --output knwiktionary-latest.cdb
- Since Kannada wiktionary is being compiled, make sure your system has Kannada(kn) locale:
sudo locale-gen kn
- Compile Kannada dictionary from the wiktionary article database:
sudo aardc wiki knwiktionary-latest.cdb knwiktionary.json
Verify that resulting dictionary has good metadata (description, license, source url), that “View Online” action works by opening it in Aard dictionary. For more detailed instructions on using Aard tools, have a look at this awesome documentation in Aard site. I've referred the same to write this.