No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Dict
MediaWiki
StarDict
TEI
Wiktionary
nginx
t
tools
www
README
Wiktionary.pm
fetch_ru_wiktionary_page.pl
generate_ru-any_templates.pl
get_article.pl
get_list.pl
make_test.pl
mw_parser_test.pl
parse_article.pl
process_any-ru.pl
process_ru-any.pl
ru_export_lang_names.pl

README

This is the http://wiktionary-export.nataraj.su/ dictionary conversion
software. It reads wiktionary data through the MediaWiki API and 
converts the data from the websites into TEI and StarDICT formats. 
Current the software focuses entirely on russian ('ru') language 
conversions. For more information see 
http://wiktionary-export.nataraj.su/

INSTALL

The software assumes a modern version of perl is installed and 
requires the MediaWiki::API perl library which in turn requires 
LWP::UserAgent URI::Escape JSON Encode and Carp

A  test suite ./make_test.pl  run some of the software's internal tests.

The software makes extensive use of language codes. These are
the same language codes as used by http://www.wiktionary.org/ and
also by xml:lang tags in XML. Some scips is run on a per-language
basis, as in:

./process_any-ru.pl en 

but

./process_ru-any.pl

as wiktionary keeps all translation for base language in one article

Due the the use of a shared output directory, only one script 
should be run at once.