Skip to content
Newer
Older
100644 119 lines (84 sloc) 6.43 KB
da0a41d @sandropaganotti First Release
authored Feb 28, 2010
1 Abacus
2 ======
3
4 Abacus is an xdxf parser and semantic toolset for Ruby.
5
6 Installation
7 ------------
8
9 gem install abacus
10
11 Abacus uses a sqllite database to store dictionary data; a config.yml file is used to determine the path of the database file, the default config.yml lies within the gem directory, inside the lib folder. You can point to a different config.yml file by setting the costant ABACUS\_CONFIG\_FILEPATH before requiring the gem.
12
13 The default config.yml file set as database path an ".abacus\_db" folder within the current user home directory, for production environment you may, as stated before, set the ABACUS\_CONFIG\_FILEPATH costant before requiring the gem in order to use a different config.yml file.
14
15 Config.yml also store configuration parameters for some tools of the toolset but I'll talk about it later.
16
17 After installed Abacus you may want to import some dictionaries in order to start using the semantic tools, the first time you do that you have to execute from command line:
18
19 abacus db:create
20
21 in order to create the database file. (set ENV[ABACUS\_CONFIG\_FILEPATH] to specify a different config file).
22
23 ABACUS_CONFIG_FILEPATH=/where/you/want abacus db:create
24
25 Then use the import function to load xdxf dictionaries (you can choose from a broad selection from here: http://xdxf.revdanica.com/down/, I used this one: http://downloads.sourceforge.net/xdxf/comn\_sdict05\_eng\_eng\_main.tar.bz2); at the moment the parser is pretty naif and it supports only some tags, but the whole dictionary is already being stored as 'raw_data' during the import so with future releases there might be further improvements also over imported dictionaries. The syntax to import a dict file is:
26
27 abacus db:xdxf:import filaname.xdxf
28
29 The import popolate the db tables with the contents of the dictionary file, multiple dictionaries can be added calling multiple times the above command.
30
31 Navigate the dictionary
32 -----------------------
33
34 To navigate the dictionary do as following:
35
36 >> require 'abacus'
37 => true
38 >> include Abacus
39 => Object
40 >> Dictionary.all
41 => [#<Abacus::Dictionary id: 1, full_name: "English explanatory dictionary (main)", lang_from: "ENG", lang_to: "ENG", description: nil>]
42 >> Dictionary.first.articles[1000..1002]
43 => [#<Abacus::Article id: 1001, dictionary_id: 1, raw_text: "Camberwell Beauty\nn. a deep purple butterfly, Nymph...">, #<Abacus::Article id: 1002, dictionary_id: 1, raw_text: "Cambodian\nn. & adj. --n. 1 a a native or national o...">, #<Abacus::Article id: 1003, dictionary_id: 1, raw_text: "Cambrian\n\313\210k\303\246mbr\311\252\311\231n adj. & n. --adj. 1 Welsh. 2 ...">]
44 >> Dictionary.first.articles.find(13000).article_keys
45 => [#<Abacus::ArticleKey id: 13000, the_key: "decipher", raw_text: "decipher">]
46 >> ArticleKey.find_by_the_key("ruby").articles.first
47 => #<Abacus::Article id: 34912, dictionary_id: 1, raw_text: "ruby\n\313\210ru:b\311\252 n., adj., & v. --n. (pl. -ies) 1 a ra...">
48 >> ArticleKey.find_by_the_key("ruby").articles.first.raw_text
49 => "ruby\n\313\210ru:b\311\252 n., adj., & v. --n. (pl. -ies) 1 a rare precious stone consisting of corundum with a colour varying from deep crimson or purple to pale rose. 2 a glowing purple-tinged red colour. --adj. of this colour. --v.tr. (-ies, -ied) dye or tinge ruby-colour. \303\270ruby glass glass coloured with oxides of copper, iron, lead, tin, etc. ruby-tail a wasp, Chrysis ignita, with a ruby-coloured hinder part. ruby wedding the fortieth anniversary of a wedding. [ME f. OF rubi f. med.L rubinus (lapis) red (stone), rel. to L rubeus red]"
50
51
52 There are two main models, Article and ArticleKey, Article is the hub for all the article properties and it contains the raw text (attribute raw_text) taken from xml:
53
54 XML FILE:
55 <ar><k>ironic</k>
56 <tr>aɪˈrɔnɪk</tr> adj. (also ironical) 1 using or displaying irony. 2 in the nature of irony. øøironically adv. [F ironique or LL ironicus f. Gk eironikos dissembling (as IRONY(1))]</ar>
57
58 IRB:
59 >> ArticleKey.find_by_the_key('ironic').articles[0].raw_text
60 => "ironic\na\311\252\313\210r\311\224n\311\252k adj. (also ironical) 1 using or displaying irony. 2 in the nature of irony. \303\270\303\270ironically adv. [F ironique or LL ironicus f. Gk eironikos dissembling (as IRONY(1))]"
61
62 For each article there may be one or more article_keys, which contains the linguistic identifiers of the article itself. Each article key can in turn be related to more than one article but from different dictionaries.
63
64
65 FIRST TOOL: HERIGONE MNEMONIC SYSTEM
66 ------------------------------------
67
68 (detailed explaination of this technique on Wikipedia: http://en.wikipedia.org/wiki/Herigone%27s\_mnemonic_system) Within config.yml you can set a list of association between numbers and letters (the standard one is already written within the standard config file):
69
70 Here's a sample config.yml file (this also the default one):
71
72 database:
73 adapter: sqlite3
74 database: <%=File.join(ENV['HOME'] || ENV['USERPROFILE'] || (Abacus::LIB_ROOT + File::SEPARATOR + ".."),'.abacus_db','abacus')%>
75 timeout: 5000
76
77 system:
78 default:
79 0:
80 z,s
81 1:
82 t,d,th
83 2:
84 n
85 3:
86 m
87 4:
88 r
89 5:
90 l
91 6:
92 j,ch,sh,dge
93 7:
94 k
95 8:
96 f,ph,v
97 9:
98 p,b
99
100 Following the above instructions you can create your own config file and specify your own system. Multiple systems are supported, simply put the one below the other within config.yml.
101
102 To enhance the existing imported dictionaries with the Hèrigone system you need to launch from commandline:
103
104 abacus db:herigone:generate :default [you can change default with your system name]
105
106 Then you can perform some interesting queries as follow:
107
108 >> HerigoneNumber.find_by_number(357)
109 => #<Abacus::HerigoneNumber id: 19081, system: "default", number: 357>
110 >> HerigoneNumber.find_by_number(357).article_keys
111 => [#<Abacus::ArticleKey id: 20159, the_key: "hemlock", raw_text: "hemlock">, #<Abacus::ArticleKey id: 26062, the_key: "milk", raw_text: "milk">, #<Abacus::ArticleKey id: 26068, the_key: "milky", raw_text: "milky">]
112 >> HerigoneNumber.find_by_number(357).article_keys.map{|a| a.the_key}
113 => ["hemlock", "milk", "milky"]
114
115
116 CONCLUSIONS
117 -----------
118
119 If you need more informations please have a look at the source code, or send me an email to sandro dot paganotti at gmail dot com.
Something went wrong with that request. Please try again.