Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 125 lines (86 sloc) 4.213 kb
0c4d798 @lok Added a README
authored
1 BayesianKit - Cocoa Objective-C Framework for a naive bayesian classifier
2 =========================================================================
3
4 BayesianKit is a Mac OS X Framework written by Samuel Mendes in Objective-C 2.0
5 and released under BSD 3-clauses license. BayesianKit offers a simple, ready to
6 use, implementation for a bayesian classifier. A command line utility is also
7 provided.
8
9 Dependencies
10 ------------
11
12 * [ParseKit](http://parsekit.com) used for the default tokenizer
13 * [appledoc](http://www.gentlebytes.com/freeware/appledoc) used to generate the
14 headers' documentation
15
16 Both of them are included as submodules. After having cloned the repository
17 type:
18
0f37f57 @lok Reformated the README
authored
19 git submodule init
20 git submodule update
0c4d798 @lok Added a README
authored
21
22 However appledoc needs [Doxygen](http://www.stack.nl/~dimitri/doxygen) which is
23 not provided.
24
25 Xcode Project
26 -------------
27
28 The BayesianKit project consists of 3 targets:
29
0f37f57 @lok Reformated the README
authored
30 - **BayesianKit** : The BayesianKit framework.
31 - **Bayes** : The command line utility.
32 - **Install Documentation** : The script running appledoc.
0c4d798 @lok Added a README
authored
33
34 BayesianKit usage
35 -----------------
36
37 The default classifier comes with a tokenizer based on ParseKit. Quite efficient
38 when training on source code. It also implements and use the Robinson-Fisher
39 combiner on probabilities. Both the tokenizer and combiner can be changed,
40 however note that they are not saved along the training. Hence if you load a
41 classifier from a file, you must reset the tokenizer and/or combiner.
42
43 ### Creating and Training a new classifier ###
44
0f37f57 @lok Reformated the README
authored
45 BKClassifier *classifier = [[BKClassifier alloc] init];
46 [classifier trainWithString:@"one two three four five"
47 forPoolNamed:@"english"];
48 [classifier trainWithString:@"un deux trois quatre cinq"
49 forPoolNamed:@"french"];
0c4d798 @lok Added a README
authored
50
51 ### Saving and reloading the training data ###
52
0f37f57 @lok Reformated the README
authored
53 [classifier writeToFile:@"counting.bks"]
54 // Another day, in a different process
55 BKClassifier *anotherOne;
56 anotherOne = [BKClassifier classifierWithContentsOfFile:@"counting.bks"];
0c4d798 @lok Added a README
authored
57
58 ### Using the classifier to make a guess ###
59
0f37f57 @lok Reformated the README
authored
60 NSDictionary *results = [anotherOne guessWithString:@"three platypuses"];
61 NSLog(@"%@", results);
0c4d798 @lok Added a README
authored
62
63 The output is:
64
0f37f57 @lok Reformated the README
authored
65 $ {
66 english = "0.9999";
67 }
0c4d798 @lok Added a README
authored
68
69 Bayes
70 -----
71
72 This tool was intended to test quickly the classifier, and works only with
73 files. A manpage is also provided with every details.
74
75 ### Installation ###
76
77 From the root directory of the project:
78
0f37f57 @lok Reformated the README
authored
79 sudo cp build/Release/bayes /usr/local/bin/
80 sudo cp docs/man/man1/bayes.1 /usr/local/share/man/man1/
0c4d798 @lok Added a README
authored
81
82 ### Training with a save file ###
83
0f37f57 @lok Reformated the README
authored
84 bayes -f save.bks -s -t english shakespeare.txt -t french moliere.txt
0c4d798 @lok Added a README
authored
85
86 ### Guessing based on this training ###
0f37f57 @lok Reformated the README
authored
87
88 bayes -f save.bks -g mystery.txt
0c4d798 @lok Added a README
authored
89
90
91 LICENSE
92 =======
93
94 Copyright (c) 2010, Samuel Mendes
95
96 All rights reserved.
97
98 Redistribution and use in source and binary forms, with or without
99 modification, are permitted provided that the following conditions are met:
100
101 * Redistributions of source code must retain the above copyright
102 notice, this list of conditions and the following disclaimer.
103
104 * Redistributions in binary form must reproduce the above copyright
105 notice, this list of conditions and the following disclaimer in the
106 documentation and/or other materials provided with the distribution.
107
108 * Neither the name of ᐱ nor the names of its
109 contributors may be used to endorse or promote products derived
110 from this software without specific prior written permission.
111
112 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
113 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
114 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
115 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
116 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
117 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
118 TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
119 PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
120 LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
121 NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
122 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
123
124 Samuel Mendes <samuel.mendes@gmail.com>
Something went wrong with that request. Please try again.