/
about.html
210 lines (178 loc) · 7.95 KB
/
about.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
<html>
<head>
<!-- Language related information -->
<meta http-equiv=Content-Language content=EN>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<meta name=language content=EN>
<meta name=keywords content="maximum entropy, Natural Language, NLP,
AI, maxent, machine learning">
<Link rel="stylesheet" href="style.css" type="text/css">
<title>The OpenNLP Maxent Homepage</title>
</head>
<body bgcolor="#FFFFFF">
<table width=100% border=0 cellPadding=0 cellSpacing=0>
<tr>
<!-- <td width=18%> </td>-->
<td width=40% valign="top"><img src="onlpmaxent_logo.jpg"></td>
<td width=47% align="right" valign="bottom"><h2>About</h2></td>
<td width=13%> </td>
</tr>
</table>
<table width=100% height=100% border=0 cellPadding=0 cellSpacing=0>
<tr>
<!-- <td width=18%> </td> -->
<td width=12%> </td>
<td width=75% valign="top">
<HR WIDTH="100%">
<!-- <h1>The OpenNLP Maxent Homepage</h1> -->
<center>
[<a href="index.html">Home</a>]
[<a href="about.html">About</a>]
[<a href="howto.html">HOWTO</a>]
[<a href="https://sourceforge.net/project/showfiles.php?group_id=5961">Download</a>]
[<a href="api/index.html">API</a>]
[<a href="https://sourceforge.net/forum/?group_id=5961">Forums</a>]
[<a href="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/maxent/">CVS</a>]
</center>
<h2>The Maximum Entropy Framework</h2>
<p>To explain what maximum entropy
is, it will be simplest to quote from <a href="http://www-nlp.Stanford.EDU/~manning/">Manning</a>
and <a href="ftp://parcftp.xerox.com/pub/qca/schuetze.html">Schutze</a>*
(p. 589):
</p>
<blockquote>
Maximum entropy modeling is a framework for integrating information
from many heterogeneous information sources for classification.
The data for a classification problem is described as a
(potentially large) number of features. These features can be
quite complex and allow the experimenter to make use of prior
knowledge about what types of informations are expected to be
important for classification. Each feature corresponds to a constraint
on the model. We then compute the maximum entropy model, the
model with the maximum entropy of all the models that satisfy the
constraints. This term may seem perverse, since we have spent
most of the book trying to minimize the (cross) entropy of models, but
the idea is that we do not want to go beyond the data. If we
chose a model with less entropy, we would add `information'
constraints to the model that are not justified by the empirical
evidence available to us. Choosing the maximum entropy model is
motivated by the desire to preserve as much uncertainty as possible.
</blockquote>
<p>
So that gives a rough idea of what the maximum entropy framework is.
Don't assume anything about your probability distribution other than what
you have observed.
</p>
<p>
On the engineering level, using maxent is an excellent way of creating
programs which perform very difficult classification tasks very
well. For example, precision and recall figures for
programs using maxent models have reached (or are) the state of the
art on tasks like part of speech tagging, sentence detection,
prepositional phrase attachment, and named entity recognition.
On the engineering level, an added benefit is that the person creating
a maxent model only needs to inform the training procedure of the
event space, and need not worry about independence between features.
</p>
<p>
While the authors of this implementation of maximum entropy are
generally interested using maxent models in natural language
processing, the framework is certainly quite general and useful for a
much wider variety of fields. In fact, maximum entropy modeling
was originally developed for statistical physics.
</p>
<p>
For a very in-depth discussion of how maxent can be used in natural
language processing, try reading <a
href="http://www.research.ibm.com/people/a/adwaitr/">Adwait
Ratnaparkhi</a>'s <a
href="ftp://ftp.cis.upenn.edu/pub/ircs/tr/98-15/98-15.ps.gz">dissertation. </a>
Also, check out Berger, Della Pietra, and Della Pietra's paper
<a
href="http://citeseer.nj.nec.com/cs?profile=314155%2C67650%2C1%2C0.25%2CDownload&rd=http%3A//www.cs.cmu.edu/%7Eaberger/ps/compling.ps">A
Maximum Entropy Approach to Natural Language Processing,</a> which
provides an excellent introduction and discussion of the framework.
</p>
<p>
<i>*<a href="http://www.sultry.arts.usyd.edu.au/fsnlp/promo/">Foundations
of statistical natural language processing </a></i>. Christopher D. Manning,
Hinrich Schutze.
<br>Cambridge, Mass. : MIT Press, c1999.
</p>
<h2>Implementation</h2>
<p>
We have tried to make the opennlp.maxent implementation easy to
use. To create a model, one needs (of course) the training data,
and then implementations of two interfaces in the opennlp.maxent
package, EventStream and ContextGenerator. These have fairly
simple specifications, and example implementations can be found in the
<a href="http://opennlp.sourceforge.net">OpenNLP Tools</a>
preprocessing components. More details are given in the
opennlp.maxent <a href="howto.html">HOWTO</a>.
</p>
<p>
We have also set in place some interfaces and code to make it easier
to automate the training and evaluation process (the Evalable
interface and the TrainEval class). It is not necessary to use
this functionality, but if you do you'll find it much easier to see
how well your models are doing. The
opennlp.grok.preprocess.namefind package is an example of a maximum
entropy component which uses this functionality.
</p>
<p>
We have managed to use several techniques to reduce the size of the
models when writing them to disk, which also means that reading in a
model for use is much quicker than with less compact encodings of the
model. This was especially important to us since we use many
maxent models in the Grok library, and we wanted the start up time and
the physical size of the library to be as minimal as possible. As of
version 1.2.0, maxent has an io package which greatly simplifies the
process of loading and saving models in different formats.
</p>
<h2>Authors</h2>
<p>The opennlp.maxent package was originally built by <a
href="http://www.cogsci.ed.ac.uk/~jmb/">Jason Baldridge</a>, <a
href="http://www.cis.upenn.edu/~tsmorton/">Tom Morton</a>, and <a
href="http://www.cogsci.ed.ac.uk/~gbierner">Gann Bierner</a>. We
owe a big thanks to <a
href="http://www.research.ibm.com/people/a/adwaitr/">Adwait
Ratnaparkhi</a> for his work on maximum entropy models for natural
language processing applications. His <a
href="ftp://ftp.cis.upenn.edu/pub/ircs/tr/97-08.ps.Z">introduction to
maxent for NLP</a> and <a
href="ftp://ftp.cis.upenn.edu/pub/ircs/tr/98-15/98-15.ps.gz">dissertation</a>
are what really made opennlp.maxent and our Grok maxent components
(POS tagger, end of sentence detector, tokenizer, name finder)
possible!
</p>
<p>Eric Friedman has been steadily improving the efficiency and design
of the package since version 1.2.0.
</p>
<center>
[<a href="index.html">Home</a>]
[<a href="about.html">About</a>]
[<a href="howto.html">HOWTO</a>]
[<a href="https://sourceforge.net/project/showfiles.php?group_id=5961">Download</a>]
[<a href="api/index.html">API</a>]
[<a href="https://sourceforge.net/forum/?group_id=5961">Forums</a>]
[<a href="http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/maxent/">CVS</a>]
</center>
<HR WIDTH="100%">
<h3>
Email: <a href="mailto:tsmorton@users.sourceforege.net">tsmorton@users.sourceforge.net</a>
<br>
<script language="JavaScript">
<!---//
var Months = new Array('January','February','March','April','May','June','July','August','September','October','November','December');
var lm = new Date (document.lastModified);
document.write(Months[lm.getMonth()]+" "+lm.getDate()+" "+lm.getFullYear());
//--->
</script>
<br>
<A href="http://sourceforge.net"> <IMG src="http://sourceforge.net/sflogo.php?group_id=5961&type=1" width="88" height="31" border="0"></A> <br>
</h3>
</td>
<td width=13%> </td>
</tr>
</table>
</html>