Permalink
Browse files

initial commit of development files

  • Loading branch information...
0 parents commit 8318138715b080dce7e24067d7136ad7c202a0c0 Aaron committed May 17, 2010
@@ -0,0 +1,36 @@
+Standard stuff applies to install. Run this from
+the command line:
+
+ python setup.py install
+
+Documentation is in the doc directory. Any questions
+can be forwarded to abrenzel@millerresource.com. I'm
+usually pretty good about responding.
+
+The LinkedIn client library has lxml as a dependency for
+XML processing. Yes, I know Python has a standard XML
+parser implementation. Yes, I know lxml can be a bear to install
+if you're building from source. Still, there's no faster or
+more full-featured XML parsing tool available for Python. I
+have no plans to include support for the etree parser in the
+standard libary.
+
+This package is intended for use with the LinkedIn API.
+You must supply your own API key for this library to work.
+Once you have an API key from LinkedIn, the syntax for instantiating
+an API client object is this:
+
+ mykey = 'mysecretkey'
+ mysecret = 'mysecretsecret'
+ myclient = LinkedInAPI(mykey, mysecret)
+
+From there, you can obtain request tokens, authorization urls,
+access tokens, and actual LinkedIn data through the LinkedInAPI
+object's methods. The object will handle signing requests, url
+formatting, and XML parsing for you. Full documentation for these
+methods can be found in the doc directory (or will be there when
+I get it done).
+
+Happy apping!
+
+Aaron

Large diffs are not rendered by default.

Oops, something went wrong.
Binary file not shown.
No changes.
@@ -0,0 +1,45 @@
+#! usr/bin/env python
+import re, nltk
+
+class TextualAnalyzer(object):
+ def __init__(self, txt, source):
+ self.sources = {}
+ text = nltk.text.Text(txt)
+ self.sources[source] = {}
+ self.cfds = {}
+ self.sources[source]['text'] = text
+ self.sources[source]['collocations'] = text.collocations().split(';')
+ self.sources[source]['freq_dist'] = text.vocab()
+
+ def register(self, txt, source):
+ if source not in self.sources.keys():
+ text = nltk.text.Text(txt)
+ self.sources[source] = {}
+ self.sources[source]['text'] = text.tokens
+ self.sources[source]['collocations'] = text.collocations().split(';')
+ self.sources[source]['freq_dist'] = text.vocab()
+ else:
+ raise KeyError('Source already found in internal dictionary. Please use a different source name.')
+
+ def generate_cfd(self, srca, srcb):
+ cdna = [(w, srca) for w in self.sources[srca]['text']]
+ cdnb = [(w, srcb) for w in self.sources[srcb]['text']]
+ cfdp = [cdnb + cdna]
+ self.cfds[srca+ ', ' + srcb] = nltk.ConditionalFreqDist(cfdp)
+
+ def tag(self, source, tagger=None):
+ if not tagger:
+ self.sources[source]['tagged'] = nltk.pos_tag(self.sources[source]['text'])
+ else:
+ self.sources[source]['tagged'] = tagger.tag(self.sources[source]['text'])
+
+ def chunk(self, source, chunker=None):
+ if not self.sources[source]['tagged']:
+ self.tag(source)
+ grammar = r"""
+ NP: {<DT|PP>?<JJ.*>*<NN.*>}
+ {<NNP>+}
+ VP: {<JJ.*>?<RB>?<VB+><NN.*>*}
+ """
+ cp = nltk.RegexpParser(grammar)
+ self.sources[source]['chunked'] = cp.parse(self.sources[source]['tagged'])
@@ -0,0 +1,22 @@
+ Copyright (c) 2010 Aaron Brenzel
+
+ Permission is hereby granted, free of charge, to any person
+ obtaining a copy of this software and associated documentation
+ files (the "Software"), to deal in the Software without
+ restriction, including without limitation the rights to use,
+ copy, modify, merge, publish, distribute, sublicense, and/or sell
+ copies of the Software, and to permit persons to whom the
+ Software is furnished to do so, subject to the following
+ conditions:
+
+ The above copyright notice and this permission notice shall be
+ included in all copies or substantial portions of the Software.
+
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ OTHER DEALINGS IN THE SOFTWARE.
Oops, something went wrong.

0 comments on commit 8318138

Please sign in to comment.