Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Alex Manelis
committed
Jun 14, 2010
1 parent
7478859
commit 2c17e1a
Showing
21 changed files
with
1,307 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
DEMO----------------------------- | ||
Most NLTK modules include demonstration code. Here are some examples involving tokenizing, stemming, and tagging: | ||
|
||
>>> import nltk | ||
>>> nltk.stem.porter.demo() | ||
>>> nltk.stem.lancaster.demo() | ||
>>> nltk.probability.demo() | ||
|
||
Here are some more examples, involving parsing and semantic interpretation: | ||
|
||
>>> import nltk | ||
>>> nltk.chunk.regexp.demo() | ||
>>> nltk.parse.chart.demo() | ||
>>> nltk.sem.evaluate.demo() | ||
>>> nltk.sem.logic.demo() | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
import nltk | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
Hello Alex Manelis, what are you doing | ||
today? I feel like going rock climbing, | ||
what about | ||
|
||
you? |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
import nltk | ||
|
||
for line in open("file.txt"): | ||
for word in line.split(): | ||
print word |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
#!/usr/bin/python | ||
import nltk | ||
from nltk.tokenize import * | ||
|
||
s = ("Good muffins cost $3.88\nin New York. Please buy metwo of them.\n\nThanks.") | ||
|
||
tokens = word_tokenize(s) | ||
capword = RegexpTokenizer('[A-Z]\w+').tokenize(s) | ||
|
||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,3 @@ | ||
#!/usr/bin/python | ||
import tweetstream | ||
|
||
stream = tweetstream.TweetStream("zandermane", "alex18257") | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
Metadata-Version: 1.0 | ||
Name: tweetstream | ||
Version: 0.3.4 | ||
Summary: Simple Twitter streaming API access | ||
Home-page: http://bitbucket.org/runeh/tweetstream/ | ||
Author: Rune Halvorsen | ||
Author-email: runefh@gmail.com | ||
License: BSD | ||
Description: .. -*- restructuredtext -*- | ||
|
||
########################################## | ||
tweetstream - Simple twitter streaming API | ||
########################################## | ||
|
||
Introduction | ||
------------ | ||
|
||
tweetstream provides a class, TweetStream, that can be used to get | ||
tweets from Twitter's streaming API. An instance of the class can be used as | ||
an iterator. In addition to fetching tweets, the object keeps track of | ||
the number of tweets collected and the rate at which tweets are received. | ||
|
||
Subclasses are available for accessing the "track" and "follow" streams | ||
as well. | ||
|
||
There's also a ReconnectingTweetStream class that handles automatic | ||
reconnecting. | ||
|
||
Twitter's documentation about the streaming API can be found here: | ||
http://apiwiki.twitter.com/Streaming-API-Documentation . | ||
|
||
**Note** that the API is blocking. If for some reason data is not immediatly | ||
available, calls will block until enough data is available to yield a tweet. | ||
|
||
Examples | ||
-------- | ||
|
||
Printing all incomming tweets: | ||
|
||
>>> stream = tweetstream.TweetStream("username", "password") | ||
>>> for tweet in stream: | ||
... print tweet | ||
|
||
|
||
The stream object can also be used as a context, as in this example that | ||
prints the author for each tweet as well as the tweet count and rate: | ||
|
||
>>> with tweetstream.TweetStream("username", "password") as stream | ||
... for tweet in stream: | ||
... print "Got tweet from %-16s\t( tweet %d, rate %.1f tweets/sec)" % ( | ||
... tweet["user"]["screen_name"], stream.count, stream.rate ) | ||
|
||
|
||
Stream objects can raise ConnectionError or AuthenticationError exceptions: | ||
|
||
>>> try: | ||
... with tweetstream.TweetStream("username", "password") as stream | ||
... for tweet in stream: | ||
... print "Got tweet from %-16s\t( tweet %d, rate %.1f tweets/sec)" % ( | ||
... tweet["user"]["screen_name"], stream.count, stream.rate ) | ||
... except tweetstream.ConnectionError, e: | ||
... print "Disconnected from twitter. Reason:", e.reason | ||
|
||
To get tweets that relate to specific terms, use the TrackStream: | ||
|
||
>>> words = ["opera", "firefox", "safari"] | ||
>>> with tweetstream.TrackStream("username", "password", words) as stream | ||
... for tweet in stream: | ||
... print "Got interesting tweet:", tweet | ||
|
||
To get only tweets from a set of users, use the FollowStream. The following | ||
would get tweets for user 1, 42 and 8675309 | ||
|
||
>>> users = [1, 42, 8675309] | ||
>>> with tweetstream.FollowStream("username", "password", users) as stream | ||
... for tweet in stream: | ||
... print "Got tweet from:", tweet["user"]["screen_name"] | ||
|
||
|
||
Simple tweet fetcher that sends tweets to an AMQP message server using carrot: | ||
|
||
>>> from carrot.messaging import Publisher | ||
>>> from carrot.connection import AMQPConnection | ||
>>> from tweetstream import TweetStream | ||
>>> amqpconn = AMQPConnection(hostname="localhost", port=5672, | ||
... userid="test", password="test", | ||
... vhost="test") | ||
>>> publisher = Publisher(connection=amqpconn, | ||
... exchange="tweets", routing_key="stream") | ||
>>> with TweetStream("username", "password") as stream: | ||
... for tweet in stream: | ||
... publisher.send(tweet) | ||
>>> publisher.close() | ||
|
||
|
||
Changelog | ||
--------- | ||
|
||
See the CHANGELOG file | ||
|
||
Contact | ||
------- | ||
|
||
The author is Rune Halvorsen <runefh@gmail.com>. The project resides at | ||
http://bitbucket.org/runeh/tweetstream . If you find bugs, or have feature | ||
requests, please report them in the project site issue tracker. Patches are | ||
also very welcome. | ||
|
||
License | ||
------- | ||
|
||
This software is licensed under the ``New BSD License``. See the ``LICENCE`` | ||
file in the top distribution directory for the full license text. | ||
|
||
Keywords: twitter | ||
Platform: any | ||
Classifier: License :: OSI Approved :: BSD License | ||
Classifier: Intended Audience :: Developers |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
.. -*- restructuredtext -*- | ||
########################################## | ||
tweetstream - Simple twitter streaming API | ||
########################################## | ||
|
||
Introduction | ||
------------ | ||
|
||
tweetstream provides a class, TweetStream, that can be used to get | ||
tweets from Twitter's streaming API. An instance of the class can be used as | ||
an iterator. In addition to fetching tweets, the object keeps track of | ||
the number of tweets collected and the rate at which tweets are received. | ||
|
||
Subclasses are available for accessing the "track" and "follow" streams | ||
as well. | ||
|
||
There's also a ReconnectingTweetStream class that handles automatic | ||
reconnecting. | ||
|
||
Twitter's documentation about the streaming API can be found here: | ||
http://apiwiki.twitter.com/Streaming-API-Documentation . | ||
|
||
**Note** that the API is blocking. If for some reason data is not immediatly | ||
available, calls will block until enough data is available to yield a tweet. | ||
|
||
Examples | ||
-------- | ||
|
||
Printing all incomming tweets: | ||
|
||
>>> stream = tweetstream.TweetStream("username", "password") | ||
>>> for tweet in stream: | ||
... print tweet | ||
|
||
|
||
The stream object can also be used as a context, as in this example that | ||
prints the author for each tweet as well as the tweet count and rate: | ||
|
||
>>> with tweetstream.TweetStream("username", "password") as stream | ||
... for tweet in stream: | ||
... print "Got tweet from %-16s\t( tweet %d, rate %.1f tweets/sec)" % ( | ||
... tweet["user"]["screen_name"], stream.count, stream.rate ) | ||
|
||
|
||
Stream objects can raise ConnectionError or AuthenticationError exceptions: | ||
|
||
>>> try: | ||
... with tweetstream.TweetStream("username", "password") as stream | ||
... for tweet in stream: | ||
... print "Got tweet from %-16s\t( tweet %d, rate %.1f tweets/sec)" % ( | ||
... tweet["user"]["screen_name"], stream.count, stream.rate ) | ||
... except tweetstream.ConnectionError, e: | ||
... print "Disconnected from twitter. Reason:", e.reason | ||
|
||
To get tweets that relate to specific terms, use the TrackStream: | ||
|
||
>>> words = ["opera", "firefox", "safari"] | ||
>>> with tweetstream.TrackStream("username", "password", words) as stream | ||
... for tweet in stream: | ||
... print "Got interesting tweet:", tweet | ||
|
||
To get only tweets from a set of users, use the FollowStream. The following | ||
would get tweets for user 1, 42 and 8675309 | ||
|
||
>>> users = [1, 42, 8675309] | ||
>>> with tweetstream.FollowStream("username", "password", users) as stream | ||
... for tweet in stream: | ||
... print "Got tweet from:", tweet["user"]["screen_name"] | ||
|
||
|
||
Simple tweet fetcher that sends tweets to an AMQP message server using carrot: | ||
|
||
>>> from carrot.messaging import Publisher | ||
>>> from carrot.connection import AMQPConnection | ||
>>> from tweetstream import TweetStream | ||
>>> amqpconn = AMQPConnection(hostname="localhost", port=5672, | ||
... userid="test", password="test", | ||
... vhost="test") | ||
>>> publisher = Publisher(connection=amqpconn, | ||
... exchange="tweets", routing_key="stream") | ||
>>> with TweetStream("username", "password") as stream: | ||
... for tweet in stream: | ||
... publisher.send(tweet) | ||
>>> publisher.close() | ||
|
||
|
||
Changelog | ||
--------- | ||
|
||
See the CHANGELOG file | ||
|
||
Contact | ||
------- | ||
|
||
The author is Rune Halvorsen <runefh@gmail.com>. The project resides at | ||
http://bitbucket.org/runeh/tweetstream . If you find bugs, or have feature | ||
requests, please report them in the project site issue tracker. Patches are | ||
also very welcome. | ||
|
||
License | ||
------- | ||
|
||
This software is licensed under the ``New BSD License``. See the ``LICENCE`` | ||
file in the top distribution directory for the full license text. |
Oops, something went wrong.