-
Notifications
You must be signed in to change notification settings - Fork 0
bwu/pymongo_for_twitter
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Pymongo Twitter Class ===================== Using sixohsix's twitter package for python, and the pymongo package for python and MongoDB, these classes work together to scrape the Twitter REST API and store the data in a Mongo database. The class was designed for to be low maintenance, easy to monitor and regularly run so that one can accurately monitor the performance of his or her Twitter handle over time. General guidelines to use are shown below. I may be missing points. If you need help using this, please reach out to me at @brain_deer. Data is collected and stored in five different collections. - twitter_channel - twitter_content - twitter_replies - twitter_mentions - twitter_hashtags General guidelines to use: ========================== Requisites: - Installed twitter, pymongo, bitly_api and itertools packages. - A set up mongo database 1. Create a twitter application and authorize given handles to acquire oauth token and oauth token secret. 2. Create a "handles" collection with documents of the following format { "_id": ObjectId("ABCD"), "name": "justinbeiber", "twitter": "JUSTINS_TWITTERID", "twitter_hashtags": { "0": "#followfriday", "1": "#traveltuesday" }, "twitter_oauth_token": "JUSTINS_OAUTH_TOKEN", "twitter_oauth_token_secret": "JUSTINS_TWITTER_OAUTH_TOKEN_SECRET" } 3. Insert CONSUMER_KEY, CONSUMER_SECRET, and DATABASE_NAME in twitter_scrape.py 4. Insert BITLY_USERNAME, BITLY_APIKEY in insights.py 5. Run twitter_scrape.py from your python interpreter Collected Data Points ===================== twitter_channel - stores day by day channel data Example: { "_id": ObjectId("ABCD"), "total_tweets": 5104, "friends_count": 113, "followers_count": 3644, "handle_name": "justinbeiber", "date": "2012-11-28", "total_lists": 89 } ----------------------------------------------- twitter_content - stores tweets made by handle shared to all followers, freezing data after 2 and 7 days. Example: { "_id": ObjectId("ABCD"), "handle_name": "justinbeiber", "date": "2012-11-20 11: 00: 00", "lifetime": { "reply_count": 1, "retweet_count": 11, "bitly_clicks": 119 }, "status_id": "1234", "text": "Check out this pic from my latest concert: http: \/\/t.co\/dcjkeow", "tweet_link": "http: \/\/twitter.com\/#!\/12345\/status\/1234", "url": "bit.ly\/JKDixkl", "2_day": { "reply_count": 1, "retweet_count": 9, "bitly_clicks": 111 }, "7_day": { "reply_count": 1, "retweet_count": 10, "bitly_clicks": 117 } } ----------------------------------------------- twitter_replies - stores tweets made by handle beginning with an @mention, freezing data after 2 and 7 days. Example: { "_id": ObjectId("ABCD"), "handle_name": "justinbeiber", "date": "2012-11-20 11: 00: 00", "lifetime": { "reply_count": 1, "retweet_count": 11, "bitly_clicks": 119 }, "status_id": "1234", "text": "@JustinsFans Check out this pic from my latest concert: http: \/\/t.co\/dcjkeow", "tweet_link": "http: \/\/twitter.com\/#!\/12345\/status\/1234", "url": "bit.ly\/JKDixkl", "2_day": { "reply_count": 1, "retweet_count": 9, "bitly_clicks": 111 }, "7_day": { "reply_count": 1, "retweet_count": 10, "bitly_clicks": 117 } } ----------------------------------------------- twitter_mentions - stores tweets mentioning a specific handle, storing specific information about the user that tweeted. Example: { "_id": ObjectId("ABCD"), "tweet_link": "http: \/\/twitter.com\/#!\/BrianSmith\/status\/1234", "url": null, "handle_name": "justinbeiber", "user": { "statuses_count": 104, "name": "Brian Smith", "friends_count": 26, "followers_count": 839, "location": "", "screen_name": "BrianSmith" }, "status_id": "1234", "date": "2012-11-28 18: 53: 02", "text": "I just had a great time watching @justinbeiber!", "retweet_count": 6, "in_reply_to_status": null } ----------------------------------------------- twitter_hashtags - stores tweets containing a specific search term, storing specific information about the user that tweeted. Example: { "_id": ObjectId("ABCD"), "tweet_link": "http: \/\/twitter.com\/#!\/JoeShmo\/status\/1234", "in_reply_to_screen_name": null, "handle_name": "justinbeiber", "query": "%23baconbits", "user": { "statuses_count": 9065, "name": "Joseph Shmo", "friends_count": 296, "followers_count": 301, "location": "'Jersey", "screen_name": "JoeShmo" }, "status_id": "1234", "date": "2012-11-28 18: 33: 58", "text": "I loveeeee my #baconbits! bit.ly\/Xv4fd", "url": "bit.ly\/Xv4fd", "retweet_count": 36 } -----------------------------------------------
About
Classes that utilize python and mongo to scrape the Twitter Restful API and insert data into a database.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published