Tweet Streamer

Uses "Twitter Streaming API" to get the target tweets(real-time) for a recent high traffic event(s), and storing them in MongoDB. Later, tweets can be filtered using REST API

API Provides :-

Endpoints to start streaming.
Time limit based streaming.
Ordering of tweets based on particular fields.
Searching of words/text in tweets.
Searching of tweets based on username.
Filtering of data in integer fields based on min/max value provided
Filtering of data in string fields based on ending or starting word or a word that the field should contain.
Filtering of data based on starting and ending dates
CSV export support

Setup and Installation :-

Clone the git repository.
Make sure python3.5+ is installed.
Install the requirements listed in requirements.txt

Twitter Authentication :-

Open file settings.py
Replace your-keys* , your-tokens* with your authentication keys and access tokens.

Creating Database and establishing the server

Open the terminal and enter:-

$ cd TweetStreamer
$ mkdir data
$ echo 'mongod --bind_ip=$IP --dbpath=data --nojournal --rest "$@"' > mongod
$ chmod a+x mongod

To start the Mongo Server :-

$ ./mongod

Running the REST Server

$ cd TweetStreamer
$ python app.py

Schema

Fields: In MongoDB, every document tweet will contain following fields -

tweet_text: string,
screen_name : string,
user_name: string,
location: string,
source_device: string,
is_retweeted: boolean,
retweet_count: integer,
country: string,
country_code: string,
reply_count: integer,
favorite_count: integer,
created_at: datetime,
timestamp_ms: long,
lang: string,
hashtags: array,
quote_count: integer

API Endpoints :-

1. To start/trigger streaming :-

GET /stream?keywords=______________

here replace ____ by keywords you want to search seperated by a comma

GET /stream?keywords=________ &time_limit=____________

here replace {any-keyword} by keywords you want to search,
{time-limit} by an integer value in seconds stating the time limit for which you want to stream the tweets.
default time limit is 100 seconds

Response

{
  "status": "success",
  "message": "Started streaming tweets with keywords [u'cricket', u'football', u'Rafale']"
}

2. To fetch the tweets :-

GET /tweets/?limit=_ & page=__

to return a limited number of tweets :- replace limit=______ with an integer value
to return based on page number with 50 tweets per page :- replace page=____ with an integer value

3. To fetch the users :-

GET /users/name=____________

here replace ______ by username to search for a particular user

4. To search in tweets :-

GET /tweets/?search=______

here replace ______ by the appropriate words you want to search for in tweet text

5. To sort/order the tweets with respect to a particular field :-

GET /tweets/?ordering=________

for ascending :- replace ______ by field-name and
for descending :- replace _______ by -field-name

6. To search based on max/min in string columns :-

GET /tweets/string_search?column_name= & start= & end=

here enter the integer column name in which you want to search in the space column_name=______
replace the space in end variable with the end word you want to search in the given column
replace the space in start variable with the starting word you want to search in the given column
enter both in case you want to give both constraints

GET /tweets/string_search?column_name=__ &contain=_

here enter the integer column name in which you want to search in the space column_name=______
replace the contain variable with the word you want the resulting column entries must contain

GET /tweets/string_search?column_name=____ & match=_

here enter the integer column name in which you want to search in the space column_name=______
replace the match variable with the word to list down the entries with that word in the given column

7. To search based on max/min in integer columns :-

GET /tweets/integer_search?column_name= & max= & min=

here enter the integer column name in which you want to search in the space column_name=______
replace the space in max variable with the maximum value for the column name
replace the space in min variable with minimum value for the column name
enter both in case you want to give both constraints

8. To filter based on Date :-

GET /tweets/date?startDate=__ & endDate=_

replace the space in startDate variable with the starting date from which you want the tweets to be displayed
replace the space in endDate variable with the ending date from which you want the tweets to be displayed
enter both in case you want to give both constraints

9. To export tweets data as CSV:-

GET /csv/

Technologies used :-

Python 3.5
Flask Framework
MongoDB
Tweepy (twitter streaming library in python)

Sample Response

[
    {
        "hashtags": [],
        "user_name": "Zesty StL Blues",
        "country": "",
        "is_retweeted": false,
        "location": "St Louis, MO",
        "tweet_text": "Blues defenseman Dunn upset by apparent benching https://t.co/1gYP1FafRDhttps://t.co/FDWEZYTmUi",
        "reply_count": 0,
        "created_at": "Thu Oct 11 10:04:58 +0000 2018",
        "source_device": "<a href=\"http://zestynews.com\" rel=\"nofollow\">Zesty Blues Tweets</a>",
        "favorite_count": 0,
        "country_code": "",
        "timestamp_ms": "1539252298068",
        "lang": "en",
        "screen_name": "zesty_blues",
        "quote_count": 0,
        "retweet_count": 0
    },
    {......},
    {......},
    {......},
    {......}
]

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
app.py		app.py
commons.py		commons.py
requirements.txt		requirements.txt
settings.py		settings.py
streamListener.py		streamListener.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tweet Streamer

Uses "Twitter Streaming API" to get the target tweets(real-time) for a recent high traffic event(s), and storing them in MongoDB. Later, tweets can be filtered using REST API

API Provides :-

Setup and Installation :-

Twitter Authentication :-

Creating Database and establishing the server

Running the REST Server

Schema

API Endpoints :-

1. To start/trigger streaming :-

GET /stream?keywords=______________

GET /stream?keywords=________ &time_limit=____________

2. To fetch the tweets :-

GET /tweets/?limit=_ & page=__

3. To fetch the users :-

GET /users/name=____________

4. To search in tweets :-

GET /tweets/?search=______

5. To sort/order the tweets with respect to a particular field :-

GET /tweets/?ordering=________

6. To search based on max/min in string columns :-

GET /tweets/string_search?column_name= & start= & end=

GET /tweets/string_search?column_name=__ &contain=_

GET /tweets/string_search?column_name=____ & match=_

7. To search based on max/min in integer columns :-

GET /tweets/integer_search?column_name= & max= & min=

8. To filter based on Date :-

GET /tweets/date?startDate=__ & endDate=_

9. To export tweets data as CSV:-

GET /csv/

Technologies used :-

Sample Response

About

Releases

Packages

Languages

bhardwajaditya/TweetStreamer

Folders and files

Latest commit

History

Repository files navigation

Tweet Streamer

Uses "Twitter Streaming API" to get the target tweets(real-time) for a recent high traffic event(s), and storing them in MongoDB. Later, tweets can be filtered using REST API

API Provides :-

Setup and Installation :-

Twitter Authentication :-

Creating Database and establishing the server

Running the REST Server

Schema

API Endpoints :-

1. To start/trigger streaming :-

GET /stream?keywords=______________

GET /stream?keywords=________ &time_limit=____________

2. To fetch the tweets :-

GET /tweets/?limit=_____ & page=______

3. To fetch the users :-

GET /users/name=____________

4. To search in tweets :-

GET /tweets/?search=______

5. To sort/order the tweets with respect to a particular field :-

GET /tweets/?ordering=________

6. To search based on max/min in string columns :-

GET /tweets/string_search?column_name=________ & start=______ & end=______

GET /tweets/string_search?column_name=________ &contain=_______

GET /tweets/string_search?column_name=________ & match=_____

7. To search based on max/min in integer columns :-

GET /tweets/integer_search?column_name=________ & max=______ & min=______

8. To filter based on Date :-

GET /tweets/date?startDate=________ & endDate=_______

9. To export tweets data as CSV:-

GET /csv/

Technologies used :-

Sample Response

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

GET /tweets/?limit=_ & page=__

GET /tweets/string_search?column_name= & start= & end=

GET /tweets/string_search?column_name=__ &contain=_

GET /tweets/string_search?column_name=____ & match=_

GET /tweets/integer_search?column_name= & max= & min=

GET /tweets/date?startDate=__ & endDate=_

Packages