GitHub - gnip/sample-python-connector: Sample Python connector to the Gnip streaming services

#Gnip Python Sample Connector This is a sample app which connects to the Gnip set of streaming APIs in Python. The application is broken down into three basic elements:

A GnipRawStreamClient which connects to the HTTP endpoint and buffers the streaming JSON
A GnipJSONStreamClient which wraps the GnipRawStreamClient and parses the JSON payloads, placing them onto a multiprocessing.Queue()
A set of processors which accept a multiprocessing.Queue() and take action accordingly

Some key notes about this design are as follows:

Modularity: We try to do our best to keep clean separation between different logical pieces of the app.
Multi-Processing: We have separate threads of execution (we are using processes as Python has a global interpreter lock)to handle the different steps of the process. For example, we do not write out to a database on the same thread that we are consuming the data. Rather, we use Queues to communicate between the different parts of the application.
Logging: We log as much relevant data as possible. If this were a production application, we would want to be able to trace what happened when without digging into live code.
Reconnection logic: Sometimes the stream will fail on us. We handle this gracefully by backing off the stream exponentially, and attempting to reconnect until we are successful.

##Requirements

Linux (Tested: Ubuntu 14.02, Multiprocessing libraries do not work on OSX)
Python (Tested: 2.7)
Access to PowerTrack or Decahose stream
pip (to install dependencies)
mysql-server-5.5 (optional)
Vagrant (optional, but we have included a Vagrantfile for convenience)

##Running To run with Linux:

git clone git@github.com:twitterdev/sample-python-connector.git sample-python-connector
cd sample-python-connector
sh script/bootstrap.sh
sh ./start.sh

To run with Vagrant:

vagrant init
vagrant ssh
cd /vagrant
sh ./start.sh

##Whats Next? Now that you can access the data, where do you go from here? The world is your oyster, and we can't wait to see what you do next. There are a few things to keep in mind however. Getting the data in the door is just the first step, proper storage of the data for purposes of keeping in compliance is important. The easiest way to achieve this is to make sure to keep record of both the individual Tweet ids and user ids, to facilitate removing tweets when necessary.

##LICENCE Refer to the LICENCE file

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
bin		bin
config		config
script		script
src		src
tests		tests
.gitignore		.gitignore
LICENCE		LICENCE
README.md		README.md
Vagrantfile		Vagrantfile
requirements.txt		requirements.txt
screenshot.png		screenshot.png
start		start

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

gnip/sample-python-connector

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages