This repo contains all the files required for the August 13th, 2012 tutorial "Realtime Stream Processing". Below is the computing guide which outlines the technical requirements required to follow along.
Rendered slides can be view at http://mynameisfiber.github.com/realtimestream/
You will need to have the following installed:
Which require the following dependencies:
The easiest way to install the required programs is to first install homebrew and then use brew
and pip
to download the required files. Also, there is a great tutorial for steps 1 and 2 on this guide located at http://www.moncefbelyamani.com/how-to-install-xcode-homebrew-git-rvm-ruby-on-mac/ (follow steps 1-3 from that tutorial).
- Install GCC from XCode (see step 1-2 in this tutorial)
- Follow the instructions on http://mxcl.github.com/homebrew/ to install homebrew (see step 3 in this tutorial)
wget "https://github.com/mynameisfiber/realtimestream/blob/gh-pages/formula/brew_formulas.0.2.tar.gz?raw=true"
tar -xvzf "brew_formulas.0.2.tar.gz" -C /usr/local/Library/Formula/
rm "brew_formulas.0.2.tar.gz"
brew install simplequeue pubsub ps_to_http redis python
brew test simplequeue pubsub
pip install numpy "pysimplehttp>=0.2.1" redis ujson host_pool
(you may want to consider usingvirtualenv
) * Note: If you get an error installing ujson, don't sweat it. Nothing depends on it, but it is a really great and fast alternative to json/simplejson. Simply runsudo pip "pysimplehttp>=0.2.1" redis host_pool
To get started in ubuntu, we first use aptitude
to get dependencies, and then we use pip
to download the python libraries and manually compile the requirements in simplehttp
.
sudo apt-get install make gcc libevent1-dev libcurl4-openssl-dev redis-server
sudo apt-get install ipython python-pip python-redis python-numpy git
sudo pip install "pysimplehttp>=0.2.1" ujson host_pool
* Note: If you get an error installing ujson, don't sweat it. Nothing depends on it, but it is a really great and fast alternative to json/simplejson. Simply runsudo pip install "pysimplehttp>=0.2.1" host_pool
git clone https://github.com/bitly/simplehttp.git
cd simplehttp/simplehttp
make ; sudo make install
cd ../pubsub/
make ; sudo make install
cd ../simplequeue
make ; sudo make install
cd ../pubsubclient
make ; sudo make install
cd ../ps_to_http
make ; sudo make install
We currently do not support windows. If you have any success installing the required programs in windows, please tell us so we can update this section! For the python requirements, pip
and enthought
will be useful.
Enthought, a firm that specializes in scientific applications using Python, is offering a one-year subscription to its Enthought Python Distribution (EPD) to tutorial attendees. For those unfamiliar with Python, EPD makes it simple to set up a complete Python environment. EPD offers a comprehensive array of libraries that include machine learning, statistical, and various data manipulation packages. Subscribers also have access to Enthought’s private repository and package management tools, as well as a series of instructional webinars on various data analysis topics. EPD works just like a standard python environment, so you can still use pip and easy_install as you normally would.
If you’d like a free one-year subscription to the EPD:
1. Register for an Enthought account.
2. Email datagotham@enthought.com and let us know you've registered as part of Datagotham. We will then activate your subscription and email further instructions.
Have fun at the tutorials!