Skip to content
Chapter-wise code for Agile Data the O'Reilly book
Branch: master
Clone or download
Pull request Compare This branch is 24 commits behind rjurney:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
ch02
ch03
ch04
ch07
ch08
ch09
ch10
ch11
.gitignore
Procfile
README.md
pigrc
requirements.txt

README.md

Agile Data the Book

You can buy the book here. You can read the book on O'Reilly OFPS now. Work the chapter code examples as you go. Don't forget to initialize your python environment. Try linux (apt-get, yum) or OS X (brew, port) packages if any of the requirements don't install in your virtualenv.

Agile Data Code Examples

Setup your Python Virtual Environment

# From project root

# Setup python virtualenv
virtualenv -p `which python2.7` venv --distribute
source venv/bin/activate
pip install -r requirements.txt

Download your Gmail Inbox!

# From ch3

# Download your gmail inbox
cd gmail
./gmail.py -m automatic -u me@gmail.com -p 'my_password_' -s ./email.avro.schema -f '[Gmail]/All Mail' -o /tmp/test_mbox 2>&1 &

Chapter 2: Data

An example spreadsheet is available at ch02/Email Analysis.xlsb. Example Pig code is available at ch02/probability.pig.

Chapter 3: Agile Tools

Full tutorial in Chapter 3 README.

Highlight:

Download your Gmail Inbox!

# From ch3

# Download your gmail inbox
cd gmail
./gmail.py -m automatic -u me@gmail.com -p 'my_password_' -s ./email.avro.schema -f '[Gmail]/All Mail' -o /tmp/test_mbox 2>&1 &

Chapter 4: To the Cloud!

Chapter 4 tutorial

Chapter 7: Collecting and Displaying Atomic Records

Chapter 7 tutorial

Chapter 8: Creating Charts

Chapter 8 tutorial

Chapter 9: Building Interactive Reports

Chapter 9 tutorial

Chapter 10: Making Predictions

Chapter 10 tutorial

Chapter 11: Driving Actions

Chapter 11 tutorial

You can’t perform that action at this time.