- Agile_Data_Code 137 Chapter-wise code for Agile Data the O'Reilly book
- Collecting-Data 27 This is a HOWTO for collecting data in Ruby and Python applications and sending it to S3 via Kafka.
- Cloud-Stenography 16 Main Repo
- enron-python-flask-cassandra-pig 15 Hortonworks demo of Enron emails with Pig, Cassandra, Python and Flask
- pig-to-json 14 A Pig to JSON UDF for Pig that converts tuples and bags to JSON strings
Repositories contributed to
- scikit-learn/scikit-learn 11,629 scikit-learn: machine learning in Python
- infochimps-labs/big_data_for_chimps 149 A Seriously Fun guide to Big Data Analytics in Practice
- networkx/networkx 2,136 Official NetworkX source code repository.
- dcos/dcos-docs 54 Documentation for Datacenter Operating System (DC/OS)
- docker/docker 31,612 Docker - the open-source application container engine