GitHub is home to over 36 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
Fast iterative local development and testing of Apache Airflow workflows
Provision the training environment (right now only for the Data Science with Spark on Dataproc trainings
A utility tool to automate certain tasks with Jupyter notebooks.
a python grammar for evolutionary algorithms and heuristics
Genetic algorithms and the game of Risk
Skeleton project for Airflow training participants to work on.
A selection of notebooks coming from the GoDataDriven trainings
support for nested data structures in columns in pandas
Python interface to Hive and Presto. 🐝
Extract data from JIRA through REST and create charts.
Ansible scripts to create druid cluster
Example project demonstrating easy, concise and typechecked JDBC access
Material for PyData Code Breakfast: Introduction to Deep Learning
The iterative broadcast join example code.
Balancing Heroes and Pokemon in Real Time: A Streaming Variant of Trueskill for Online Ranking
Scripts to provision NiFi to HDInsight
Hadoop smoke testing framework
Sources to our blog
repo to demonstrate streaming game imbalance
Example how to use Flink with Kafka
bigger simulations = moar profit
Python interface for igraph