GitHub is home to over 40 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
A utility tool to automate certain tasks with Jupyter notebooks.
support for nested data structures in columns in pandas
Fast iterative local development and testing of Apache Airflow workflows
Skeleton project for Apache Airflow training participants to work on.
a python grammar for evolutionary algorithms and heuristics
Provision the training environment (right now only for the Data Science with Spark on Dataproc trainings
Genetic algorithms and the game of Risk
A selection of notebooks coming from the GoDataDriven trainings
Python interface to Hive and Presto. 🐝
Extract data from JIRA through REST and create charts.
Ansible scripts to create druid cluster
Example project demonstrating easy, concise and typechecked JDBC access
Material for PyData Code Breakfast: Introduction to Deep Learning
The iterative broadcast join example code.
Balancing Heroes and Pokemon in Real Time: A Streaming Variant of Trueskill for Online Ranking
Scripts to provision NiFi to HDInsight
Hadoop smoke testing framework
Sources to our blog
repo to demonstrate streaming game imbalance
Example how to use Flink with Kafka
bigger simulations = moar profit