Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


PyData Berlin 2016 Materials


Olivier Grisel, Predictive Modelling with Python

Julia Evans, How to trick a neural network

We McKinney, Python Data Ecosystem: Thoughts on Building for the Future


Daniel Kirsch, Functional Programming in Python

Trent McConaghy, BigchainDB: a Scalable Blockchain Database, in Python

David Higgins, Introduction to Julia for Python programmers

Katharina Rasch, What every Data Scientist should know about data anonymization

Alexander Sibiryakov, Frontera: open source, large scale web crawling framework

Thomas Reineking, Plumbing in Python: Pipelines for Data Science Applications

  • Yamal: Not yet Opensourced

Ryan Henderson, image-match: a python library for searching for similar images in large corpora

Ian Ozsvald, Statistically Solving Sneezes and Sniffles (a work in progress)

Felix Biessmann, Predicting Political Views From Text

Jie Bao, ExpAn - A Python Library for A/B Testing Analysis

Anne Matthies, Zero-Administration Data Pipelines using AWS Simple Workflow

Daniel Moisset, Bridging the gap: from Data Science to service

Katharine Jarmul, Holy D@t*! How to Deal with Imperfect, Unclean Datasets

Nora Neumann, Usable A/B testing – A Bayesian approach

Frank Kaufer, Building Polyglot Data Science Platform on Big Data Systems

Lukasz Czarnecki, Brand recognition in real-life photos using deep learning

Edouard Fouché, Accelerating Python Analytics by In-Database Processing

Delia Rusu, Estimating stock price correlations using Wikipedia

Jakob van Santen, The IceCube data pipeline from the South Pole to publication

Moritz Neeb, Bayesian Optimization and it's application to Neural Networks"

Kashif Rasul, What's new in Deep Learning?

Nathan Epstein, Machine Learning at Scale

Ronert Obst and Dat Tran, PySpark in Practice

Jose Quesada, A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and cons

Martina Pugliese, Spotting trends and tailoring recommendations: PySpark on Big Data in fashion

Angelos Kapsimanis, The Simple Leads To The Spectacular (Cancelled)

Anton Dubrau, Using small data in the client instead of big data in the cloud

  • did not respond, yet

Nils Magnus, Dealing with TBytes of Data in Realtime

  • did not respond, yet

Abhishek Thakur, Classifying Search Queries without User Click Data

  • did not respond, yet

Jessica Palmer, Python and TouchDesigner for Interactive Experiments

  • did not respond, yet

Maciej Gryka, Removing Soft Shadows with Hard Data

  • did not respond, yet

Andreas Lattner, Setting up predictive analytics services with Palladium

  • did not respond, yet

Andrej Warkentin, Visualizing

  • did not respond, yet

James Powell, The kwarg problem

  • did not respond, yet

Matthew Honnibal, Designing spaCy: A high-performance natural language processing (NLP) library written in Cython

  • did not respond, yet

Valentine Gogichashvili, Data Integration in the World of Microservices

  • did not respond, yet

Michelle Tran Chain, Loop & Group: How Celery Empowered our Data Scientists to Take Control of our Data Pipeline

  • did not respond, yet

Guertel Idai, Artificial Body Representation in Robots, Expectation and Surprise

  • did not respond, yet

Robert Meyer, pypet: A Python Toolkit for Simulations and Numerical Experiments

  • did not respond, yet

Juha Suomalainen, Visualizing research data: Challenges of combining different datasources

  • did not respond, yet

Danny Bickson, Python based predictive analytics with GraphLab Create

  • did not respond, yet

Fang Xu, Connecting Keywords to Knowledge Base Using Search Keywords and Wikidata

  • did not respond, yet

Dr. Markus Abel, Python Learns to Control Complex Systems

  • did not respond, yet


Frank Gerhardt, Using Spark - with PySpark

Mike Müller, Single-source Python 2/3

Katharine Jarmul, Data Wrangling with Python

Lev Konstantinovskiy, Practical Word2vec in Gensim

Shoaib Burq, Which city is the cultural capital of Europe? An introduction to Apache PySpark for GeoAnalytics

Lightning Talks

Oliver Zeigermann

Piotr Migdał, Teaching machine learning

Mentioned tools:


Collection of pointers to slides and repositories from speakers at PyData Berlin 2016



No releases published


No packages published
You can’t perform that action at this time.