Skip to content

Notes and code for my short talk on the Hypothesis testing framework

Notifications You must be signed in to change notification settings

cmrosenberg/hypothesis_talk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Property-based testing and the Hypothesis framework

This repository holds the example code and some notes for a talk I gave at the Simula bi-weekly tools-meetup on Property-based testing and the Hypothesis testing framework on Friday September 28th, 2018.

What is Property-based testing?

As the author of Hypothesis points out in an article on this very question, it is hard to give a crisp definition that captures everything associated with the term "property-based testing". The gist of it is to try to test that certain properties of your program holds by automatically (and often randomly) generating example instances where the property should hold. It frees you from the tedium of coming up with (counter)examples in the form of unit tests, and helps you find adversarial examples that break your code.

I recommend reading the aforementioned article if you want to explore the term "property-based testing" more.

What is Hypothesis?

Hypothesis is a property-based tesing framework mainly targeting Python but also other platforms. It helps you write property-based tests by giving you lots of options for automatically generating example data of different kinds (even numpy arrays and pandas dataframes!). It integrates nicely with your existing unit testing framework, making it easy to gradually adopt Property-based testing.

Hypothesis is being used by several companies and open source projects. The payment service company Stripe uses it to extensively test their machine learning model training pipeline, for example.

Code examples

The talk revolved around a few code examples where I showcased different aspects of using Hypothesis.

The first example discussed an example borrowed from Hillel Wayne's excellent talk: A function that finds the maximal product of two elements from a list of integers. I show how quickly Hypothesis can find an example that breaks a "clever" implementation.

The second example discusses the challenge of checking whether CountVectorizer from scikit-learn behaves as expected when tokenizing on chars. In test_encoder.py I check whether CountVectorizer can handle any unicode input. In test_restricted, I check whether CountVectorizer can cope when explicitly asked not to lowercase its inputs and receiving a much more restricted set of unicode letters.

The third example shows a simple way of writing an extended unit test for the autograd differentiation library: We use autograd to find the derivative of np.sin, and then check whether it behaves like np.cos for a bunch of input arrays.

Installing the prereqs needed for the code examples

Either do pip install -r requirements.txt or do pip install pytest hypothesis autograd sklearn.

Resources

While researching Hypothesis and Property-based testing I benefitted greatly from the following:

About

Notes and code for my short talk on the Hypothesis testing framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages