topmodel is a service for evaluating binary classifiers. It comes with built-in metrics and comparisons so that you don't have to build your own from scratch.
You can store your data either locally or in S3.
Here are the graphs topmodel will give you for any binary classifier:
ROC (Receiver operating characteristic) curve
We also use bootstrapping to show the uncertainty on ROC curves and precision/recall curves. Here's an example:
The idea here is that among all items with score 0.9, you expect 90% of them to be in the target group (marked 'True'). This graph compares the expected rate to the actual rate -- the closer it is to a straight line, the better.
There are two graphs that show score distributions for instances labelled 'True' and instances labelled 'False'. The first graph shows the log distribution of the scores:
And the second shows the absolute frequencies:
Using topmodel locally
topmodel comes with example data so you can try it out right away. Here's how:
Create a virtualenv
Install the requirements:
pip install -r requirements.txt
Start a topmodel server:
topmodel should now be running at http://localhost:9191.
See a page of metrics for some example data at http://localhost:9191/model/data/test/my_model_name/
You can now add new models for evaluation! (see "How to add a model to topmodel" below for more)
Using topmodel with S3
It's better to store your model data in a S3 bucket, so that you don't lose it. To get this working:
cp config_example.yaml config.yaml
and fill it in with the S3 bucket you want to use and your AWS secret key and access key. topmodel will automatically find models in the bucket as long as they're named correctly (see "How to add a model to topmodel")
Then start topmodel with
How to add a model to topmodel
Create a TSV with columns 'pred_score' and 'actual'. Save it to
your_model_name.tsv. The columns should be separated by tabs. In each row:
actualshould be 0 or 1 (True/False also work)
pred_scoreshould be the score the model determined.
weightis an optional third column if you want to weight different instances more or less (default is 1).
- See the examples in
- For example:
actual pred_score False 0.2 True 0.8 True 0.7 False 0.3
Copy the TSV to S3 at
s3://your-s3-bucket/your_model_name/scores.tsv, or locally to
You're done! Your model should appear at http://localhost:9191/ if you reload.
We'd love for you to contribute. If you run topmodel with
it will autoreload code.
There's example data to test on in
Copyright 2014 Stripe, Inc
Licensed under the MIT license.