# I Know What You Did Last Summer
## Experiment Tracking Tools for Data Science

> Sarah Braden

> Phoenix Data Science Meetup

> 13 September 2016


Possible Questions for the crowd:
    1. How do data scientists here keep track of their experiments?
    2. Has anyone ever had to go back to a project, but you forgot which model worked the best and had to rerun models to find a solution?
    3. Has anyone ever worked on something late at night, and gotten a great result, but then forgot the exact configuration in the morning?

<img style="float: center;" src="img/got_great_result.jpg">

<img style="float: center;" src="img/record_all_the_experiments.jpg">

### What

* Two Python libraries: Sacred and Sumatra
* Why to use an experiment management system
* How to get started
* Discussion of differences
* This talk is available on github (insert link)


### Why

* Reproducibility of experiments is critical to all science, including data science.
* Reduce errors in recording inputs, parameters, and outcomes.
* Automating experiment management makes it easier: Work smarter, not harder
* “I really need an experiment manager, but I want to roll my own experiment management system. It needs to be specific for my needs.” Why reinvent the wheel?

### What should an experiment manager be?

What are the needs of a data scientist?

"Sacred is a tool to help you configure, organize, log and reproduce experiments. It is designed to do all the tedious overhead work that you need to do around your actual experiment in order to:

    keep track of all the parameters of your experiment
    easily run your experiment for different settings
    save configurations for individual runs in a database
    reproduce your results"

# Sumatra

# Sumatra

* Documentation: 
    * https://pythonhosted.org/Sumatra/
    * https://pypi.python.org/pypi/Sumatra
* Github: https://github.com/open-research/sumatra
* Licence: 2-clause BSD
* Started in 2009
* supports Python versions 2.6, 2.7, 3.4 or 3.5. 

# Sumatra Features
* a command line tool
* a web interface
* creates directories and files in the repository
* Automatically creates a directory called “Data” when a project is initialized (you can customize the filename though).
* Sumatra requires that you keep your own code in a version control system (currently Subversion, Mercurial, Git and Bazaar are supported).
* by default Sumatra will refuse to run until you have committed your changes.
* Uses SQLite db by default, option to use PostgreSQL
* Targeted audience: Academics

# How to install Sumatra

The web interface requires Django (>= 1.6) and the django-tagging package (Installed automatically if you pip install Sumatra).

Install directly from the Python Package Index: (Which version is this? 0.7.4)
    
    pip install gitpython
    pip install sumatra

If you have downloaded the source package, Sumatra-0.7.0.tar.gz:

    tar xzf Sumatra-0.7.0.tar.gz
    cd Sumatra-0.7.0
    python setup.py install



# Getting started

# Sacred

# Sacred
<img style="float: center;" src="img/Monty_Python_Series.png">

> Every experiment is sacred

> Every experiment is great

> If an experiment is wasted

> God gets quite irate

# Sacred

* Documentation: https://pypi.python.org/pypi/sacred
* Github: https://github.com/IDSIA/sacred
* License: MIT
* Started in 2014

# Sacred Features
* Like Sumatra, Sacred has a Command-line interface you can use to change parameters and run different variants.
* Uses MongoDB
* Unlike Sumatra, it does not have an option to make you to commit your work before running
* Sacred has better test coverage at this time. Is this still true?
* Automatic seeding helps controlling the randomness in your experiments, such that the results remain reproducible.
* Does Sacred have a cool web interface like Sumatra?
No not yet: there is a proto-project: https://github.com/Qwlouse/prophet
One guy just uses jupyter notebooks to connect to MongoDB and look at the results. Not a bad option.


# How to Install Sacred

Install it from the Python Package Index (version 0.6.10):

    pip install sacred

Install manually:

    git clone https://github.com/IDSIA/sacred.git
    cd sacred
    python setup.py install

Recommended:

    pip install numpy pymongo pandas

# Getting Started

In [8]:
from sacred import Experiment  # central class of the Sacred framework

ex = Experiment('hello_config')

@ex.config
def my_config():
    recipient = "world"
    message = "Hello %s!" % recipient

@ex.automain
def my_main(message):
    print(message)

ImportError: No module named sacred

Sacred will run the my_config function and put all variables from its local scope into the configuration of our experiment.

In [None]:
Show output here

In [None]:
# Run Object

# Observers

Experiments in Sacred collect lots of information about their runs:

* time it was started and time it stopped
* the used configuration
* the result or any errors that occurred
* basic information about the machine it runs on
* packages the experiment depends on and their versions
* all imported local source-files
* files opened with `ex.open_resource`
* files added with `ex.add_artifact`
* custom info

# Observers

To access this information you can use the observer interface. First you need to add an observer like this:

In [None]:
from sacred.observers import MongoObserver

ex.observers.append(MongoObserver.create())


# Capturing stdout / stderr

By default sacred captures everything that is written to sys.stdout and sys.stderr and transmits that information to the observers. Sometimes this is unwanted, for example when the output contains lots of live-updated progressbars and such. To prevent the captured out from recording each and every update that is written to the console one can add a captured out filter to the experiment like this:

    from sacred.utils import apply_backspaces_and_linefeeds

    ex.captured_out_filter = apply_backspaces_and_linefeeds

Here apply_backspaces_and_linefeeds is a simple function that interprets all backspace and linefeed characters like in a terminal and returns the modified text. Any function that takes a string as input and outputs a (modified) string can be used as a captured_out_filter. For a simple example see examples/captured_out_filter.py.


In [None]:
# Capture Function Decorator

# Caveats

By default, Sacred experiments will fail if run in an interactive environment like a REPL or a Jupyter Notebook.

Only variables that are JSON serializable (i.e. a numbers, strings, lists, tuples, dictionaries) become part of the configuration. Other variables are ignored.

For running from the command line to work the automain function needs to be at the end of the file. Otherwise everything below it is not defined yet when the experiment is run.

Under the hood a Run object is created every time you run an Experiment (this is also the object that ex.run() returns). It holds some information about that run (e.g. final configuration and later the result) and is responsible for emitting all the events for the Observing an Experiment.

# Looking at the data
Let's use a combination of jupter notebook, pymongo, and pandas!

In [None]:
import pymongo
import pandas as pd

In [None]:
connection = pymongo.MongoClient('127.0.0.1', 27017)
connection.database_names()

In [None]:
sb = connection['sacred']
db.collection_names()

In [None]:
collection = db.runs
cursor = db.default.runs.find()
# Expand the cursor and construct the DataFrame
df = pd.DataFrame(list(cursor))

In [None]:
df.sort_values('heartbeat')

Do I want to talk about this?
Standard Output and Progress Bars
Writing a Custom Observer... 

## Which manager did I chose in the end?

After testing out both libraries I chose Sacred over Sumatra.

I found that Sacred was easier to start using out of the box. 

# Pros and Cons

In [None]:
### Junk slide

Data science experiments and machine learning models often have a non trivial number of variables. 
Tracking variables with a pen and paper notebook, or even a spreadsheet invites human error. 
Why not automate the experiment tracking process? 
Two Python libraries, Sumatra and Sacred, provide easy to use automated experiment tracking frameworks. 
This talk will discuss the differences between the libraries, how to get started using both, 
and how to customize the experiment trackers.  

I'm working on a talk about the importance of Data Science experiment reproducibility through the use of automated experiment trackers. 
I've researched two open source experiment tracking libraries in Python. In the talk I compare the two libraries Sacred and Sumatra. 
The need for automated experiment tracking came from my own need for better record keeping. 

### Final Slide

* help develop these open source projects
* Questions? Twitter: @ifmoonwascookie

### Further Reading
Davison A.P. (2012) Automated capture of experiment context for easier reproducibility in computational research. Computing in Science and Engineering 14: 48-56.

In [None]:
I made this presentation using RISE:

https://github.com/damianavila/RISE