A Multilayer and Multivariate Experiment Framework for the JVM
Java Shell
Latest commit 04b687c Jan 6, 2015 @srowen srowen Ad
Permalink
Failed to load latest commit information.
avro Various changes from IntelliJ inspection Nov 22, 2013
core Various changes from IntelliJ inspection Nov 22, 2013
curator Reorder poms and add final additional config for site building and re… Nov 13, 2013
deploy Various changes from IntelliJ inspection Nov 22, 2013
file Various changes from IntelliJ inspection Nov 22, 2013
server Use generic JavaEE 6 web API artifact versus Tomcat copy Dec 30, 2013
src/site Improve generated site config a bit, including nicer Fluido skin. Bum… Dec 30, 2013
.gitignore Initial public release Nov 12, 2013
LICENSE.txt Initial public release Nov 12, 2013
README.md Ad Jan 6, 2015
pom.xml

README.md

In The Attic

Hello! cloudera/gertrude is no longer being developed. It will remain available here but will not be updated further. Now, back to the README...

Gertrude: A Multilayer and Multivariate Experiment Framework for the JVM

Gertrude is a Java implementation of the overlapping experiments infrastructure used at Google and first described in Tang et al. (2010). It is designed to be powerful enough to support the types of experiments that data scientists, machine learning researchers, and software engineers need to run when developing data products (e.g., recommendation engines, search ranking algorithms, and large-scale classifiers), although it can also be used for testing new features and UI treatments.

The core of Gertrude is a Java library that allows developers to add experiment flags to their code to control the value of certain scalar parameters (booleans, ints, doubles, and strings) and an external configuration file that defines rules for setting the values of experiment flags on every request to the server based on attributes of the request (such as a user's cookie or anonymous login id.)

Gertrude has minimal dependencies and is intended to be used as a component library for production servers. The components of the framework are:

  • core: Core API definitions and experiment diversion logic
  • avro: Support for serializing experiment configurations as Apache Avro records
  • curator: Support for loading experiment configurations via Apache Curator, a library of patterns for Apache Zookeeper
  • file: Support for loading experiment configurations from a file that is monitored for changes
  • server: Example code for creating core experiment classes and configuring them for use with a Java server, a good place to start to see how the framework is used
  • deploy: Simple commandline tool for parsing an experiment configuration from a JSON or HOCON file, serializing it as an Avro object, and then deploying the serialized object to a Zookeeper node or file.

Gertrude is alpha code and is under active development, and we welcome new contributors. We will be co-developing Gertrude with Oryx, but Gertrude will remain a stand-alone library.

Gertrude is named for Gertrude Cox, the founder of the department of Experimental Statistics at North Carolina State University and co-author of one of the classic texts in the field, Experimental Designs.