Skip to content
/ COBRA Public

A containerized experimentation platform built to monitor online controlled experiments learned under contextual bandit policies in real-time.

License

Notifications You must be signed in to change notification settings

wirrywoo/COBRA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

The Containerized Online Bandit Experimentation (COBE) Platform

A containerized experimentation platform built to monitor online controlled experiments learned under contextual bandit policies in real-time. Received Honorable Mention in 2023 Docker AI/ML Hackathon.

View Demo · View Devpost Submission · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

The Containerized Online Bandit Experimentation (COBE) Platform is built to monitor the performance of online controlled experiments learned under contextual bandit policies in real-time. The COBE Platform seeks to address the issues that standard A/B Testing is unable to resolve, including the following:

  • What if the chosen variation during the rollout phase of the experimentation process degrades in performance over time?
  • Will personalizing the choice of variation for each user successfully optimize the targeted metric?
  • Is there a faster way to identify better performing variations at a lower opportunity cost?

Many companies with an experimentation-first culture can highly benefit from utilizing online controlled experiments to improve their experimentation strategies by adjusting and optimizing future decisions based on the data collected from each observation. For example, Stitch Fix uses multi-armed bandits in their experimentation platform to support the implementation of various bandit policies, allowing data scientists to implement their own reward models and plug them into the allocation engine via a dedicated microservice for each bandit experiment.

Inspired by Stitch Fix's case study, we built the COBE Platform using $n$ Docker containers to isolate all variations of the landing page to respect the definition of the experiment, where only the tested variation is changed across different containers. Additionally, a HTTP load balancer container is created to split the population of all users into one of $n$ treatments. Lastly, a policy learner container is built to learn a policy that aims to optimize the targeted metric using an online contextual bandit system. Shown below is a high-level diagram that visualizes the technical architecture of the COBE Platform in its current state for $n = 2$ (one control and one treatment).

stateDiagram

    classDef platform font-family: courier, font-size:16px, fill:transparent, stroke-width:2px
    classDef container font-family: courier, font-size:12px, fill:transparent
    classDef actor font-family: courier, font-size:12px
    classDef none font-family: none, font-size:none

    direction LR
    Users --> LoadBalancer:::container
    Dev --> CobePlatform:::platform

    state CobePlatform {
        direction LR
        LoadBalancer --> WebControl
        LoadBalancer --> WebTreatment
        WebControl --> PolicyLearner
        WebTreatment --> PolicyLearner
        PolicyLearner --> LoadBalancer
    }
    WebControl:::container --> Users:::actor
    WebTreatment:::container --> Users
    Users --> PolicyLearner:::container
    WebControl --> WandB:::platform
    WebTreatment --> WandB
    PolicyLearner --> WandB
    WandB --> Dev:::actor
    Dev --> PolicyLearner

Built With

  • Python
  • Docker
  • Django
  • JQuery
  • Gunicorn
  • NGINX
  • VowpalWabbit
  • Weights and Biases
  • Ubuntu
  • Visual Studio Code

(back to top)

Getting Started

Prerequisites

  1. Create and sign into your Weights and Biases account.
  2. Locate the API Key here, copy it and add the secret key in the .env file under environment variable WANDB_API_KEY.
  3. Install Docker Desktop.

Installation and Usage

  1. Clone the repository to your local environment.

    git clone https://github.com/wirrywoo/cobe-platform.git
  2. Go into the cobe-platform main directory and build the containers.

    cd cobe-platform; docker compose up -d
  3. Go to browser and enter http://127.0.0.1/cobe-platform-demo/?seed=1 to see control group and http://127.0.0.1/cobe-platform-demo/?seed=3 to see treatment group. Reference screenshots of control and treatment versions of the landing page.

  4. Clicking on the Sign Me Up! button will register reward = 1 in both the logs of the respective Docker container, and record the reward in a Weights and Biases project named cobe-platform. Conversely, navigating away or refreshing the page without clicking on the button will register reward = 0 in the same locations.

  5. To observe contextual bandits in action, reference the following Google Colab notebook to simulate the setting when hundreds of users interact with the COBE Platform.

(back to top)

Screenshots

Landing Page for Control Group (with Docker Logo): control

Landing Page for Treatment Group (without Docker Logo): treatment

Visualizations

Average Reward Performance of Control vs. Treatment Variations

Under an unobserved cost function used to synthetically generate data for simulation purposes in the provided Google Colab notebook, we observe that the control variant (landing page with Docker logo) outperforms the treatment variant (landing page without Docker logo) over number of observations.

simulated_avg_reward

Updating NGINX Probabilities from CB Learning

For a fixed user, the probabilities of directing that user to one of the landing pages converge as more users interact with both control and treatment versions of the landing page. Using myself as an example in the provided Google Colab notebook, the policy recommends that the control version of the landing page should be shown to all users similar to me in terms of click activity and technical skills.

learned_probabilities_for_me

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Wilson Cheung - Personal Website - info@wilsoncheung.me

Project Link: https://github.com/wirrywoo/cobe-platform

(back to top)

About

A containerized experimentation platform built to monitor online controlled experiments learned under contextual bandit policies in real-time.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published