Sieve is a simple API-based a/b testing framework designed for ease and speed of use. It is meant for those organizations who prefer to build and operate their own a/b testing framework rather than using a third party service.
There are several third party optimization and a/b testing cloud services, but for a variety of reasons -- such as cost, privacy, flexibility -- you might want to roll your own. Sieve is exactly for this purpose.
- Easy to use (REST API Driven)
- Integrated client side SDK, dashboard and backend analytics.
- Your data is owned by you. No need to send your data to a third party service provider.
- Inexpensive. Most third party services are too expensive. Some of them charge not per user but for every experiment that a user is a part of.
- Real-time allocation of users to buckets.
- Pluggable design. Determination of statistical significance of results of any experiment depends on the assumptions of underlying distribution. Sieve comes with Chi-square and normal distributions built-in and you can plugin your own distribution if you want.
This project is divided into three components:
- SDK: hosts the npm module sieve-js, which is to be integrated by an app or website that wants to perform a/b tests.
- API Server: provides endpoints to create and manage the a/b tests (aka experiments). Also handles experiment allocation, tracking and reporting.
- Dashboard: is the admin UI for (1) managing experiments and (2) tracking their performance as the experiment receives traffic.
The flow:
-
Create an experiment using the dashboard, set its exposure, add variations, and set their splits.
-
Integrate the SDK into your client, and request the server to allocate experiments and their variations. Trigger the required changes in the client based on the response.
var client = new Sieve({ base_url: 'https://sieve.server' }); client.allocate().then(function (experiments) { // Change client based on allocated experiments+variations }) .catch(function (err) { console.error(err); })
-
Track relevant user actions.
client.track('pay_btn_click');
All the tracked events are stored in files as JSON strings. These file(s) need to be processed offline at regular intervals, and then saved into the analytics database, for which there is a Spark Java application bundled along with the server code. Using this will require a Spark or EMR cluster to be set up. If you prefer to not do this, you can write your own scripts to parse the log files and save it in the database.
- node >= 4
- MySQL
- Redis
- Redshift cluster / PostgreSQL server
(Optional) For building the Java Data Processing app:
- JDK 8
- Maven 3
- Apache Spark 2+ / AWS EMR cluster
To make it easy for you to test out the framework, we've created a simple demo app. This section lists out the procedure to get it running.
Let's start off by cloning this repository:
git clone https://github.com/agaralabs/sieve.gitChange your current working directory to the demo:
cd sieve/demoThis directory contains the demo app, a sample config and a script that will install the dependencies, build all the components, create tables and seed it with the sample data.
But before running this script, you need to ensure that you've installed NodeJS, MySQL, and Redis. MySQL & Redis services should be running too.
Now you need to create the sieve_demo database and the sieveuser user, and grant privileges for the database:
CREATE SCHEMA `sieve_demo`;
CREATE USER sieveuser IDENTIFIED BY "password";
GRANT ALL PRIVILEGES ON sieve_demo.* TO sieveuser;If you need to, modify the demo-config.ini. For more details, refer to this document
Finally, run the demo:
./run-demoIf the script faces any issue during this process, it will throw an error. Re-run the command after fixing it. But if the error is database related, it is better to delete all the tables before re-running.
If everything goes right, the dashboard should be running on http://localhost:8000 and the demo app on http://localhost:8000/demo
To end the demo, press CTRL+C
The demo app's tracking logs should be written to the tracker.logfile path in the config. (Default value: /tmp/sieve_tracker.log)
If you haven't already cloned the repo, do so, as given at the beginning of #quickstart.
Install the dependencies:
If you've built the demo app, you can skip this step, as it has already installed the dependencies. Otherwise, run:
.bin/dependencies.shIf you prefer not to build the data processor java application, set the env WITHOUT_DATAPROC=1:
WITHOUT_DATAPROC=1 .bin/dependencies.shConfigure the API:
The config file needs to be present at dashboard/server/src/config/config.ini.
This file does not exist and needs to be created. You can copy a sample config and modify it:
cd dashboard/server
cp src/config/sample-spark.config.ini src/config/config.iniModify the config according to the instructions given here and return to the project root.
cd ../..Note: For security reasons, the config file is not saved to the repo. If you're using a CI, this file needs to be provided during the build process by other means.
The config file is read during the build, and the public url of the API is injected into the dashboard UI build.
Build:
.bin/build.shAs before, if you prefer not to build the data processor java app, you can skip it.
WITHOUT_DATAPROC=1 .bin/build.shThis will create build artifacts in the dist directory. It contains the server and dashboard folders, which correspond to the API app and it's frontend.
Setup Database:
MySQL is used for storing experiments. To set it up:
Create the sieve database and the sieveuser user, and grant permissions to the database. If you've followed the process listed in Quickstart, you should not create the user, but you need to create the database and assign privileges.
CREATE SCHEMA `sieve`;
CREATE USER sieveuser IDENTIFIED BY "password";
GRANT ALL PRIVILEGES ON sieve.* TO sieveuser;And setup the tables. This is done automatically in the demo script, but needs to be done manually for general installations.
mysql -u sieveuser -p sieve < dashboard/server/db/mysql_schema.sqlYou can skip the rest of the database setup if you are not setting up analytics now.
For analytics database:
We've kept the experiments store separate from the tracker store. PostgreSQL is used for this and can be replaced by Amazon Redshift when you need more performance.
Instructions for PostgreSQL:
After installing Postgres, run these commands on the shell to create the database and the user. Depending on your OS, you might have to login as postgres user for these commands to work.
createdb sieve
createuser sieveuser -l -P # enter the password at promptEnter into the psql prompt and grant all privileges on the sieve tables to sieveuser:
$ psql
psql (9.5.6)
Type "help" for help.
postgres=# GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO sieveuser;Edit the pg_hba.conf to allow sieveuser to login into the sieve db from host 127.0.0.1 using a password. Restart the postgres service after editing
# Your data directory may vary
# Check your installation for details
# Running `SHOW data_directory` at psql prompt might help
sudo vi /etc/postgresql/9.5/main/pg_hba.conf
# TYPE DATABASE USER ADDRESS METHOD
host sieve sieveuser 127.0.0.1/32 md5
# Add the above line to the end of the file, save, and quit
# Restart postgres
Test by logging into the database:
psql -U sieveuser -W -h 127.0.0.1 sieve # Enter password at prompt
psql (9.5.6)
Type "help" for help.
sieve=>
# quit
sieve=> \qLoad the schema into the database:
psql -U sieveuser -W -h 127.0.0.1 -d sieve -a -f dashboard/server/db/pg_schema.sqlAnd you're done!
Instructions for Redshift:
Create the cluster on AWS Console and load the schema in dashboard/server/db/redshift_schema.sql
Run:
For the API Server:
cd dist/server
npm install --production
node app.js # use the "--harmony" flag for node v4For the frontend dashboard, serve the contents via a generic server like nginx/apache, or use this command for a temporary server. Access it through the browser via http://localhost:8000.
cd dist/dashboard
python -m SimpleHTTPServer 8000For bugs and support, please raise an issue on this repository.
To contribute, please fork the project and raise a pull request. Make sure to run tests before pushing.
.bin/test.sh
.bin/build.shApache 2.0

