google · jonathanmetzman · Feb 28, 2020 · Feb 26, 2020 · Feb 27, 2020 · Feb 27, 2020
diff --git a/docs/advanced-topics/running_an_experiment.md b/docs/advanced-topics/running_an_experiment.md
@@ -2,97 +2,126 @@
 layout: default
 title: Running an experiment
 parent: Advanced topics
-nav_order: 1
+nav_order: 3
 permalink: /advanced-topics/running-an-experiment/
 ---
 
 # Running an experiment
 
-This page explains how to run an experiment. It requires using Google Cloud.
-Because most "users" of FuzzBench will be using it as a service and not running
-it themselves, we consider this an advanced topic.
+**NOTE**: Most users of FuzzBench should simply [add a fuzzer]({{ site.baseurl
+}}/getting-started/adding-a-new-fuzzer/) and use the FuzzBench service. This
+document isn't needed for using the FuzzBench service. This document explains
+how to run an [experiment]({{ site.baseurl }}/reference/glossary/#Experiment) on
+your own. We don't recommend running experiments on your own for most users.
+Validating results from the FuzzBench service is a good reason to run an
+experiment on your own.
+
+This document assumes a certain level of knowledge about
+Google Cloud and FuzzBench. If you haven't already, please follow the
+[guide on setting up a Google Cloud Project]({{ site.baseurl}}/advanced-topics/setting-up-a-google-cloud-project/)
+to run your own experiments. This document assumes you already have set up a
+Google Cloud Project, since running an experiment requires Google Cloud.
 
 - TOC
 {:toc}
 
-Experiments are started by the `run_experiment.py` script. This will create a
-dispatcher instance on Google Compute Engine which:
-1. Builds desired fuzzer-benchmark combinations.
-1. Starts instances to run fuzzing trials with the fuzzer-benchmark
-   builds and stops them when they are done.
-1. Measures the coverage from these trials.
-1. Generates reports based on these measurements.
+This page will walk you through on how to use `run_experiment.py`.
+Experiments are started by the `run_experiment.py` script. The script will
+create a dispatcher instance on Google Compute Engine which runs the experiment,
+including:
+1. Building desired fuzzer-benchmark combinations.
+1. Starting instances to run fuzzing trials with the fuzzer-benchmark
+   builds and stopping them when they are done.
+1. Measuring the coverage from these trials.
+1. Generating reports based on these measurements.
 
-This page will walkthrough on how to use `run_experiment.py`.
+The rest of this document will assume all commands are run from the root of
+FuzzBench.
 
 # run_experiment.py
 
-This page assumes a certain level of knowledge about Google Cloud and FuzzBench.
-If you haven't already, please check out the guide on setting up a Google Cloud
-Project to run FuzzBench.
-{% comment %}
-TODO(metzman): Write this doc.
-{% endcomment %}
-
 ## Experiment configuration file
 
 You need to create an experiment configuration yaml file.
-This will contain the configuration parameters for experiments that do not
+This file contains the configuration parameters for experiments that do not
 change very often.
-Below is an example configuation file with explanations of each required
+Below is an example configuration file with explanations of each required
 parameter.
 
 ```yaml
 # The number of trials of a fuzzer-benchmark pair to do.
 trials: 5
 
 # The amount of time in seconds that each trial is run for.
+# 1 day = 24 * 60 * 60 = 86400
 max_total_time: 86400
 
 # The name of your Google Cloud project.
-cloud_project: fuzzbench
+cloud_project: $PROJECT_NAME
 
 # The Google Compute Engine zone to run the experiment in.
-cloud_compute_zone: us-central1-a
+cloud_compute_zone: $PROJECT_REGION
 
 # The Google Cloud Storage bucket that will store most of the experiment data.
-cloud_experiment_bucket: gs://fuzzbench-data
+cloud_experiment_bucket: gs://$DATA_BUCKET_NAME
 
 # The bucket where HTML reports and summary data will be stored.
-cloud_web_bucket: gs://fuzzbench-reports
+cloud_web_bucket: gs://$REPORT_BUCKET_NAME
 
 # The connection to use to connect to the Google Cloud SQL instance.
-cloud_sql_instance_connection_name: "fuzzbench:us-central1:postgres-experiment-db=tcp:5432"
+cloud_sql_instance_connection_name: "$PROJECT_NAME:$PROJECT_REGION:$POSTGRES_INSTANCE=tcp:5432"
 ```
+
+**NOTE:** The values `$PROJECT_NAME`, `$PROJECT_REGION` `$DATA_BUCKET_NAME`,
+`$REPORT_BUCKET_NAME` `$POSTGRES_INSTANCE` refer to the values of those
+environment variables that were set in the [guide on setting up a Google Cloud
+Project]({{ site.baseurl }}/advanced-topics/setting-up-a-google-cloud-project/).
+For example if `$PROJECT_NAME` is `my-fuzzbench-project`, use
+`my-fuzzbench-project` and not `$PROJECT_NAME`.
+
 ## Setting the database password
 
 Find the password for the PostgreSQL instance you are using in your
 experiment config.
 Set it using the environment variable `POSTGRES_PASSWORD` like so:
 
 ```bash
-export POSTGRESS_PASSWORD="my-super-secret-password"
+export POSTGRES_PASSWORD="my-super-secret-password"
 ```
 
 ## Benchmarks
+
 Pick the benchmarks you want to use from the `benchmarks/` directory.
+
 For example: `freetype2-2017` and `bloaty_fuzz_target`.
 
 ## Fuzzers
+
 Pick the fuzzers you want to use from the `fuzzers/` directory.
 For example: `libfuzzer` and `afl`.
 
 ## Executing run_experiment.py
+
 Now that everything is ready, execute `run_experiment.py`:
 
 ```bash
 PYTHONPATH=. python3 experiment/run_experiment.py \
 --experiment-config experiment-config.yaml \
 --benchmarks freetype2-2017 bloaty_fuzz_target \
---experiment-name experiment-name \
+--experiment-name $EXPERIMENT_NAME \
 --fuzzers afl libfuzzer
 ```
 
+where `$EXPERIMENT_NAME` is the name you want to give the experiment.
+
+## Viewing reports
+
+You should eventually be able to see reports from your experiment, that are
+update at some interval throughout the experiment. However, you may have to wait
+a while until they first appear since a lot must happen before there is data to
+generate report. Once they are available, you should be able to view them at:
+`https://storage.googleapis.com/$REPORT_BUCKET_NAME/$EXPERIMENT_NAME/index.html`
+
 # Advanced usage
 
 ## Fuzzer configuration files

diff --git a/docs/advanced-topics/setting_up_a_google_cloud_project.md b/docs/advanced-topics/setting_up_a_google_cloud_project.md
@@ -0,0 +1,192 @@
+---
+layout: default
+title: Setting up a Google Cloud Project
+parent: Advanced topics
+nav_order: 2
+permalink: /advanced-topics/setting-up-a-google-cloud-project/
+---
+
+# Setting up a Google Cloud Project
+
+**NOTE**: Most users of FuzzBench should simply [add a fuzzer]({{ site.baseurl
+}}/getting-started/adding-a-new-fuzzer/) and use the FuzzBench service. This
+document isn't needed for using the FuzzBench service. This document explains
+how to set up a Google Cloud project for running an [experiment]({{ site.baseurl
+}}/reference/glossary/#Experiment) for the first time. We don't recommend
+running experiments on your own for most users. Validating results from the
+FuzzBench service is a good reason to run an experiment on your own.
+
+Currently, FuzzBench requires Google Cloud to run experiments (though this may
+change, see
+[FAQ]({{ site.baseurl }}/faq/#how-can-i-reproduce-the-results-or-run-fuzzbench-myself)).
+
+The rest of this document will assume all commands are run from the root of
+FuzzBench.
+
+## Create the Project
+
+* [Create a new Google Cloud Project](https://console.cloud.google.com/projectcreate).
+
+* Enable billing when prompted on the Google Cloud website.
+
+* Set `$PROJECT_NAME` in the environment:
+
+```bash
+export PROJECT_NAME=<your-project-name>
+```
+
+For the rest of this document, replace `$PROJECT_NAME` with the name of the
+project you created.
+
+* [Install Google Cloud SDK](https://console.cloud.google.com/sdk/install).
+
+* Set your default project using gcloud:
+
+```bash
+gcloud config set project $PROJECT_NAME
+```
+
+## Set up the database
+
+* [Enable the Compute Engine API](https://console.cloud.google.com/apis/library/compute.googleapis.com?q=compute%20engine)
+
+* Create a PostgreSQL (we use PostgreSQL 11) instance using
+[Google Cloud SQL](https://console.cloud.google.com/sql/create-instance-postgres).
+This will take a few minutes.
+We recommend using "us-central1" as the region and zone "a" as the zone.
+Certain links provided in this document assume "us-central1".
+Note that the region you choose should be the region you use later for running
+experiments.
+
+* For the rest of this document, we will use `$PROJECT_REGION`,
+`$POSTGRES_INSTANCE`, and `$POSTGRES_PASSWORD` to refer to the region of the
+PostgreSQL instance you created, its name, and its password. Set them in your
+environment:
+
+```bash
+export PROJECT_REGION=<your-postgres-region>
+export POSTGRES_INSTANCE=<your-postgres-instance-name>
+export POSTGRES_PASSWORD=<your-postgres-password>
+```
+
+* [Download and install cloud_sql_proxy](https://cloud.google.com/sql/docs/postgres/sql-proxy)
+
+```bash
+wget https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64 -O cloud_sql_proxy
+```
+
+* Connect to your postgres instance using cloud_sql_proxy:
+
+```bash
+./cloud_sql_proxy -instances=$PROJECT_NAME:$PROJECT_REGION:$POSTGRES_INSTANCE=tcp:5432
+```
+
+* (optional, but recommended) Connect to your instance to ensure you
+   have all of the details right:
+
+```bash
+psql "host=127.0.0.1 sslmode=disable user=postgres"
+```
+
+Use `$POSTGRES_PASSWORD` when prompted.
+
+* Initialize the postgres database:
+
+```bash
+PYTHONPATH=. alembic upgrade head
+```
+
+If this command fails, double check you set `POSTGRES_PASSWORD` correctly.
+At this point you can kill the `cloud_sql_proxy` process.
+
+## Google Cloud Storage buckets
+
+* Set up Google Cloud Storage Buckets by running the commands below:
+
+```bash
+# Bucket for storing experiment artifacts such as corpora, coverage binaries,
+# crashes etc.
+gsutil mb gs://$DATA_BUCKET_NAME
+
+# Bucket for storing HTML reports.
+gsutil mb gs://$REPORT_BUCKET_NAME
+```
+
+You can pick any (globally unique) names you'd like for `$DATA_BUCKET_NAME` and
+`$REPORT_BUCKET_NAME`.
+
+* Make the report bucket public so it can be viewed from your browser:
+
+```bash
+gsutil iam ch allUsers:objectViewer gs://$REPORT_BUCKET_NAME
+```
+
+## Dispatcher image and container registry setup
+
+* Build the dispatcher image:
+
+```bash
+docker build -f docker/dispatcher-image/Dockerfile \
+    -t gcr.io/$PROJECT_NAME/dispatcher-image docker/dispatcher-image/
+```
+
+FuzzBench uses an instance running this image to manage most of the experiment.
+
+* [Enable Google Container Registry API](https://console.console.cloud.google.com/apis/api/containerregistry.googleapis.com/overview)
+to use the container registry.
+
+* Push `dispatcher-image` to the docker registry:
+
+```bash
+docker push gcr.io/$PROJECT_NAME/dispatcher-image
+```
+
+* [Switch the registry's visibility to public](https://console.cloud.google.com/gcr/settings).
+
+## Enable required APIs
+
+* [Enable the IAM API](https://console.cloud.google.com/apis/api/iam.googleapis.com/landing)
+so that FuzzBench can authenticate to Google Cloud APIs and services.
+
+* [Enable the error reporting API](https://console.cloud.google.com/apis/library/clouderrorreporting.googleapis.com)
+so that FuzzBench can report errors to the
+[Google Cloud error reporting dashboard](https://console.cloud.google.com/errors)
+
+* [Enable Cloud Build API](https://console.cloud.google.com/apis/library/cloudbuild.googleapis.com)
+so that FuzzBench can build docker images using Google Cloud Build, a platform
+optimized for doing so.
+
+* [Enable Cloud SQL Admin API](https://console.cloud.google.com/apis/library/sqladmin.googleapis.com)
+so that FuzzBench can connect to the database.
+
+## Configure networking
+
+* Go to the networking page for the network you want to run your experiment in.
+[This](https://cloud.console.google.com/networking/subnetworks/details/us-central1/default)
+is the networking page for the default network in "us-central1". It is best if
+you use `$POSTGRES_REGION` for this.
+
+* Click the edit icon. Turn "Private Google access" to "On". Press "Save".
+
+* This allows the trial runner instances to use Google Cloud APIs since they do
+  not have external IP addresses.
+
+## Request CPU quota increase
+
+* FuzzBench uses a 96 core Google Compute Engine instance for measuring trials
+and single core instances for each trial in your experiment.
+
+* Go to the quotas page for the region you will use for experiments.
+[This](https://console.cloud.google.com/iam-admin/quotas?location=us-central1)
+is the quotas page for the "us-central1" region.
+
+* Select the "Compute Engine API" "CPUs" quota, fill out contact details and
+request a quota increase. We recommend requesting a quota limit of "1000" as
+will probably be approved and is large enough for running experiments in a
+reasonable amount of time.
+
+* Wait until you receive an email confirming the quota increase.
+
+## Run an experiment
+
+* Follow the [guide on running an experiment]({{ site.baseurl }}/advanced-topics/running-an-experiment/)
diff --git a/docs/advanced-topics/statistical_analysis.md b/docs/advanced-topics/statistical_analysis.md
@@ -2,7 +2,7 @@
 layout: default
 title: Statistical Analysis
 parent: Advanced topics
-nav_order: 2
+nav_order: 1
 permalink: /getting-started/statistical-analysis/
 ---
 

diff --git a/docs/reference/glossary.md b/docs/reference/glossary.md
@@ -38,6 +38,22 @@ or a custom one where you explicitly define the steps to checkout code and build
 the fuzz target
 ([example integration](https://github.com/google/fuzzbench/blob/master/benchmarks/vorbis-2017-12-11/build.sh)).
 
+### Trial
+
+A single fuzzing run on a particular benchmark. For example, we might compare
+AFL and honggfuzz by running 20 trials of each fuzzer on the libxml2-v2.9.2
+benchmark.
+
+### Experiment
+
+A group of [trials](#trial) that are run together to compare fuzzer performance.
+This usually includes trials from multiple benchmarks and multiple fuzzers. For
+example, to compare libFuzzer, AFL and honggfuzz, we might run an experiment
+where each of them fuzz every benchmark. Experiments use the same number of
+trials for each fuzzer-benchmark pair and a specific amount of time for each
+trial (typically, 24 hours) so that results are comparable. FuzzBench generates
+reports for experiments while they are running and after they complete.
+
 [fuzzing]: https://en.wikipedia.org/wiki/Fuzzing
 [fuzz target]: https://github.com/google/fuzzing/blob/master/docs/glossary.md#fuzz-target