Spanner Latency Troubleshooting

This application combines a variety of read and write workloads on Spanner to simulate the problems that can be encountered in a real-world application at scale and then demonstrates how to debug them. Running the application for a few hundred thousand iterations in 'simulation' mode will be sufficient to surface latency problems. The causes for the latency problems include large payloads, complex transactions, and queries with full table scans. Instrumentation for metrics and trace collection using OpenCensus and export to Stackdriver. The application randomly executes a read and write transaction using one of a variety of query and transaction strategies on each iteration. The related paper Troubleshooting app latency with Cloud Spanner and OpenCensus provides detailed steps and description of interpretation of results.

After running the application for a period, view the data collected in the Stackdriver Monitoring and Stackdriver Trace user interfaces to check the latency of the requests to find the reasons for the differences in performance.

The example application assumes that you are familiar with Go programming, Google Cloud Platform, Spanner basics, OpenCensus, and Stackdriver.


The steps described here can be run on a Linux or Mac OS command line or the GCP Cloud Shell.

Project Setup

In the Cloud Shell, clone the GitHub project

git clone
cd opencensus-spanner-demo

Edit the variables in setup.env and import them into your development environment:

source ./setup.env

Enable the Stackdriver and Spanner APIs:

gcloud services enable \ \ \ \

Setup Spanner

Create a Spanner instance

gcloud spanner instances create $SPANNER_INSTANCE \
  --config=regional-us-central1 \
  --description="Test Instance" \

Create a database

gcloud spanner databases create $DATABASE --instance=$SPANNER_INSTANCE

Create some tables for the test application with the same schema as Getting started with Cloud Spanner in Go. Following the Data Manipulation Language syntax, in the Cloud Console, navigate to the Spanner database. Check Edit as text, enter the following text into the text area

  SingerId   INT64 NOT NULL,
  FirstName  STRING(1024),
  LastName   STRING(1024),
  BirthDate  DATE,
  LastUpdated TIMESTAMP,
) PRIMARY KEY(SingerId);

and click the Create button to create the table Singers.

Click on the Create index link and check Edit as text. Enter the following text into the text area.

CREATE INDEX SingersByLastName ON Singers(LastName)

and click the Create button to add an index for last name.

Go back to Database details and click the Create table link. Check Edit as text and enter the following text into the text area.

  SingerId        INT64 NOT NULL,
  AlbumId         INT64 NOT NULL,
  AlbumTitle      STRING(MAX),
  MarketingBudget INT64,
) PRIMARY KEY(SingerId, AlbumId),

Click Create to create the table Albums.

Setup an GCE Instance

Back in the Cloud Shell, create a GCE instance to run the test application from

gcloud compute instances create $CLIENT_INSTANCE \
  --zone=$ZONE \
  --scopes= \

Grant the GCE instance service account the predefined role roles/spanner.databaseUser following these steps. First, find the name of the service account associated with the instance:

gcloud compute instances describe $CLIENT_INSTANCE \
 --zone=$ZONE \

Make a note of the service account ID to grant role roles/spanner.databaseUser to the instance service account

SA_ACCOUNT=[service account id from command above]
gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT \
  --member serviceAccount:$SA_ACCOUNT \
  --role roles/spanner.databaseUser

SSH to the instance

gcloud compute ssh --zone $ZONE $CLIENT_INSTANCE

Install git

sudo apt-get update
sudo apt-get install -y git

Install Go and get the dependent libraries, as above.

Run the test app

Clone the code from the git repo.

git clone
cd opencensus-spanner-demo

Edit the file setup.env and initialize the environment

source ./setup.env

Build the code

go build

If you have trouble building the application make sure that you have Go modules enabled by setting the GO111MODULE environment variable:

export GO111MODULE=on

Set the project

export GOOGLE_CLOUD_PROJECT=[your project]

Run the test application:

nohup ./oc-spannerlab --project=$GOOGLE_CLOUD_PROJECT \
  --instance=$SPANNER_INSTANCE \
  --database=$DATABASE \
  --command=simulation \
  --iterations=100000 &

This runs 100,000 iterations of the test application in simulation mode, which will execute a random combination of queries and updates. It will take several minutes to run. Check that there are no errors in the command output:

tail -f nohup.log

View the data

You can view these in the Google Cloud Logging Log Viewer under GCE VM instances.

Go to the trace list to see the trace data. Notice the payload size in the Trace timeline and how higher latency tends to be correlated with larger payload size. To view the payload size, click on a trace in the Trace list and in the Trace timeline click Show events. Notice the bytes received in the timeline.

To view log-trace correlation, click on a trace in the Trace list and in the Trace timeline click on Show logs. Notice the log entry in the trace timeline and in the trace detail.

Also, check the aggregate metrics in Stackdriver Monitoring. In the Resource menu click Metrics Explorer. In the Metric textfield type in the prefix 'spanner-oc-test' and select from the metrics displayed. The metric 'completed_rpcs' is a good metric to view the overall status of the test. From the Metrics Explorer click Save chart to save the chart into a new dashboard.

