Skip to content

Commit

Permalink
[cloudspanner] Add binding for Google's Cloud Spanner. (brianfrankcoo…
Browse files Browse the repository at this point in the history
  • Loading branch information
siamaktz authored and risdenk committed Mar 17, 2017
1 parent 1c4c310 commit 78c3cfa
Show file tree
Hide file tree
Showing 9 changed files with 620 additions and 2 deletions.
1 change: 1 addition & 0 deletions bin/bindings.properties
Expand Up @@ -34,6 +34,7 @@ azuretablestorage:com.yahoo.ycsb.db.azuretablestorage.AzureClient
basic:com.yahoo.ycsb.BasicDB
cassandra-cql:com.yahoo.ycsb.db.CassandraCQLClient
cassandra2-cql:com.yahoo.ycsb.db.CassandraCQLClient
cloudspanner:com.yahoo.ycsb.db.cloudspanner.CloudSpannerClient
couchbase:com.yahoo.ycsb.db.CouchbaseClient
couchbase2:com.yahoo.ycsb.db.couchbase2.Couchbase2Client
azuredocumentdb:com.yahoo.ycsb.db.azuredocumentdb.AzureDocumentDBClient
Expand Down
5 changes: 3 additions & 2 deletions bin/ycsb
Expand Up @@ -54,14 +54,15 @@ DATABASES = {
"accumulo" : "com.yahoo.ycsb.db.accumulo.AccumuloClient",
"aerospike" : "com.yahoo.ycsb.db.AerospikeClient",
"arangodb" : "com.yahoo.ycsb.db.ArangoDBClient",
"arangodb3" : "com.yahoo.ycsb.db.arangodb.ArangoDB3Client",
"arangodb3" : "com.yahoo.ycsb.db.arangodb.ArangoDB3Client",
"asynchbase" : "com.yahoo.ycsb.db.AsyncHBaseClient",
"basic" : "com.yahoo.ycsb.BasicDB",
"cassandra-cql": "com.yahoo.ycsb.db.CassandraCQLClient",
"cassandra2-cql": "com.yahoo.ycsb.db.CassandraCQLClient",
"cloudspanner" : "com.yahoo.ycsb.db.cloudspanner.CloudSpannerClient",
"couchbase" : "com.yahoo.ycsb.db.CouchbaseClient",
"couchbase2" : "com.yahoo.ycsb.db.couchbase2.Couchbase2Client",
"azuredocumentdb" : "com.yahoo.ycsb.db.azuredocumentdb.AzureDocumentDBClient",
"azuredocumentdb" : "com.yahoo.ycsb.db.azuredocumentdb.AzureDocumentDBClient",
"dynamodb" : "com.yahoo.ycsb.db.DynamoDBClient",
"elasticsearch": "com.yahoo.ycsb.db.ElasticsearchClient",
"geode" : "com.yahoo.ycsb.db.GeodeClient",
Expand Down
111 changes: 111 additions & 0 deletions cloudspanner/README.md
@@ -0,0 +1,111 @@
<!--
Copyright (c) 2017 YCSB contributors. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You
may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the License for the specific language governing
permissions and limitations under the License. See accompanying
LICENSE file.
-->

# Cloud Spanner Driver for YCSB

This driver provides a YCSB workload binding for Google's Cloud Spanner database, the first relational database service that is both strongly consistent and horizontally scalable. This binding is implemented using the official Java client library for Cloud Spanner which uses GRPC for making calls.

For best results, we strongly recommend running the benchmark from a Google Compute Engine (GCE) VM.

## Running a Workload

We recommend reading the [general guidelines](https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload) in the YCSB documentation, and following the Cloud Spanner specific steps below.

### 1. Set up Cloud Spanner with the Expected Schema

Follow the [Quickstart instructions](https://cloud.google.com/spanner/docs/quickstart-console) in the Cloud Spanner documentation to set up a Cloud Spanner instance, and create a database with the following schema:

```
CREATE TABLE usertable (
id STRING(MAX),
field0 STRING(MAX),
field1 STRING(MAX),
field2 STRING(MAX),
field3 STRING(MAX),
field4 STRING(MAX),
field5 STRING(MAX),
field6 STRING(MAX),
field7 STRING(MAX),
field8 STRING(MAX),
field9 STRING(MAX),
) PRIMARY KEY(id);
```
Make note of your project ID, instance ID, and database name.

### 2. Set Up Your Environment and Auth

Follow the [set up instructions](https://cloud.google.com/spanner/docs/getting-started/set-up) in the Cloud Spanner documentation to set up your environment and authentication. When not running on a GCE VM, make sure you run `gcloud auth application-default login`.

### 3. Edit Properties

In your YCSB root directory, edit `cloudspanner/conf/cloudspanner.properties` and specify your project ID, instance ID, and database name.

### 4. Run the YCSB Shell

Start the YCBS shell connected to Cloud Spanner using the following command:

```
./bin/ycsb shell cloudspanner -P cloudspanner/conf/cloudspanner.properties
```

You can use the `insert`, `read`, `update`, `scan`, and `delete` commands in the shell to experiment with your database and make sure the connection works. For example, try the following:

```
insert name field0=adam
read name field0
delete name
```

### 5. Load the Data

You can load, say, 10 GB of data into your YCSB database using the following command:

```
./bin/ycsb load cloudspanner -P cloudspanner/conf/cloudspanner.properties -P workloads/workloada -p recordcount=10000000 -p cloudspanner.batchinserts=1000 -threads 10 -s
```

We recommend batching insertions so as to reach ~1 MB of data per commit request; this is controlled via the `cloudspanner.batchinserts` parameter which we recommend setting to `1000` during data load.

If you wish to load a large database, you can run YCSB on multiple client VMs in parallel and use the `insertstart` and `insertcount` parameters to distribute the load as described [here](https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload-in-Parallel). In this case, we recommend the following:

* Use ordered inserts via specifying the YCSB parameter `insertorder=ordered`;
* Use zero-padding so that ordered inserts are actually lexicographically ordered; the option `zeropadding = 12` is set in the default `cloudspanner.properties` file;
* Split the key range evenly between client VMs;
* Use few threads on each client VM, so that each individual commit request contains keys which are (close to) consecutive, and would thus likely address a single split; this also helps avoid overloading the servers.

The idea is that we have a number of 'write heads' which are all writing to different parts of the database (and thus talking to different servers), but each individual head is writing its own data (more or less) in order. See the [best practices page](https://cloud.google.com/spanner/docs/best-practices#loading_data) for further details.

### 6. Run a Workload

After data load, you can a run a workload, say, workload B, using the following command:

```
./bin/ycsb run cloudspanner -P cloudspanner/conf/cloudspanner.properties -P workloads/workloadb -p recordcount=10000000 -p operationcount=1000000 -threads 10 -s
```

Make sure that you use the same `insertorder` (i.e. `ordered` or `hashed`) and `zeropadding` as specified during the data load. Further details about running workloads are given in the [YCSB wiki pages](https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload).

## Configuration Options

In addition to the standard YCSB parameters, the following Cloud Spanner specific options can be configured using the `-p` parameter or in `cloudspanner/conf/cloudspanner.properties`.

* `cloudspanner.database`: (Required) The name of the database created in the instance, e.g. `ycsb-database`.
* `cloudspanner.instance`: (Required) The ID of the Cloud Spanner instance, e.g. `ycsb-instance`.
* `cloudspanner.project`: The ID of the project containing the Cloud Spanner instance, e.g. `myproject`. This is not strictly required and can often be automatically inferred from the environment.
* `cloudspanner.readmode`: Allows choosing between the `read` and `query` interface of Cloud Spanner. The default is `query`.
* `cloudspanner.batchinserts`: The number of inserts to batch into a single commit request. The default value is 1 which means no batching is done. Recommended value during data load is 1000.
* `cloudspanner.boundedstaleness`: Number of seconds we allow reads to be stale for. Set to 0 for strong reads (default). For performance gains, this should be set to 10 seconds.
26 changes: 26 additions & 0 deletions cloudspanner/conf/cloudspanner.properties
@@ -0,0 +1,26 @@
# Copyright (c) 2017 YCSB contributors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you
# may not use this file except in compliance with the License. You
# may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied. See the License for the specific language governing
# permissions and limitations under the License. See accompanying
# LICENSE file.

# Core YCSB properties.
table = usertable
zeropadding = 12

# Cloud Spanner properties
cloudspanner.instance = ycsb-instance
cloudspanner.database = ycsb-database

cloudspanner.readmode = query
cloudspanner.boundedstaleness = 0
cloudspanner.batchinserts = 1
53 changes: 53 additions & 0 deletions cloudspanner/pom.xml
@@ -0,0 +1,53 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Copyright (c) 2017 YCSB contributors. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You
may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the License for the specific language governing
permissions and limitations under the License. See accompanying
LICENSE file.
-->

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>binding-parent</artifactId>
<version>0.13.0-SNAPSHOT</version>
<relativePath>../binding-parent/</relativePath>
</parent>

<artifactId>cloudspanner-binding</artifactId>
<name>Cloud Spanner DB Binding</name>
<packaging>jar</packaging>

<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-spanner</artifactId>
<version>${cloudspanner.version}</version>
<exclusions>
<exclusion> <!-- exclude an old version of Guava -->
<groupId>com.google.guava</groupId>
<artifactId>guava-jdk5</artifactId>
</exclusion>
</exclusions>
</dependency>

<dependency>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>core</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>

</dependencies>
</project>

0 comments on commit 78c3cfa

Please sign in to comment.