Java implementation of the GCS Benchmark.
This benchmark measures Google Cloud Storage (GCS) performance using different client libraries and configurations. It serves as a Java port of the official C++ benchmark, enabling performance comparisons across languages and client implementations.
C++ Reference: https://github.com/GoogleCloudPlatform/grpc-gcp-cpp/tree/master/e2e-examples/gcs/benchmark
- ✅ All 3 operations:
read,random-read,write - ✅ All parameter names match C++ exactly
- ✅ Multi-threading support
- ✅ Warmup runs
- ✅ Object name resolution with templates
- ✅ Configurable timeouts
- ✅ Retry logic (
--trying)
- ✅ gRPC direct (
--client=grpc) - ✅ GCS Java Client library (
--client=http(gcs-json),gcs-grpc)
- ✅
perthread- One channel per thread (default) - ✅
const- Single shared channel - ✅
percall- New channel per operation - ✅
pool- Round-robin pool with configurable size - ✅ Channel eviction on errors (
CANCELLED,DEADLINE_EXCEEDED) - ✅ On-demand stub creation
- ✅ Full C++ metrics: threadId, channelId, peer, object, errors, chunks
- ✅ CSV export (
--report_file,--data_file) - ✅ Percentile latencies (p50, p95, p99)
- ✅ Per-operation detailed tracking
- ✅ Throughput calculation
- ✅ Bazel support
- ✅ Maven support
- ⏭️ CRC32C validation (
--crc32cflag exists, logic pending) - ⏭️ Resumable writes (
--resumableflag exists, logic pending) - ⏭️ Work stealing (
--steal_work) - ⏭️ TD mode (
--tdflag exists, logic pending) - ⏭️ Advanced channel policies (
bpool,spool) - ⏭️ Custom host/network configuration
- ⏭️ OpenTelemetry/Prometheus exports
- ⏭️ gRPC admin interface
Note: All unimplemented features show clear warnings when used
- Java 11+
- Bazel (recommended) or Maven
- Google Cloud credentials (for authenticated tests)
# Build
bazel build :gcs-java-bench
# Clean build
bazel clean
bazel build :gcs-java-bench# Build JAR
mvn clean package
# Build without tests
mvn clean package -DskipTests# Use default credentials
gcloud auth application-default login
# Or set service account
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.jsonbazel run :gcs-java-bench -- \
--bucket=my-test-bucket \
--object=10MB.dat \
--operation=read \
--runs=100bazel run :gcs-java-bench -- \
--bucket=my-test-bucket \
--object=100MB.dat \
--operation=read \
--threads=8 \
--runs=1000 \
--warmups=50bazel run :gcs-java-bench -- \
--bucket=my-test-bucket \
--object=file.dat \
--client=grpc \
--cpolicy=pool \
--carg=8 \
--threads=16 \
--runs=1000bazel run :gcs-java-bench -- \
--bucket=my-test-bucket \
--object=output.dat \
--operation=write \
--write_size=10485760 \
--trying \
--runs=100bazel run :gcs-java-bench -- \
--bucket=my-test-bucket \
--object=large-file.dat \
--operation=random-read \
--chunk_size=1048576 \
--runs=500bazel run :gcs-java-bench -- \
--bucket=my-test-bucket \
--object=file.dat \
--runs=1000 \
--report_file=results.csv \
--data_file=raw_data.csv \
--report_tag=baseline_testjava -jar target/gcs-java-bench-1.0-SNAPSHOT.jar \
--bucket=my-test-bucket \
--object=file.dat \
--runs=100All parameters match the C++ benchmark exactly. See the full list:
bazel run :gcs-java-bench -- --help| Parameter | Type | Description | Default |
|---|---|---|---|
--bucket |
string | GCS bucket name | required |
--object |
string | Object name | required |
--client |
string | Client type: grpc, http, gcs-json, gcs-grpc |
grpc |
--operation |
string | Operation: read, random-read, write |
read |
--runs |
int | Number of operations | 1 |
--warmups |
int | Warmup runs (excluded from results) | 0 |
--threads |
int | Number of threads | 1 |
--cpolicy |
string | Channel policy: perthread, const, pool, percall |
auto |
--carg |
int | Policy parameter (e.g. pool size) | 0 |
--trying |
bool | Retry on failures | false |
--read_limit |
long | Bytes to read (-1 = all) | -1 |
--write_size |
long | Bytes to write | 0 |
--chunk_size |
long | Chunk size for random-read/write | -1 |
--report_file |
string | CSV summary output | "" |
--data_file |
string | CSV detailed data output | "" |
--verbose |
bool | Show debug output | false |
Running benchmark with direct gRPC client, operation: read...
Using pool channel policy with 8 channels
Running actual benchmark...
=== Results ===
Operations: 1000
Total bytes: 10485760000
Duration: 45234 ms
Throughput: 220.45 MB/s
Latency percentiles (ms):
p50: 42.3
p75: 58.1
p95: 89.7
p99: 124.5
CSV format with summary statistics:
tag,operation,client,cpolicy,threads,runs,total_bytes,duration_ms,throughput_mbps,p50_ms,p95_ms,p99_ms,success_rate
test1,read,grpc,pool,8,1000,10485760000,45234,220.45,42.3,89.7,124.5,100.00CSV format with per-operation details:
tag,operation,timestamp_ms,latency_ms,bytes,success
test1,read,1700000001234,42,10485760,true
test1,read,1700000001289,45,10485760,trueTo run equivalent tests in C++ and Java:
C++:
bazel run //e2e-examples/gcs/benchmark:benchmark -- \
--bucket=test --object=file.dat --runs=1000 --cpolicy=pool --carg=8Java:
bazel run :gcs-java-bench -- \
--bucket=test --object=file.dat --runs=1000 --cpolicy=pool --carg=8The parameters are identical - you can copy-paste command lines between implementations!
gcs-java-bench/
├── BUILD # Bazel build config
├── WORKSPACE # Bazel workspace
├── pom.xml # Maven build config
├── README.md # This file
└── src/main/java/com/google/cloud/benchmark/
├── Main.java # Entry point
├── BenchmarkParameters.java # CLI parameters
├── BenchmarkRunner.java # Runner interface
├── GrpcRunner.java # gRPC implementation
├── GcsRunner.java # GCS client implementation
├── StorageStubProvider.java # Channel pool interface
├── ConstChannelPool.java # Const policy
├── PerThreadChannelPool.java # Per-thread policy
├── PerCallChannelPool.java # Per-call policy
├── RoundRobinChannelPool.java # Pool policy
├── ChannelFactory.java # Channel creation
├── RunnerWatcher.java # Metrics interface
├── StatWatcher.java # Metrics implementation
├── ReportWriter.java # CSV export
├── ResultPrinter.java # Console output
├── ObjectResolver.java # Name templating
└── RandomData.java # Data generation
- Check C++ implementation for reference
- Add parameter to
BenchmarkParameters.java - Implement logic in appropriate runner
- Add tests
- Update this README
The benchmark includes async-profiler integration for finding performance bottlenecks in the GCS SDK.
# Profile a read operation (wall-clock mode - recommended for I/O)
./scripts/profile.sh wall read_test -- --bucket=my-bucket --object=10MB.dat --runs=100
# Profile CPU-only (ignores I/O wait time)
./scripts/profile.sh cpu cpu_analysis -- --bucket=my-bucket --object=file.dat --runs=100
# Profile lock contention (good for high-concurrency tests)
./scripts/profile.sh lock contention -- --bucket=my-bucket --object=file.dat --threads=32 --runs=500
# Profile memory allocations (GC pressure analysis)
./scripts/profile.sh alloc memory_test -- --bucket=my-bucket --object=file.dat --runs=100
# Run all profiling modes
./scripts/profile.sh all comprehensive -- --bucket=my-bucket --object=file.dat --runs=200| Mode | Best For | What It Shows |
|---|---|---|
wall |
I/O-heavy workloads (recommended) | Where time is actually spent, including I/O waits, locks, network |
cpu |
CPU-bound analysis | Computation hotspots only, ignores I/O time |
lock |
High-concurrency tests | Where threads block waiting for locks |
alloc |
Memory/GC analysis | Where objects are allocated, GC pressure sources |
The profiler outputs interactive HTML flame graphs to profiles/. Key areas to examine for GCS SDK optimization:
| Package | What It Represents |
|---|---|
com.google.cloud.storage.* |
GCS Java SDK methods |
io.grpc.* |
gRPC transport layer |
io.netty.* |
Network I/O operations |
com.google.protobuf.* |
Protocol buffer serialization |
com.google.auth.* |
Authentication/credential handling |
Tip: The script auto-downloads async-profiler on first run. Profiles are saved with timestamps for easy comparison.
# Clean and rebuild
bazel clean --expunge
bazel build :gcs-java-bench
# Maven clean
mvn clean package# Verify credentials
gcloud auth application-default login
gcloud config list
# Test with insecure mode (no auth)
bazel run :gcs-java-bench -- --bucket=test --object=file --cred=insecure- Use
--cpolicy=pool --carg=Nfor better throughput - Increase
--threadsto match CPU cores - Use
--warmupsto exclude JIT warmup from results - Enable
--verboseto debug issues
This project aims for 100% parity with the C++ benchmark. Priority areas:
- High Priority: CRC32C validation, resumable writes
- Medium Priority: Advanced channel policies (
bpool,spool) - Low Priority: OpenTelemetry integration, custom network paths
Copyright 2025 Google LLC
Licensed under the Apache License, Version 2.0