Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to use Cassandra-3.11.3's DelimiterAnalyzer and PREFIX mode SASI for the annotations index in the Zipkin2 Cassandra storage #1948

Conversation

michaelsembwever
Copy link
Member

@michaelsembwever michaelsembwever commented Mar 16, 2018

This PR is waiting for Cassandra-3.11.3 to first be released.

It switches the SASI used for the annotation_query from the expensive (cpu, latency and disk space) CONTAINS NonTokenizingAnalyzer to the fast efficient DelimiterAnalyzer that was provided by @zuochangan and made available via CASSANDRA-14247.

ref: #1861

To test Zipkin with the new Delimiter Tokenizer, which will be available in Cassandra-3.11.3, do the following steps:

# clone and build Cassandra (3.11.2-SNAPSHOT)
git clone https://github.com/apache/cassandra.git
cd cassandra
git checkout cassandra-3.11
ant artifacts
# start Cassandra
build/dist/bin/cassandra -f

# clone Zipkin
git clone https://github.com/openzipkin/zipkin.git
cd zipkin
git checkout mck/zipkin2-cassandra-delimiter-indexer-for-annotations 
# Build the server and also make its dependencies
./mvnw -DskipTests --also-make -pl zipkin-server clean install
# start Zipkin
cd <ZIPKIN_SRC>
java -jar ./zipkin-server/target/zipkin-server-*exec.jar

# open http://localhost:8080/

To stress-test the Zipkin schema that uses the new Delimiter Tokenizer

# start Cassandra (as above)
…
# create stress friendly zipkin keyspace and schema
cd zipkin-storage/zipkin2_cassandra/src/test/resources/
cqlsh -f zipkin2-test-schema.cql

# warmup, and create initial writes
cassandra-stress  user profile=span-stress.yaml ops\(insert=1\)  duration=1m  -rate threads=4 throttle=50/s
# stress
cassandra-stress  user profile=span-stress.yaml ops\(insert=10,by_trace=1,by_trace_ts_id=1,by_annotation=1\)  duration=1m  -rate threads=4 throttle=50/s  -errors retries=10 ignore
# repeat, increasing throttle, threads, and duration. and graph to html if desired.
cassandra-stress  user profile=span-stress.yaml ops\(insert=10,by_trace=1,by_trace_ts_id=1,by_annotation=1\)  duration=1h  -rate threads=16 throttle=10000/s -errors retries=10 ignore -graph file=zipkin_1948.html title=Zipkin revision=with_delimiter_idx

@@ -33,12 +33,14 @@ CREATE TABLE IF NOT EXISTS zipkin2.span_by_service (
AND speculative_retry = '95percentile'
AND comment = 'Secondary table for looking up span names by a service name.';

DROP INDEX IF EXISTS zipkin2.span_annotation_query_idx;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this there before?

Copy link
Member Author

@michaelsembwever michaelsembwever Mar 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if the user has already used zipkin-storage/zipkin2_cassandra and created the schema.
(This is the default name of an index created, ie from CREATE … INDEX … ON zipkin2.span (annotation_query) …)

And if that is the case then the code won't actually be calling this cql file.

So this line only has any purpose if the cql file is going to be manually run.

This comes back to this question…

upgrade schema approach, or require users to drop keyspace if already using this storage backend

If we're happy to say anyone already using the new schema is expected to either:

  • manually run cqlsh -f zipkin2-schema-indexes.cql, or
  • drop keyspace and let it be recreated from scratch.

Then this approach is ok.

If not then we need something more along the lines of zipkin-storage/cassandra/src/main/resources/cassandra-schema-cql3-upgrade-1.txt

@codefromthecrypt
Copy link
Member

looks sweet. should ping @openzipkin/cassandra about those running "cassandra3" if they can upgrade. I like this.

Will be nice to see the storage impact before and after, and some throughput difference.

Preconditions.checkState(
0 < VersionNumber.parse("3.11.3").compareTo(host.getCassandraVersion()),
"All Cassandra nodes must be running 3.11.3+");
});
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit confused when writing this…
Isn't this type of check done somewhere else in the codebase?

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Mar 17, 2018 via email

@michaelsembwever michaelsembwever changed the title WIP – Switch to use a DelimiterAnalyzer and PREFIX mode SASI for the annotations index in the Zipkin2 Cassandra storage Switch to use Cassandra-3.11.3's DelimiterAnalyzer and PREFIX mode SASI for the annotations index in the Zipkin2 Cassandra storage May 14, 2018
@michaelsembwever michaelsembwever force-pushed the mck/zipkin2-cassandra-delimiter-indexer-for-annotations branch from c1da341 to 33e32d8 Compare May 16, 2018 04:00
@michaelsembwever
Copy link
Member Author

michaelsembwever commented May 24, 2018

Will be nice to see the storage impact before and after, and some throughput difference.

Here you go @adriancole
These are the numbers for three separate benchmarks, each running for an hour with, as much throughput as possible,

  • Cassandra-3.11.2 without the annotations index
  • Cassandra-3.11.3 with the new delimiter SASI annotations index
  • Cassandra-3.11.2 with the previous CONTAINS SASI annotations index

It's visible from these that the new delimiter SASI annotations index is almost as fast as with no annotations index, both of which are ~20x as fast as the previous CONTAINS SASI annotations index.
Both Cassandra and the stress client ran on the same Lenovo X1 Carbon gen5 (Intel Core i7-5600U Processor, 8Gb RAM) with Ubuntu 17.10.

Cassandra-3.11.2 without the annotations index
---------------------------------------
op rate                    : 8,997 op/s [by_trace: 692 op/s, by_trace_ts_id: 1,387 op/s, insert: 6,918 op/s]
partition rate             : 8,997 pk/s [by_trace: 692 pk/s, by_trace_ts_id: 1,387 pk/s, insert: 6,918 pk/s]
row rate                   : 8,997 row/s [by_trace: 692 row/s, by_trace_ts_id: 1,387 row/s, insert: 6,918 row/s]
latency mean               : 1.4 ms [by_trace: 1.5 ms, by_trace_ts_id: 1.5 ms, insert: 1.4 ms]
latency median             : 1.0 ms [by_trace: 1.1 ms, by_trace_ts_id: 1.1 ms, insert: 1.0 ms]
latency 95th percentile    : 3.7 ms [by_trace: 3.8 ms, by_trace_ts_id: 3.8 ms, insert: 3.6 ms]
latency 99th percentile    : 6.1 ms [by_trace: 6.2 ms, by_trace_ts_id: 6.2 ms, insert: 6.0 ms]
latency 99.9th percentile  : 16.3 ms [by_trace: 17.7 ms, by_trace_ts_id: 18.6 ms, insert: 15.9 ms]
latency max                : 140.4 ms [by_trace: 140.4 ms, by_trace_ts_id: 121.6 ms, insert: 136.8 ms]
total gc count             : 1,532
total gc memory            : 471.324 GiB
total gc time              : 32.5 seconds
avg gc time                : 21.2 ms
stddev gc time             : 9.2 ms
Total operation time       : 01:00:00


Cassandra-3.11.3 with the new delimiter SASI annotations index
---------------------------------------
op rate                    : 7,536 op/s [by_annotation: 580 op/s, by_trace: 577 op/s, by_trace_ts_id: 579 op/s, insert: 5,799 op/s]
partition rate             : 7,531 pk/s [by_annotation: 575 pk/s, by_trace: 577 pk/s, by_trace_ts_id: 579 pk/s, insert: 5,799 pk/s]
row rate                   : 7,586 row/s [by_annotation: 630 row/s, by_trace: 577 row/s, by_trace_ts_id: 579 row/s, insert: 5,799 row/s]
latency mean               : 2.1 ms [by_annotation: 3.4 ms, by_trace: 2.1 ms, by_trace_ts_id: 2.2 ms, insert: 2.0 ms]
latency median             : 1.4 ms [by_annotation: 2.6 ms, by_trace: 1.4 ms, by_trace_ts_id: 1.5 ms, insert: 1.3 ms]
latency 95th percentile    : 5.9 ms [by_annotation: 8.1 ms, by_trace: 5.9 ms, by_trace_ts_id: 5.9 ms, insert: 5.7 ms]
latency 99th percentile    : 10.3 ms [by_annotation: 14.6 ms, by_trace: 10.3 ms, by_trace_ts_id: 10.3 ms, insert: 9.8 ms]
latency 99.9th percentile  : 24.3 ms [by_annotation: 29.1 ms, by_trace: 24.1 ms, by_trace_ts_id: 24.4 ms, insert: 23.6 ms]
latency max                : 368.8 ms [by_annotation: 368.8 ms, by_trace: 157.0 ms, by_trace_ts_id: 116.3 ms, insert: 148.9 ms]
total gc count             : 3,760
total gc memory            : 1165.746 GiB
total gc time              : 70.8 seconds
avg gc time                : 18.8 ms
stddev gc time             : 7.6 ms
Total operation time       : 01:00:00


Cassandra-3.11.2 with the previous CONTAINS SASI annotations index
---------------------------------------
op rate                    : 442 op/s [by_annotation: 34 op/s, by_trace: 34 op/s, by_trace_ts_id: 34 op/s, insert: 340 op/s]
partition rate             : 442 pk/s [by_annotation: 34 pk/s, by_trace: 34 pk/s, by_trace_ts_id: 34 pk/s, insert: 340 pk/s]
row rate                   : 445 row/s [by_annotation: 37 row/s, by_trace: 34 row/s, by_trace_ts_id: 34 row/s, insert: 340 row/s]
latency mean               : 36.2 ms [by_annotation: 6.8 ms, by_trace: 4.2 ms, by_trace_ts_id: 4.1 ms, insert: 45.5 ms]
latency median             : 10.6 ms [by_annotation: 2.4 ms, by_trace: 1.1 ms, by_trace_ts_id: 1.1 ms, insert: 34.9 ms]
latency 95th percentile    : 118.0 ms [by_annotation: 31.1 ms, by_trace: 20.6 ms, by_trace_ts_id: 20.2 ms, insert: 125.1 ms]
latency 99th percentile    : 167.5 ms [by_annotation: 61.8 ms, by_trace: 47.3 ms, by_trace_ts_id: 47.1 ms, insert: 177.5 ms]
latency 99.9th percentile  : 423.6 ms [by_annotation: 114.8 ms, by_trace: 98.3 ms, by_trace_ts_id: 90.4 ms, insert: 470.5 ms]
latency max                : 1117.8 ms [by_annotation: 242.6 ms, by_trace: 192.5 ms, by_trace_ts_id: 243.4 ms, insert: 1,117.8 ms]
total gc count             : 23,728
total gc memory            : 7321.556 GiB
total gc time              : 517.7 seconds
avg gc time                : 21.8 ms
stddev gc time             : 8.5 ms
Total operation time       : 01:00:00

@michaelsembwever
Copy link
Member Author

michaelsembwever commented May 24, 2018

And screenshots of the benchmarking graphs.

Operations per second.
screen shot 2018-05-24 at 19 26 06

Mean latencies.
screen shot 2018-05-24 at 19 26 47

99th latencies.
screen shot 2018-05-24 at 19 27 02

michaelsembwever added a commit to michaelsembwever/michaelsembwever.github.io that referenced this pull request May 24, 2018
@michaelsembwever
Copy link
Member Author

The original benchmarking graph is available here: http://michaelsembwever.github.io/zipkin_1948.html

@michaelsembwever michaelsembwever force-pushed the mck/zipkin2-cassandra-delimiter-indexer-for-annotations branch from 33e32d8 to 9cca0c1 Compare May 24, 2018 11:26
@codefromthecrypt
Copy link
Member

@drolando fyi cassandra 3.11.3 is coming fast and once this merges should affect indexing a lot

@drolando
Copy link
Contributor

@adriancole @michaelsembwever I'll wait until C* 3.11.3 is out before upgrading to Cassandra 3 then.

@drolando
Copy link
Contributor

@michaelsembwever Any idea about when they'll release the new version? I saw a blog post from 2 years ago about Cassandra doing monthly releases but that doesn't seem to be the case at all.

@codefromthecrypt codefromthecrypt force-pushed the mck/zipkin2-cassandra-delimiter-indexer-for-annotations branch from 9cca0c1 to bfaddff Compare June 12, 2018 01:59
@codefromthecrypt
Copy link
Member

rebased in preparation of the pending release of cassandra 3.11.3

@@ -104,8 +104,7 @@
Input input =
new AutoValue_SelectTraceIdsFromSpan_Input(
serviceName,
// % for like, bracing with ░ to ensure no accidental substring match
"%░" + annotationKey + "░%",
annotationKey,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we wouldn't want the trailing %? I'm guessing that without the trailing % it will just do a strict match vs a partial prefix right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this one is the exact match thing which our api uses

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jun 13, 2018 via email

@llinder
Copy link
Member

llinder commented Jun 13, 2018

I didn't realize we had negative tests for this. I was looking forward to the ability to do prefixed based searches :/ It makes sense not to change API contracts for this right now though.

P.S. sorry for closing this. Comment and Close is way to close to the Comment button :/

@llinder llinder closed this Jun 13, 2018
@llinder llinder reopened this Jun 13, 2018
@@ -33,13 +33,14 @@ CREATE TABLE IF NOT EXISTS zipkin2.span_by_service (
AND speculative_retry = '95percentile'
AND comment = 'Secondary table for looking up span names by a service name.';

DROP INDEX IF EXISTS zipkin2.span_annotation_query_idx;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question.. I guess this means that zipkin2.span (annotation_query) results in an index named zipkin2.span_annotation_query_idx? so essentially we are re-creating that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

second question: does dropping this cleanup the data associated? and third: if we create an index below, is there a command we can use to rebuild the index based on old data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first question: yes. re-creating it with different mode and analyzer_class
second question: yes it's drops the data. creating the new index will go over the base data and rebuild it into the new index.

@codefromthecrypt codefromthecrypt force-pushed the mck/zipkin2-cassandra-delimiter-indexer-for-annotations branch from bfaddff to 22f1445 Compare July 4, 2018 03:19
@codefromthecrypt
Copy link
Member

fyi rebased this

@@ -97,6 +99,11 @@ static KeyspaceMetadata getKeyspaceMetadata(Session session) {
}

static KeyspaceMetadata ensureExists(String keyspace, boolean searchEnabled, Session session) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this means when you ask to automatically install the schema, we crash if an older version. Sounds fine as long as ENSURE_SCHEMA=false doesn't kill others

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any of the nodes in the Cassandra cluster are not 3.11.3 then the Zipkin server will crash.

For example someone accidently starts Zipkin up when the Cassandra cluster is mid-way through a rolling upgrade to 3.11.3

@@ -104,8 +104,7 @@
Input input =
new AutoValue_SelectTraceIdsFromSpan_Input(
serviceName,
// % for like, bracing with ░ to ensure no accidental substring match
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could do something like detect if the indexing type is updated. Ex if the mode=CONTAINS then box annotationKey like before?

@michaelsembwever michaelsembwever force-pushed the mck/zipkin2-cassandra-delimiter-indexer-for-annotations branch from 22f1445 to ec26344 Compare July 28, 2018 10:49
@michaelsembwever
Copy link
Member Author

rebased.

@michaelsembwever
Copy link
Member Author

@adriancole Cassandra-3.11.3 is out.

…tions index in the Zipkin2 Cassandra storage

ref: #1861
@michaelsembwever michaelsembwever force-pushed the mck/zipkin2-cassandra-delimiter-indexer-for-annotations branch from ec26344 to 2f304ad Compare August 3, 2018 06:17
@codefromthecrypt codefromthecrypt merged commit 5df523e into master Aug 3, 2018
@codefromthecrypt codefromthecrypt deleted the mck/zipkin2-cassandra-delimiter-indexer-for-annotations branch August 3, 2018 07:08
@codefromthecrypt
Copy link
Member

Thanks much! we can release shortly

@codefromthecrypt
Copy link
Member

So instructions for folks is to basically recreate their keyspace right? (or use a different name), correct?

@michaelsembwever
Copy link
Member Author

hot dog!

@michaelsembwever
Copy link
Member Author

michaelsembwever commented Aug 3, 2018

So instructions for folks is to basically recreate their keyspace right? (or use a different name), correct?

The recommended approach would be to drop and create the keyspace.

Advanced users could just run zipkin-storage/cassandra/src/main/resources/zipkin2-schema-indexes.cql
as it drops the index and re-creates it (which will re-index existing data).

For example:

cqlsh -f zipkin-storage/cassandra/src/main/resources/zipkin2-schema-indexes.cql

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Aug 3, 2018 via email

@codefromthecrypt
Copy link
Member

out in zipkin 2.11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants