Adds Cassandra support for Autocomplete tags #2309

zeagord · 2018-12-03T10:09:56Z

No description provided.

codefromthecrypt

good start... next in-memory I think as then we can add some tests

zipkin-storage/cassandra/src/main/resources/zipkin2-schema-indexes.cql

codefromthecrypt · 2018-12-03T10:31:36Z

zipkin-storage/cassandra/src/main/java/zipkin2/storage/cassandra/SelectTagKeys.java

+    return new SelectTagKeys(factory);
+  }
+
+  static class AccumulateTagsAllResults extends AccumulateAllResults<List<String>> {


possible refactor possibility, as I recognise this class :P

zipkin/src/main/java/zipkin2/storage/TagStore.java

michaelsembwever · 2018-12-04T08:41:31Z

zipkin-storage/cassandra/src/main/java/zipkin2/storage/cassandra/SelectTagKeys.java

+    Factory(Session session) {
+      this.session = session;
+      this.preparedStatement =
+        session.prepare(QueryBuilder.select("key").distinct().from(TABLE_TAGS));


I don't know what the cardinality of tags will be? But this won't scale.

But, for example, if there's 1 million tags in a Zipkin storage this full-table scan is going to be painful (if not timeout).

Right. I would do a progressive query (by pagination but only simple one, no ordering and deifnirively not with offset but with after.

If we want to have a whitelist of tags to index and not just index all tags you might get away without doing this at all. And just return the list of whitelisted tags.

Added the whitelist implementation. Let me know your thoughts.

zeagord · 2018-12-04T11:16:38Z

I have added the in-memory api. I will revisit the cassandra tomorrow and address the above concerns.

codefromthecrypt · 2018-12-04T23:44:35Z

made a comment about fixed cardinality.. we definitely need to document this as it is indeed inappropriate for unbounded. #2236 (comment)

One thing @zeagord and I discussed is initially inheriting the config for the other names (service/span). This is for simplicity. In the future we could add a timestamp/lookback parameter to only fetch the values for a range. However, same problem would apply to service/span so thinking of that later

zeagord · 2018-12-07T10:38:52Z

Need to add tests and revisit the Elastic search design.

zipkin-server/src/main/java/zipkin2/server/internal/ZipkinQueryApiV2.java

tacigar · 2018-12-09T16:49:51Z

In elasticsearch, how about using _q instead of changing mapping?

codefromthecrypt · 2018-12-10T00:27:53Z

reuse _q field in elasticsearch

we can think about it but the performance might be bad. for example, getting the key names would require an expression I am not sure how to express unless we hard code the possible key names. if we hard code the possible key names yes it could work, but we have to check the performance of scanning all span documents to get the values

codefromthecrypt · 2018-12-10T00:29:06Z

good news is we can try it. actually I think we need to hard code key names anyway especially in Cassandra.

…

On Mon, 10 Dec 2018, 08:27 Adrian Cole ***@***.*** wrote: > reuse _q field in elasticsearch we can think about it but the performance might be bad. for example, getting the key names would require an expression I am not sure how to express unless we hard code the possible key names. if we hard code the possible key names yes it could work, but we have to check the performance of scanning all span documents to get the values

codefromthecrypt · 2018-12-10T06:12:28Z

so on cassandra (and elasticsearch) we'll need to ensure the "deduper" is in use to avoid thrashing writes.

In both cases, it might be helpful to reverse-engineer the service-span mapping to re-use the same table. ex PRIMARY KEY ((type, key), value) This could make data management in general easier long term.

Since there is a time bomb on elasticsearch #2219, we might want to solve that first before merging this (or at least before cutting a release with it).

Meanwhile, we can allow UI testing to work with static managed list of tags. (ex there may be only several values associated with phase, for example.. so one way is to allow the UI to configure predefined where it is small)

zeagord · 2018-12-10T16:49:23Z

is this tagkey because "key" is reserved? I suspect if not easier to just do key/value

Bodyconverters for ES looks for "key" from the result of aggregations which clashes with the key in the tag {k,v}. It could be the name of the aggregation. I will change the name of the aggs alone and see if it works.

drolando · 2018-12-11T03:35:09Z

zipkin-storage/cassandra/src/main/java/zipkin2/storage/cassandra/SelectTagKeys.java

+    Factory(Session session) {
+      this.session = session;
+      this.preparedStatement =
+        session.prepare(QueryBuilder.select("key").distinct().from(TABLE_TAGS));


If we want to have a whitelist of tags to index and not just index all tags you might get away without doing this at all. And just return the list of whitelisted tags.

zipkin-storage/cassandra-v1/src/main/resources/cassandra-schema-cql3-upgrade-1.txt

michaelsembwever · 2018-12-11T04:33:00Z

In both cases, it might be helpful to reverse-engineer the service-span mapping to re-use the same table. ex PRIMARY KEY ((type, key), value) This could make data management in general easier long term.

if service-span and tags are collapsed to one table it does make full table scans (eg QueryBuilder.select("key").distinct().from(TABLE__)) more painful…
i suspect keeping two separate tables is in fact wiser, thanks to @drolando for raising this.

codefromthecrypt

made most of a pass!

zipkin-storage/cassandra-v1/src/main/java/zipkin2/storage/cassandra/v1/CassandraStorage.java

...n-storage/cassandra-v1/src/main/java/zipkin2/storage/cassandra/v1/CassandraSpanConsumer.java

zipkin-storage/cassandra-v1/src/main/java/zipkin2/storage/cassandra/v1/CassandraTagStore.java

zipkin-storage/cassandra/src/main/java/zipkin2/storage/cassandra/CassandraSpanConsumer.java

zipkin-storage/cassandra-v1/src/main/java/zipkin2/storage/cassandra/v1/InsertTags.java

zipkin-storage/cassandra-v1/src/main/java/zipkin2/storage/cassandra/v1/SelectTagValues.java

zipkin-storage/cassandra-v1/src/main/java/zipkin2/storage/cassandra/v1/CassandraTagStore.java

zipkin-storage/cassandra/src/main/java/zipkin2/storage/cassandra/CassandraSpanConsumer.java

zipkin-storage/elasticsearch/src/main/java/zipkin2/elasticsearch/ElasticsearchTagStore.java

codefromthecrypt · 2018-12-17T03:51:45Z

this chops off the basic functionality and will allow the UI work to start immediately when merged. I can help rework the other commits similarly #2332

codefromthecrypt · 2018-12-17T09:42:05Z

other storage impls pulled out in #2333 and #2334

codefromthecrypt · 2018-12-18T07:51:31Z

I think this is nearly ready. we need to test the auto-upgrade logic and also update the README files to talk about how autocomplete works

michaelsembwever · 2018-12-20T00:29:16Z

Code changes LGTM.

One comment though: most of the diff is whitespace/code-style changes. For the reviewer there's a huge waste of time reading diffs that have nothing to do with the actual PR. It would be great if those changes where separated out to a separate follow-up commit (still within the PR). That way by reviewing just the first commit of the PR a lot of time could be saved.

codefromthecrypt · 2018-12-31T02:22:33Z

@michaelsembwever sorry about the formatting thing. I was in a rush to get something stable before I turned off internet for the vacation, but that amplified efforts to others.. not sure the better call but I apologize nevertheless. thanks for reviewing despite this.

codefromthecrypt · 2018-12-31T02:23:18Z

FYI travis is failing still on the same tests as last push this needs to be looked into prior to merge

zeagord requested review from codefromthecrypt and michaelsembwever December 3, 2018 10:09

codefromthecrypt reviewed Dec 3, 2018

View reviewed changes

michaelsembwever reviewed Dec 4, 2018

View reviewed changes

codefromthecrypt mentioned this pull request Dec 4, 2018

UI should support generic and site-specific tags #2236

Closed

tacigar reviewed Dec 9, 2018

View reviewed changes

zipkin-server/src/main/java/zipkin2/server/internal/ZipkinQueryApiV2.java Outdated Show resolved Hide resolved

drolando reviewed Dec 11, 2018

View reviewed changes

codefromthecrypt reviewed Dec 13, 2018

View reviewed changes

zipkin-storage/elasticsearch/src/main/java/zipkin2/elasticsearch/ElasticsearchTagStore.java Outdated Show resolved Hide resolved

zeagord force-pushed the tags-api branch from cc88c51 to cc09c54 Compare December 17, 2018 03:20

codefromthecrypt force-pushed the tags-api branch from 10c728f to fcceb20 Compare December 17, 2018 09:40

codefromthecrypt changed the title ~~[WIP] Api to query by tags~~ Adds Cassandra support for Autocomplete tags Dec 17, 2018

codefromthecrypt force-pushed the tags-api branch from fcceb20 to 109c40f Compare December 18, 2018 07:44

codefromthecrypt force-pushed the tags-api branch from 5119fd7 to 3ea8062 Compare December 18, 2018 08:13

Adds Cassandra support for Autocomplete tags

b086167

Adrian Cole added 2 commits December 31, 2018 14:01

Adds missing file

55aecc9

Fixes cassandra test drift

cf61fdc

codefromthecrypt force-pushed the tags-api branch from 0745564 to cf61fdc Compare December 31, 2018 08:28

codefromthecrypt merged commit 93163b1 into openzipkin:master Jan 1, 2019

abesto pushed a commit to abesto/zipkin that referenced this pull request Sep 10, 2019

Adds Cassandra support for Autocomplete tags (openzipkin#2309)

1123ed5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds Cassandra support for Autocomplete tags #2309

Adds Cassandra support for Autocomplete tags #2309

zeagord commented Dec 3, 2018 •

edited by codefromthecrypt

codefromthecrypt left a comment

codefromthecrypt Dec 3, 2018

michaelsembwever Dec 4, 2018

jcchavezs Dec 4, 2018

drolando Dec 11, 2018

zeagord Dec 13, 2018

zeagord commented Dec 4, 2018

codefromthecrypt commented Dec 4, 2018

zeagord commented Dec 7, 2018

tacigar commented Dec 9, 2018

codefromthecrypt commented Dec 10, 2018 via email

codefromthecrypt commented Dec 10, 2018 via email

codefromthecrypt commented Dec 10, 2018

zeagord commented Dec 10, 2018

drolando Dec 11, 2018

michaelsembwever commented Dec 11, 2018

codefromthecrypt left a comment

codefromthecrypt commented Dec 17, 2018

codefromthecrypt commented Dec 17, 2018

codefromthecrypt commented Dec 18, 2018

michaelsembwever commented Dec 20, 2018 •

edited

codefromthecrypt commented Dec 31, 2018

codefromthecrypt commented Dec 31, 2018

Adds Cassandra support for Autocomplete tags #2309

Adds Cassandra support for Autocomplete tags #2309

Conversation

zeagord commented Dec 3, 2018 • edited by codefromthecrypt

codefromthecrypt left a comment

Choose a reason for hiding this comment

codefromthecrypt Dec 3, 2018

Choose a reason for hiding this comment

michaelsembwever Dec 4, 2018

Choose a reason for hiding this comment

jcchavezs Dec 4, 2018

Choose a reason for hiding this comment

drolando Dec 11, 2018

Choose a reason for hiding this comment

zeagord Dec 13, 2018

Choose a reason for hiding this comment

zeagord commented Dec 4, 2018

codefromthecrypt commented Dec 4, 2018

zeagord commented Dec 7, 2018

tacigar commented Dec 9, 2018

codefromthecrypt commented Dec 10, 2018 via email

codefromthecrypt commented Dec 10, 2018 via email

codefromthecrypt commented Dec 10, 2018

zeagord commented Dec 10, 2018

drolando Dec 11, 2018

Choose a reason for hiding this comment

michaelsembwever commented Dec 11, 2018

codefromthecrypt left a comment

Choose a reason for hiding this comment

codefromthecrypt commented Dec 17, 2018

codefromthecrypt commented Dec 17, 2018

codefromthecrypt commented Dec 18, 2018

michaelsembwever commented Dec 20, 2018 • edited

codefromthecrypt commented Dec 31, 2018

codefromthecrypt commented Dec 31, 2018

zeagord commented Dec 3, 2018 •

edited by codefromthecrypt

michaelsembwever commented Dec 20, 2018 •

edited