Elasticsearch http slow on getting span names when there's a lot of data #1462

Open
adriancole opened this Issue Dec 30, 2016 · 5 comments

Projects

None yet

2 participants

@adriancole
Contributor

from @dragontree101

because of we have lots of data, service name and span name all time out, i remember store data in cassandra, have table only store service_name and span_name, but store in es, service_name and span_name get by es, so time is very long.

image

and chrome console have

image

i found i set SELF_TRACING_ENABLED="true" in environment

image

and yestoday's es index is

health status index                           uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   zipkin-2016-12-29               2_uXTThATbyoJ7khnXbo4g   5   1  609199726            0      140gb           70gb
@adriancole
Contributor

Currently, the query for service names looks at all indexes (it also sorts which is redundant). So, it probably affects more than just today's data.

@adriancole
Contributor

@openzipkin/elasticsearch I'm suspicious of using catch-all for service and span names. On one hand we want to make sure data is readable, but on the other hand do you think we need to search all indexes for service and span names? what if we cut-off at a particular date?

@adriancole adriancole added a commit that referenced this issue Dec 30, 2016
@adriancole adriancole Don't sort span names in elasticsearch
While probably not a big contributor to performance, we don't need to
sort span names in elasticsearch as we already copy out into a sorted
list.

See #1462
0399f64
@mansu
Contributor
mansu commented Dec 30, 2016

@adriancole While cutting off after a certain date is a good idea I think a better alternative is to add a mechanism to cache the service and span names in the persistent layer. That way we can run an expensive query periodically and cache accurate data instead of returning incomplete data every time. The current model of relying on HTTP caching is too expensive when a lot of users use the zipkin UI and is not scalable(for us) in the long run. Loading span and service names is the slowest part in our UI and currently it takes 10-15 seconds before the UI is usable. We are currently sampling 1/3 the expected rate but once we sample at the expected rate, the UI will take even longer to load.

@adriancole
Contributor
@mansu
Contributor
mansu commented Dec 30, 2016

Thanks. Will look into this and let you know.

@adriancole adriancole added a commit that referenced this issue Jan 2, 2017
@adriancole adriancole Don't sort span names in elasticsearch (#1463)
While probably not a big contributor to performance, we don't need to
sort span names in elasticsearch as we already copy out into a sorted
list.

See #1462
56fbfab
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment