Suggestions to improve the user experience for the ES database #3767

jblespiau · 2016-01-17T21:39:43Z

I have been trying to use the new Elasticsearch database instead of my previous one, and I have encountered several difficulties. I am willing to discuss them, suggest ways to fix these issues and contribute myself.

I first list the issues I had, and then suggests solutions for comments.

I have difficulties to get proper documentation on the use of the ES database. For instance, there is no information about the ES mappings that are expected by the ES Query Editor. For instance, I tried using the following data :
{ 'metric_value': 42,
tags : { 'host': 'machine 1 } }

with bad results since, for what I have understood, a flat mapping is expected ( {'metric_value' : 42, 'host': 'machine 1'}, for auto-completion to work.

To understand the expected mapping, and the way auto-completion worked, I had to look at the Grafana source code and the requests send by Grafana through the web-browser.

I expect the ES DB to include the process Grafana has been including for other databases, meaning:

we can choose the metric we want to display by its name, using an auto-completed list
we select the tags among the tags associated to the given metric, and select their values (including the '*' operator I have been using in OpenTSDB, whose technical name is unknown to me)

This should structure the way we fill in the query editor, however it is not the case for the ES editor, which is by design generic:

the auto-completion of the metric fields seems to be among the numerical fields defined in the mappings,
the auto-completion of tags seems to be among the string fields defined in the mappings.

Thus, when we have selected a metric, the tags available through auto-completion are all the tags that are defined for all the metrics, while only a few of them have sense for the specific metric.
To give an example, we can have the 2 following metric values stored in the ES index:
{ 'cpu_usage': 0.8,
'host': 'fqdn1', 'timestamp' : xxx}
{'ip_throughput': '1031',
'host': 'fqdn1',
'interface': 'eth0', 'timestamp' : xxx}

While selecting the 'cpu_usage' metric, only aggretation/filtering on the 'host' tag has sense, while the auto-complete with display 'host' and 'interface' tags. This become crazy with hundreds of metrics, displaying hundreds of tags, while only 3 or 4 are useful the the given metric.

Here are my suggestions, open for comments, since my goal is to improve the ES database to meet the standards of other db.

To be able to do an auto-complete on the name of the metrics, and on the tags of a given metric, I'm suggesting that each metric is stored with a separate elasticsearch type within the index. This ways, we can auto-complete the names of the metrics by looking at the types defined in the index. We can auto-complete the tags for a given metric, looking at the string fields for the type of the metric.

Then we can modify the query editor elasticsearch queries to make the auto-complete works, and add some documentation. I'm a willing to do so if the community things this is a good idea.

There is one technical issue due to Elasticsearch. Since all metrics are stored in a single index, and that fields can only be of one type within an ES index, we cannot store values of different type in a given field. For instance, we can't have :
{ '_type': 'cpu_usage',
'value': 0.90 }
and
{'_type': 'number_interfaces',
'value': 3 }
and { '_type': 'is_something', 'value': true }

If we are dealing only with numerical values, we could use a float field datatype, but I assume that metrics can be boolean values ? Or string ? Is that correct ? If so, we need to find a trick, such as creating the value, value_int, value_string etc fields in the mapping, or storing the value in a specific field named after the metric (such that every 'value field' has a different name within an index (e.g. { '_type': 'cpu_usage', 'cpu_usage': 0.90} does not conflict with { '_type': 'is_something', 'is_something': true }). I'm still thinking about it, if anyone has a better idea or some insight, I'm interested.

torkelo · 2016-01-18T09:37:29Z

Elasticsearch is not a time series database but a general purpose document store with very powerful query features. It does not have a fixed metric schema that people use, it does not have a metric meta data store (with info about metric keys and tags) so the query editor is limited in only providing some general auto completion (for example duing lookup on what fields exist in mapping and their type).

Metric aggregations can only be done on numeric fields, if done on a string field ES will throw exception.

Nested properties work with the table panel and Raw Document mode but currently it the auto complete does not show nested properties in in the terms and value field suggestions (though it should still work if you enter the nested property manualy like tags.host), will open an issue for this.

torkelo · 2016-01-18T09:39:39Z

#3772

jblespiau · 2016-01-18T19:10:27Z

It was exactly to be able to have a not limited query editor that I was suggesting to fix a metric schema. Because, currently, it is a good proof of concept or a generic editor to display a very few graphs, but it is not designed to replace other back-ends such as Graphite or OpenTsdb. I think my suggestions make that possible.

Since I really want to use ES for its reliability, scalability, and the easy maintenance, I will fork and simply add a Database and modify the auto-complete such that the auto-completed lists are semantically correct. It will still be general (using what fields exist in mappings). The only specificity will be that each metric will be stored in a specific _type within the index.

I hope this will help people, and maybe it will be merged if it supplies a feature interesting enough. I will post again when completed.

torkelo · 2016-01-18T19:15:01Z

sounds great :)

jblespiau changed the title ~~Suggestions to improve the user experience for ES database~~ Suggestions to improve the user experience for the ES database Jan 17, 2016

torkelo closed this as completed Jan 18, 2016

jblespiau mentioned this issue Feb 24, 2016

Fixed schema elasticsearch #4149

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestions to improve the user experience for the ES database #3767

Suggestions to improve the user experience for the ES database #3767

jblespiau commented Jan 17, 2016

torkelo commented Jan 18, 2016

torkelo commented Jan 18, 2016

jblespiau commented Jan 18, 2016

torkelo commented Jan 18, 2016

Suggestions to improve the user experience for the ES database #3767

Suggestions to improve the user experience for the ES database #3767

Comments

jblespiau commented Jan 17, 2016

torkelo commented Jan 18, 2016

torkelo commented Jan 18, 2016

jblespiau commented Jan 18, 2016

torkelo commented Jan 18, 2016