Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions to improve the user experience for the ES database #3767

Closed
jblespiau opened this issue Jan 17, 2016 · 4 comments
Closed

Suggestions to improve the user experience for the ES database #3767

jblespiau opened this issue Jan 17, 2016 · 4 comments

Comments

@jblespiau
Copy link
Contributor

I have been trying to use the new Elasticsearch database instead of my previous one, and I have encountered several difficulties. I am willing to discuss them, suggest ways to fix these issues and contribute myself.

I first list the issues I had, and then suggests solutions for comments.

  1. I have difficulties to get proper documentation on the use of the ES database. For instance, there is no information about the ES mappings that are expected by the ES Query Editor. For instance, I tried using the following data :
    { 'metric_value': 42,
    tags : { 'host': 'machine 1 } }

with bad results since, for what I have understood, a flat mapping is expected ( {'metric_value' : 42, 'host': 'machine 1'}, for auto-completion to work.

To understand the expected mapping, and the way auto-completion worked, I had to look at the Grafana source code and the requests send by Grafana through the web-browser.

  1. I expect the ES DB to include the process Grafana has been including for other databases, meaning:
  • we can choose the metric we want to display by its name, using an auto-completed list
  • we select the tags among the tags associated to the given metric, and select their values (including the '*' operator I have been using in OpenTSDB, whose technical name is unknown to me)

This should structure the way we fill in the query editor, however it is not the case for the ES editor, which is by design generic:

  • the auto-completion of the metric fields seems to be among the numerical fields defined in the mappings,
  • the auto-completion of tags seems to be among the string fields defined in the mappings.

Thus, when we have selected a metric, the tags available through auto-completion are all the tags that are defined for all the metrics, while only a few of them have sense for the specific metric.
To give an example, we can have the 2 following metric values stored in the ES index:
{ 'cpu_usage': 0.8,
'host': 'fqdn1', 'timestamp' : xxx}
{'ip_throughput': '1031',
'host': 'fqdn1',
'interface': 'eth0', 'timestamp' : xxx}

While selecting the 'cpu_usage' metric, only aggretation/filtering on the 'host' tag has sense, while the auto-complete with display 'host' and 'interface' tags. This become crazy with hundreds of metrics, displaying hundreds of tags, while only 3 or 4 are useful the the given metric.

Here are my suggestions, open for comments, since my goal is to improve the ES database to meet the standards of other db.

To be able to do an auto-complete on the name of the metrics, and on the tags of a given metric, I'm suggesting that each metric is stored with a separate elasticsearch type within the index. This ways, we can auto-complete the names of the metrics by looking at the types defined in the index. We can auto-complete the tags for a given metric, looking at the string fields for the type of the metric.

Then we can modify the query editor elasticsearch queries to make the auto-complete works, and add some documentation. I'm a willing to do so if the community things this is a good idea.

There is one technical issue due to Elasticsearch. Since all metrics are stored in a single index, and that fields can only be of one type within an ES index, we cannot store values of different type in a given field. For instance, we can't have :
{ '_type': 'cpu_usage',
'value': 0.90 }
and
{'_type': 'number_interfaces',
'value': 3 }
and { '_type': 'is_something', 'value': true }

If we are dealing only with numerical values, we could use a float field datatype, but I assume that metrics can be boolean values ? Or string ? Is that correct ? If so, we need to find a trick, such as creating the value, value_int, value_string etc fields in the mapping, or storing the value in a specific field named after the metric (such that every 'value field' has a different name within an index (e.g. { '_type': 'cpu_usage', 'cpu_usage': 0.90} does not conflict with { '_type': 'is_something', 'is_something': true }). I'm still thinking about it, if anyone has a better idea or some insight, I'm interested.

@jblespiau jblespiau changed the title Suggestions to improve the user experience for ES database Suggestions to improve the user experience for the ES database Jan 17, 2016
@torkelo
Copy link
Member

torkelo commented Jan 18, 2016

Elasticsearch is not a time series database but a general purpose document store with very powerful query features. It does not have a fixed metric schema that people use, it does not have a metric meta data store (with info about metric keys and tags) so the query editor is limited in only providing some general auto completion (for example duing lookup on what fields exist in mapping and their type).

Metric aggregations can only be done on numeric fields, if done on a string field ES will throw exception.

Nested properties work with the table panel and Raw Document mode but currently it the auto complete does not show nested properties in in the terms and value field suggestions (though it should still work if you enter the nested property manualy like tags.host), will open an issue for this.

@torkelo torkelo closed this as completed Jan 18, 2016
@torkelo
Copy link
Member

torkelo commented Jan 18, 2016

#3772

@jblespiau
Copy link
Contributor Author

It was exactly to be able to have a not limited query editor that I was suggesting to fix a metric schema. Because, currently, it is a good proof of concept or a generic editor to display a very few graphs, but it is not designed to replace other back-ends such as Graphite or OpenTsdb. I think my suggestions make that possible.

Since I really want to use ES for its reliability, scalability, and the easy maintenance, I will fork and simply add a Database and modify the auto-complete such that the auto-completed lists are semantically correct. It will still be general (using what fields exist in mappings). The only specificity will be that each metric will be stored in a specific _type within the index.

I hope this will help people, and maybe it will be merged if it supplies a feature interesting enough. I will post again when completed.

@torkelo
Copy link
Member

torkelo commented Jan 18, 2016

sounds great :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants