[ML] User-friendly experience for categorization of text fields #17997

elasticmachine · 2017-05-25T10:17:25Z

Original comment by @droberts195:

This came out of a Slack chat with @peteharverson. It was also something that was brought up on the IRC channel during the recent ML webinar.

We would expect categorization to be applied to log messages, and we would expect people to be storing log messages in text fields, because that's what you have to do to make use of Elasticsearch's text search.

Additionally, the reverse search terms we generate as an output of categorization can only be used to efficiently search text fields.

However, at present we make it very hard for people to use a text field as their categorization_field_name when feeding a job with a datafeed. They have to set the obscure "_source": true setting in the JSON.

I propose the following:

If a field of type text is selected as the categorization_field_name we automatically set "_source": true in the datafeed config
If a field that is not of type text is selected as the categorization_field_name we warn people that it's unlikely to work well with categorization

The text was updated successfully, but these errors were encountered:

elasticmachine · 2017-05-25T13:46:36Z

Original comment by @skearns64:

++, this will help the vast majority of users.

elasticmachine · 2017-05-25T13:55:43Z

Original comment by @Harvey-Maddocks:

I have created a snapshot dataset called it_ops_new_raw_snapshot of an index called it_ops_new_raw, that will help with testing this behaviour. This contains as a type called logs (which is just the old it_ops_app_logs dataset). Which has as it's mapping
for the message field both a type text and type keyword.

elasticmachine added :ml Feature:Anomaly Detection ML anomaly detection discuss release_note:enhancement labels Apr 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] User-friendly experience for categorization of text fields #17997

[ML] User-friendly experience for categorization of text fields #17997

elasticmachine commented May 25, 2017

elasticmachine commented May 25, 2017

elasticmachine commented May 25, 2017

[ML] User-friendly experience for categorization of text fields #17997

[ML] User-friendly experience for categorization of text fields #17997

Comments

elasticmachine commented May 25, 2017

elasticmachine commented May 25, 2017

elasticmachine commented May 25, 2017