[Transforms] Graceful handling for _preview when source indices are missing #87074

dolaru · 2022-05-24T10:01:15Z

Context

As an example, the Endpoint Security integration creates a couple of transforms when it's set up, but the source indices for these transforms are not available until the integration starts collecting data from agents.

Until the indices are created and data starts coming in, the users will see this in the Preview tab:

These errors come from the _preview endpoint.

Enhancement

The _preview endpoint should either return 0 documents or return an error message that describes the situation in a way that's more helpful to the user.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2022-05-24T10:01:18Z

Pinging @elastic/ml-core (Team:ML)

benwtrent · 2022-05-24T12:05:18Z

Honestly, I am not 100% sure about this. We should totally indicate VERY clearly that there is no data in the _preview.

przemekwitek · 2022-05-25T12:04:55Z

or return an error message that describes the situation in a way that's more helpful to the user.

Honestly, I am not 100% sure about this. We should totally indicate VERY clearly that there is no data in the _preview.

IIUC these 2 do not contradict. I agree we should return an error and make the error message understandable.

sophiec20 · 2022-05-25T13:08:54Z

One approach would be to match what _search does. For example, if a wildcard is used in the source index then zero docs are returned without errors. If a concrete index name is used, and it does not exist then an error is shown. Whilst it might not be the most intuitive in all cases, at least following the behaviour of _search introduces a consistency within the stack.

benwtrent · 2022-05-25T13:11:47Z

One approach would be to match what _search does.

The problem I have with this, is that it gives no indication as to why no docs were matched. In one case it could be that there are no indices (e.g. the wrong pattern could have been chosen, or the user knows about this and its fine), or the user who is creating the transform has no permissions to read the data. One of those the user should be able to act on. If we do not distinguish the "why" behind no docs, we make the experience worse for the user.

sophiec20 · 2022-05-25T13:22:20Z

I agree and perhaps that should be fixed in _search.

It would be good to also consider the consistency between the behaviour of _preview for datafeeds and transforms as the same team supports both.

droberts195 · 2022-05-25T13:52:39Z

or the user who is creating the transform has no permissions to read the data

One of the principles behind the way index security works is that it shouldn't be possible for a user who doesn't have permission to read the cluster state to determine which indices exist by getting "permission denied" errors.

So for example if an index top-secret-index exists and a user who doesn't have permission to read top-secret-index searches it then they should get the same response as if top-secret-index didn't exist. Similarly if they search top-secret*.

It would be good to also consider the consistency between the behaviour of _preview for datafeeds and transforms as the same team supports both.

In both transforms and datafeeds we're trying to check up-front what a later search will be able to do. The only tool we have for this is the _has_privileges API. But actually this behaves in a subtly different way to the actual _search API.

If you ask _has_privileges whether a user can search .fleet-agents* then it will respond with false unless the user has sufficiently wildcarded permissions to search any possible future index that might match the pattern .fleet-agents*. If they have permission to search .fleet-agents-2022* then it will respond with false because they can't search .fleet-agents-does-not-exist-and-never-will. In contrast the _search API will happily return results from .fleet-agents-2022* and silently ignore .fleet-agents-2021* that the user doesn't have permissions for.

This discrepancy between _search and _has_privileges is where we're getting ourselves in a mess.

At the moment our rule is that if you want to configure a transform or datafeed to search .fleet-agents* then you have to have permissions over that whole wildcarded namespace. Maybe that rule is no longer fit for purpose. But then, given the way _has_privileges works, all we could do is let the transform or datafeed start and retrieve data from the subset of that namespace that the user is actually allowed to search (which could be anywhere from the entire subset that actually exists to nothing).

hendrikmuhs · 2022-05-25T14:10:14Z

Regarding the "no feedback" issue:

Search provides hits and it returns _shards in the result object and therefore gives an indication if e.g. no indices could be resolved due to a misspelling in the config and if you made a mistake in the query, you can see 0 hits but e.g. that 1 shard has matched.

We lack those information in _preview, so one way to think about this is exposing more data from the search in _preview, we have it, we just ignore it right now. Indicating the "size of a transform" seems like a good addition, too.

(However _shards seems a bit too technical (is it legacy?), as a user I rather prefer to get the number of resolved indices, if we get that information)

Regarding privileges: I agree we can't disclose the existence of an index, therefore it seems to me, we can't do anything if wildcards are involved. All the indices we talk about seem to be data streams, it would be better if they switch to data streams instead of using the old school syntax.

LBNL the error is about a system transform, longer term we should hide that transform in the default view. It should at least require ticking a checkbox to see it.

dolaru added >enhancement :ml/Transform Transform needs:triage Requires assignment of a team area label labels May 24, 2022

elasticmachine added the Team:ML Meta label for the ML team label May 24, 2022

stu-elastic removed the needs:triage Requires assignment of a team area label label May 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Transforms] Graceful handling for _preview when source indices are missing #87074

[Transforms] Graceful handling for _preview when source indices are missing #87074

dolaru commented May 24, 2022 •

edited

elasticmachine commented May 24, 2022

benwtrent commented May 24, 2022

przemekwitek commented May 25, 2022

sophiec20 commented May 25, 2022

benwtrent commented May 25, 2022

sophiec20 commented May 25, 2022

droberts195 commented May 25, 2022

hendrikmuhs commented May 25, 2022

[Transforms] Graceful handling for _preview when source indices are missing #87074

[Transforms] Graceful handling for _preview when source indices are missing #87074

Comments

dolaru commented May 24, 2022 • edited

Context

Enhancement

elasticmachine commented May 24, 2022

benwtrent commented May 24, 2022

przemekwitek commented May 25, 2022

sophiec20 commented May 25, 2022

benwtrent commented May 25, 2022

sophiec20 commented May 25, 2022

droberts195 commented May 25, 2022

hendrikmuhs commented May 25, 2022

dolaru commented May 24, 2022 •

edited