-
Notifications
You must be signed in to change notification settings - Fork 24.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce authorization for enrich in ESQL #99646
Conversation
23cc2f0
to
672258b
Compare
Pinging @elastic/es-ql (Team:QL) |
Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL) |
Pinging @elastic/es-security (Team:Security) |
@dnhatn I am no longer on the security team. So I'll ping @jakelandis for a formal review. Briefly looking through the changes, I think the general approach makes sense to me. We might need to tweak exactly what privilege is required for resolving the enrich policy. Right now it is |
@ywangd No problem. Thank you for taking a look. I've requested @jakelandis to take a look at the PR. |
Do I understand this correctly ?: Before: any user could use the With this PR: we no longer user the ClientHelper.ENRICH_ORIGIN which means the user must be explicitly granted permissions. Specifically the user now need index:read (or explicitly named: indices:data/read/esql/lookup) for the enrich policy name and cluster:monitor (or explicitly named: cluster:monitor/esql/resolve_enrich) to order use the use the 'enrich' keyword. The new cluster:monitor is needed since the enrich indices are system (or at least hidden) indices and the concrete index names are implementation details. So this is needed map the friendly name of the enrich policy to the underlying index name. Assuming I understand the change correctly (I may not!) ... it feels odd that we are applying index level privileges on the enrich policy name. I worry that it violates our expectations w.r.t. applying privileges for indices/aliases/data streams. I am also kinda surprised this actually works since for index level permissions since we iterate through the known indices, datastreams, and aliases, compare them against what is in the role and then rewrite the indices that are requests omiting the ones you don't have access to. I am failing to see how that works since that security logic does not take the enrich resolution. However, I also am not familiar with how ESQL works and feels like I am missing a key piece of information to understand how this works (is that what the messageRecieved override is doing ?...then how is the comparison against the friendly name working ?). A better understanding here might help with these technical concerns. However, there is still a non-technical concern w.r.t. to placing non-index abstractions (alias, datastream, index name) in the indices name of the role. For example, I can not query "test-enrich" with the normal _search API and am concerned this could lead to confusion for what is 'indices.names'... how does this work with Kibana UIs ? how do users know to put the policy name in there ? and the eventual questions for why does it not also work with enrich via ingest node pipelines or policy execution. I also have concerns w.r.t. normal users that just want to do some queries needing the cluster:monitor privilege. That cluster privilege is intended to be a (lower) admin level set of privileges. There probably isn't too much of pure security concern with giving normal users that access in the scope that it does not leak credentials or the like... but doing so IMO it violates least privilege. It doesn't seem right that to use enrich queries I also have to give users permission to see things like autoscaling stats or ml jobs or ccr stats, etc. Being overly restrictive can have the opposite effect in that some admins will just give wider permissions (lowering security) to users that don't need it just to accomplish the goal without fully understanding the implications. Also, I don't believe that any normal users in serverless will ever have this permission. Ideally we would have a permission model for enrich that is applicable to both ESQL, ingest nodes, and policy execution that does not require users to know to how to configure indices permissions for things that are not index abstractions (as defined by being able to _search on those names). This probably looks like a new field in the role 'enrich.names' where the enrich names are treated like abstractions such as date math or aliases that get resolved to the allowed list of concrete index names in the security (and non-security) index resolution. That model would conform to the existing expectations be accessible to ingest node / policy execution workflows and IMO better progresses enrich as concept. Ultimately you are hitting on our lack of resource specific permissions and there might be some less involved short term solutions. However, before proposing any alternatives I would want to double check my understanding and better understand the requirements. Those conversations probably extend beyond this PR. |
@jakelandis If I understand your comment correctly, I think you might have been mislead by the name of the index in the tests.
So the ES|QL command is
Which (more or less) translates to
The security change is in how step 2 is applied.
The change to the security setup in
None of those users have access to anything named Interestingly, despite the PR description, the actual enrichment still happens using the enrich origin which explains why it works
As @dnhatn has indicated, the ideal outcome would be to have a way to limit which enrich polices a user may use within ES|QL. It looks like we could probably do that with a |
@jakelandis @tvernum Thank you so much for looking. Tim has summarized this perfectly. In the enrich lookup in ESQL, there are two actions:
I believe we don't require any extra privilege for ingest-enrich. |
...no I did not. Thanks Tim and Nhat for the clarifications. One last clarification:
You mean the index being enriched... right ? The enrich index is defined here as "A special system index tied to a specific enrich policy". Sorry to be pedantic, but confusing the security requirements of two was the source of my original confusion. Assuming we are all on the same page... The only relevant part of the original comment are :
While ideal that would be quite involved. An implementation with
While it possible to configure a role with the string |
@jakelandis Thanks so much for your suggestion. It makes sense to me. I will update this PR to introduce a new privilege. |
@jakelandis I've introduced the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - however, can you update the doc here either as part of this PR or as a followup.
Thanks again for clarifications and apologies for the noise.
@jakelandis Thank you for the review + suggestion :).
Sure, I will do this in a follow-up. |
This change introduces a new privilege monitor_enrich. Users are required to have this privilege in order to use the enrich functionality in ESQL. Additionally, it eliminates the need to use the enrich_origin when executing enrich lookups. The enrich_origin will only be used when resolving enrich policies to prevent warnings when accessing system indices directly. Closes elastic#98482
manage_enrich is a cluster privilege, not a built in role. manage_enrich is already documented as a cluster privilege. This commit remove manage_enrich from the role documentation. This commit also makes mention of the monitor_enrich introduced in elastic#99646. related: elastic#85877 (cherry picked from commit 1eaa907)
manage_enrich is a cluster privilege, not a built in role. manage_enrich is already documented as a cluster privilege. This commit remove manage_enrich from the role documentation. This commit also makes mention of the monitor_enrich introduced in #99646. related: #85877 (cherry picked from commit 1eaa907)
Today, we have a hierarchy of tasks in ESQL designed to leverage the task framework for reporting status and cancellation. ```mermaid flowchart RESTLayer -->| EsqlQueryRequest indices:data/read/esql | ComputeService ComputeService -->| DriverRequest indices:data/read/esql/compute | Driver ComputeService -->| DataNodeRequest indices:data/read/esql/data | DataNode DataNode -->| DriverRequest indices:data/read/esql/compute | Driver Driver -->| LookupRequest indices:data/read/esql/lookup | EnrichLookupService ``` The primary issue here is that `DriverRequest` is neither `IndicesRequest` nor `CompositeIndicesRequest`. Consequently, the Driver is executed within the context of the system user, leading to access indices with the system user. To address this issue, this PR makes `DriverRequest` a `CompositeIndicesRequest` and ensures that the Driver executes within the user's context. With this fix we can now properly capture the response headers when a Driver is yielded and rescheduled. Relates #100707 Relates #99646 Relates #99926 Closes #100164
Today, we have a hierarchy of tasks in ESQL designed to leverage the task framework for reporting status and cancellation. ```mermaid flowchart RESTLayer -->| EsqlQueryRequest indices:data/read/esql | ComputeService ComputeService -->| DriverRequest indices:data/read/esql/compute | Driver ComputeService -->| DataNodeRequest indices:data/read/esql/data | DataNode DataNode -->| DriverRequest indices:data/read/esql/compute | Driver Driver -->| LookupRequest indices:data/read/esql/lookup | EnrichLookupService ``` The primary issue here is that `DriverRequest` is neither `IndicesRequest` nor `CompositeIndicesRequest`. Consequently, the Driver is executed within the context of the system user, leading to access indices with the system user. To address this issue, this PR makes `DriverRequest` a `CompositeIndicesRequest` and ensures that the Driver executes within the user's context. With this fix we can now properly capture the response headers when a Driver is yielded and rescheduled. Relates elastic#100707 Relates elastic#99646 Relates elastic#99926 Closes elastic#100164
Today, we have a hierarchy of tasks in ESQL designed to leverage the task framework for reporting status and cancellation. ```mermaid flowchart RESTLayer -->| EsqlQueryRequest indices:data/read/esql | ComputeService ComputeService -->| DriverRequest indices:data/read/esql/compute | Driver ComputeService -->| DataNodeRequest indices:data/read/esql/data | DataNode DataNode -->| DriverRequest indices:data/read/esql/compute | Driver Driver -->| LookupRequest indices:data/read/esql/lookup | EnrichLookupService ``` The primary issue here is that `DriverRequest` is neither `IndicesRequest` nor `CompositeIndicesRequest`. Consequently, the Driver is executed within the context of the system user, leading to access indices with the system user. To address this issue, this PR makes `DriverRequest` a `CompositeIndicesRequest` and ensures that the Driver executes within the user's context. With this fix we can now properly capture the response headers when a Driver is yielded and rescheduled. Relates #100707 Relates #99646 Relates #99926 Closes #100164
This change introduces a new privilege
monitor_enrich
. Users are required to have this privilege in order to use the enrich functionality in ESQL. Additionally, it eliminates the need to use theenrich_origin
when executing enrich lookups. Theenrich_origin
will only be used when resolving enrich policies to prevent warnings when accessing system indices directly.Closes #98482