Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding stack_monitoring_agent role #34369

Merged
merged 11 commits into from
Oct 24, 2018

Conversation

ycombinator
Copy link
Contributor

@ycombinator ycombinator commented Oct 9, 2018

Motivation

Starting with 6.4.0 users have been able to monitor (in the x-pack sense) Kibana using Metricbeat. And starting with 6.5.0 users will be able to monitor Elasticsearch using Metricbeat as well.

If users are using more than a Basic license with security enabled in Elasticsearch (xpack.security.enabled: true), then Metricbeat will need to provide a username and password to connect to Elasticsearch. The username must belong to a user with a security role that has sufficient privileges for Metricbeat to collect monitoring data for the stack and index it into monitoring indices, .monitoring-*. For discussion purposes, let's call this role stack_monitoring_agent.

Metricbeat needs to call APIs of each Elastic stack product that it's monitoring to collect metrics for these products. So the stack_monitoring_agent role must have sufficient privileges to call these APIs.

Additionally, Metricbeat needs to connect to the Elasticsearch monitoring cluster in order to index monitoring data collected from various Elastic stack products into .monitoring-* indices. So the stack_monitoring_agent role must have sufficient privileges to perform this indexing operation, including creating new time-based indices when necessary.

Access Granularity

The above section suggests that we have a single stack_monitoring_agent role to cover all privileges necessary for collection as well as indexing of monitoring data. This makes the stack_monitoring_agent role a very coarsely-grained role.

The benefit of a coarse level of granularity is that it is easier for users to get started with Metricbeat stack monitoring and security: at minimum, they need only create a single user having this single role and they are all set to use that user in their Metricbeat module configurations for Elastic stack modules (for collection purposes) as well as their Metricbeat elasticsearch output configuration (for indexing purposes). The other benefit of a single, coarse role is that it also provides abstraction: we may increase or reduce privileges behind this single role over time without requiring any changes from the user. The downside of this coarse level of granularity, of course, is lessened security due to broad access regardless of which specific product a single Metricbeat instance might be monitoring.

At the other end of the granularity spectrum is having a plethora of fine-grained roles for collecting monitoring data from specific Elastic stack products as well as for indexing monitoring data into specific monitoring indices (.monitoring-es-*, .monitoring-kibana-*, etc.). The benefit of this fine level of granularity is increased security because of the ability to provide fine-tuned access. The downside is the potential for a user to screw up their configuration in defining security users and associating the right roles with these users and further with setting up Metricbeat configuration to use the right security users in the right configuration files.

And, of course, there are variations of access granularity in between the two ends of the spectrum mentioned above.

After discussion with @ruflin, we decided to go with the coarse granularity approach in order to favor ease-of-use of of the box. Concretely, that means we will provide a built-in stack_monitoring_agent security role.

Of course, users who want to have finer-grained access may choose to create their own custom roles with the desired access granularity. In order to do this, users may need to know about some lower-level details about stack monitoring, including details that might change over time (such as index names). We may consider documenting these details for such users, either publicly or as a support knowledge base article, probably based on demand.


This PR implements this stack_monitoring_agent built-in role.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@ycombinator
Copy link
Contributor Author

ycombinator commented Oct 10, 2018

We'll handle documentation for this new built-in stack_monitoring_agent role in a separate PR(s), as part of the broader effort for documenting Stack monitoring with Metricbeat that @lcawl is working on.

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like CI failure is related? Overall LGTM.

"stack_monitoring_agent",
new String[] {
"monitor",
"manage_index_templates"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ycombinator can you please explain why manage_index_templates is required ?
Isn't .indices(".monitoring-*", "metricbeat-*").privileges("index", "create_index") enough?

Copy link
Contributor Author

@ycombinator ycombinator Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please explain why manage_index_templates is required ?

When Metricbeat starts up today, it checks if the Metricbeat index template exists and, if not, tries to create it.

Isn't .indices(".monitoring-", "metricbeat-").privileges("index", "create_index") enough?

Creating templates is a cluster-level privilege (AFAIK) so having these index-level privileges is not enough.

Copy link
Contributor Author

@ycombinator ycombinator Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I could do here to tighten things up a bit is to specifically grant the indices:admin/template/get and indices:admin/template/put cluster-level privileges instead of the broader manage_index_templates cluster-level privilege.

[EDIT] I've done this in a33449a. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was thinking about, was, do you need both to put template and create index. Cannot Metricbeat use one or the other. Preferably create index because we have tighter authz for it (the privilege has the index name pattern), although in this case, when for example there is a centralized monitoring cluster, a template would be more suitable.

[EDIT] I've done this in a33449a. Let me know what you think.

So, you've removed the delete action but put is an overwrite (our fault not yours 😃). It's fine either way.

},
new RoleDescriptor.IndicesPrivileges[] {
RoleDescriptor.IndicesPrivileges.builder().indices("*").privileges("monitor").build(),
RoleDescriptor.IndicesPrivileges.builder().indices(".kibana*").privileges("read").build(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could you please explain why .indices(".kibana*").privileges("read") is required?

Copy link
Contributor Author

@ycombinator ycombinator Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please explain why .indices(".kibana*").privileges("read") is required?

The kibana module in metricbeat calls the Kibana Settings API to retrieve a specific setting, xPack:defaultAdminEmail and index it into .monitoring-kibana-*. This value is then used for Monitoring's Cluster Alerts feature today. This feature uses Watcher today; every Cluster Alerts watch queries .monitoring-kibana-* for the value of this setting in order to decide which email address to send Cluster Alerts notifications to.

In the future, when Alerting is a Kibana feature, we will no longer need to index this setting's value in .monitoring-kibana-*. At that time we can remove this particular API call from metricbeat and this privilege as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sorry, but I understand that Metricbeat calls a Kibana API and does not require to query the .kibana index (that API might do it on its account, but Metricbeat does not handle it itself). Please correct me here if I'm wrong.
The .kibana index is a treasure trove, we should be wary when reading it outside of through Kibana itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's correct that Beats does not query the index directly but through the API. As far as I understood it so far still the user needs to have these access rights so the API call is executed correctly. @ycombinator @chrisronline can perhaps confirm or correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user which metricbeat is using does need .kibana privileges as the user's role is used to determine permissions within the Kibana api. This particular api needs to query against the .kibana index so the user who makes the api request needs the appropriate permissions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .kibana index is a treasure trove, we should be wary when reading it outside of through Kibana itself.

Adding to what @ruflin and @chrisronline said, the GET /api/settings Kibana API that Metricbeat calls reads the .kibana index, but only exposes a very narrow view of what's in that index. Here's the result of a call I just made locally while using the elastic superuser:

{
   "cluster_uuid":"zLXBfdzPSrO12OBIz3ybyg",
   "settings":{
      "xpack":{
         "default_admin_email":"foo@bar.com"
      },
      "kibana":{
         "uuid":"5b2de169-2785-441b-ae8c-186a1936b17d",
         "name":"Shaunaks-MacBook-Pro-2.local",
         "index":".kibana",
         "host":"localhost",
         "transport_address":"localhost:5601",
         "version":"7.0.0-alpha1",
         "snapshot":false,
         "status":"green"
      }
   }
}

It may not be super obvious from the code in Kibana for this API, but it only returns the xPack:defaultAdminEmail setting and other metadata about the kibana instance:

https://github.com/elastic/kibana/blob/master/x-pack/plugins/monitoring/server/kibana_monitoring/collectors/get_settings_collector.js

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay understood! Thank you @ycombinator and @chrisronline .

@ycombinator
Copy link
Contributor Author

jenkins, test this

@tvernum
Copy link
Contributor

tvernum commented Oct 15, 2018

My feeling is that the choice to use a single role is the wrong trade-off. but it may not matter too much, as long as the following 2 things are true:

  1. remote_monitoring_agent provides all the access needed to perform the indexing of monitoring data
  2. metricbeat will support using different users for collection & indexing.

The main use case I'm worried about here is customers with a central monitoring cluster. In that case, the admin for the monitoring cluster will want to issue credentials that allow indexing the data, but will not want to grant read/monitor access to that cluster.
I would like us to have an easy to setup answer to that problem. I think remote_monitoring_agent is the answer, provided that this new stack-monitoring doesn't require anything outside of that role.

I think the counter-side, of the monitored cluster not wanting to grant monitoring-indexing privileges is also real, but far less likely to be a concern.

they need only create a single user having this single role

Why are we proposing that the customer needs to create the user? Shouldn't we ship a reserved user with the necessary role(s)?

@ycombinator
Copy link
Contributor Author

@tvernum Thanks for your comment. I think I'm following but just a quick clarification: when you said remote_monitoring_agent, did you mean the stack_monitoring_agent role introduced in this PR? Asking because there is already an existing remote_monitoring_agent role that is used currently and has slightly different privileges.

If I'm following you correctly, at minimum what we'd want to provide (in terms of roles) would be:

  • a role to index data into the monitoring cluster.
  • a role to collect data from a production (aka monitored) cluster.

In essence, that'd mean splitting up the stack_monitoring_agent role introduced in this PR into the above two roles.

Further, we'd want to provide built-in users associated with each of these roles, for ease of setup.

Am I understanding this correctly?

@albertzaharovits
Copy link
Contributor

I am with @tvernum on splitting the role in two. And also, I think he suggests that the remote_monitoring_agent role we have now, can be used as the index half of the proposed stack_monitoring_agent.

We don't need to create two users out of the box. The Metricbeat user (that we ship with? - no creation during setup?) can have these two roles. But when the customer asks how to deal with the scenarios Tim's describing we have an easy answer, that being to create a user with only one of these roles.

@tvernum
Copy link
Contributor

tvernum commented Oct 16, 2018

@albertzaharovits has understood me correctly.

For background, I think the original proposal falls into a mistake we have made a few times before where we make it easy to get started with something that meets the security needs of the most common case, but when a customer has tighter security needs, we don't have a lot to offer them out-of-the-box and we put the work on to them.

I would ike us to being looking for ways that we can give customers a variety of options along that ease-of-use/tightness-of-security balance, that is, as you need to tighten security you don't get a disproportionate drop in ease-of-use.

For this, I think 1 builtin user with 2 roles is a good way to offer that.

  • The easiest option is to use the builtin user
  • Customers who don't want to use the builtin user (e.g. they want to avoid "known generic users", or want to use PKI auth, or an LDAP user) can create a new user with those 2 roles
  • Customers who want to split collection and indexing can create new users with whichever role is appropriate.

I'm open to other approaches, but I don't think we should take a position of "if you want to separate collection and indexing then you need to define your own roles and then test & update them with every release". We want to support customers who need tighter security than our default setup.

@tvernum
Copy link
Contributor

tvernum commented Oct 16, 2018

My earlier comment was intentionally refering to remote_monitoring_agent because at the time I felt like it would be a reasonable approximation of the indexing role that we need, and maybe that would be good enough (because I think the "indexing only" requirement is likely to be the higher priority).

So at the very least we could then document "if you are collecting your stack monitoring data into a remote cluster, you can do it with any user that has the remote_monitoring_agent role".

I'd still be OK with that option, but having a builtin user and 2 completely distinct roles would be better.

@tvernum
Copy link
Contributor

tvernum commented Oct 19, 2018

@ycombinator I hope you don't mind, I pushed to your branch to fix the broken test from the previous CI run.

@ycombinator
Copy link
Contributor Author

Not at all, @tvernum. Appreciate the help!

@ycombinator
Copy link
Contributor Author

@tvernum CI has gone green after your latest commit. Mind giving this PR another review? Thanks.

@ycombinator
Copy link
Contributor Author

ycombinator commented Oct 23, 2018

Hey @tvernum, GitHub seems to be back to normal now. Mind reviewing this PR again? Thanks.

Copy link
Contributor

@tvernum tvernum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd like @jaymode to comment on the changes to the agent role.

We'll also need to do docs updates, but that can be a separate PR.

RoleDescriptor.IndicesPrivileges.builder().indices(".monitoring-*").privileges("all").build() },
RoleDescriptor.IndicesPrivileges.builder().indices(".monitoring-*").privileges("all").build(),
RoleDescriptor.IndicesPrivileges.builder()
.indices("metricbeat-*").privileges("index", "create_index").build() },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaymode Any thoughts here?
This is grant write access to the metricbeat-* indices for an existing role.
On the one hand I worry about expanding the access for roles that are in use by customers, on the other hand creating an ever increasing set of new roles is a usability nightmare for customers.

Are you happy to run with this as is, and handle it in the release notes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to go with this; reserved roles are controlled by us and if we feel this makes sense, which I think it does, then expanding access is ok.

@ycombinator
Copy link
Contributor Author

Thanks @tvernum. I'll wait for @jaymode's comments. Re: docs updates we are handling that separately over here: elastic/stack-docs#127.

@tvernum
Copy link
Contributor

tvernum commented Oct 23, 2018

@ycombinator Unfortunately, we currently have references to the builtin users & roles across a bunch of docs in multiple repos, but I imagine @lcawl has it under control.

stack-docs/docs/en/stack/security/authentication/built-in-users.asciidoc
stack-docs/docs/en/stack/security/authorization/built-in-roles.asciidoc
stack-docs/docs/en/stack/security/get-started-builtin-users.asciidoc
stack-docs/docs/en/stack/security/get-started-security.asciidoc
stack-docs/docs/upgrading-stack.asciidoc
elasticsearch/x-pack/docs/en/security/configuring-es.asciidoc
elasticsearch/docs/reference/commands/setup-passwords.asciidoc
elasticsearch/docs/reference/setup/install/windows.asciidoc

@tvernum
Copy link
Contributor

tvernum commented Oct 23, 2018

@elastic/microsoft Does the MSI installer need to get updated when we add a new user to setup-passwords? If so, this PR will introduce a new remote_monitoring_user.

Copy link
Member

@pickypg pickypg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. At a certain point do we expect to deprecate and remove the old users, thereby forcing the user to choose to create new, specific users or live with the coarse version?

@ycombinator
Copy link
Contributor Author

At a certain point do we expect to deprecate and remove the old users, thereby forcing the user to choose to create new, specific users or live with the coarse version?

@pickypg Just so I'm understanding clearly, specifically which old users are you referring to?

@jaymode
Copy link
Member

jaymode commented Oct 24, 2018

@joegallo fyi, for 6.5 there will be some more role updates

@ycombinator
Copy link
Contributor Author

Thanks @jaymode. @pickypg and others, we can continue further conversation on some of the other questions posted here, but I think this PR is good to merge now. Merging...

@ycombinator ycombinator merged commit 0f1a5ec into elastic:master Oct 24, 2018
@ycombinator ycombinator deleted the security-role-stack-monitoring branch October 24, 2018 14:20
ycombinator added a commit to ycombinator/elasticsearch that referenced this pull request Oct 24, 2018
* Adding stack_monitoring_agent role

* Fixing checkstyle issues

* Adding tests for new role

* Tighten up privileges around index templates

* s/stack_monitoring_user/remote_monitoring_collector/ + remote_monitoring_user

* Fixing checkstyle violation

* Fix test

* Removing unused field

* Adding missed code

* Fixing data type

* Update Integration Test for new builtin user
@ycombinator
Copy link
Contributor Author

Backport to 6.x PR: #34805

ycombinator added a commit that referenced this pull request Oct 24, 2018
* Adding stack_monitoring_agent role

* Fixing checkstyle issues

* Adding tests for new role

* Tighten up privileges around index templates

* s/stack_monitoring_user/remote_monitoring_collector/ + remote_monitoring_user

* Fixing checkstyle violation

* Fix test

* Removing unused field

* Adding missed code

* Fixing data type

* Update Integration Test for new builtin user
@colings86 colings86 added >enhancement and removed :Security/Authorization Roles, Privileges, DLS/FLS, RBAC/ABAC labels Oct 25, 2018
@russcam
Copy link
Contributor

russcam commented Oct 26, 2018

Thanks for the ping @tvernum.

The MSI installer does not use setup-passwords.bat file, so a new user added to the batch file will not affect the installer. Is remote_monitoring_user required to be configured? If so, we will need to update the installer

@tvernum
Copy link
Contributor

tvernum commented Oct 26, 2018

Thanks @russcam - I was aware that you don't actually call the passwords tool, it was more a case of whether you/we want to keep the MSI in sync with users handled by setup-passwords.
I don't have a strong opinion on that, other than that the stack monitoring docs will be will probably all assume that the user has a password and the admin knows it. However, I suspect that is often not true, and the docs should probably start with a message of "If you don't know the password, or never set a password, then ..."

Given that the person doing the initial install of the stack will often not know which users they will need, or even what their purposes is, I wonder whether we ought to move to a model where we configure the minimum set of passwords (elastic & kibana probably) and then document the process for manually setting passwords for other users when they're needed.

@tvernum
Copy link
Contributor

tvernum commented Oct 26, 2018

@colings86 Is there a reason why we don't want the :Security/Authorization label?

@colings86
Copy link
Contributor

@tvernum The current script that generates the release notes does not cope well with PRs that have multiple area labels because it doesn't know where in the release notes to put them. Whilst this is not ideal and needs to be fixed in the new script we are planning for release up until we have this new script the release manager for ES (me for 6.5.0) will need to make sure all PRs only have 1 area label. On this PR monitoring seemed like the more appropriate of the two but I'm happy to switch it the other way if you think thats better?

@russcam
Copy link
Contributor

russcam commented Oct 29, 2018

@tvernum the installer only configures elastic, kibana and logstash_system as of now, we do not configure beats_system either.

I wonder whether we ought to move to a model where we configure the minimum set of passwords (elastic & kibana probably) and then document the process for manually setting passwords for other users when they're needed.

Agreed - keep the "from download to up and running" configuration as simple as possible.

kcm pushed a commit that referenced this pull request Oct 30, 2018
* Adding stack_monitoring_agent role

* Fixing checkstyle issues

* Adding tests for new role

* Tighten up privileges around index templates

* s/stack_monitoring_user/remote_monitoring_collector/ + remote_monitoring_user

* Fixing checkstyle violation

* Fix test

* Removing unused field

* Adding missed code

* Fixing data type

* Update Integration Test for new builtin user
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet