Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to override cluster_uuid to be used in monitoring data #13182

Conversation

@ycombinator
Copy link
Contributor

ycombinator commented Aug 6, 2019

Background and Problem

Starting 7.2.0, the xpack.monitoring.* settings in Beats are deprecated in favor of monitoring.* settings. The former were used to define the production cluster to which the Beat should send its monitoring data. The latter are used to define the monitoring cluster to which the Beat should directly send its monitoring data.

When the monitoring.* settings are used with an output other than elasticsearch, there's no way (currently) to know if the Beat's "regular" (i.e. non-monitoring) data is going to end up in an Elasticsearch cluster. As such, we cannot associate the Beat with an Elasticsearch cluster in the Stack Monitoring UI. Instead we show it under a "Standalone Cluster".

Prior to 7.2.0, if users were using an output other than elasticsearch, they would still send monitoring data via the production cluster (by using the xpack.monitoring.* settings). As such, the production cluster could enrich the monitoring data such that the Stack Monitoring UI would show the Beat associated with that production Elasticsearch cluster.

Starting 7.2.0 instead, the same Beat has now "moved" to a Standalone Cluster in the UI and users are not happy about it, specifically users who know that their Beats are eventually sending their regular data into Elasticsearch. Such users would like to see the Beat associated with their production Elasticsearch cluster in the UI.

Solution

This PR aims to solve the above problem by providing said class of users with a new setting in their Beat's configuration:

monitoring.override_cluster_uuid:

By default, this setting is not set, that is, it's value is empty. In this case the value of the cluster_uuid field in the Beat's monitoring documents will be determined as follows:

  • If the Beat is using the elasticsearch output, the cluster_uuid will be that of the Elasticsearch cluster referenced by the elasticsearch output.
  • Else, the cluster_uuid will be blank, causing the Beat to show up under Standalone Cluster in the Stack Monitoring UI.

If the monitoring.override_cluster_uuid setting is given a value (i.e. the Cluster UUID of an Elasticsearch cluster), this value will be used as the value of the cluster_uuid field in the Beat's monitoring documents, regardless of the output being used by the Beat.

Testing this PR

For all test cases below, the same Elasticsearch query is to be run against your Monitoring Elasticsearch cluster. This is the query:

POST .monitoring-beats-*/_search
{
  "size": 2, 
  "_source": [
    "type",
    "cluster_uuid"
    ],
  "collapse": {
    "field": "type"
  },
  "sort": [
    {
      "timestamp": {
        "order": "desc"
      }
    }
  ]
}

When running this query, note that it may take up to 30 seconds for type:beats_state documents to show up.

When monitoring.override_cluster_uuid is set

Verify that the value of the cluster_uuid field in .monitoring-beats-* documents of type:beats_stats as well as type:beats_state is the same as that specified for the monitoring.override_cluster_uuid setting.

When monitoring.override_cluster_uuid is not set

When output.elasticsearch is enabled

Verify that the value of the cluster_uuid field in .monitoring-beats-* documents of type:beats_stats as well as type:beats_state is the same as the cluster UUID of the Elasticsearch cluster referenced by the output.elasticsearch setting. To deteremine this value, call the GET / API against the output.elasticsearch Elasticsearch cluster.

When an output other than output.elasticsearch is enabled

Verify that the value of the cluster_uuid field in .monitoring-beats-* documents of type:beats_stats as well as type:beats_state is null.

For testing, you can enable output.console. Make sure to disable output.elasticsearch and to point monitoring.elasticsearch.hosts to your Monitoring Elasticsearch cluster.

@elasticmachine

This comment has been minimized.

Copy link
Collaborator

elasticmachine commented Aug 6, 2019

# the Stack Monitoring UI. However, if a different output is enabled, the monitoring data will
# show up under a Standalone Cluster in the UI. If you want the monitoring data to be associated
# with a specific Elasticsearch cluster in the UI, specify that cluster's UUID here.
#monitoring.override_cluster_uuid:

This comment has been minimized.

Copy link
@ycombinator

ycombinator Aug 6, 2019

Author Contributor

@dedemorton WDYT of this language?

This comment has been minimized.

Copy link
@dedemorton

dedemorton Aug 6, 2019

Contributor

I wouldn't start by talking saying, "If output.elasticsearch is enabled" because skim readers might think the section doesn't apply to them. Instead, I would start with a description of what the setting contains, then I'd describe the reasons for setting it. How about something like this:

Sets the UUID of the Elasticsearch cluster under which monitoring data for this
{{.BeatName | title}} instance will appear in the Stack Monitoring UI. If output.elasticsearch
is enabled, the UUID is set by default. If a different output is enabled, you must specify this
setting, or monitoring data about this {{.BeatName | title}} instance will appear under a
standalone cluster.

Maybe a bit wordy, tho. WDYT?

This comment has been minimized.

Copy link
@cachedout

cachedout Aug 8, 2019

Contributor

I can't recall a place in the public documentation right now where we discuss what a cluster UUID is and what its function is. I'm a bit on the fence over whether this case is common enough that we should document it more thoroughly but generally I learn toward the more documentation the better. :)

Either way, I think that we should also provide clear instructions to a user on retrieving a the cluster UUID for the cluster that they wish to associate a monitored component with.

This comment has been minimized.

Copy link
@ycombinator

ycombinator Aug 9, 2019

Author Contributor

@cachedout In addition to @dedemorton's suggested changes, I can add a sentence about how to determine the cluster UUID.

Given that this is an inline comment in a configuration file, I'm not sure there's space to talk about what a cluster UUID is and what it's function is — that seems like something that belongs in our online ES docs. If you agree then perhaps file a docs issue (or PR) in the ES repo for adding the information you'd like to see?

This comment has been minimized.

Copy link
@dedemorton

dedemorton Aug 9, 2019

Contributor

We can add more detail to the Beats docs where we talk about monitoring. IMO it's important to keep the description in the config file as brief as possible, or the file will become unreadable over time.

This comment has been minimized.

Copy link
@ycombinator

ycombinator Aug 13, 2019

Author Contributor

I wanted to add the override_ prefix to indicate more strongly that this will be the cluster UUID that is used when it is set, regardless of any other configuration. But I'm good with your suggestion as well, @urso. I think perhaps the name is not as critical as the inline comment and web site documentation about it.

This comment has been minimized.

Copy link
@urso

urso Aug 13, 2019

Collaborator

:) Reason I ask for cluster_uuid is, cause I'd prefer it to be configured most of the time. It's the one to use for users not using the ES output always, meaning there is nothing to override. That is: we only 'derive' it from the ES output if available and not configured (ES output route as a fallback). In the end I don't mind as much about the exact naming, but agree the doc should be clear.
Checking the code how we get the UUID from the ES output I wouldn't be surprised if we publish a few events without a cluster uuid at all.

This comment has been minimized.

Copy link
@ycombinator

ycombinator Aug 14, 2019

Author Contributor

Checking the code how we get the UUID from the ES output I wouldn't be surprised if we publish a few events without a cluster uuid at all.

Good point. I will put up a separate PR to prevent this when output.elasticsearch is being used. I already did something like this for the new beat Metricbeat module: #13020.

This comment has been minimized.

Copy link
@ycombinator

ycombinator Aug 14, 2019

Author Contributor

Based on all the above feedback, I've updated the setting name and inline comment in 78b25d4.

I tried to keep the comment short and convey that we expect users to set this setting unless they are using output.elasticsearch, in which case we automatically derive the cluster UUID.

Please let me know what you think.

This comment has been minimized.

Copy link
@ycombinator

ycombinator Aug 15, 2019

Author Contributor

I will put up a separate PR to prevent this when output.elasticsearch is being used.

PR is up: #13251

@ycombinator ycombinator requested review from elastic/stack-monitoring and removed request for elastic/uptime Aug 6, 2019
@ycombinator

This comment has been minimized.

Copy link
Contributor Author

ycombinator commented Aug 9, 2019

Hi @elastic/beats developers, anyone available to review this PR?

@abraxxa

This comment has been minimized.

Copy link

abraxxa commented Aug 13, 2019

Why not set the UUID automatically like now when the cluster hosts are the same or autodiscover it?

Copy link
Contributor

cachedout left a comment

LGTM pending the additional sentence for the docs discussed in a review comment. Should we also file a docs issue on enhancing the documentation to describe the role of a UUID, per the suggestion of @dedemorton ?

@ycombinator

This comment has been minimized.

Copy link
Contributor Author

ycombinator commented Aug 13, 2019

@abraxxa said:

Why not set the UUID automatically like now when the cluster hosts are the same or autodiscover it?

Indeed, if output.elasticsearch is being used there will be no need to set the monitoring.override_cluster_uuid setting. The Beat will discover the cluster UUID from the Elasticsearch cluster referenced by output.elasticsearch.

We expect users to use monitoring.override_cluster_uuid only if they are using an output other than output.elasticsearch but still want to associated the Beat in the Stack Monitoring UI with a specific Elasticsearch cluster.

@ycombinator

This comment has been minimized.

Copy link
Contributor Author

ycombinator commented Aug 13, 2019

@cachedout I'm waiting for someone from @elastic/beats to review this PR before taking any further action. I've noted @dedemorton's feedback and I'll take the necessary next steps once this PR has been reviewed by someone from @elastic/beats.

@abraxxa

This comment has been minimized.

Copy link

abraxxa commented Aug 13, 2019

@ycombinator: we send filebeat logs to logstash for processing and have xpack.monitoring.elasticsearch configured.
Why can‘t monitoring.elasticsearch not behave exactly like xpack. now?

@ycombinator

This comment has been minimized.

Copy link
Contributor Author

ycombinator commented Aug 13, 2019

Why can‘t monitoring.elasticsearch not behave exactly like xpack. now?

xpack.monitoring.* would send monitoring data to an Elasticsearch "production" cluster, which would then export it to an Elasticsearch "monitoring" cluster. This monitoring cluster could be the same as the production cluster (so the exporter used would be a local one, which is the default) or the monitoring cluster could be a separate, dedicated one (so the exporter used on the production cluster would be a http one).

Routing monitoring data through the production cluster has disadvantages: it's unnecessarily complicated (the extra hop), it adds an unnecessary burden on the production cluster, and it requires maintenance of code in the production cluster just for supporting this routing. Instead, if we allow Beats to ship data directly to the monitoring cluster, those disadvantages go away. So this is the direction we are heading in.

However, when monitoring data is shipped through the production cluster (by using the xpack.monitoring.elasticsearch.* settings), the production cluster would inject its own cluster_uuid into the monitoring data. That would cause the Beat to be associated with that Elasticsearch production cluster in the Stack Monitoring UI in Kibana.

In the new way, when monitoring data is shipped directly to the monitoring cluster (by using the monitoring.elasticsearch.* settings), the Beat has no way of knowing the production cluster's ID or if there is even a production cluster in the picture! To allow for this possibility, this PR proposes adding a monitoring.override_cluster_uuid setting — users can optionally choose to set it to their production Elasticsearch cluster's UUID if they want the Beat's monitoring data to be associated with this cluster in the Stack Monitoring UI in Kibana.

If a user doesn't set this new setting, the Beat will use the cluster UUID of the Elasticsearch cluster pointed to by the Beat's output.elasticsearch.* settings (assuming, of course, that the user is using this output). If the user is not using the Elasticsearch output, then no Cluster UUID will be injected into the monitoring data for the Beat and it will be associated with a "Standalone Cluster" in the Stack Monitoring UI in Kibana.

we send filebeat logs to logstash for processing and have xpack.monitoring.elasticsearch configured.

So in your case you will most likely want to set monitoring.override_cluster_uuid to the cluster UUID of the Elasticsearch cluster currently referenced by xpack.monitoring.elasticsearch. This will ensure that you see the same association in the Stack Monitoring UI in Kibana.

@abraxxa

This comment has been minimized.

Copy link

abraxxa commented Aug 14, 2019

@ycombinator thanks for the detailed explanation, very much appreciated!

I wasn‘t aware that there is any exporter functionality in Elasticsearch towards another cluster.

Maybe there is a way for the Elasticsearch module of Filebeat to detect the UUID, by API call or by including it in the JSON logfiles.

@ycombinator

This comment has been minimized.

Copy link
Contributor Author

ycombinator commented Aug 15, 2019

@urso I've addressed all your feedback on this PR. Please take a look when you get a chance. Thanks!

@urso urso self-assigned this Aug 20, 2019
@urso
urso approved these changes Aug 20, 2019
@ycombinator ycombinator force-pushed the ycombinator:libbeat-monitoring-override-cluster-uuid branch from 740059e to 80a610c Aug 20, 2019
@ycombinator ycombinator requested a review from elastic/siem as a code owner Aug 20, 2019
@@ -1222,6 +1222,17 @@ logging.files:
# Set to true to enable the monitoring reporter.
#monitoring.enabled: false

# If output.elasticsearch is enabled, monitoring data for this Auditbeat instance

This comment has been minimized.

Copy link
@dedemorton

dedemorton Aug 20, 2019

Contributor

Did you mean to include both statements? I think there is too much redundancy, and I'm still not happy with the wording (even the bits that I've contributed). I know you're eager to get this merged. I'd suggest removing the redundant wording (maybe even go with your original statement), and I can improve the wording when I add more info to the docs.

This comment has been minimized.

Copy link
@ycombinator

ycombinator Aug 21, 2019

Author Contributor

I did not, this might be the result of a bad rebase. Fixing...

This comment has been minimized.

Copy link
@ycombinator

ycombinator Aug 21, 2019

Author Contributor

@dedemorton I've removed the duplicate comment in 1e2585f. Thanks for catching this!

Copy link
Contributor

dedemorton left a comment

Doc changes LGTM

@ycombinator ycombinator force-pushed the ycombinator:libbeat-monitoring-override-cluster-uuid branch from 9ab945a to 6376f42 Sep 3, 2019
@ycombinator ycombinator merged commit 0bafb43 into elastic:master Sep 3, 2019
4 checks passed
4 checks passed
CLA All commits in pull request signed
Details
Hound No violations found. Woof!
beats-ci Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@ycombinator ycombinator deleted the ycombinator:libbeat-monitoring-override-cluster-uuid branch Sep 3, 2019
@ycombinator ycombinator added v7.3.2 and removed v7.3.1 labels Sep 3, 2019
ycombinator added a commit to ycombinator/beats that referenced this pull request Sep 3, 2019
…lastic#13182)

* Add ability to override cluster_uuid to be used in monitoring data

* Removing changes committed by accident

* Updating setting name and inline comments in reference config file

* Updating setting name in code

* Use struct tags pattern instead of repeating magic string

* Fixing logic after variable rename

* Remove unnecessary check and override

* Fixing duplicate comment

* Fixing comment

* Updating comment
@ycombinator ycombinator added the libbeat label Sep 3, 2019
This was referenced Sep 4, 2019
ycombinator added a commit that referenced this pull request Sep 4, 2019
…13182) (#13480)

* Add ability to override cluster_uuid to be used in monitoring data

* Removing changes committed by accident

* Updating setting name and inline comments in reference config file

* Updating setting name in code

* Use struct tags pattern instead of repeating magic string

* Fixing logic after variable rename

* Remove unnecessary check and override

* Fixing duplicate comment

* Fixing comment

* Updating comment
@ruflin

This comment has been minimized.

Copy link
Collaborator

ruflin commented Sep 16, 2019

@ycombinator I can see this in the 7.3 but not 7.4 branch. Is this expected?

ycombinator added a commit to ycombinator/beats that referenced this pull request Sep 16, 2019
…lastic#13182)

* Add ability to override cluster_uuid to be used in monitoring data

* Removing changes committed by accident

* Updating setting name and inline comments in reference config file

* Updating setting name in code

* Use struct tags pattern instead of repeating magic string

* Fixing logic after variable rename

* Remove unnecessary check and override

* Fixing duplicate comment

* Fixing comment

* Updating comment
@ycombinator

This comment has been minimized.

Copy link
Contributor Author

ycombinator commented Sep 16, 2019

@ruflin Nope, not expected. It should be in 7.4 as well. I was expecting that to happen when changes from master get mass-backported to 7.x, but I guess 7.x was already pointing to 7.5 by the time this PR was merged? Anyway, backport PR to 7.4 is here now: #13694

ycombinator added a commit that referenced this pull request Sep 17, 2019
…13182) (#13694)

* Add ability to override cluster_uuid to be used in monitoring data

* Removing changes committed by accident

* Updating setting name and inline comments in reference config file

* Updating setting name in code

* Use struct tags pattern instead of repeating magic string

* Fixing logic after variable rename

* Remove unnecessary check and override

* Fixing duplicate comment

* Fixing comment

* Updating comment
@urso urso added the v7.5.0 label Oct 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.