Metricbeat elasticsearch module when output is Kafka #11519

ppf2 · 2019-03-28T18:20:53Z

I am linking this to one of the items on the main issue on monitoring ES via metricbeat (#7035).

One aspect we haven't talked about much (or documented) is what happens when the user's metricbeat is configured to route all events through Kafka (using output.kafka).

Per our guidelines today (https://www.elastic.co/guide/en/elasticsearch/reference/current/configuring-metricbeat.html), the configuration of the Elasticsearch module requires the output to be output.elasticsearch.

What is the recommended set up here for output.kafka users?

Will they send everything through output.kafka, and have separate Logstash ES outputs downstream, 1 for regular events to the production cluster, and 1 for routing metricbeat ES stack module events to .monitoring-es* indices on the remote monitoring cluster?
Or is there a way to reuse Logstash's xpack.monitoring.elasticsearch.hosts for the connection to route the metricbeat ES stack module events to the remote monitoring cluster?
Will they have to set up a 2nd metricbeat (with output.elasticsearch just for the ES stack modules) to route events directly to the remote monitoring cluster, while the original metricbeat instance will continue to send other events through Kafka.

Until we figure out our story on this, it will be helpful to update https://www.elastic.co/guide/en/elasticsearch/reference/current/configuring-metricbeat.html with some information on our current guidelines when metricbeat does not have an output.elasticsearch (or maybe it's simply a not-currently-supported statement, etc..). Thx!

elasticmachine · 2019-03-28T19:04:41Z

Pinging @elastic/stack-monitoring

cachedout · 2019-03-29T15:48:39Z

There has been some discussion around this question and from what I have seen thus far, we lean toward option three in your list. @ycombinator, do you concur?

ycombinator · 2019-03-29T15:53:24Z

Option 3 has been tested and known to work. So I'd start out by documenting that right now.

However, in theory, option 1 could also work so I think its worth testing it out and coming up with docs around that too.

ppf2 · 2019-03-29T19:56:16Z

Tested option 1 briefly, metricbeat (ES module) -> Kafka -> LS -> ES seems to work as well :)

For example, a conditional can be added to the output section of the LS config to route metricbeat ES module metrics to a separate monitoring cluster (for 6.5+ to 6.latest):

# route ES monitoring metrics collected by metricbeat elasticsearch module
# to ES monitoring cluster
# https example
if [metricset][module] == "elasticsearch"
{
  elasticsearch{
  index => ".monitoring-es-6-mb-%{+YYYY.MM.dd}"
  hosts => ["https://node1:9200"]
  cacert => "/path_to/ca.crt"
  user=>"elastic"
  password=>"password"
} else {
... <where your non-monitoring events will go>
}

Instead of having a conditional statement with 2 ES output, the alternative will be to build out hosts, index, etc.. variables upstream in the pipeline to substitute into a single elasticsearch output.

ycombinator · 2019-03-29T20:03:44Z

Good stuff, @ppf2, thanks so much for testing this out!

I wonder if it's safe for the conditional to just test for [metricset][module] == "elasticsearch". After all, this could be true even if xpack.enabled is false in the corresponding metricbeat module config.

Imagine a case where the user has configured the elasticsearch module in the same Metricbeat instance twice for some reason, once with xpack.enabled: true and once without. Or that there are two Metricbeat instances feeding data to the same LS instance, one configured with xpack.enabled: true in the elasticsearch module and one without.

I wonder if there's another piece of data/metadata in the event that Logstash receives from Metricbeat that we could use to make this check more robust. If there isn't, I wonder if we should inject something to this effect from the Elasticsearch Metricbeat module when xpack.enabled is set to true.

ppf2 · 2019-03-29T20:25:32Z

How about this if clause instead?

# route ES monitoring metrics collected by metricbeat elasticsearch module
# to ES monitoring cluster
# https example
if [@metadata][index] =~ /^.monitoring-es*/
{
  elasticsearch{
  index => ".monitoring-es-6-mb-%{+YYYY.MM.dd}"
  hosts => ["https://node1:9200"]
  cacert => "/path_to/ca.crt"
  user=>"elastic"
  password=>"password"
} else {
... <where your non-monitoring events will go>
}

ycombinator · 2019-03-29T20:32:54Z

Perhaps we could generalize this a bit to work not just for ES stack monitoring data collected by Metricbeat but also other stack products' monitoring data? So something like:

# route monitoring metrics collected by metricbeat Elastic stack product module
# to ES monitoring cluster
# https example
if [@metadata][index] =~ /^.monitoring-*/ {
  if [@metadata][id] {
    elasticsearch {
      index => "%{[@metadata][index]}-%{+YYYY.MM.dd}"
      document_id => "%{[@metadata][id]}"
      hosts => ["https://node1:9200"]
      cacert => "/path_to/ca.crt"
      user=>"elastic"
      password=>"password"
    }
  } else {
    elasticsearch{
      index => "%{[@metadata][index]}-%{+YYYY.MM.dd}"
      hosts => ["https://node1:9200"]
      cacert => "/path_to/ca.crt"
      user=>"elastic"
      password=>"password"
    }
  }
} else {
... <where your non-monitoring events will go>
}

cachedout · 2019-04-24T13:34:48Z

If indeed we have a solution that works and we agree on, in order to satisfy the original request , we need to document the recommended setup.

@lcawl Any suggestions on a home for this sort of information in the docs? I'm happy to discuss over slack/zoom as well to give more context.

cachedout · 2019-07-29T14:34:47Z

Polite bump, @lcawl . Thanks!

kkh-security-distractions · 2020-01-07T19:45:10Z

I implemented option 1 in a 3 node test cluster today. I am setting metricbeat to output to Logstash , then output to elasticsearch. at first glance it seemed to work fine. But then I noticed that the shard count on the nodes page is way off. It keeps incrementing going from correct number to several thousands over time. so I may start with 50 , then for every 10 secs , it gets to 100 , 150 ,200 and so. This was tested using 7.5.1 on RHEL.

My motivation for this is trying to come up with a fix , so I dont have to disable the system module in Metricbeat as it proves very valuable insights into other performance charactericstics of a given node,

cachedout · 2020-01-08T09:15:45Z

@lcawl and @ycombinator I've removed myself as owner of this issue after switching teams. Would one of you like to pick it up?

ycombinator · 2020-01-08T14:19:02Z

@cachedout Sure.

@lcawl Can you take up bit about documenting Option 1? I can answer @kkh-security-distractions's question.

ycombinator · 2020-01-08T14:23:17Z

@kkh-security-distractions I assume you were using the Logstash fragment I had posted in my comment above:

# route monitoring metrics collected by metricbeat Elastic stack product module
# to ES monitoring cluster
# https example
if [@metadata][index] =~ /^.monitoring-*/
{
  elasticsearch{
    index => "%{[@metadata][index]}-%{+YYYY.MM.dd}"
    hosts => ["https://node1:9200"]
    cacert => "/path_to/ca.crt"
    user=>"elastic"
    password=>"password"
} else {
... <where your non-monitoring events will go>
}

Unfortunately, this fragment is not quite right. It works for most stack monitoring data except data about shards, as you obviously found out the hard way 😞. Sorry about that.

I've now updated the comment with a better fragment; please try that out. Note that you will need to clear out your existing monitoring data (DELETE .monitoring-es-*-mb*) first.

kkh-security-distractions · 2020-01-08T17:54:55Z

@ycombinator ,Yesterday when I was comparing a working versus the non working setup , I did notice the rather odd "id" of the working setup. But I was not able to determine , whether it was important or not ;)

I changed my pipeline according to your suggestions and it seems have fixed the problem. Shard count seems steady now as it should be. I added some stuff in the filter section to get rid of the ECS fields from Metricbeat as this is not needed as I see it.

I will leave it running and see tomorrow , how it ends up. I suggest that someone improves the documention on the main metricbeat page to how incorporate this setup. I think many people with Ingest going through fx Kafka will appreciate this setup and still be able to keep the system module , which was my goal :)

Thnx for input.

lcawl · 2020-01-09T23:43:17Z

@lcawl Can you take up bit about documenting Option 1?

Sorry, I somehow missed the earlier notifications on this one.

From what I understand in this issue, this is a less common configuration option. Since
the basic setup steps (https://www.elastic.co/guide/en/elasticsearch/reference/current/configuring-metricbeat.html) are already quite complex, I don't think it would be ideal to try to squeeze it in there. Instead, I think this would be appropriate for a separate piece of content describing a more advanced configuration scenario.

I had a chat with @dedemorton and I think our suggestion would be to put this in a blog, at least initially. If it becomes a common enough use case that we want to actively maintain it in the docs, we can revisit incorporating it.

ycombinator · 2020-01-09T23:49:54Z

Sounds good and makes sense, @lcawl and @dedemorton. I'll start working on a blog post soon.

ppf2 · 2020-01-10T00:27:59Z

Blog post sounds good, let's cross link from the https://www.elastic.co/guide/en/elasticsearch/reference/current/configuring-metricbeat.html page to the public blog post location (once it is published). thx!

ycombinator · 2020-02-05T16:50:56Z

@ppf Blog post is live: https://www.elastic.co/blog/elastic-stack-monitoring-with-metricbeat-via-logstash-or-kafka.

@lcawl WDYT about the linking idea that @ppf2 mentioned in the previous comment?

botelastic · 2021-01-05T17:15:54Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

botelastic · 2021-01-05T17:15:59Z

This issue doesn't have a Team:<team> label.

ppf2 added enhancement Metricbeat Metricbeat labels Mar 28, 2019

ycombinator added the Feature:Stack Monitoring label Mar 28, 2019

cachedout self-assigned this Mar 29, 2019

ppf2 mentioned this issue Jan 2, 2020

Single Beats Agent/Supervisor #10452

Closed

cachedout removed their assignment Jan 8, 2020

botelastic bot added Stalled needs_team Indicates that the issue/PR needs a Team:* label labels Jan 5, 2021

botelastic bot closed this as completed Feb 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metricbeat elasticsearch module when output is Kafka #11519

Metricbeat elasticsearch module when output is Kafka #11519

ppf2 commented Mar 28, 2019 •

edited

Loading

elasticmachine commented Mar 28, 2019

cachedout commented Mar 29, 2019

ycombinator commented Mar 29, 2019

ppf2 commented Mar 29, 2019 •

edited

Loading

ycombinator commented Mar 29, 2019 •

edited

Loading

ppf2 commented Mar 29, 2019 •

edited

Loading

ycombinator commented Mar 29, 2019 •

edited

Loading

cachedout commented Apr 24, 2019

cachedout commented Jul 29, 2019

kkh-security-distractions commented Jan 7, 2020

cachedout commented Jan 8, 2020

ycombinator commented Jan 8, 2020

ycombinator commented Jan 8, 2020 •

edited

Loading

kkh-security-distractions commented Jan 8, 2020

lcawl commented Jan 9, 2020

ycombinator commented Jan 9, 2020 •

edited

Loading

ppf2 commented Jan 10, 2020

ycombinator commented Feb 5, 2020

botelastic bot commented Jan 5, 2021

botelastic bot commented Jan 5, 2021

Metricbeat elasticsearch module when output is Kafka #11519

Metricbeat elasticsearch module when output is Kafka #11519

Comments

ppf2 commented Mar 28, 2019 • edited Loading

elasticmachine commented Mar 28, 2019

cachedout commented Mar 29, 2019

ycombinator commented Mar 29, 2019

ppf2 commented Mar 29, 2019 • edited Loading

ycombinator commented Mar 29, 2019 • edited Loading

ppf2 commented Mar 29, 2019 • edited Loading

ycombinator commented Mar 29, 2019 • edited Loading

cachedout commented Apr 24, 2019

cachedout commented Jul 29, 2019

kkh-security-distractions commented Jan 7, 2020

cachedout commented Jan 8, 2020

ycombinator commented Jan 8, 2020

ycombinator commented Jan 8, 2020 • edited Loading

kkh-security-distractions commented Jan 8, 2020

lcawl commented Jan 9, 2020

ycombinator commented Jan 9, 2020 • edited Loading

ppf2 commented Jan 10, 2020

ycombinator commented Feb 5, 2020

botelastic bot commented Jan 5, 2021

botelastic bot commented Jan 5, 2021

ppf2 commented Mar 28, 2019 •

edited

Loading

ppf2 commented Mar 29, 2019 •

edited

Loading

ycombinator commented Mar 29, 2019 •

edited

Loading

ppf2 commented Mar 29, 2019 •

edited

Loading

ycombinator commented Mar 29, 2019 •

edited

Loading

ycombinator commented Jan 8, 2020 •

edited

Loading

ycombinator commented Jan 9, 2020 •

edited

Loading