Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make elasticsearch/index_summary metricset work for Stack Monitoring without xpack.enabled flag #20615

Conversation

sayden
Copy link
Contributor

@sayden sayden commented Aug 14, 2020

Code should be working 😅

Missing steps:

  • Add description to new fields (marked as TODO)
  • Fix some comments

@sayden sayden added the Team:Services (Deprecated) Label for the former Integrations-Services team label Aug 14, 2020
@sayden sayden self-assigned this Aug 14, 2020
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Aug 14, 2020
},
},
// following field is not included in the Stack Monitoring UI mapping
"is_throttled": c.Bool("is_throttled"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the field is not needed by the Stack Monitoring UI (@chrisronline may want to confirm) and it wasn't a field we were already collecting in data.go, then it's safe to not collect it at all. So I would remove this line and others like it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct. We need to not only collect the mappings, but also continue to collect and index any other fields we used to because we often read values from the source document that weren't considered when the finalized mappings were produced.

Copy link
Contributor

@ycombinator ycombinator Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisronline So just to clarify, you're saying we should collect this field, is_throttled, but we should not include it in the fields.yml so it doesn't show up in the mappings, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really even want to index fields that aren't used and we could take another pass on which fields we read from source directly, but we're also currently in talks about potentially indexing as much data as possible (even if the stack monitoring UI doesn't use it) to enable users to build custom dashboards. Nothing is decided but I'd rather leave the functionality as is in case we do go down that path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait a sec 😄 as far as I understood, we should map this fields in elasticsearch fields.yml mapping. But should we still map it in the metricset fields.yml file? As a Metricbeat convention, it must be mapped, AFAIK 100% of Metricbeat fields are map in fields.yml of their respective metricsets. Metricbeat is not going to fail though, we however have a commonly used python test (like this one in MySQL module https://github.com/elastic/beats/blob/master/metricbeat/module/mysql/test_mysql.py#L47) to check if all fields are map in fields.yml but fortunately it's not included in the python test code for the elasticsearch module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.monitoring-* indices have dynamic mapping set to false, but perhaps metricbeat-* handles this differently? In that case, maybe we should add every field we are indexing to fields.yml but if the field isn't contained in the used mappings list, we set the mapping to enabled: false?

Copy link
Contributor

@ycombinator ycombinator Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sayden I can try to help clarify about when and where to map fields for stack monitoring metricsets. 😅

Consider a field, say elasticsearch.<foo>.bar.baz that is newly (i.e. not already in master) being collected by the elasticsearch/<foo> metricset in data.go. For this field:

  1. Go ahead and define the field in the module/elasticsearch/<foo>/_meta/fields.yml file.

  2. Now, check if the field has a corresponding field in the .monitoring-es-* index mappings.

    1. If yes: let's say the corresponding field in .monitoring-es-* is some_thing.some_thing_else.qux.boom. Go ahead and define a field alias from some_thing.some_thing_else.qux.boom => elasticsearch.<foo>.bar.baz in the module/elasticsearch/_meta/fields.yml file.
    2. If no: edit the field mapping you created in step 1 in module/elasticsearch/<foo>/_meta/fields.yml and add enabled: false to it. This will ensure that the field is mapped (so the assert_fields_are_documented test assertion you are talking about in your comment will pass) but not be unnecessarily indexed.

Hopefully this now clear as mud 😃.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, @chrisronline re: enabled: false. I will edit my comment above to reflect this step.

@ycombinator
Copy link
Contributor

@sayden Just a heads up that I made a mistake in the node_stats PR, which I'm trying to fix in #20613. As you can see, tests are failing on that PR right now which I need to address. I'm not sure yet what those fixes are going to look like. But just wanted to let you know so you are aware that you might need similar fixes in this PR (and future PRs for other stack monitoring metricsets) as well.

@elasticmachine
Copy link
Collaborator

elasticmachine commented Aug 14, 2020

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #20615 updated]

  • Start Time: 2020-11-06T16:43:32.091+0000

  • Duration: 69 min 37 sec

Test stats 🧪

Test Results
Failed 0
Passed 2250
Skipped 500
Total 2750

- Rename index_summary to index.summary in elasticsearch fields.yml
- Remove call to eventsMappingXPack
- Add a missing comment about a non used field in data.go
- Run make update
@ycombinator
Copy link
Contributor

@sayden As part of this PR you will also need to make this metricset one of the metricsets that's enabled by default when the user runs metricbeat modules enable elasticsearch. This way, it's minimal work for a user to use the elasticsearch module for Stack Monitoring.

@sayden
Copy link
Contributor Author

sayden commented Aug 20, 2020

I'm trying to make CI work but it's complaining about field elasticsearch.index.summary.primaries.store.size.bytes not documented, but it's here https://github.com/elastic/beats/pull/20615/files#diff-3c4851b3e8837e18ba25dc267cf10492R18

@ycombinator
Copy link
Contributor

@sayden Can you check whether the field shows up if you run the following?

mage update build
./metricbeat export template | jq '.mappings.properties.elasticsearch.properties.index.properties.summary'

If it doesn't, then there's something wrong in the fields.yml for the metricset. Being YAML it could even be a slight mis-formatting that can cause this problem. One way would be to comment out most of your fields in fields.yml, repeat the above commands to make sure what you have uncommented shows up in the template, then uncomment some more fields, and so on, until you've isolated the problem area and can fix it.

@sayden
Copy link
Contributor Author

sayden commented Aug 24, 2020

Okay, I fixed the missing field thing thanks to the comment of @ycombinator I just need to fix the Go integration test now

@sayden sayden marked this pull request as ready for review August 25, 2020 17:18
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@sayden
Copy link
Contributor Author

sayden commented Aug 25, 2020

Integration test finished and current CI error is not related! 🎉

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, just left a few minor comments.

@chrisronline
Copy link
Contributor

I'm seeing an issue here.

After running this PR, this query:

POST metricbeat-*/_search?filter_path=hits.hits._source.elasticsearch.index.summary.total.search
{
  "size": 1,
  "sort": [
    {
      "timestamp": {
        "order": "desc"
      }
    }
  ],
  "query": {
    "term": {
      "metricset.name": {
        "value": "index_summary"
      }
    }
  }
}

Returns an empty block:

{
  "hits" : {
    "hits" : [
      {
        "_source" : {
          "elasticsearch" : {
            "index" : {
              "summary" : {
                "total" : {
                  "search" : {
                    "query" : {
                      "time" : { }
                    }
                  }
                }
              }
            }
          }
        }
      }
    ]
  }
}

However, indices_stats._all.total.search.query_total aliases into elasticsearch.index.summary.total.search.query.count but it doesn't seem any data is being indexed in that location.

Maybe I'm missing something?

@sayden
Copy link
Contributor Author

sayden commented Aug 25, 2020

@chrisronline you are right. After taking a look I have just realized that there's something missing. https://github.com/elastic/beats/pull/20615/files#diff-ecd66d9e3da80b956adc12f104330b4cR90 I'll take a look

@sayden
Copy link
Contributor Author

sayden commented Aug 25, 2020

Okay! Thanks for the help @chrisronline ! Now it should be ok.

{
  "hits" : {
    "hits" : [
      {
        "_source" : {
          "elasticsearch" : {
            "index" : {
              "summary" : {
                "total" : {
                  "search" : {
                    "query" : {
                      "count" : 56,
                      "time" : {
                        "ms" : 585
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    ]
  }
}

Copy link
Contributor

@chrisronline chrisronline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! All the necessary fields have data populated and the ES cluster overview charts look right!

@ycombinator
Copy link
Contributor

It looks like we are no longer using elasticsearch.BulkStatsDict. Remove it?

@sayden
Copy link
Contributor Author

sayden commented Aug 27, 2020

I have moved elasticsearch.BulkStatsDict to data.go file :)

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for taking this on and iterating through it, @sayden!

@chrisronline
Copy link
Contributor

Hey folks, any update on this? It looks ready to go but hasn't been merged yet.

@ycombinator
Copy link
Contributor

@chrisronline @sayden was out all last week so he's probably catching up on backlog.

@sayden CI failures look unrelated but Beats master branch CI has been passing recently. So maybe rebase this on master and see if that gets CI on this PR to green before merging?

@elasticmachine
Copy link
Collaborator

Pinging @elastic/stack-monitoring (Stack monitoring)

@sayden
Copy link
Contributor Author

sayden commented Nov 4, 2020

@ycombinator do you mean to rebase the feature branch to master I guess?

@ycombinator
Copy link
Contributor

@ycombinator do you mean to rebase the feature branch to master I guess?

Yeah, I guess you'll need to do both:

  1. Rebase the elastic:feature-stack-monitoring-mb-ecs on master so it's up to date, then
  2. Rebase this PR's branch on elastic:feature-stack-monitoring-mb-ecs so it's up to date as well.

@sayden sayden force-pushed the feature/mb/elasticsearch/index_summary-xpack-flag branch from c2ba106 to c4934af Compare November 5, 2020 11:32
@elasticmachine
Copy link
Collaborator

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 2250
Skipped 500
Total 2750

@sayden sayden merged commit 58e2a7d into elastic:feature-stack-monitoring-mb-ecs Nov 10, 2020
sayden added a commit that referenced this pull request Nov 12, 2020
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Stack Monitoring Team:Services (Deprecated) Label for the former Integrations-Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants