Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add azure module/monitor metricset #13196

Merged
merged 37 commits into from
Sep 23, 2019
Merged

Add azure module/monitor metricset #13196

merged 37 commits into from
Sep 23, 2019

Conversation

narph
Copy link
Contributor

@narph narph commented Aug 8, 2019

Azure Monitor collects and aggregates logs and metrics from a variety of sources into a common data platform where it can be used for analysis, visualization, and alerting.

The metrics are numerical values that describe some aspect of a system at a particular point in time. They are collected at regular intervals and are identified with a timestamp, a name, a value, and one or more defining labels. Metrics can be aggregated using a variety of algorithms, compared to other metrics, and analyzed for trends over time .

We can access the azure monitor metrics:

  • from a command line using PowerShell cmdlets
  • from custom application using REST API.
  • from a command line using CLI.
  • if metrics are forwarded to an event hub then a custom solution to read data from the event hub

PR is in progress and subject to changes, initial approach using the rest API to retrieve metric definitions and metrics, currently using the azure sdk for go.

Current azure module/monitor metricset configuration:

- module: azure
  metricsets:
  - monitor
  enabled: true
  period: 60s
  client_id: ''
  client_secret: ''
  tenant_id: ''
  subscription_id: ''
  refresh_list_interval: 120s
  resources:
  - resource_id: "subscriptions/xxxxx/resourceGroups/obs-infrastructure/providers/Microsoft.ApiManagement/service/obsapimanagement"
    metrics:
    - name: "Requests"
      namespace: "Microsoft.ApiManagement/service"
      aggregations: ["Maximum"]
      dimensions:
      - name: "Hostname"
        value: "obsapimanagement.azure-api.net"
    - name: ["Capacity", "Requests"]
      namespace: "Microsoft.ApiManagement/service"
      aggregations: ["Average"]
      dimensions:
      - name: "Location"
        value: "West Europe"
  - resource_group: "obs-infrastructure"
    resource_type: "Microsoft.Compute/virtualMachines"
    metrics:
    - name: "Percentage CPU"
      namespace: "Microsoft.Compute/virtualMachines"
      aggregations: ["Average", "Minimum", "Maximum"]
    - name: ["Disk Read Bytes", "Data Disk Write Operations/Sec", "Network Out"]
      namespace: "Microsoft.Compute/virtualMachines"
      aggregations: ["Average", "Maximum"]
  - resource_query: "resourceType eq 'Microsoft.Storage/storageAccounts' and resourceGroup eq 'obs-infrastructure'"
    metrics:
    - name: ["*"]
      namespace: "Microsoft.Storage/storageAccounts"
      aggregations:  ["Total", "Average"]

Example output:

        "_index" : "metricbeat-8.0.0-2019.08.07-000001",
        "_type" : "_doc",
        "_id" : "Tj85cGwBFNllmdJdQHXi",
        "_score" : null,
        "_source" : {
          "@timestamp" : "2019-08-08T07:54:50.905Z",
          "azure" : {
            "monitor" : {
              "resource" : {
                "name" : "obsinfrastructurediag",
                "location" : "westeurope",
                "type" : "Microsoft.Storage/storageAccounts"
              },
              "namespace" : "Microsoft.Storage/storageAccounts",
              "subscriptionId" : "xxxxxx",
              "metrics" : {
                "successe2elatency.avg" : 34.54545454545455,
                "availability.avg" : 100,
                "transactions.avg" : 1,
                "ingress.avg" : 35056.5,
                "egress.avg" : 8167.5,
                "successserverlatency.avg" : 22.181818181818183
              }
            }
          },
          "event" : {
            "module" : "azure",
            "duration" : 3039980500,
            "dataset" : "azure.monitor"
          },
          "metricset" : {
            "name" : "monitor"
          },
          "service" : {
            "type" : "azure"
          },
          "ecs" : {
            "version" : "1.0.1"
          .....
        },

STILL TO DO

  • Clean up/improve code
  • Implement validation on resources, metric names/ aggregations configured
  • Create tests for the client and helper methods
  • Move location value to cloud.region

x-pack/metricbeat/module/azure/monitor/client.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/monitor/client.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/monitor/client.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/monitor/client.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/monitor/client.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/azure.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/azure.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/azure.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/azure.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/azure.go Outdated Show resolved Hide resolved
@narph narph self-assigned this Aug 8, 2019
@narph narph added the Team:Integrations Label for the Integrations team label Aug 8, 2019
Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skimmed through the code and left some comments.

One thing I'm especially curious about is how exactly the calculation and gathering of metrics happens. What happens if there are 2 metrics in the queried timeframe? What will the API return? If someone selects 1h for the selection and we get a metric, do we know when it happened or is it calculated on demand? We had quite a few discussions around this also for the AWS side. Maybe @kaiyan-sheng would be good to have a look at this PR.

For some of the long methods it would be great to get a bit more details in the method description on what it does which would make reviewing easier.

One part I didn't fully get yet: Will azure.monitors contain logs and metrics? Could you provide an example on how a log even looks like?

metricbeat/docs/modules_list.asciidoc Outdated Show resolved Hide resolved
vendor/vendor.json Show resolved Hide resolved
x-pack/metricbeat/module/azure/monitor/_meta/data.json Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/monitor/data.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/monitor/monitor.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/monitor/monitor.go Outdated Show resolved Hide resolved
Copy link
Contributor

@kaiyan-sheng kaiyan-sheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if https://github.com/Azure-Samples/azure-sdk-for-go-samples/blob/master/insights/insights.go will be helpful. I see ListMetricDefinitions and GetMetricsData here in the code might be handy.

x-pack/metricbeat/metricbeat.reference.yml Outdated Show resolved Hide resolved
ClientSecret string `config:"client_secret" validate:"required"`
TenantID string `config:"tenant_id" validate:"required"`
SubscriptionID string `config:"subscription_id" validate:"required"`
Period time.Duration `config:"period"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add validate:"nonzero,required" here? Same for other required params.

x-pack/metricbeat/modules.d/azure.yml Outdated Show resolved Hide resolved
@kaiyan-sheng
Copy link
Contributor

Please ignore this comment if I missed it, I think you also need to add TestFetch and TestData functions: https://github.com/elastic/beats/blob/master/x-pack/metricbeat/module/aws/cloudwatch/cloudwatch_integration_test.go

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job here! I have added some extra comments.

And one more comment I didn't know how to add: metricbeat/modules.d/system.yml shouldn't be renamed.

vendor/vendor.json Show resolved Hide resolved
type: keyword
description: >
The subscription ID
- name: metrics.*.*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloudwatch metrics are stored in aws.metrics.*.*, should we have Azure metrics in azure.metrics.*.*?

Copy link
Contributor Author

@narph narph Aug 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be prefixed by aws.cloudwatch.*. fields.yml fields are concatenated together. Best double check in the resulting template to confirm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsoriano Sorry I missed this comment. cloudwatch metricset stores data in aws.metrics.* now. Because with using light module to create other metricsets, we don't want aws.cloudwatch being the prefix for other services, such as elb or ebs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think we should use the same pattern here for consistency. @narph what do you think? It'd mean to store azure metrics in azure.metrics.* instead of azure.monitor.metrics.*.
I think that other common fields like the namespace or the subscription ID could be placed also at the module level.

}
if query != "" {
resourceQuery = query
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If query is set, group, rtype and ID will be ignored. And also if group is set, ID will be ignored. This can be counterintuitive, should we check for these cases and return errors?

Or maybe better we should check for these cases implementing a Validate() method for ResourceConfig.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@narph should we explicitly validate that a valid combination of options is provided (e.g. if ID and query are provided, return an error)?

x-pack/metricbeat/metricbeat.reference.yml Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/azure.go Outdated Show resolved Hide resolved
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package compute_vm

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use an underscore in package name

// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package compute_vm

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use an underscore in package name

// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package compute_vm_scaleset

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use an underscore in package name

// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package compute_vm_scaleset

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use an underscore in package name

@narph narph marked this pull request as ready for review September 18, 2019 17:28
@narph narph requested a review from a team as a code owner September 18, 2019 17:28
@narph narph requested a review from jsoriano September 19, 2019 08:08
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@narph huge effort here, let's see if we can merge it soon.

I have added some additional comments, but nothing specially serious. I think that we should merge it soon because it is starting to be quite big and difficult to review. Further fixes and refactors can be addressed in future smaller PRs.

Also as all the code is located in a new specific module there shouldn't be any risk of regressions, and it is beta, what would help to set expectations.

There is only one thing that I think we should clarify before merging this. I think that field paths should be coherent with the cloudwatch metricset. In my opinion metrics should be under azure.metrics. And other fields coming from information that is not specific of the monitor service should be placed at the module level.

We should also generate proper data.json files, but we need to review docs in any case, so this can be left for a future change.

=== Example configuration

The azure module supports the standard configuration options that are described
in <<configuration-metricbeat>>. Here is an example configuration:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More than a big example it may be good to have smaller examples with more explanations in the docs. We may revisit docs in follow-ups in any case.

"time"

"github.com/Azure/azure-sdk-for-go/services/preview/monitor/mgmt/2019-06-01/insights"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Empty line.

Suggested change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually make the changes but every time I run mage fmt update I get a different order, any other commands that should be ran here?

Config Config
Resources ResourceConfiguration
Log *logp.Logger
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a follow up, move common code to configure and initialize clients to some common place so it can be also used in other beats.

x-pack/metricbeat/module/azure/client.go Show resolved Hide resolved
x-pack/metricbeat/module/azure/client.go Outdated Show resolved Hide resolved
x-pack/metricbeat/module/azure/data.go Outdated Show resolved Hide resolved
type: keyword
description: >
The subscription ID
- name: metrics.*.*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think we should use the same pattern here for consistency. @narph what do you think? It'd mean to store azure metrics in azure.metrics.* instead of azure.monitor.metrics.*.
I think that other common fields like the namespace or the subscription ID could be placed also at the module level.

x-pack/metricbeat/module/azure/monitor/client_helper.go Outdated Show resolved Hide resolved
func (m *MetricSet) Fetch(report mb.ReporterV2) error {
err := m.client.InitResources(report)
if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@narph please review these return err to see if more context can be provided with errors.Wrap.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say to merge this as is, and continue with refactors, documentation and dashboards in follow ups.

Great job here!

@narph
Copy link
Contributor Author

narph commented Sep 23, 2019

jenkins test this

1 similar comment
@narph
Copy link
Contributor Author

narph commented Sep 23, 2019

jenkins test this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Metricbeat Metricbeat Team:Integrations Label for the Integrations team v7.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants