Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide labels for Azure Tags per metric #599

Open
4 tasks
tomkerkhove opened this issue Jun 26, 2019 · 27 comments
Open
4 tasks

Provide labels for Azure Tags per metric #599

tomkerkhove opened this issue Jun 26, 2019 · 27 comments
Labels
agents:scraper All issues related to the scraping agent feature-request New feature requests Hacktoberfest All issues available for contribution during Hacktoberfest help-wanted All issues where people can contribute to the project scraping All issues related to scraping

Comments

@tomkerkhove
Copy link
Owner

tomkerkhove commented Jun 26, 2019

We should determine if it's feasible and makes sense to provide labels for Azure Tags per metric.

Original suggestion was made in #462 by @brusMX .

Checklist

@tomkerkhove tomkerkhove added feature-request New feature requests scraping All issues related to scraping awaiting-customer-demand All issues which can be relevant but requires effective customers first labels Jun 26, 2019
@jcorioland
Copy link
Contributor

jcorioland commented Jun 26, 2019

Having a look at the code I think this is something that might be doable easily in the QueryMetricAsync method of AzureMonitorClient class. Like that:

// get tags
string resourceGroupName = "EXTRACT NAME FROM resourceId";
var resourceGroup = await _authenticatedAzureSubscription.ResourceGroups.GetByNameAsync(resourceGroupName);
 var tags = resourceGroup.Tags;

I am just wondering how we can add Tags property on the metric as you seem to use the object from the monitoring SDK, right?

Any thoughts on that @tomkerkhove ?

@tomkerkhove
Copy link
Owner Author

tomkerkhove commented Jun 26, 2019

I've had a look and I can access the data very easily, however, you need to know what property to call so ever scraper has to do it.

Will be a good addition, but it will be opt-in and only for post v1.0.

Does that sound fair?

@tomkerkhove tomkerkhove modified the milestones: v1.1.0, v1.0.0 Jun 26, 2019
@tomkerkhove tomkerkhove removed the awaiting-customer-demand All issues which can be relevant but requires effective customers first label Jun 27, 2019
@tomkerkhove
Copy link
Owner Author

From a Promitor perspective, I think we should only get tags from the resource itself rather than the resource group.

That way we keep the required permissions scoped to what is essential and we get aimed tags.

@jcorioland
Copy link
Contributor

Yep makes sense. And I checked yesterday, seems like the Monitoring Reader role is enough to get the tags, which is great in the case of Promitor :)

@tomkerkhove
Copy link
Owner Author

That is correct! Will add this tomorrow!

@tomkerkhove
Copy link
Owner Author

Figuring out how we can easily do this in a generic way via Azure/azure-libraries-for-net#719

Worst case this is moved to a later version.

@tomkerkhove tomkerkhove modified the milestones: v1.0.0, v1.1.0 Jul 1, 2019
@adamconnelly
Copy link
Contributor

I'd be up for taking a look at this, but I just wanted to explain my use case first, and clarify a bit more about how it could work.

Use Case

At work, we have multiple applications made up of different components (web server, database, redis cache, etc). We also have multiple environments (production, staging, etc). I'd like to be able to make sure that our metrics had a consistent set of labels on them:

  • environment
  • application
  • region
  • component

This will allow us to do a few things:

  • Create Grafana dashboards showing different parts of the same application.
  • Group alerts together, so that a problem in an SQL Server database that causes a knock-on error in the web application results in a single alert for the application, making it easier to diagnose problems.

With this labeling scheme, we might end up with a set of metrics like the following:

azure_sql_dtu_percent{environment="production", application="diary", region="euw", component="sql-server"} 54
azure_sql_dtu_percent{environment="production", application="diary", region="usnc", component="sql-server"} 43
azure_sql_dtu_percent{environment="production", application="login", region="euw", component="sql-server"} 49
azure_sql_dtu_percent{environment="staging", application="diary", region="euw", component="sql-server"} 23
azure_sql_dtu_percent{environment="staging", application="login", region="euw", component="sql-server"} 12
azure_redis_cache_hits{environment="production", application="diary", region="euw", component="redis"} 10000
azure_redis_cache_hits{environment="production", application="diary", region="usnc", component="redis"} 9500
azure_redis_cache_hits{environment="production", application="login", region="euw", component="redis"} 5609
azure_redis_cache_hits{environment="staging", application="diary", region="euw", component="redis"} 500
azure_redis_cache_hits{environment="staging", application="login", region="euw", component="redis"} 240

At the moment we can configure the component label in Promitor using the following:

name: azure_redis_cache_hits
description: "The number of successful key lookups ..."
resourceType: RedisCache
labels:
  component: redis
...
resources:
- cacheName: Promitor-1
- cacheName: Promitor-2

But the other labels would have to be set per resource, which would make the configuration difficult.

Possible Implementation

Configuration

We could alter the metrics configuration to allow users to opt-in to adding Azure tags as labels. I think it would make sense to either add this to the existing azureMetadata section, or to create a new azure section. Personally I'd put it into azureMetadata to keep the config as simple as possible unless you see a problem with that.

I think this feature should be opt-in, and we should be able to specify which labels we want to include, for example:

azureMetadata:
  tenantId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  subscriptionId: yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy
  resourceGroupName: promitor
  addLabelsForTags:
    tags:
    - environment
    - region
    - application
    - component

I can imagine that some people might have simple setups where they just want to add all the tags, in which case we could do something like this:

azureMetadata:
  tenantId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  subscriptionId: yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy
  resourceGroupName: promitor
  addLabelsForTags:
    includeAllTags: true

Implementation

I've got one main question about the implementation of this: should we get the tags once at startup, or during every scrape. The advantage of doing it once at startup would be that we'd avoid running an extra API request per scrape, but the disadvantage is that any changes to tags wouldn't be picked up until Promitor restarted. I don't think this would matter for my use case since our tags will be very static, but maybe that's a problem for some people?

Any thoughts on this?

@tomkerkhove
Copy link
Owner Author

Thanks for the suggestion and your usecase @adamconnelly!

I've been thinking about this as well and there are a couple of things:

  1. This should be opt-in indeed and we should check what the required ARM role is to read them so we can document it.
  2. People should be able to list the tags they want to use, similar to your case
  3. People should be able to use all tags but list the ones to exclude
  4. People should be able to configure a value to be used in case one resource of a list does not have the tag

This makes me thing in these terms:

  • Promitor runtime allows you to opt-in and configure a default value
  • Metric declaration allows you to define default include/exclude of tags, with override per metric

In terms of getting the tags, I would get them every time and see how things go. Over time we could still cache them or only fetch them once but then it becomes tricky as we might report stale data.

Now I would not implement everything as part of this feature, for example I would create a dedicated issue for excluding tags and one for annotating the ones to include rather than to add everything.

Or would annotating those that you want be the best approach to avoid having too many labels🤔 Now that I think of this I'm actually leaning toward whitelist only.

Runtime Configuration

server:
  httpPort: 80 # Optional. Default: 80
azure:
  tagEnrichment:
    notFoundValue: N/A # Optional. Default: N/A
    enableEnrichment: true # Optional. Default: false

Metric Declaration

So I'm more thinking along these lines, but I'm open for feedback:

version: v1
azureMetadata:
  tenantId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  subscriptionId: yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy
  resourceGroupName: promitor
metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    # Every minute
    schedule: "0 * * ? * *"
  azureTags:
    include:
    - foo
    - bar
    exclude:
    - foo
    - bar
metrics:
  - name: azure_service_bus_active_messages
    description: "The number of active messages on a service bus queue."
    resourceType: ServiceBusQueue
    labels:
      app: promitor
      tier: messaging
    azureTags:
      include:
      - environment
      exclude:
      - deployedBy
      - version

@adamconnelly
Copy link
Contributor

That makes sense to me. Couple of questions:

  • I take it the include and exclude lists are mutually exclusive?
  • I think it makes sense to allow this to be specified at the global, and per-metric level like you've suggested. Should we support only specifying it at the metric level without specifying a default (i.e. if you only wanted to pull in the tags for a single metric or a certain subset)? I don't think this would be a use-case for me, but I just want to get your thoughts.

@tomkerkhove
Copy link
Owner Author

I think having both would be ideal to avoid duplication, for example the environment Azure Tag would be on all resources rather than a handful.

The list would be mutually exclusive indeed.

I'll need to give the include/exclude some thought but for the this issue I'd say let's just add all of them.

@adamconnelly
Copy link
Contributor

@tomkerkhove have you had any thoughts yet about how to get around Azure/azure-libraries-for-net#719?

@adamconnelly
Copy link
Contributor

In case this helps, I've done a bit of playing about, and found that we can get the resource using IGenericResources.Get(). For example:

_resourceManager.GenericResources.Get(ResourceUtils.GroupFromResourceId(resourceId),
                ResourceUtils.ResourceProviderFromResourceId(resourceId),
                ResourceUtils.ParentRelativePathFromResourceId(resourceId),
                ResourceUtils.ResourceTypeFromResourceId(resourceId),
                ResourceUtils.NameFromResourceId(resourceId),
                "2018-06-01-preview");

There's (at least!) a couple of problems / annoyances with that:

  • The method isn't async.
  • We'd need to specify an appropriate API version for each resource provider.

@tomkerkhove
Copy link
Owner Author

@adamconnelly I did some planning and would like to move this to v1.2, is that an issue for you? Is this a requirement on your end?

I want to ship v1.1 fairly soon and it feels like this is not well-supported yet and we might need to ask for better support

@adamconnelly
Copy link
Contributor

That's no problem for me.

When I started to look at it I hadn't spotted the problem you'd found with the Azure client library. At the moment it looks like we either wait for better support, or we'd have to put a lot of effort in to work around it. Either way it's not gonna be a quick fix.

@tomkerkhove
Copy link
Owner Author

tomkerkhove commented Oct 28, 2019

In that case I'd sit back and wait until there is better support, no pressure on this feature!

@tomkerkhove tomkerkhove modified the milestones: v1.2.0, v1.3.0 Jan 6, 2020
@taruna-verma
Copy link

Is this feature implemented? I am also in need of adding the azure tags in the metrics dimension. Please help.

@tomkerkhove
Copy link
Owner Author

This is not supported yet, unfortunately.

Would you prefer to have this for all metrics or only when using resource discovery?

@taruna-verma
Copy link

All the metrics is preferred but good to have for both.

@taruna-verma
Copy link

taruna-verma commented Dec 28, 2020

Can you please provide some tips, if I want to implement this for all the metrics in the scraper?

@tomkerkhove
Copy link
Owner Author

Can you please provide some tips, if I want to implement this for all the metrics in the scraper?

Do you mean to have it up and running or contribute the feature itself?

@taruna-verma
Copy link

Can you please provide some tips, if I want to implement this for all the metrics in the scraper?

Do you mean to have it up and running or contribute the feature itself?

I would love to contribute on this feature, please provide me details.

@tomkerkhove
Copy link
Owner Author

That would be great, thanks! Let's focus on resource discovery first now, and then we can add it to declarative metrics later on.

I'll circle back on this later on

@tomkerkhove
Copy link
Owner Author

The resource discovery group should have a new field called enrichments which allows you to opt-in for tag enrichment when enabled: true:

version: v1
azureLandscape:
  tenantId: c8819874-9e56-4e3f-b1a8-1c0325138f27
  subscriptions:
  - SUBSCRIPTON-ID-ABC
  cloud: China
resourceDiscoveryGroups:
- name: container-registry-landscape
  type: ContainerRegistry
- name: filtered-logic-apps-landscape
  type: LogicApp
  enrichments:
    tags:
      enabled: true
      scope: [resource, resourceGroup]
      exclude:
      - app
  criteria:
    include:
      resourceGroups:
      - promitor-resource-group-1
      - promitor-resource-group-2
      tags:
        app: promitor
        region: europe

Bonus points if we allow exclusion by defining the tag names through exclude and/or define the scope to look for the tags on either resource and/or resourceGroup.

The API endpoint should introduce a new field called Dimensions:

[
    {
        "$type": "Promitor.Core.Contracts.ResourceTypes.ServiceBusNamespaceResourceDefinition, Promitor.Core.Contracts",
        "Namespace": "promitor-messaging-other-subscription",
        "QueueName": "",
        "TopicName": "",
        "ResourceType": "ServiceBusNamespace",
        "SubscriptionId": "0329dd2a-59dc-4493-aa54-cb01cb027dc2",
        "ResourceGroupName": "promitor-sources",
        "ResourceName": "promitor-messaging-other-subscription",
        "UniqueName": "promitor-messaging-other-subscription--",
        "Dimensions":{
            "region": "europe"
        }
    }
]

@tomkerkhove
Copy link
Owner Author

tomkerkhove commented Sep 29, 2021

Extending the example above, we should:

  • Support resourceDiscoveryGroups.[].enrichments.tags.include
  • Remove resourceDiscoveryGroups.[].enrichments.scope as we will only support resources for now
  • Provide capability to configure the default value if it was not found (default: N/A)

However, next to the approach above we will provide system metrics on the discovery agent for every single resource so that you can use a more centralized view as well.

@KeyanatGiggso
Copy link

What's the status on this @tomkerkhove ? Any good News for this Improvement ?

@tomkerkhove
Copy link
Owner Author

Not at the moment but I'm open to contributions.

@tomkerkhove tomkerkhove added help-wanted All issues where people can contribute to the project Hacktoberfest All issues available for contribution during Hacktoberfest labels Sep 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agents:scraper All issues related to the scraping agent feature-request New feature requests Hacktoberfest All issues available for contribution during Hacktoberfest help-wanted All issues where people can contribute to the project scraping All issues related to scraping
Projects
Status: Todo
Development

No branches or pull requests

5 participants