Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Azure VM scale sets to azure_sd_config #2369

Closed
gambrose opened this Issue Jan 26, 2017 · 19 comments

Comments

Projects
None yet
6 participants
@gambrose
Copy link
Contributor

gambrose commented Jan 26, 2017

What did you do?

Configured azure_sd_config to point to my Azure subscription.

What did you expect to see?

All my resource manager vms.

What did you see instead? Under which circumstances?

I only saw standalone vms. VMs managed under Azure Scale Set where not visible.

Environment

Windows, Azure

  • System information:
'uname' is not recognized as an internal or external command,
operable program or batch file.
  • Prometheus version:
prometheus, version 1.5.0 (branch: master, revision: d840f2c400629a846b210cf58d65b9fbae0f1d5c)
  build user:       root@a04ed5b536e3
  build date:       20170123-14:03:28
  go version:       go1.7.4
@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jan 26, 2017

We have very little Azure knowledge in the core team. So it will be hard for any of us to work on this. @rjtsdl , does the above make any sense to you?

@rjtsdl

This comment has been minimized.

Copy link

rjtsdl commented Jan 26, 2017

What did you see instead? Under which circumstances?
I only saw standalone vms. VMs managed under Azure Scale Set where not visible.

@gambrose
For this sentence, can you elaborate more. Can you give some pseudo example ?
And by saying VMs managed under Azure Scale Set, what do you mean ? Do you mean VMSS or Availability Set ?

@gambrose

This comment has been minimized.

Copy link
Contributor Author

gambrose commented Jan 26, 2017

I mean VMSS.
The config does not return any VM instances that are created as part of a scale set.

So if I created a VMSS and set number of instances to 5 then the VMSS would create 5 VMs. But non of those VMs are found by Prometheus.

@rjtsdl

This comment has been minimized.

Copy link

rjtsdl commented Jan 26, 2017

While i am trying to repro your scenario, how do you create VMSS ? Through CLI or portal ?

@gambrose

This comment has been minimized.

Copy link
Contributor Author

gambrose commented Jan 26, 2017

Using an ARM template deployment from Poweshell

@gambrose

This comment has been minimized.

Copy link
Contributor Author

gambrose commented Jan 26, 2017

Sorry, I may have misunderstood you question. I use ARM templates to create my VMSS but you can create them though the portal. I am not sure what support the CLI has for Scale Sets.

@rjtsdl

This comment has been minimized.

Copy link

rjtsdl commented Jan 26, 2017

The cli tool i mentioned is this (https://github.com/Azure/azure-cli)
This one is a cli tool for azure in general, and in active development.

@rjtsdl

This comment has been minimized.

Copy link

rjtsdl commented Jan 26, 2017

Just take a look at the azure api here (https://github.com/Azure/azure-sdk-for-go/tree/master/arm/compute)

Apparently virtualmachines and virtualmachinescalesets are treated differently. They are not the same resource type at all.

For virtualmachines, the corresponding REST API is like
/subscriptions/{subscriptionId}/providers/Microsoft.Compute/virtualMachines
For virtualmachineScaleSets, the corresponding REST API is like
/subscriptions/{subscriptionId}/providers/Microsoft.Compute/virtualMachineScaleSets

In prometheus, just checked the code, we only have virtualmachines. It means it will only discover virtualMachines resourceType's virtualmachines.

VMSS is a relatively new concept. There is AvailabilitySet as well.
I think, some work is needed to support VMSS and AvailabilitySet in azure_sd code.

@gambrose interested ? I can point you to the code where you want to change and test. Unfortunately, i don't have enough cycle for this right now.

@gambrose

This comment has been minimized.

Copy link
Contributor Author

gambrose commented Jan 26, 2017

I'm not sure why there would be a need to be support for Availability Sets. They are more of a parallel resource type that VMs are linked to and don't really own a VM in the same way an Scale Set does.

Unless there is a requirement to expose the AV set in a tag for relabelling purposes.

I don't have much experience with go but I could take a stap at adding VMSS support.

Would the idea be to iterate the scale sets and produce a flattened list of VMs that included normal VMs and scale set manged ones?

If so would we expose any scale set specific labels? I'm not really sure what the guide lines are for labels that only relate to certain VMs and not others.

@rjtsdl

This comment has been minimized.

Copy link

rjtsdl commented Jan 26, 2017

@beorn7 can you answer the questions @gambrose listed here?

I am not sure about AV sets's VMs get hide like VMSS or not.
I think the guidance you pointed out is what i thought. But get from beorn7 or other members for sure.

You can take look the code here. https://github.com/prometheus/prometheus/blob/master/discovery/azure/azure.go
The azureClient definition is here

type azureClient struct {
	nic network.InterfacesClient
	vm  compute.VirtualMachinesClient
}

You want to add

        vmss compute.VirtualMachineScaleSetsClient

and track the places where vm used.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jan 27, 2017

I have no deeper understanding what labels these would be. The general rule is that you want to have the same dimensionality on all targets, i.e. the set of label names should be the same, while the label values might differ. I'm sure if you come up with an attempt, we can make sure target labels are treated properly. @brian-brazil is our guardian of target labels.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 27, 2017

The rules for meta labels for SD are different, we take any metadata we can get and it's up to the user to make sense of it in the context of their environment.

The more pertinent questions here are what this will do in terms of API call volume as Azure has very low limits, and if we should be splitting this with roles like we do k8.

@gambrose

This comment has been minimized.

Copy link
Contributor Author

gambrose commented Jan 27, 2017

@brian-brazil I had a quick look at the k8s config and as far as I can see the roles relate to getting different resource types. In this case we are taking about getting the same resource type (VMs). It just happens that VMSS try and abstract some of the management details of the VMs, but ultimately they are still the same Azure VMs and would share all the same labels.
From a monitoring point of view I want a list of the VMs in my subscription, I don't really care how they are created.

That said I understand the need to limit calls;
As far as I can tell from looking at the code linked the current implementation makes one call to a paged endpoint that retrieves all the VMs for a subscription, then makes another call for each VM returned to get the network adaptors.
To also get the scale set VMs then we would also need a call to list all the scale sets then for each scale set a call to the the VMs in the scale set then another call for each VM to get their network interfaces.

As we have a select N+1 to get the VM info. I would have thought that if we want to limit the number of api calls we should be trying to filter the number of VMs we need to get info for.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 27, 2017

It doesn't sound like this will cause a crazy increase in call counts then.

@samirageb

This comment has been minimized.

Copy link

samirageb commented Sep 11, 2017

Any update/eta on this? In any Azure auto-scale environment, not supporting VMs within a VMSS is a huge gap. I was looking to leverage this functionality and seeing this thread makes azure_sd_config a non-starter.

I don't know enough to code contribute, but am willing to answer any questions I can around Azure.

I CAN say to ignore Availability Sets, as they won't impact VM discovery. Only VMSS's have a different API namespace for VMs.

@gambrose

This comment has been minimized.

Copy link
Contributor Author

gambrose commented Sep 15, 2017

@samirageb I got something working in my pull request. But this was I think my second attempt at go code.

What the experience highlighted for me was that although go is a nice language. I had real trouble understanding the ecosystem for dependencies and debugging my code. Unfortunately I don't have the time to invest in getting myself to a stage where I would be able to write go code that I had enough confidence in to run in a production scenario.

@samirageb

This comment has been minimized.

Copy link

samirageb commented Sep 16, 2017

@gambrose Thanks for the update. I'm currently deploying Consul which (hopefully) will get me where I want to go re: SD.

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Aug 1, 2018

Fixed by #4202

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.