Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch metadata from GCE VM Metadata Server #837

Merged
merged 3 commits into from Sep 19, 2022

Conversation

LujieDuan
Copy link
Contributor

@LujieDuan LujieDuan commented Sep 8, 2022

Description

This change introduces a new package in confgenerator called resourcedetector to detect metadata of the environment Ops Agent is running in. The detected metadata can then be used by receivers/processors. Right now we only have a detector for GCE environment; the detector is not currently used by any components; this will be used by the Prometheus receivers.

Related issue

b/244434005

How has this been tested?

This has been tested with unit tests and manual tests on GCE.
Integration tests for this is TBD.

Checklist:

  • Unit tests
    • Unit tests do not apply.
    • Unit tests have been added/modified and passed for this PR.
  • Integration tests
    • Integration tests do not apply.
    • Integration tests have been added/modified and passed for this PR.
  • Documentation
    • This PR introduces no user visible changes.
    • This PR introduces user visible changes and the corresponding documentation change has been made.
  • Minor version bump
    • This PR introduces no new features.
    • This PR introduces new features, and there is a separate PR to bump the minor version since the last release already.
    • This PR bumps the version.

Copy link
Contributor

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks pretty solid. There are a few transformations and nuances here around differences from how the gce detector in prometheus collects things (Compute engine API) and how we do (metadata server). We should add an integration test in addition to the unit tests in this PR to make sure the metadata attributes look like what we expect for the prometheus use case.

)

// Keep synced with the Prometheus repo
const (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest keeping this detector generic and leave all the translation to the prometheus style labels to the prometheus receiver. This detector is useful for other components too. One such example is: b/219991181 | P3 | Allow LogName to be used in processors

then func(GCEDataProvider, string) (string, string, error)
}

// List of single value labels (non-nested)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels feels like a prometheus specific thing to me. It can get confusing especially since the metadata has its own labels too. Naming it instead to something like metadata or attributes might make it clearer.

gceLabelInstanceID: GCEDataProvider.getInstanceID,
gceLabelInstanceName: GCEDataProvider.getInstanceName,
gceLabelTags: GCEDataProvider.getTags,
gceLabelLabel: GCEDataProvider.getLabel,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a single label or nested? I thought it was a nested one but could be wrong.

// The data provider interface for GCE environment
// Implementation of this provider can use either the metadata server on VM,
// or the cloud API
type GCEDataProvider interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add some comments here around what is available in the detector now and what is not


type LabelValue string

type Metalabels map[LabelName]LabelValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The contract we expose should probably be more clear - instead of having map[string]string it could be a struct that has fields for all the attributes. This way it is part of the public interface what fields are available as part of the detector, instead of maintaining another set of key labels.

// Also add `nic` to the name following the Prometheus's SD convention
func (gmp *GCEMetadataProvider) getInterfaceIPv4(name string) (string, string, error) {
interface_name := strings.TrimSuffix(name, "/")
ipv4, err := gmp.client.Get(fmt.Sprintf("instance/network-interfaces/%s/ip", interface_name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are very interesting mutations we have to do. I'd leave it out of the detector package and add it when the prometheus receiver uses it. Only it cares about the Prometheus SD's conventions.


// EmptyDetector that returns an empty detector without any attributes
// for non-recognized environments
type EmptyDetector struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it would be better to call this UnrecognizedPlatformDetector or something similar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for UnrecognizedPlatformDetector.

@LujieDuan LujieDuan marked this pull request as ready for review September 12, 2022 16:53
Copy link
Contributor

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking very good! Left a couple comments


// EmptyDetector that returns an empty detector without any attributes
// for non-recognized environments
type EmptyDetector struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for UnrecognizedPlatformDetector.


// Nested attribute takes two steps: first to get the list of keys,
// and then use the key to get individual values.
type nestedAttribute struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct and interface seems pretty confusing to me. It feels simpler to have the provider do both the fetching of the keys and then aggregate the values rather than having the detector work with this interface.

In other words, if the provider provided getMetadata, getLabels and getLabels methods that each returned a map[string]string, the GCE detector wouldn't need to do any of this nesting and unnesting logic right? That would be contained in the implementation of the getMetadata, getLabels and getLabels methods. Let me know what you think of this approach.

}
}

func TestGettingDetectorOnGCE(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it belongs in the integration_test folder right?

Also you will be able to test a lot more of the fields there since we have the VM information when the integration test sets it up:

type VM struct {
Name string
Project string
Network string
Platform string
Zone string
MachineType string
ID int64
// The IP address to ssh/WinRM to. This is the external IP address, unless
// USE_INTERNAL_IP is set to 'true'. See comment on extractIPAddress() for
// rationale.
IPAddress string
// WindowsCredentials is only populated for Windows VMs.
WindowsCredentials *WindowsCredentials
AlreadyDeleted bool
}

See this as an example of what is used when setting up the environment for an integration test:

ctx, logger, vm := agents.CommonSetup(t, platform)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what the best way is to integration test this. Maybe the easiest way is to have a small go command line runner that hits your interface and dumps the metadata into a json file. The integration test can then read that from the VM and compare it to what it expects. Just a thought, figured its worth sharing. Looking forward to whatever you end up going with.

@LujieDuan LujieDuan added the kokoro:force-run Forces kokoro to run integration tests on a CL label Sep 15, 2022
@stackdriver-instrumentation-release stackdriver-instrumentation-release removed the kokoro:force-run Forces kokoro to run integration tests on a CL label Sep 15, 2022
@LujieDuan LujieDuan changed the title WIP: Fetch meta labels for Prometheus from GCE VM Metadata Server Fetch meta labels for Prometheus from GCE VM Metadata Server Sep 15, 2022
@LujieDuan LujieDuan requested review from a team and ridwanmsharif September 15, 2022 21:24
Copy link
Contributor

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'd get another reviewer however since we were pretty involved in this

}

// GCEDetector implements the Detector interface and provide attributes of the VM when on GCE
type GCEDetector struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] naming feels a bit off to me. GCEDetector.Project sounds like the project of the detector and not the metadata of the VM that is being detected. I'm struggling to come up with a better name/interface that works for both GCE and non-GCE resource detection however. Maybe we can just do the simple thing and rename this and Detector to Resource and the method GetDetector to GetResource. And we can add a comment that in GCE VM land the resource is the VM metadata.

It is a nit pick tho. Feel free to ignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good idea - I have changed the name to Resource.

gcp_metadata "cloud.google.com/go/compute/metadata"
)

type Detector interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can we add some comments documenting how to use the interface. Its hard to tell from the interface, how it is meant to be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments to what the interface means and how to cast to get an instance of the underlying type such as GCEResource.

}

// Get a Detector for the current environment, which can be used
// to get all the attributes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add what it needs to be casted to (for example GCEDetector)

Copy link
Contributor

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wooo LGTM!

@LujieDuan LujieDuan added the kokoro:force-run Forces kokoro to run integration tests on a CL label Sep 16, 2022
@stackdriver-instrumentation-release stackdriver-instrumentation-release removed the kokoro:force-run Forces kokoro to run integration tests on a CL label Sep 16, 2022
@LujieDuan LujieDuan changed the title Fetch meta labels for Prometheus from GCE VM Metadata Server Fetch metadata from GCE VM Metadata Server Sep 16, 2022
@LujieDuan LujieDuan merged commit 6272ea2 into master Sep 19, 2022
@igorpeshansky igorpeshansky deleted the lujieduan-prometheus-metadata branch July 10, 2023 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants