New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetch metadata from GCE VM Metadata Server #837
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks pretty solid. There are a few transformations and nuances here around differences from how the gce detector in prometheus collects things (Compute engine API) and how we do (metadata server). We should add an integration test in addition to the unit tests in this PR to make sure the metadata attributes look like what we expect for the prometheus
use case.
) | ||
|
||
// Keep synced with the Prometheus repo | ||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest keeping this detector generic and leave all the translation to the prometheus style labels to the prometheus
receiver. This detector is useful for other components too. One such example is: b/219991181 | P3 | Allow LogName to be used in processors
then func(GCEDataProvider, string) (string, string, error) | ||
} | ||
|
||
// List of single value labels (non-nested) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
labels
feels like a prometheus specific thing to me. It can get confusing especially since the metadata has its own labels
too. Naming it instead to something like metadata
or attributes
might make it clearer.
gceLabelInstanceID: GCEDataProvider.getInstanceID, | ||
gceLabelInstanceName: GCEDataProvider.getInstanceName, | ||
gceLabelTags: GCEDataProvider.getTags, | ||
gceLabelLabel: GCEDataProvider.getLabel, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a single label or nested? I thought it was a nested one but could be wrong.
// The data provider interface for GCE environment | ||
// Implementation of this provider can use either the metadata server on VM, | ||
// or the cloud API | ||
type GCEDataProvider interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add some comments here around what is available in the detector now and what is not
|
||
type LabelValue string | ||
|
||
type Metalabels map[LabelName]LabelValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The contract we expose should probably be more clear - instead of having map[string]string
it could be a struct that has fields for all the attributes. This way it is part of the public interface what fields are available as part of the detector, instead of maintaining another set of key labels.
// Also add `nic` to the name following the Prometheus's SD convention | ||
func (gmp *GCEMetadataProvider) getInterfaceIPv4(name string) (string, string, error) { | ||
interface_name := strings.TrimSuffix(name, "/") | ||
ipv4, err := gmp.client.Get(fmt.Sprintf("instance/network-interfaces/%s/ip", interface_name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are very interesting mutations we have to do. I'd leave it out of the detector package and add it when the prometheus
receiver uses it. Only it cares about the Prometheus SD's conventions.
5482025
to
3a79286
Compare
|
||
// EmptyDetector that returns an empty detector without any attributes | ||
// for non-recognized environments | ||
type EmptyDetector struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it would be better to call this UnrecognizedPlatformDetector
or something similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for UnrecognizedPlatformDetector
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking very good! Left a couple comments
|
||
// EmptyDetector that returns an empty detector without any attributes | ||
// for non-recognized environments | ||
type EmptyDetector struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for UnrecognizedPlatformDetector
.
|
||
// Nested attribute takes two steps: first to get the list of keys, | ||
// and then use the key to get individual values. | ||
type nestedAttribute struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This struct and interface seems pretty confusing to me. It feels simpler to have the provider do both the fetching of the keys and then aggregate the values rather than having the detector work with this interface.
In other words, if the provider provided getMetadata
, getLabels and getLabels
methods that each returned a map[string]string
, the GCE detector wouldn't need to do any of this nesting and unnesting logic right? That would be contained in the implementation of the getMetadata
, getLabels and getLabels
methods. Let me know what you think of this approach.
} | ||
} | ||
|
||
func TestGettingDetectorOnGCE(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it belongs in the integration_test
folder right?
Also you will be able to test a lot more of the fields there since we have the VM information when the integration test sets it up:
ops-agent/integration_test/gce/gce_testing.go
Lines 267 to 282 in e7aa72d
type VM struct { | |
Name string | |
Project string | |
Network string | |
Platform string | |
Zone string | |
MachineType string | |
ID int64 | |
// The IP address to ssh/WinRM to. This is the external IP address, unless | |
// USE_INTERNAL_IP is set to 'true'. See comment on extractIPAddress() for | |
// rationale. | |
IPAddress string | |
// WindowsCredentials is only populated for Windows VMs. | |
WindowsCredentials *WindowsCredentials | |
AlreadyDeleted bool | |
} |
See this as an example of what is used when setting up the environment for an integration test:
ops-agent/integration_test/ops_agent_test.go
Line 346 in e7aa72d
ctx, logger, vm := agents.CommonSetup(t, platform) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what the best way is to integration test this. Maybe the easiest way is to have a small go command line runner that hits your interface and dumps the metadata into a json file. The integration test can then read that from the VM and compare it to what it expects. Just a thought, figured its worth sharing. Looking forward to whatever you end up going with.
1696c5b
to
939a738
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'd get another reviewer however since we were pretty involved in this
} | ||
|
||
// GCEDetector implements the Detector interface and provide attributes of the VM when on GCE | ||
type GCEDetector struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] naming feels a bit off to me. GCEDetector.Project
sounds like the project of the detector and not the metadata of the VM that is being detected. I'm struggling to come up with a better name/interface that works for both GCE and non-GCE resource detection however. Maybe we can just do the simple thing and rename this and Detector
to Resource
and the method GetDetector
to GetResource
. And we can add a comment that in GCE VM land the resource is the VM metadata.
It is a nit pick tho. Feel free to ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good idea - I have changed the name to Resource
.
gcp_metadata "cloud.google.com/go/compute/metadata" | ||
) | ||
|
||
type Detector interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also can we add some comments documenting how to use the interface. Its hard to tell from the interface, how it is meant to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments to what the interface means and how to cast to get an instance of the underlying type such as GCEResource.
} | ||
|
||
// Get a Detector for the current environment, which can be used | ||
// to get all the attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also add what it needs to be casted to (for example GCEDetector
)
bd09687
to
b179df4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wooo LGTM!
b179df4
to
c38b059
Compare
Description
This change introduces a new package in confgenerator called
resourcedetector
to detect metadata of the environment Ops Agent is running in. The detected metadata can then be used by receivers/processors. Right now we only have a detector for GCE environment; the detector is not currently used by any components; this will be used by the Prometheus receivers.Related issue
b/244434005
How has this been tested?
This has been tested with unit tests and manual tests on GCE.
Integration tests for this is TBD.
Checklist: