-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowed ingestors to be disabled and improved resolving of listener address of those #72
Conversation
@itzg I'm confused by that comment. The code makes it look like 127.0.0.1 is always used no matter what host is specified. is that correct? if so, why allow a host at all? how is 0.0.0.0 special? |
Yeah, this was an awkward scenario to explain since there's the ingestor and then the agent perspective. I'll try to further explain with a hypothetical scenario: Let's say a user wants to send application metrics from another server (or logs in the case of filebeat/lumberjack) over to Envoy because they're already in the right format and the user wants to leverage our Envoy->Ambassador->Kafka infrastructure, but with custom metrics. (This is a use case that RBA will probably exercise for their apps.) In that case, the ingestor would need to be told to bind to At the same time, the Envoy is launching its managed instance of telegraf and it is always on the same host. Therefore Unfortunately there's still an edge case that cancels the "consistent address" statement. If the user configures the ingestor to bind to a public IP address of the host, then |
config/listeners.go
Outdated
|
||
func RegisterListenerAddress(name string, address string) { | ||
// The given address is relative to the binding, so it might contain the special "all interfaces" | ||
// address of 0.0.0.0. As such, we need to pick apart and rebuild an address suitable for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably add a comment here, something like:
This will allow a server without an envoy to send it's metrics to a
different servers envoy/ingestor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i added a suggested comment, but it looks good to me.
actually, i do have one other question about the shared ingestor scenario. since they are coming through the an ingestor whose envoy has a different resource id, is it a problem that we don't have the correct resource id associated with those metrics? |
Good question. I was going to say the other system could pre-populate resourceId, but currently the Ambassador removes the resourceId tag here A small tweak in the latter could fall back to retaining the originally provided resourceId rather than an empty labels map. |
This is a similar concern I have (prior to reading the PR) and is why I was going to say I prefer requiring them to install and envoy (with a proxy if required) might be a better solution... but maybe a "small tweak" does solve that. |
I spoke too soon. As I was experimenting with the code change I realized further down it promotes the resourceId label that was extracted from the metric's tags to the device field of the external metric: The warning log and metric are misleading, but the code all works as-is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure about allowing binding to "all interfaces" by specifying 0.0.0.0 as the ingestor bind address, yet converting to 127.0.0.1 when the agent runner is looking up what to use for agent configs
but the rest looks good.
If this is blocking you I'm ok for a merge to be discussed later, otherwise it might be worth us chatting about this more.
@@ -86,7 +82,7 @@ func (fbr *FilebeatRunner) PostInstall() error { | |||
return nil | |||
} | |||
|
|||
func (fbr *FilebeatRunner) EnsureRunningState(ctx context.Context, applyConfigs bool) { | |||
func (fbr *FilebeatRunner) EnsureRunningState(ctx context.Context, _ bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the normal way to deprecate a field vs. updating everything that calls it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The telegraf runner actually makes use of that parameter
salus-telemetry-envoy/agents/telegraf.go
Line 203 in 7f74772
func (tr *TelegrafRunner) EnsureRunningState(ctx context.Context, applyConfigs bool) { |
and yes, that's the standard way IntelliJ recommends fixing the warning
@@ -241,6 +232,6 @@ func (fbr *FilebeatRunner) exePath() string { | |||
return filepath.Join(currentVerLink, binSubpath, "filebeat") | |||
} | |||
|
|||
func (fbr *FilebeatRunner) ProcessTestMonitor(correlationId string, content string, timeout time.Duration) (*telemetry_edge.TestMonitorResults, error) { | |||
func (fbr *FilebeatRunner) ProcessTestMonitor(string, string, time.Duration) (*telemetry_edge.TestMonitorResults, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just a standardization thing since the fields weren't being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I could have also done ProcessTestMonitor(_ string, _ string, _ time.Duration)
but Go allows the special case of all unnamed parameters. I can change to underscores for consistency. I just wanted to fix warnings while I was in this file.
In f7146c9 I realized I could both clarify the binding-to-dial-address conversion and solve external IP address binding by only swapping out |
Should that be updated? |
Looking at that block of code it seems good. What's misleading? |
config/listeners.go
Outdated
// The given address is relative to the binding, so if an ingestor has been configured to | ||
// listen on all interfaces with 0.0.0.0, such as for accepting telemetry from | ||
// non-envoy-running hosts, then the agent needs to be told the loopback address to dial. | ||
listenerAddresses.Store(name, strings.Replace(address, "0.0.0.0", "127.0.0.1", 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be changed to a more specific regex using ^
& $
? I'd imagine at worst right now all it's doing it converting an already bad address to an equally as bad one e.g. 10.0.0.0 -> 1127.0.0.1, so it maybe makes no different, but if it's easy to do a closer match I think I'd prefer that.
Happy to go with whatever you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matching on ^0.0.0.0:
could be good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll use the net.SplitHostPort
and only replace a host that exactly matches "0.0.0.0".
Nothing :) , now that I realized users could create resources via the API to supply the label processing there in the metrics routing. |
Resolves
https://jira.rax.io/browse/SALUS-748 and prepares for new line protocol ingestor to be added for https://jira.rax.io/browse/SALUS-594
What
We found on some systems where logstash was running that the lumberjack binding would fail since the equivalent logstash plugin was already bound at the standard port. That made me realize that we should ensure a few variations of ingest config to be allowed:
0.0.0.0
as the ingestor bind address, yet converting to127.0.0.1
when the agent runner is looking up what to use for agent configsAll of this also gets ready for me to add a line protocol ingest type to be used in conjunction with racker/salus-packages-agent#1
How
Most of the code changes are code cleanup to ensure the Ingestor Bind and Start method are consistently doing just what they were intended to be doing. With that the agent lookup of registered listener address could be consistently handled.
How to test
Existing unit tests