New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LOG-4589: vector fails to start on a node with IPv6 disabled #2200
Conversation
@syedriko: This pull request references LOG-4589 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.8.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test e2e-ocp-target-minus-one |
1 similar comment
/test e2e-ocp-target-minus-one |
Please rebase this to 5.7 and we'll pick it forward |
Here you go: #2201 |
var PrometheusExporterAddress string | ||
|
||
func init() { | ||
if fd, err := unix.Socket(unix.AF_INET6, unix.SOCK_STREAM, unix.IPPROTO_IP); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this run during init
of the package bothers me a bit, because it makes it intransparent that this code is running and host-configuration is affecting the results. I think we're also testing for the output generation, which would now fail on non-IPv6 enabled hosts (because the output would have the IPv4 address in it instead of the IPv6 one we expect).
I have two ideas for this:
- Does vector maybe support leaving out the address completely (just provide
:24231
)? Then we could avoid this whole code completely. - Can we put this code somewhere "more explicit" and pass the result to the generator? That way we could reliably set "IPv6-enabled/disabled in our tests" and an explicit function call is probably also less surprising when debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Nope
$ grep address ~/syslog/vector_ip.toml | grep -v '^#'
address = ":24231"
$ target/debug/vector -c ~/syslog/vector_ip.toml
...
2023-10-09T13:59:02.169084Z INFO vector::app: Loading configs. paths=["/home/syedriko/syslog/vector_ip.toml"]
2023-10-09T13:59:02.170997Z ERROR vector::cli: Configuration error. error=invalid socket address syntax
in `sinks.prometheus_output`
$
- ...would now fail on non-IPv6 enabled hosts
We don't have any, not even in QA, otherwise we would've caught this issue earlier. All the CI tests pass.
I like this approach exactly because it's unavoidable and unconditional. We can tell an IPv4-only host if we ever have it automated, in the tests and expect the correct config gen output accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@syedriko Yes, this is nothing that "will happen immediately". Having the code in init()
that changes the value of the variable before it is used the first might just be surprising if you look on the code and don't know about this yet (or have forgotten it in a few months), so this is a construct I'd try to avoid.
But it's not wrong, so consider my feedback on this optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PrometheusExporterAddress used to be a const, what I'm doing is the best approximation to a const I could come up with.
@syedriko: This pull request references LOG-4589 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.8.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jcantrill, syedriko The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
||
func init() { | ||
if fd, err := unix.Socket(unix.AF_INET6, unix.SOCK_STREAM, unix.IPPROTO_IP); err != nil { | ||
PrometheusExporterAddress = `0.0.0.0` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried using an empty hostname `":12345" rather than an IP literal? The empty hostname is supposed to listen on all local addresses for ipv6 or ipv4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep ^^^ #2200 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh. I have long hated getaddrinfo in C, and now I hate the equivalent in Rust as well. Is a simple "listen to everything" so much to ask?
Have you seen https://issues.redhat.com/browse/LOG-4608 ? I think we have to solve and test both together. QE may have test clusters now, and I guess we can force the ipv settings on a CRC or SNO cluster to test all scenarios.
/hold cancel |
/lgtm |
/hold |
/lgtm |
|
||
func init() { | ||
if fd, err := unix.Socket(unix.AF_INET6, unix.SOCK_STREAM, unix.IPPROTO_IP); err != nil { | ||
PrometheusExporterAddress = `0.0.0.0` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh. I have long hated getaddrinfo in C, and now I hate the equivalent in Rust as well. Is a simple "listen to everything" so much to ask?
Have you seen https://issues.redhat.com/browse/LOG-4608 ? I think we have to solve and test both together. QE may have test clusters now, and I guess we can force the ipv settings on a CRC or SNO cluster to test all scenarios.
/unhold |
@syedriko: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Description
On an IPv4-only cluster, vector fails to start because its metrics endpoint attempts to create an IPv6 socket. This PR adds a an IPv6 socket creation check and selects the right listen-to address.
/cc @cahartma
/assign @jcantrill
Links