Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix udp-multicast-join tests #2185

Closed
wants to merge 2 commits into from

Conversation

santigimeno
Copy link
Member

The messages must be actually sent to the multicast address.

@santigimeno
Copy link
Member Author

Unfortunately I've not been able to make the ipv6 tests to pass on smartos, aix, zos and s390x. I'd need some help with those.

CI: https://ci.nodejs.org/job/libuv-test-commit-linux/1270/

@bnoordhuis
Copy link
Member

cc @libuv/smartos and @mhdawson for the other three platforms.

@santigimeno santigimeno force-pushed the fix_udp_multicast branch 3 times, most recently from 82d8dae to e6d7aff Compare February 22, 2019 05:04
@santigimeno
Copy link
Member Author

Just an update: zos is Ok.
The ipv6 test is failing on smartos and aix.
The ipv4 test is failing on linux rhel72-s390x.

CI: https://ci.nodejs.org/view/libuv/job/libuv-test-commit/1242/

cc @libuv/aix

@mhdawson
Copy link
Contributor

@gireeshpunathil would you be able to take a look at the aix issue?

@mhdawson
Copy link
Contributor

@miladfarca could you look at the rhel 390 issue? If you need access to the specific machine just let me know.

@santigimeno
Copy link
Member Author

Thanks @mhdawson

@miladfarca
Copy link

miladfarca commented Feb 26, 2019

@mhdawson @santigimeno On rhel72-s390x problem had to do with iptables (firewall). Multicast packets were being received by eth0 but OS was rejecting them. Adding the following to /etc/sysconfig/iptables and restarting iptables fixed the issue and the test runs fine. This rule can be further customized to narrow the accepted list.

-A INPUT -m pkttype --pkt-type multicast -j ACCEPT

@santigimeno
Copy link
Member Author

@miladfarca thanks for looking into it!
@mhdawson, @refack would it be possible adding that rule (or similar) to the rhel72-s390x box in the CI?

@cjihrig
Copy link
Contributor

cjihrig commented May 20, 2019

Ping @mhdawson and @refack regarding the last comment ^

@refack
Copy link
Contributor

refack commented May 20, 2019

We could do that... The question is; is it better to patch the system or the test? AFAIK we do minimal changes to iptables so our setup should be a good canary for typical LinuxONE setups...

I don't have a good answer, on the one hand we want to cover the code, on the other, we want to be aware of possible regressions/issues with typical systems.

I defer to the IBM team.

@cjihrig
Copy link
Contributor

cjihrig commented May 20, 2019

The question is; is it better to patch the system or the test?

IMO, it would be better to change the test if that's possible. Otherwise, we'll probably end up with users opening issues about the test.

@mhdawson
Copy link
Contributor

@refack I think the ansible config for linuxOne may already be making some IPTable additions in which case this would be a natural extension. @sam-github can you look to see if I'm remembering correctly and if so create a PR to add the additional configuration?

@sam-github
Copy link
Contributor

See nodejs/build#1808

sam-github added a commit to nodejs/build that referenced this pull request Jun 4, 2019
PR-URL: #1808
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>

See: libuv/libuv#2185 (comment)
@santigimeno
Copy link
Member Author

I just run the CI: https://ci.nodejs.org/view/libuv/job/libuv-test-commit/1452/ and it does not look good.

  • rhel72-s390x still failing the ipv4 after apparently allowing multicast in the CI bots.
  • smartos and aix still failing the ipv6 test. I spent some time trying to make them pass in smartos but could not figure it out.
  • fedora-last-latest-x64 both tests fails. This one is new, so I don't really know 🤷‍♂️. I'll try look into it.

/cc @libuv/aix @libuv/smartos @mhdawson @miladfarca

IMO, it would be better to change the test if that's possible

The tests themselves are quite simple. TBH I don't know what could I change. Maybe disabling them for the platforms we can't make them pass is an option 🤷‍♂️ .

@bnoordhuis
Copy link
Member

bnoordhuis commented Jul 7, 2019

I'll take a look at AIX later this week. @cjihrig Perhaps you can do smartos?

@miladfarca
Copy link

I just ran the test manually on rhel72-s390x-1 and it passed. iptables is configured correctly. I don't seem to have access to rhel72-s390x-2, can my key get added to the machine or could you check if #2185 (comment) is applied on this machine, my pubic key if needed:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQZihjEXcY52UZo09CEb29HwOWwXcmwbwZFR4rsACQQyGUexL4fkVrFfwuG8eV1vg5KECsO8LiGY/MHkNIpABaJJoip0Qxgv0pAFtAukIDjLXXOV/VNJjfIto16vOAehRZkmI+BtQP8TjoT2CSyJgvVQcay8BhH52in1LQQsyCi2crHLYzDrrCgY/rAmuVb1MzMnT8mFOdJ8E5RBhjnmc1K4YBKmNTf6yefgbOJssI0lLzp7Q2uytzp3pipg7AO/VqmRn8953UTJS/cOQeBi3nCYGpz4I7kOKHgwbdW1IP/XFfm0KO5daulHQeToRGIE85ntxF314wsYE3ZyeJwKgH

@mhdawson
Copy link
Contributor

mhdawson commented Jul 8, 2019

@sam-github can you add @miladfarca's key?

@sam-github
Copy link
Contributor

Key added to test-linuxonecc-rhel72-s390x-1

@miladfarca
Copy link

@sam-github Need access to rhel72-s390x-2 if its ok, already had access to -1. thanks again.

@sam-github
Copy link
Contributor

@miladfarca test-linuxonecc-rhel72-s390x-2 has your keys now

@miladfarca
Copy link

thanks @sam-github , issue is solved and tests are passing on both of our rhel72-s390x machines now.

@sam-github
Copy link
Contributor

Great, thanks. I can't close this, maybe someone else here can.

@mhdawson
Copy link
Contributor

mhdawson commented Jul 9, 2019

@miladfarca what needed to be updated? Just want to confirm that it is covered by the current Ansible scripts.

@miladfarca
Copy link

miladfarca commented Jul 9, 2019

@mhdawson /etc/sysconfig/iptables had to be updated and this rule was added:

-A INPUT -m pkttype --pkt-type multicast -j ACCEPT

Seems like Sam had opened a card for this: nodejs/build#1808

@cjihrig
Copy link
Contributor

cjihrig commented Jul 9, 2019

@santigimeno I think the IPv6 test might need some additional skipping logic. At least on SmartOS, I think the test requires at least one interface that is UP, MULTICAST, IPv6, not LOOPBACK, and not POINTOPOINT. By passing the can_ipv6() check in the test, we have established UP and IPv6. We could add a check for is_internal == 0 as well to cover the not LOOPBACK case and that would probably be adequate. I don't think the libuv public API exposes the MULTICAST or POINTOPOINT flags. We could work around that in libuv by calling getifaddrs() directly, but it could present an issue in projects like Node if the same logic is necessary.

TL;DR - I think if we can get the IPv6 test passing by adding just adding an is_internal check, then let's do that. If it still fails, I'd be OK with just skipping it completely on SmartOS.

@santigimeno
Copy link
Member Author

santigimeno commented Jul 11, 2019

@cjihrig Thanks for the info! I've been able to confirm your assesment by adding a MAC-derived ipv6 to my instance as described here and now the test pass. I'll see about adding the is_internal check, but I think it can also be nice trying to add a notLOOPBACK ipv6 to the smartos CI machines.

@santigimeno
Copy link
Member Author

I've been looking into the failures in the fedora-last-latest-x64 bot and the problem seems to be that firewalld is installed and running and has a configuration that doesn't allow any kind of external incoming traffic apart from ssh, mdns and dhcp. The multicast tests fail because though run locally, they use the 'external' network interface and not the loopback.

To make it work I've had to make the following changes to the firewalld configuration:

firewall-cmd --permanent --direct --add-rule ipv4 filter INPUT 0 -m pkttype --pkt-type multicast -j ACCEPT
firewall-cmd --permanent --direct --add-rule ipv6 filter INPUT 0 -m pkttype --pkt-type multicast -j ACCEPT

Also we should disable the IPv6_rpfilter configuration in /etc/firewalld/firewalld.conf as it's dropping the ipv6 datagrams.

Another option might be not using firewalld altogether, which other bots seems not to be using at all.

Any thoughts on whether these changes could be applied? /cc @mhdawson @sam-github @refack

@mhdawson
Copy link
Contributor

@santigimeno the firewall-cmd commands or the commands to disable firewalld altogether should be added to the ansible templates so that they run when fedora machines are configured. Look for "- name: Firewall"https://github.com/nodejs/build/blob/master/ansible/roles/jenkins-worker/tasks/main.yml as an example of what was done on rhel72-s390x. It might even be possible that simply expanding the condition so that those run on fedora as well solves the problem but you'd have to review them to see.

The messages must be actually sent to the multicast address.
@santigimeno
Copy link
Member Author

Added can_ipv6_external() helper method as suggested in #2185 (comment).

CI: https://ci.nodejs.org/view/libuv/job/libuv-test-commit/1529/

@santigimeno
Copy link
Member Author

The CI looks pretty good now. There's only the issue with centos7-64 that should be solved once nodejs/build#1879 lands. So I'd say the PR is finally ready.

@santigimeno
Copy link
Member Author

@bnoordhuis after this patch, that tried to fix the smartos failures as suggested by Colin, the aix bots are also happy. Is this ok from the aix pov?

santigimeno added a commit that referenced this pull request Aug 19, 2019
The messages must be actually sent to the multicast address.

PR-URL: #2185
Reviewed-By: Saúl Ibarra Corretgé <saghul@gmail.com>
@santigimeno
Copy link
Member Author

Landed in b571851. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants