Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing pim-join-prune in 3 router setup #13

Closed
T-X opened this issue May 15, 2019 · 5 comments
Closed

Missing pim-join-prune in 3 router setup #13

T-X opened this issue May 15, 2019 · 5 comments

Comments

@T-X
Copy link
Contributor

T-X commented May 15, 2019

Advancing to more interesting setups now, here is an issue for a three router setup, with a client subnet attached to each of them:

pim6sd-3-router

Setup:

In this setup router0, router1 and router2 are running pim6sd. router1 is configured as the bootstrap router and router2 as a rendez-vous point for ff13:23:42:ffff::/64. There is then a multicast listener for ff13:23:42:ffff::123 on client0 and client1. And client2 continuously sends ICMPv6 messages to ff13:23:42:ffff::123.

Issue:

While router1 sends pim-join-prune messages to router2, our RP, just fine router0 does not.

router0 receives seemingly correct bootstrap messages with the RP information:

router0$ tcpdump -i wan0 -n -v -l
tcpdump: listening on wan0, link-type EN10MB (Ethernet), capture size 262144 bytes
18:09:32.921513 IP6 (flowlabel 0x5b502, hlim 1, next-header PIM (103) payload length: 72) fe80::11:22ff:fe00:2 > ff02::d: PIMv2, length 72
        Bootstrap, cksum 0x0f2a (correct) tag=cea5 hashmlen=126 BSRprio=0 BSR=fd5c:725:2841::2 (group0: ff13:23:42:ffff::/64 RPcnt=1 FRPcnt=1 RP0=fd5c:725:2841:1::3,holdtime=2m30s,prio=0)

But does not seem to be able to set the "upstream" in its RP-Set:

router0$ pim6stat
[...]
---------------------------RP-Set----------------------------
Current BSR address: fd5c:725:2841::2 Prio: 0 Timeout: 140
RP-address(Upstream)/Group prefix             Prio Hold Age
fd5c:725:2841:1::3(none)
     ff13:23:42:ffff::/64                     0    150  130

router0 is able to ping both the bootstrap and RP listed just fine (so does not seem to be an issue with unicast routes):

router0$ ping6 -c3 fd5c:725:2841::2 
PING fd5c:725:2841::2(fd5c:725:2841::2) 56 data bytes
64 bytes from fd5c:725:2841::2: icmp_seq=1 ttl=64 time=0.078 ms
64 bytes from fd5c:725:2841::2: icmp_seq=2 ttl=64 time=0.099 ms
64 bytes from fd5c:725:2841::2: icmp_seq=3 ttl=64 time=0.030 ms

--- fd5c:725:2841::2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 48ms
rtt min/avg/max/mdev = 0.030/0.069/0.099/0.028 ms
router0$ ping6 -c3 fd5c:725:2841:1::3
PING fd5c:725:2841:1::3(fd5c:725:2841:1::3) 56 data bytes
64 bytes from fd5c:725:2841:1::3: icmp_seq=1 ttl=63 time=0.149 ms
64 bytes from fd5c:725:2841:1::3: icmp_seq=2 ttl=63 time=0.050 ms
64 bytes from fd5c:725:2841:1::3: icmp_seq=3 ttl=63 time=0.137 ms

--- fd5c:725:2841:1::3 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 54ms
rtt min/avg/max/mdev = 0.050/0.112/0.149/0.044 ms

Reproducer and pim6stat:

https://gist.github.com/T-X/b585768892bda7d652406c71365a2b27

@troglobit
Copy link
Owner

Awesome bug report, very much appreciate your attention to detail! I'll have a look at it on Friday evening at the latest, I'm a bit swamped right now with personal matters.

@T-X
Copy link
Contributor Author

T-X commented May 15, 2019

Hm, I think this debug output gives a hint to the issue:

[...]
23:19:07.988 NETLINK: ask path to fd5c:725:2841:1::3
23:19:07.988 NETLINK: vif 0, ifindex=5
23:19:07.988 NETLINK: gateway is fd5c:725:2841::2
23:19:07.988 For src fd5c:725:2841:1::3, iif is wan0, next hop router is fd5c:725:2841::2%5: NOT A PIM ROUTER
---------------------------RP-Set----------------------------
Current BSR address: fd5c:725:2841::2 Prio: 0 Timeout: 145
RP-address(Upstream)/Group prefix             Prio Hold Age
fd5c:725:2841:1::3(none)
     ff13:23:42:ffff::/64                     0    150  135
[...]

The route towards the BSR on router0 is this one:

fd5c:725:2841:1::/64 via fd5c:725:2841::2 dev wan0 metric 1024 pref medium

Therefore the kernel reports fd5c:725:2841::2%5 as the router and not the link-local one used in the IPv6 source address in PIM messages.

So now as it's not the primary PIM address, the link-local one, set_incoming() in src/route.c now checks the aux addresses in a loop (n->aux_addrs).

The issue here is that the kernel reported a router address with an interface index (neighbor_addr->sin6_scope_id) of 5. However the addresses in n->aux_addrs are all stored with a scope id of 0. Therefore the inet6_equal(&neighbor_addr, &pa->pa_addr) in set_incoming() does not match.


I'm not quite sure how to fix this. There could be several ways:

A) Set the scope id of the neighbor_addr the kernel route request returned to 0 before calling inet6_equal() on the aux_addr (but only before the aux_addr check, as the primary one, n->address is stored with a sin6_scope_id...).

B) Store the sin6_scope_id in the addresses of aux_addr, too (but check what other places, especially the ones calling inet6_equal(), expect)

C) Avoid checking the sin6_scope_id in inet6_equal() if it is 0 for one of the provided address.

D) Provide the (link-local) IPv6 source address of the PIM bootstrap message, including the sin6_scope_id, to set_incoming() instead of the BSR address.

@T-X
Copy link
Contributor Author

T-X commented May 15, 2019

Ok, as far as I can tell, aux_addr is only ever used in set_incoming for precisely this check. So should be save to just set the correct sin6_scope_id in parse_pim6_hello(), which is where the objects for aux_addr are malloc'd and initialized.

I'm going to try a patch.

@troglobit
Copy link
Owner

Just to double check, PR #15 fixes this issue, right?

@T-X
Copy link
Contributor Author

T-X commented May 17, 2019

Yes, it does, thanks :-).

@T-X T-X closed this as completed May 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants