Add Support for IPv6 Multicast #364

Merged
merged 23 commits into from Mar 18, 2014

Conversation

Projects
None yet
5 participants
@g-oikonomou
Contributor

g-oikonomou commented Sep 30, 2013

This Pull Request adds multicast support, with an example and some regression tests. It introduces a NETSTACK-style multicast engine driver, with two such engines already implemented.

In a nutshell, this pull does the following:

  • Adds support for RPL in MOP3 (basically, multicast group membership advertisements in DAOs)
  • Implements the Multicast Forwarding using Trickle (TM) internet draft (now called MPL)
  • Implements an alternative multicast forwarding engine called SMRF (links to its documentation in the README)
  • Adds hooks to the uIP core and build system to bring it all together

I must start with the big gotcha. Currently, we only support multicast traffic originating and destined within the same 6LoWPAN. In other words, traffic can not cross 6LoWPAN boundaries in either direction. In order to support this, we'd need the following additions to border routers or other gateway devices:

  • Capability to add / remove the TM/MPL HBHO for datagrams entering/exiting the 6LoWPAN. This is probably not horribly complicated
  • Support for MLD, so that a BR can communicate group membership to the wider internet. This may take a while...

These are in the ToDo list

I am intentionally calling the Trickle Multicast engine TM and not MPL, since it was implemented when the draft was still at version 1 (i.e. before the name MPL came along). As I've discussed in the mailing list in the past, my personal take on things is that to implement MPL one would benefit by starting from scratch, since the differences are quite significant. During discussions it was agreed that it's better to have support for the old draft soon and implement the new version in due course, rather than have nothing while working on implementing MPL.

SMRF was originally written when we had support for neither the RPL HBHO nor multiple instances. I'm currently working on SMRFv2, which will take advantage of both techniques.

As some of you already know, this implementation has been around for circa 2 years now. I recently modified it to use the new neighbour tables and uip-ds6-route-style forwarding tables.

Looking forward to feedback
Enjoy

@adamdunkels

This comment has been minimized.

Show comment
Hide comment
@adamdunkels

adamdunkels Oct 5, 2013

Member

Wow! Very, very cool and something that will be very useful. I quick question: from looking at the examples, joining a multi-hop multicast group is done in the same way as joining a local group (uip_ds6_maddr_add()), is this correct?

Member

adamdunkels commented Oct 5, 2013

Wow! Very, very cool and something that will be very useful. I quick question: from looking at the examples, joining a multi-hop multicast group is done in the same way as joining a local group (uip_ds6_maddr_add()), is this correct?

@g-oikonomou

This comment has been minimized.

Show comment
Hide comment
@g-oikonomou

g-oikonomou Oct 5, 2013

Contributor

That's right

Contributor

g-oikonomou commented Oct 5, 2013

That's right

@aguirrem

This comment has been minimized.

Show comment
Hide comment

👍

@aguirrem

This comment has been minimized.

Show comment
Hide comment
@aguirrem

aguirrem Oct 25, 2013

@g-oikonomou
George I have been playing around with the TM implementation and I am not sure if I have found a problem with the software or with my understanding of the RFC [1].

With suppression enabled (K=1) and Tactive < consistent timer < Tdwell for all the members of the domain. Should the addition of a new node cause the forwarding of all the messages in the window? or should the fact that Tactive was reached cause the multicast messages to be suppressed?

The behavior I am currently observing is such that the multicast messages get re-broadcast as long as the consistency timer < Tdwell. In my interpretation this is incorrect. Could you confirm?

[1] http://tools.ietf.org/html/draft-ietf-roll-trickle-mcast-01

@g-oikonomou
George I have been playing around with the TM implementation and I am not sure if I have found a problem with the software or with my understanding of the RFC [1].

With suppression enabled (K=1) and Tactive < consistent timer < Tdwell for all the members of the domain. Should the addition of a new node cause the forwarding of all the messages in the window? or should the fact that Tactive was reached cause the multicast messages to be suppressed?

The behavior I am currently observing is such that the multicast messages get re-broadcast as long as the consistency timer < Tdwell. In my interpretation this is incorrect. Could you confirm?

[1] http://tools.ietf.org/html/draft-ietf-roll-trickle-mcast-01

core/net/uip-mcast6/roll-tm.c
+ payload_len += sizeof(struct sequence_list_header);
+ for(locmpptr = &buffered_msgs[ROLL_TM_BUFF_NUM - 1];
+ locmpptr >= buffered_msgs; locmpptr--) {
+ if(MCAST_PACKET_IS_USED(locmpptr)) {

This comment has been minimized.

@g-oikonomou

g-oikonomou Oct 29, 2013

Contributor

This conditional must also check if the cached datagram's age is < Tactive before including it in the sequence list.

Note to self: This is likely as simple as replacing the if block with something like:

if(MCAST_PACKET_IS_USED(locmpptr) &&
  (locmpptr->active < TRICKLE_ACTIVE(&t[(sl->flags & SEQUENCE_LIST_M_BIT) != 0]))) {
@g-oikonomou

g-oikonomou Oct 29, 2013

Contributor

This conditional must also check if the cached datagram's age is < Tactive before including it in the sequence list.

Note to self: This is likely as simple as replacing the if block with something like:

if(MCAST_PACKET_IS_USED(locmpptr) &&
  (locmpptr->active < TRICKLE_ACTIVE(&t[(sl->flags & SEQUENCE_LIST_M_BIT) != 0]))) {
core/net/uip-mcast6/roll-tm.c
+ /* Handle multicast transmissions */
+ if((SUPPRESSION_ENABLED(param) && MCAST_PACKET_MUST_SEND(locmpptr)) ||
+ (SUPPRESSION_DISABLED(param) &&
+ locmpptr->active < TRICKLE_ACTIVE(param))) {

This comment has been minimized.

@g-oikonomou

g-oikonomou Oct 29, 2013

Contributor

This is a conditional written by reading the draft and translating it directly to code.

Section 6.2, has two paragraphs: one starting with "When suppression is enabled..." and one with "When suppression is disabled...".

Tactive is only mentioned in the latter. However, after the discussion below, it makes more sense to check Tactive in both cases, even though the draft isn't explicit about this.

@g-oikonomou

g-oikonomou Oct 29, 2013

Contributor

This is a conditional written by reading the draft and translating it directly to code.

Section 6.2, has two paragraphs: one starting with "When suppression is enabled..." and one with "When suppression is disabled...".

Tactive is only mentioned in the latter. However, after the discussion below, it makes more sense to check Tactive in both cases, even though the draft isn't explicit about this.

core/net/uip-mcast6/roll-tm.c
+ PRINTF("ROLL TM: ICMPv6 In, Check our buffer\n");
+ for(locmpptr = &buffered_msgs[ROLL_TM_BUFF_NUM - 1];
+ locmpptr >= buffered_msgs; locmpptr--) {
+ if(MCAST_PACKET_IS_USED(locmpptr)) {

This comment has been minimized.

@g-oikonomou

g-oikonomou Oct 29, 2013

Contributor

By not comparing this cached datagram's age with Tactive here, we risk flagging as missing from the ICMP message a Seed ID which is only associated with expired datagrams. This results in a situation where expired datagrams alone can raise an inconsistency.

As a side note: loctpptr (used below) needs updated to point to a (possibly different) trickle param at every iteration.

(edit / strike-through: That was nonsense, the variable doesn't need updated)

@g-oikonomou

g-oikonomou Oct 29, 2013

Contributor

By not comparing this cached datagram's age with Tactive here, we risk flagging as missing from the ICMP message a Seed ID which is only associated with expired datagrams. This results in a situation where expired datagrams alone can raise an inconsistency.

As a side note: loctpptr (used below) needs updated to point to a (possibly different) trickle param at every iteration.

(edit / strike-through: That was nonsense, the variable doesn't need updated)

This comment has been minimized.

@aguirrem

aguirrem Nov 20, 2013

I think you were right. loctpptr needs to be updated to point to the right trickle param specifically in the case where an advertisement without a payload is sent received.(which I think it's possible) might not get an old trickle param value or even worse loctpptr could be unitialized no?

@aguirrem

aguirrem Nov 20, 2013

I think you were right. loctpptr needs to be updated to point to the right trickle param specifically in the case where an advertisement without a payload is sent received.(which I think it's possible) might not get an old trickle param value or even worse loctpptr could be unitialized no?

@g-oikonomou

This comment has been minimized.

Show comment
Hide comment
@g-oikonomou

g-oikonomou Oct 29, 2013

Contributor

@aguirrem

Melvin, that's a very very nice catch. This is a combination of what happens in three different locations of the code (see various inline comments).

You made me read the draft again: Even though it 'suppresses', it is unclear to me whether Tactive is part of the mechanism referred to as 'suppression'. My understanding/interpretation of the draft is that 'suppression' only refers to K and that a forwarder may only transmit a datagram with age < Tactive regardless of whether K is infinity or not. See my comment under L623.

So here is what happens in your situation:

  • A forwarder will include 'expired' messages in its Trickle ICMP message sequence list. This is wrong, it should not and the draft is quite explicit about this in section 5.2. (Comment on L841).
  • New node joins the network
  • At some point the new node will send a Trickle ICMP message for one of two reasons:
    • Either: The new node received an ICMP from the old one. This ICMP lists the expired datagram and triggers an inconsistency on the new node.
    • Or: Simply the new node's trickle fired and it sent its scheduled ICMP.
  • In both cases, the new node's Trickle ICMP message will have an empty sequence list. This is essentially an empty message meaning "I have nothing cached".

Upon reception of the new node's Trickle ICMP message, the older node will flag an inconsistency on the expired datagram. This is because its Seed ID will not be listed in the received ICMP (which was empty). This happens under the checks for "we have new" (the block starting in L1256).

In section 6.4. Trickle ICMP Processing, the draft says:

"The receiver has a new multicast message to offer if any buffered messages does not have an associated SeedID entry in the Trickle ICMP message."

In the next round of transmissions, because of what happens near L623, the expired message gets sent. Bingo

Contributor

g-oikonomou commented Oct 29, 2013

@aguirrem

Melvin, that's a very very nice catch. This is a combination of what happens in three different locations of the code (see various inline comments).

You made me read the draft again: Even though it 'suppresses', it is unclear to me whether Tactive is part of the mechanism referred to as 'suppression'. My understanding/interpretation of the draft is that 'suppression' only refers to K and that a forwarder may only transmit a datagram with age < Tactive regardless of whether K is infinity or not. See my comment under L623.

So here is what happens in your situation:

  • A forwarder will include 'expired' messages in its Trickle ICMP message sequence list. This is wrong, it should not and the draft is quite explicit about this in section 5.2. (Comment on L841).
  • New node joins the network
  • At some point the new node will send a Trickle ICMP message for one of two reasons:
    • Either: The new node received an ICMP from the old one. This ICMP lists the expired datagram and triggers an inconsistency on the new node.
    • Or: Simply the new node's trickle fired and it sent its scheduled ICMP.
  • In both cases, the new node's Trickle ICMP message will have an empty sequence list. This is essentially an empty message meaning "I have nothing cached".

Upon reception of the new node's Trickle ICMP message, the older node will flag an inconsistency on the expired datagram. This is because its Seed ID will not be listed in the received ICMP (which was empty). This happens under the checks for "we have new" (the block starting in L1256).

In section 6.4. Trickle ICMP Processing, the draft says:

"The receiver has a new multicast message to offer if any buffered messages does not have an associated SeedID entry in the Trickle ICMP message."

In the next round of transmissions, because of what happens near L623, the expired message gets sent. Bingo

@aguirrem

This comment has been minimized.

Show comment
Hide comment
@aguirrem

aguirrem Nov 4, 2013

@g-oikonomou

George,
I am sorry about having to re-read the standard. I know how entertaining they can be :).

I will try to tackle the changes, test them out and report back.

Thanks for your thorough comments above

Melvin

aguirrem commented Nov 4, 2013

@g-oikonomou

George,
I am sorry about having to re-read the standard. I know how entertaining they can be :).

I will try to tackle the changes, test them out and report back.

Thanks for your thorough comments above

Melvin

core/net/uip-mcast6/roll-tm.c
+ if(SLIDING_WINDOW_GET_M(iterswptr)) {
+ sl->flags |= SEQUENCE_LIST_M_BIT;
+ }
+ sl->seq_len = iterswptr->count;

This comment has been minimized.

@aguirrem

aguirrem Nov 6, 2013

taking into account the fact we now need to take note of the current Tactive timer, we cannot simply assign the count to sl->seq_len. Instead we need to count only the stored messages whose lifetime < Tactive

@aguirrem

aguirrem Nov 6, 2013

taking into account the fact we now need to take note of the current Tactive timer, we cannot simply assign the count to sl->seq_len. Instead we need to count only the stored messages whose lifetime < Tactive

core/net/uip-mcast6/roll-tm.c
+ }
+ PRINTF("\n");
+ payload_len += sl->seq_len * 2;
+ sl = (struct sequence_list_header *)buffer;

This comment has been minimized.

@aguirrem

aguirrem Nov 6, 2013

If the number of buffered messages whose lifetime is less than Tactive:

 payload_len = 0

If this is not done, we end up sending the ICMPv6 message with a header but no payload

@aguirrem

aguirrem Nov 6, 2013

If the number of buffered messages whose lifetime is less than Tactive:

 payload_len = 0

If this is not done, we end up sending the ICMPv6 message with a header but no payload

@aguirrem

This comment has been minimized.

Show comment
Hide comment
@aguirrem

aguirrem Nov 6, 2013

@g-oikonomou
George
I think I got it working. I am going to clean up and then if you are interested I can either send you the code or do a pull request. Your choice.

aguirrem commented Nov 6, 2013

@g-oikonomou
George
I think I got it working. I am going to clean up and then if you are interested I can either send you the code or do a pull request. Your choice.

@g-oikonomou

This comment has been minimized.

Show comment
Hide comment
@g-oikonomou

g-oikonomou Nov 6, 2013

Contributor

Yeah I was playing around too today. Open a pull on top of g-oikonomou/multicast-push if you have made good progress

Contributor

g-oikonomou commented Nov 6, 2013

Yeah I was playing around too today. Open a pull on top of g-oikonomou/multicast-push if you have made good progress

@adamdunkels

This comment has been minimized.

Show comment
Hide comment
@adamdunkels

adamdunkels Dec 2, 2013

Member

What's the status of this patch? Is it ready enough so that we can merge it? Personally, I'd really like to have it in, but I'm not sure what the status is (considering the discussion above).

Member

adamdunkels commented Dec 2, 2013

What's the status of this patch? Is it ready enough so that we can merge it? Personally, I'd really like to have it in, but I'm not sure what the status is (considering the discussion above).

@g-oikonomou

This comment has been minimized.

Show comment
Hide comment
@g-oikonomou

g-oikonomou Dec 2, 2013

Contributor

TM works but it has a couple of non-critical bugs (those discussed above).

It would be great if we could start looking at the integration into the core (uIP hooks, the code that puts RPL in MOP3 and sends multicast DAO advertisements) while I'm fixing those. Now that we have cooja PCAPs back it won't take long at all.

I'll also need to rebase and make sure it didn't break by #454 or #460 (and make necessary adjustments, if any).

Contributor

g-oikonomou commented Dec 2, 2013

TM works but it has a couple of non-critical bugs (those discussed above).

It would be great if we could start looking at the integration into the core (uIP hooks, the code that puts RPL in MOP3 and sends multicast DAO advertisements) while I'm fixing those. Now that we have cooja PCAPs back it won't take long at all.

I'll also need to rebase and make sure it didn't break by #454 or #460 (and make necessary adjustments, if any).

@g-oikonomou

This comment has been minimized.

Show comment
Hide comment
@g-oikonomou

g-oikonomou Dec 3, 2013

Contributor

On second thoughts, this really needs a little rebase before we can consider merging it. Closing and will reopen in the short term.

Contributor

g-oikonomou commented Dec 3, 2013

On second thoughts, this really needs a little rebase before we can consider merging it. Closing and will reopen in the short term.

@g-oikonomou g-oikonomou closed this Dec 3, 2013

@g-oikonomou g-oikonomou referenced this pull request Jan 21, 2014

Open

Contiki 3.x Roadmap #422

@g-oikonomou g-oikonomou referenced this pull request Feb 24, 2014

Closed

implement MLDv1 #572

@g-oikonomou

This comment has been minimized.

Show comment
Hide comment
@g-oikonomou

g-oikonomou Feb 24, 2014

Contributor

I'm re-opening this:

  • Code is now updated to play nicely with the modular build system. I've done some directory restructuring in the process, so the core is now hosted under core/net/ipv6/multicast.
  • Example and Regression simulations have been updated to org.contikios and to reflect the new directory structure.
  • The bugs discussed above with @aguirrem have been fixed. Fixes are in a separate commit each. Thus, we no longer advertise (nor forward) datagrams older than Tactive, we don't send out empty sliding windows.
Contributor

g-oikonomou commented Feb 24, 2014

I'm re-opening this:

  • Code is now updated to play nicely with the modular build system. I've done some directory restructuring in the process, so the core is now hosted under core/net/ipv6/multicast.
  • Example and Regression simulations have been updated to org.contikios and to reflect the new directory structure.
  • The bugs discussed above with @aguirrem have been fixed. Fixes are in a separate commit each. Thus, we no longer advertise (nor forward) datagrams older than Tactive, we don't send out empty sliding windows.

@g-oikonomou g-oikonomou reopened this Feb 24, 2014

@cmorty

This comment has been minimized.

Show comment
Hide comment
@cmorty

cmorty Feb 24, 2014

Contributor

Looks like this needs a rebase against master, or is there a problem with gitub?
Edit: Might as well be, that I can't use the the new github-design yet. :/

Contributor

cmorty commented Feb 24, 2014

Looks like this needs a rebase against master, or is there a problem with gitub?
Edit: Might as well be, that I can't use the the new github-design yet. :/

@g-oikonomou

This comment has been minimized.

Show comment
Hide comment
@g-oikonomou

g-oikonomou Feb 24, 2014

Contributor

Yeah, it's rebased but I don't know why the history display is all messed up, it shows up correctly on g-oikonomou/contiki

Contributor

g-oikonomou commented Feb 24, 2014

Yeah, it's rebased but I don't know why the history display is all messed up, it shows up correctly on g-oikonomou/contiki

@vinithmundhra vinithmundhra referenced this pull request in xcore/sc_xtcp Feb 25, 2014

Merged

add Christian's work on IPv6 #28

@g-oikonomou

This comment has been minimized.

Show comment
Hide comment
@g-oikonomou

g-oikonomou Feb 27, 2014

Contributor

The history cleaned itself up with the last push.

Overall, I'm very happy with this pull's current state and I would therefore kindly request it got brought to the foreground again (shameless ping @nvt , @adamdunkels )

Contributor

g-oikonomou commented Feb 27, 2014

The history cleaned itself up with the last push.

Overall, I'm very happy with this pull's current state and I would therefore kindly request it got brought to the foreground again (shameless ping @nvt , @adamdunkels )

@aguirrem

This comment has been minimized.

Show comment
Hide comment
@aguirrem

aguirrem Feb 27, 2014

👍
I should have some time to play with this next week on a node and on a border router (I think I observed some issues with the pointer math on the border router)
Thanks for the hard work @g-oikonomou !

👍
I should have some time to play with this next week on a node and on a border router (I think I observed some issues with the pointer math on the border router)
Thanks for the hard work @g-oikonomou !

@nvt

This comment has been minimized.

Show comment
Hide comment
@nvt

nvt Mar 3, 2014

Member

@g-oikonomou OK, I will try to find some time to test it and go through the source this week.

Member

nvt commented Mar 3, 2014

@g-oikonomou OK, I will try to find some time to test it and go through the source this week.

@g-oikonomou g-oikonomou referenced this pull request Mar 5, 2014

Closed

implement MLDv1 #574

@nvt

This comment has been minimized.

Show comment
Hide comment
@nvt

nvt Mar 18, 2014

Member

The source code looks good, and is properly #ifdef'ed. The Cooja test seems to work correctly as well, so I think this one is ready to merge after a too long wait. 👍

Member

nvt commented Mar 18, 2014

The source code looks good, and is properly #ifdef'ed. The Cooja test seems to work correctly as well, so I think this one is ready to merge after a too long wait. 👍

@adamdunkels

This comment has been minimized.

Show comment
Hide comment
Member

adamdunkels commented Mar 18, 2014

👍

adamdunkels added a commit that referenced this pull request Mar 18, 2014

@adamdunkels adamdunkels merged commit ce4bb53 into contiki-os:master Mar 18, 2014

1 check passed

default The Travis CI build passed
Details

@g-oikonomou g-oikonomou deleted the g-oikonomou:multicast-push branch Apr 3, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment