Skip to content

Commit

Permalink
Fixes/Workarounds for fragmentation support
Browse files Browse the repository at this point in the history
Fixes for #13

- modify txTimeout value
- detect 0 payload length due to corrupt dynamic payload
- detect and clear un-finished fragmented payloads
- detect and clear frames that exceed MAX_PAYLOAD_SIZE (due to duplicate
payloads, etc)
- add auto-adjusted delay after writes to account for timing issues
- adjust includes in examples to prevent compile errors
  • Loading branch information
TMRh20 committed Sep 10, 2014
1 parent 68aee19 commit e34a394
Show file tree
Hide file tree
Showing 5 changed files with 46 additions and 31 deletions.
50 changes: 33 additions & 17 deletions RPi/RF24Network/RF24Network.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,8 @@ void RF24Network::begin(uint8_t _channel, uint16_t _node_address ) {
//uint8_t retryVar = (node_address % 7) + 5;
uint8_t retryVar = (((node_address % 6)+1) *2) + 3;
radio.setRetries(retryVar, 5);
txTimeout = 20;
routeTimeout = txTimeout+35;
txTimeout = 25;
routeTimeout = txTimeout*9;

// Setup our address helper cache
setup_address();
Expand Down Expand Up @@ -97,8 +97,9 @@ uint8_t RF24Network::update(void)

//while (radio.available())
//{
// Fetch the payload, and see if this was the last one.
// Fetch the payload, and see if this was the last one.
size_t len = radio.getDynamicPayloadSize();
if(len == 0){ /*printf("bad payload dropped\n");*/continue; }

This comment has been minimized.

Copy link
@reixd

reixd Sep 10, 2014

Contributor

I think having payload of size 0 is not a bad thing.
This could be used e.g. to send RF24NetworkFrame beacons: node 01 broadcast a heartbeat beacon to the network.
Useful for dynamic routing and addressing ;)
Having the possibility of a 0 length payload may save also bandwidth while during RF24Network management operations.
I would say, that the application on top of RF24Network should check if a 0 size payload is an error or not.

This comment has been minimized.

Copy link
@TMRh20

TMRh20 Sep 10, 2014

Author Member

The return value of 0 is related to a bug with the RF24 radios when using dynamic payloads. If the radio reports a payload size > 32 , then the RF24 layer returns 0, and flushes the corrupt payload. I believe it is caused by RF transmission errors.
In any case, it is not possible to send a payload with 0 bytes, so if it returns 0, a corrupt payload was detected and flushed by the core RF24 radio driver.
It introduces a bit of an odd issue when using dynamic payloads, in that the radio may think it has received a payload via available(), but it is flushed automatically when checking using getDynamicPayloadSize() if corrupted. This workaround likely will not fully or best address the issue.

*Edit: It looks like a delay(2); is needed here also, with the current functionality

radio.read( frame_buffer, len );

//Do we have a valid length for a frame?
Expand All @@ -112,8 +113,8 @@ uint8_t RF24Network::update(void)

IF_SERIAL_DEBUG(printf_P("%u: MAC Received on %u %s\n\r",millis(),pipe_num,header.toString()));
if (len) {
IF_SERIAL_DEBUG_FRAGMENTATION(printf("%u: MAC frame size %i",millis(),len););
IF_SERIAL_DEBUG_FRAGMENTATION(printf("%u: MAC frame ",millis()); const char* charPtr = reinterpret_cast<const char*>(frame_buffer); for (size_t i = 0; i < len; i++) { printf("%02X ", charPtr[i]); }; printf("\n\r"));
IF_SERIAL_DEBUG_FRAGMENTATION(printf("%u: Rcv MAC frame size %i\n",millis(),len););
IF_SERIAL_DEBUG_FRAGMENTATION(printf("%u: Rcv MAC frame ",millis()); const char* charPtr = reinterpret_cast<const char*>(frame_buffer); for (size_t i = 0; i < len; i++) { printf("%02X ", charPtr[i]); }; printf("\n\r"));
}

// Throw it away if it's not a valid address
Expand Down Expand Up @@ -199,16 +200,16 @@ bool RF24Network::enqueue(RF24NetworkFrame frame) {
appendFragmentToFrame(frame);
result = true;

} else if (frame.header.fragment_id == 1 && frame.header.type == NETWORK_LAST_FRAGMENT) {
} else if (frame.header.fragment_id == 1 && frame.header.type == NETWORK_LAST_FRAGMENT) {
//Set the last fragment flag to indicate the last fragment
IF_SERIAL_DEBUG_FRAGMENTATION(printf("%u: MAC Last fragment with size %i Bytes and fragmentID '%i' received.\n\r",millis(),frame.message_size,frame.header.fragment_id););

//Append payload
appendFragmentToFrame(frame);

appendFragmentToFrame(frame);
IF_SERIAL_DEBUG(printf_P(PSTR("%u: NET Enqueue assembled frame @%x "),millis(),frame_queue.size()));
//Push the assembled frame in the frame_queue and remove it from cache
frame_queue.push( frameFragmentsCache[ std::make_pair(frame.header.from_node,frame.header.id) ] );
frame_queue.push( frameFragmentsCache[ std::make_pair(frame.header.from_node,frame.header.id) ] );
frameFragmentsCache.erase( std::make_pair(frame.header.from_node,frame.header.id) );

result = true;
Expand All @@ -235,19 +236,34 @@ bool RF24Network::enqueue(RF24NetworkFrame frame) {
void RF24Network::appendFragmentToFrame(RF24NetworkFrame frame) {

if (frameFragmentsCache.count(std::make_pair(frame.header.from_node,frame.header.id)) == 0 ) {

//If there is an un-finished fragment:
for (std::map<std::pair<uint16_t, uint16_t>, RF24NetworkFrame>::iterator it=frameFragmentsCache.begin(); it!=frameFragmentsCache.end(); ++it){
if( it->first.first == frame.header.from_node){

This comment has been minimized.

Copy link
@reixd

reixd Sep 10, 2014

Contributor

This code would erase all collected fragments sended for example by 0010 even if they are from different data streams (different frame.header.id).
It would be better to check for (frame.header.from_node,frame.header.id).
Having lost fragments in the frameFragmentsCache could be solved by using an LRUCache, this cache would self clean after a timeout, avoiding orphaned fragments in the cache.
This point is in my TODO list.

This comment has been minimized.

Copy link
@TMRh20

TMRh20 Sep 10, 2014

Author Member

Agreed, this is a crappy workaround, and checking for (frame.header.from_node,frame.header.id-1) works fairly well, but still results in the cache filling up. Drawbacks either way, but my testing is just using SCP to transfer a file, and monitoring the transfer rates, so did not notice that.

frameFragmentsCache.erase( it );
//printf("Map Size: %d\n",frameFragmentsCache.size());
break;
}
}

//This is the first of many fragments
frameFragmentsCache[ std::make_pair(frame.header.from_node,frame.header.id) ] = frame;

} else {
//We have at least received one fragments.

//Append payload
RF24NetworkFrame *f = &(frameFragmentsCache[ std::make_pair(frame.header.from_node,frame.header.id) ]);

if(frame.message_size + f->message_size > MAX_PAYLOAD_SIZE){
frameFragmentsCache.erase( std::make_pair(frame.header.from_node,frame.header.id) );
printf("cleared corrupt frame\n");

This comment has been minimized.

Copy link
@reixd

reixd Sep 10, 2014

Contributor

In this case it is not a corrupt frame, but a large payload received. This could be caused by a miss-configured tx Node (application).
I think a better error message would be: "Frame total payload exceeded MAX_PAYLOAD_SIZE %i Bytes. Frame dropped."

This comment has been minimized.

Copy link
@TMRh20

TMRh20 Sep 10, 2014

Author Member

Yeah, it would be. I'm still not sure exactly what causes this, but it doesn't happen very often, so still trying to figure it out.

}else{
memcpy(f->message_buffer+f->message_size, frame.message_buffer, frame.message_size);

//Increment message size
f->message_size += frame.message_size;
//Update header
f->header = frame.header;
}
}
}

Expand Down Expand Up @@ -406,7 +422,7 @@ bool RF24Network::write(RF24NetworkHeader& header,const void* message, size_t le
fragment_id--;
msgCount++;
}

int frag_delay = int(len/16); delay( std::min(frag_delay,15) );

This comment has been minimized.

Copy link
@reixd

reixd Sep 10, 2014

Contributor

-- int frag_delay = int(len/16); delay( std::min(frag_delay,15) );
++ int frag_delay = int(len/16);
++ delay( std::min(frag_delay,15) );

//Return true if all the chuncks where sent successfuly
//else return false
IF_SERIAL_DEBUG(printf("%u: NET total message fragments sent %i. txSuccess ",millis(),msgCount); printf("%s\n\r", txSuccess ? "YES" : "NO"););
Expand All @@ -431,7 +447,7 @@ bool RF24Network::_write(RF24NetworkHeader& header,const void* message, size_t l
if (frame_size)
{
// IF_SERIAL_DEBUG(const uint16_t* i = reinterpret_cast<const uint16_t*>(message);printf_P(PSTR("%u: NET message %04x\n\r"),millis(),*i));
IF_SERIAL_DEBUG_FRAGMENTATION(printf("%u: MAC frame size %i",millis(),frame_size););
IF_SERIAL_DEBUG_FRAGMENTATION(printf("%u: MAC frame size %i\n",millis(),frame_size););
IF_SERIAL_DEBUG_FRAGMENTATION(printf("%u: MAC frame ",millis()); const char* charPtr = reinterpret_cast<const char*>(frame_buffer); for (size_t i = 0; i < frame_size; i++) { printf("%02X ", charPtr[i]); }; printf("\n\r"));
}

Expand Down Expand Up @@ -504,9 +520,9 @@ bool RF24Network::write(uint16_t to_node, uint8_t directTo)
}


//if( ( send_node != to_node) || frame_buffer[6] == NETWORK_ACK || directTo == 3 || directTo == 4){
// multicast = 1;
// }
// if( ( send_node != to_node) || frame_buffer[6] == NETWORK_ACK ){
// multicast = 1;
//}

IF_SERIAL_DEBUG(printf_P(PSTR("%u: MAC Sending to 0%o via 0%o on pipe %x\n\r"),millis(),logicalAddress,send_node,send_pipe));

Expand Down Expand Up @@ -543,7 +559,7 @@ bool RF24Network::write(uint16_t to_node, uint8_t directTo)
// Now, continue listening
radio.startListening();

if( (send_node != logicalAddress) && (directTo==0 || directTo == 3 )){
if( ok && (send_node != logicalAddress) && (directTo==0 || directTo == 3 )){
uint32_t reply_time = millis();
while( update() != NETWORK_ACK){
if(millis() - reply_time > routeTimeout){
Expand Down
4 changes: 2 additions & 2 deletions RPi/RF24Network/RF24Network_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,12 @@
#ifndef __RF24_CONFIG_H__
#define __RF24_CONFIG_H__

#include <stddef.h>
//#include <stddef.h>

/********** USER CONFIG **************/

//#define RF24NetworkMulticast
#define SERIAL_DEBUG //Change #undef to #define for debug
//#define SERIAL_DEBUG //Change #undef to #define for debug
#define SERIAL_DEBUG_ROUTING
//#define SERIAL_DEBUG_FRAGMENTATION

Expand Down
9 changes: 4 additions & 5 deletions RPi/RF24Network/examples/helloworld_rx.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,14 @@
* Listens for messages from the transmitter and prints them out.
*/



#include <cstdlib>
#include <iostream>
#include <RF24/RF24.h>
#include <RF24Network/RF24Network.h>
#include <iostream>
#include <ctime>
#include <stdio.h>
#include <time.h>


/**
* g++ -L/usr/lib main.cc -I/usr/include -o main -lrrd
**/
Expand Down Expand Up @@ -76,7 +74,8 @@ int main(int argc, char** argv)

printf("Received payload # %lu at %lu \n",payload.counter,payload.ms);
}
sleep(2);
//sleep(2);
delay(2000);
//fclose(pFile);
}

Expand Down
8 changes: 3 additions & 5 deletions RPi/RF24Network/examples/helloworld_tx.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,10 @@
* Listens for messages from the transmitter and prints them out.
*/



#include <cstdlib>
#include <iostream>
//#include <cstdlib>
#include <RF24/RF24.h>
#include <RF24Network/RF24Network.h>
#include <iostream>
#include <ctime>
#include <stdio.h>
#include <time.h>
Expand All @@ -33,7 +31,7 @@ using namespace std;
//RF24 radio(RPI_V2_GPIO_P1_15, BCM2835_SPI_CS0, BCM2835_SPI_SPEED_4MHZ);

// Setup for GPIO 22 CE and CE1 CSN with SPI Speed @ 8Mhz
RF24 radio(RPI_V2_GPIO_P1_15, BCM2835_SPI_CS1, BCM2835_SPI_SPEED_8MHZ);
RF24 radio(RPI_V2_GPIO_P1_15, BCM2835_SPI_CS0, BCM2835_SPI_SPEED_8MHZ);

RF24Network network(radio);

Expand Down
6 changes: 4 additions & 2 deletions RPi/RF24Network/examples/rx-test.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
#include <cstdlib>
#include <iostream>



#include <RF24/RF24.h>
#include <RF24Network/RF24Network.h>
#include <iostream>
#include <ctime>
#include <stdio.h>
/*#include <rrd.h>*/
Expand Down

5 comments on commit e34a394

@reixd
Copy link
Contributor

@reixd reixd commented on e34a394 Sep 10, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! :)
I am just curious: which timing issues appear so do you need to auto-adjust delay after writes?

While implementing the fragmentation feature I thought about adding a payload_size (uint16_t) field to the RF24NetworkHeader. This field would simplify some stuff while checking fragments but would reduce the throughput and frame_payload_size.
Therefore I used two things to identify fragmented payloads: header.type and header.fragment_id.
We could add a check in the code to check for the >>expected fragment_id<<, like TCP checks for the next ACK. If the received fragment_id does not match the expected id, then drop (assembled) frame from cache or if we use a LRUCache, do not append received frame (the missing payload may arrive next).

Ps: I wrote some comments on the code.

@TMRh20
Copy link
Member Author

@TMRh20 TMRh20 commented on e34a394 Sep 10, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my above comment, my testing is simply using SCP to transfer a file between the two RPi nodes, via the RF24toTUN link, and it directly impacts the number of failures and transfer rates. I'm not sure exactly what the underlying issue is, but the delay seems to minimize errors and improve transfer rates...

Well, since the first 32 bytes of the frame contains the frame_id which represents the size of the fragment, it could be used for error checking, but if the first 32bytes is lost etc., then the size is not reported accurately. This is why I think it needs to include a checksum, etc. when fragmented payloads are transmitted.

@reixd
Copy link
Contributor

@reixd reixd commented on e34a394 Sep 10, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This handles only the cases where the message payload from the application greater is than the max_frame_payload. Therefore only the fragmentation/reassembly cases).

What do you think of using header.type with the following values

  • #define NETWORK_FIRST_FRAGMENT 130
  • #define NETWORK_MORE_FRAGMENTS 131
  • #define NETWORK_LAST_FRAGMENT 132

The TX node will set the header with the needed information to start, manage and end a fragmentation/reassembly procedure.

  • Set the header as the first fragment, set the number of expected fragments, transmit frame.
  • The next frames become the more_fragments flag and a decreased fragments_id counter to signalize that other fragments have been already sent.
  • The last fragment is marked as the last_fragment flag with a fragment_id of 1. Signalizing this is the last of all fragments, other fragments should have been received.

On the RX side, the node will use a new attribute in the RF24NetworkFrame Struct:
"frame.total_fragments". In this variable the total expected fragments is saved.

  • This means, if the RX node get a first_frame marked frame and a fragment_id > 1 and in the cache there a no frames with the same key (header.from_node,header.id), then save this frame in the cache.
    • How to handle lost frames and duplicates?
  • The next frames marked with the more_fragments flag are checked if the header.fragment_id equals the (frame.total_fragments-1) from the cache.
    • If the received frame/payload was in the correct order then update the frame.header.total_fragments and append the payload.
  • When the last_fragment marked frame is received, the RX checks if header.fragment_id equals 1 and if (frame.total_fragments-1) also equals 1.

A second attribute could also be added to the RF24NetworkFrame: frame.expected_fragment_id.
This counter will be decreased until it reaches 1. But I think this is overhead, the total_fragments attribute fulfills the task.


Another possibility is to use StartOfTransmission and EndOfTransmission messages to mark the beginning and ending of the transmission. This would mean 2 extra frames (+ ACKs) need to be exchanged. This messages would be in the payload of the frame, not in the header. Inside of this messages all needed information is packed, e.g. total_size of payload in bytes, total number of fragments, and more.

But this approach makes the "fragmentation-protocol" a bit harder to understand and also reduces the throughput of the network. On the other side if opens the possibility to send other type of network management frames for example for dynamic routing, addressing protocol, network status and so on.

I hope I made me clear ^^.

@TMRh20
Copy link
Member Author

@TMRh20 TMRh20 commented on e34a394 Sep 10, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah the first option seems like it should address most or all of the issues. The one thing I don't seem to understand is why other types of network management frames would not work with the first option?

@reixd
Copy link
Contributor

@reixd reixd commented on e34a394 Sep 10, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prefer the first option.

Other types of network management frames would also work.

I was thinking on using some IPv6 network management ideas. IPv6 has some interesting stuff implemented using the IPv6 headers and ICMPv6. We could copy some of this ideas, like Router Advertisement and Neighbor Advertisement (https://en.wikipedia.org/wiki/Neighbor_Discovery_Protocol).
Second could be used to deploy a spanning tree. IPv6 also support automatic addressing ;)

Please sign in to comment.