Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receiving data on a closed pipe. #913

Closed
msobarzo opened this issue Aug 13, 2023 · 15 comments · Fixed by #914
Closed

Receiving data on a closed pipe. #913

msobarzo opened this issue Aug 13, 2023 · 15 comments · Fixed by #914
Labels

Comments

@msobarzo
Copy link

msobarzo commented Aug 13, 2023

I have a problem where I receive a packet on a closed pipe. Initially I noticed this problem in the context of transmitting a large array but currently I have the same problem even though I'm sending an increasing integer.
Right now I'm using a very basic program because I wanted to reproduce the problem and try to identify the cause.

I'm currently using the nRF24L01+PA+LNA connected to 3v3 pin of the ESP32. The ESP32 is being powered from the USB port of my laptop and I have also tested with different modules.

The problem is I see the output in my receiver code, but sometimes I see the the correct value but is being received on pipe 5, this happens randomly. Also the settings on both modules are the same, I checked using the radio.printPrettyDetails();

This is my transmitter code

#include <SPI.h>
#include <Arduino.h>
#include <RF24.h>
// #include <RF24Network.h>
#include <printf.h>

// create an RF24 object
RF24 radio(5, 17); // CE, CSN  

// address through which two modules communicate.
const uint64_t address = 0x7878787878LL;

void setup()
{
  Serial.begin(115200);

  radio.begin();
  if (!radio.isChipConnected())
  {
    Serial.print("Chip not connected.");
  }
  radio.setPALevel(RF24_PA_MIN);
  radio.setChannel(80);
  radio.stopListening();
  radio.openWritingPipe(address);
  radio.printPrettyDetails();
}

int msgNumber = 0;

void loop()
{
  bool ok = radio.write(&msgNumber, sizeof(msgNumber));
  if (ok)
  {
    Serial.println("Message sent");
  }
  else
  {
    Serial.println("Message not sent");
  }
  msgNumber++;
  delay(100);
}

This is my receiver code

#include <Arduino.h>
#include <SPI.h>
#include <RF24.h>

RF24 radio(17, 2, 2000000); // CE, CSN  (For the S3 version)

// address through which two modules communicate.
const uint64_t address = 0x7878787878LL;

RF24 radio(5, 17); // CE, CSN

void setup()
{
  Serial.begin(115200);
  radio.begin();
  if (!radio.begin())
  {
    Serial.println("radio hardware is not responding!!!");
    while (1)
    {
    } // hold in infinite loop
  }
  if (!radio.isChipConnected())
  {
    Serial.print("Chip not connected.");
  }
  radio.setPALevel(RF24_PA_MIN);
  radio.openReadingPipe(1, address);
  radio.startListening();
  radio.setChannel(80);
  radio.printPrettyDetails();
}


uint8_t pipe_rx;
int packet;

void loop()
{
  if (radio.available(&pipe_rx))
  {
    radio.read(&packet, sizeof(packet));

    if (pipe_rx > 1)
    {
      Serial.print("PACKET ON PIPE: ");
      Serial.println(pipe_rx);
      Serial.println(packet);
    }

    if (pipe_rx == 1)
    {
      Serial.print("CORRECT PIPE: ");
      Serial.println(packet);
    }
  }
}

Finally, this is the output of radio.printPrettyDetails() on my receiver code.

SPI Frequency = 10 Mhz
Channel = 80 (~ 2480 MHz)
Model = nRF24L01+
RF Data Rate = 1 MBPS
RF Power Amplifier = PA_MIN
RF Low Noise Amplifier = Enabled
CRC Length = 16 bits
Address Length = 5 bytes
Static Payload Length = 32 bytes
Auto Retry Delay = 2750 microseconds
Auto Retry Attempts = 15 maximum
Packets lost on
current channel = 0
Retry attempts made for
last transmission = 0
Multicast = Disabled
Custom ACK Payload = Disabled
Dynamic Payloads = Disabled
Auto Acknowledgment = Enabled
Primary Mode = RX
TX address = 0xe7e7e7e7e7
pipe 0 (closed) bound = 0xe7e7e7e7e7
pipe 1 ( open ) bound = 0x7878787878
pipe 2 (closed) bound = 0xc3
pipe 3 (closed) bound = 0xc4
pipe 4 (closed) bound = 0xc5
pipe 5 (closed) bound = 0xc6

@TMRh20
Copy link
Member

TMRh20 commented Aug 13, 2023

I recreated this issue on Arduino Due and Nano using your code.

I believe it is due to the following note from the datasheet:

Note: The 3 bit pipe information in the STATUS register is updated during the IRQ pin high to low transition. The pipe information is unreliable if the STATUS register is read during an IRQ pin high to low transition.

In looking at the datasheet, one might be able to work around this issue by getting the pipe number and checking the corresponding RX_PW_P0, P1 register to see if data is actually available on that pipe. I'm not 100% sure but I think you've found an ugly bug when using polling of available() with no easy fix.

@TMRh20 TMRh20 added the bug label Aug 13, 2023
@TMRh20
Copy link
Member

TMRh20 commented Aug 13, 2023

This might fix the problem:

bool RF24::available(uint8_t* pipe_num)
{
    // get implied RX FIFO empty flag from status byte
    uint8_t pipe = (get_status() >> RX_P_NO) & 0x07;
    uint8_t pipe2 = (get_status() >> RX_P_NO) & 0x07;
    
    if( pipe != pipe2){
      pipe = (get_status() >> RX_P_NO) & 0x07;
    }
    
    if (pipe > 5)
        return 0;
    // If the caller wants the pipe number, include that
    if (pipe_num)
        *pipe_num = pipe;

    return 1;
}

Essentially, check the pipe twice and if it doesn't match, read it a third time.
Are you able to edit RF24.cpp and test out these changes?

What I don't like about this fix is that it may affect performance negatively, but I'm not sure how else to address this issue.

@2bndy5
Copy link
Member

2bndy5 commented Aug 13, 2023

This is a bit perplexing. I have a hunch, but I would like to know what exact radio module you are using (manufacturer, vendor, etc). I suspect a clone isn't adhering to the original datasheet's info about the status byte returned for every SPI transaction.

the suggestion from TMRh20 doesn't actually bypass my suspicion

Reading the status byte directly from the radio's STATUS register should be absolutely accurate:

// using inheritance instead of altering RF24.cpp directly
class RF24_Custom : public RF24
{
    bool available(uint8_t *pipe_num) {
        uint8_t pipe = (read_register(STATUS) >> RX_P_NO) & 0x07;
        if (pipe > 5)
             return 0;

        // If the caller wants the pipe number, include that
        if (pipe_num)
             *pipe_num = pipe;

        return 1;
    }
};

// then use the derived class instead
RF24_Custom radio(5, 17);

I would like to know if the behavior changes when the pipe_rx variable is initialized to an invalid pipe number (7 or higher). I don't think this would fix it, but it is better practice in general.

Lastly, there could be a slight corruption in data over the SPI bus. To diagnose this, I would request using a lower SPI speed. I see your code does this for the ESP32-S3... Is this also the case on the device you're seeing the wrong pipe number on?

@msobarzo
Copy link
Author

@2bndy5 I checked the module and it says it's the E01-ML01DP5 (when I use the isPvariant I get true)

I matched the SPI speed for both modules now and the problem still persist, I also tried initializing to an invalid pipe number, especifically 7 and as you suspected that didn't work.

Lastly I tried what @TMRh20 wrote and that solved the problem. I received all the data in the correct pipe. I let the program run for a few minutes to be sure and it worked. Right now it worked fine when I use one transmitter I still need to check if it works when I use 2 transmitters.

This might fix the problem:

bool RF24::available(uint8_t* pipe_num)
{
    // get implied RX FIFO empty flag from status byte
    uint8_t pipe = (get_status() >> RX_P_NO) & 0x07;
    uint8_t pipe2 = (get_status() >> RX_P_NO) & 0x07;
    
    if( pipe != pipe2){
      pipe = (get_status() >> RX_P_NO) & 0x07;
    }
    
    if (pipe > 5)
        return 0;
    // If the caller wants the pipe number, include that
    if (pipe_num)
        *pipe_num = pipe;

    return 1;
}

Essentially, check the pipe twice and if it doesn't match, read it a third time. Are you able to edit RF24.cpp and test out these changes?

What I don't like about this fix is that it may affect performance negatively, but I'm not sure how else to address this issue.

@msobarzo msobarzo reopened this Aug 14, 2023
@2bndy5
Copy link
Member

2bndy5 commented Aug 14, 2023

Good to hear.

As for addressing this problem, I can't think of a better way (using a delayMicroseconds() could be worse). I think the the main reason available() has been problematic is due to the speed-ups I suggested/implemented a few years back (#650 and 1e9bcb8).

Since then, I've been more concerned with the same problem when available() is called directly after whatHappened(). The one thing I noticed is that it takes a moderately fast MCU to actually encounter this problem, so maybe we could take that into consideration if/when mainlining the suggested fix.

@msobarzo
Copy link
Author

I'm not really sure if this is the place to comment this. But I originally found this problem when I was using the nrf2401 connected to a Raspberry Pi 4 (4GB), I implemented the same fix and it also worked when I used the Raspberry with the pyRF24 wrapper.
At least right now it works for 1 transmitters. I'll try again later with 2 transmitters

@2bndy5
Copy link
Member

2bndy5 commented Aug 14, 2023

I think the FIFO_STATUS register could be used to avoid unnecessarily reading the STATUS byte multiple times:

bool RF24::available(uint8_t* pipe_num)
{
    if (!isFifo(false, true)) { // if RX FIFO is not empty
        // If the caller wants the pipe number
        if (pipe_num) {
            // isFifo() updates the internal `status` member
            uint8_t pipe = (status >> RX_P_NO) & 0x07;
    
            if (pipe != ((get_status() >> RX_P_NO) & 0x07)) {
                pipe = (get_status() >> RX_P_NO) & 0x07;
            }
            *pipe_num = pipe;
        }
        return 1;
    }

    return 0;
}

EDITED: using the FIFO_STATUS register is more ideal than using the RX_DR flag (mainly used for IRQ).

@2bndy5
Copy link
Member

2bndy5 commented Aug 14, 2023

when I used the Raspberry with the pyRF24 wrapper.

I'm assuming you're using the SPIDEV driver and the individual python wrapper included in this repo (not the pyrf24 pypi pkg). Other drivers (pigpio, MRAA & RPi) seem to be too slow to encounter this.

@2bndy5 2bndy5 removed the question label Aug 14, 2023
@2bndy5
Copy link
Member

2bndy5 commented Aug 14, 2023

Remembering a footnote from the datasheet's register map section:

The RX_DR IRQ is asserted by a new packet arrival event. The procedure for handling this interrupt should be:

  1. read payload through SPI,
  2. clear RX_DR IRQ,
  3. read FIFO_STATUS to check if there are more payloads available in RX FIFO,
  4. if there are more data in RX FIFO, repeat from step 1.

So, if not using IRQ, then I guess the datasheet implies using the FIFO_STATUS register to check for RX'd payloads. Essentially, this means reverting the idea in #650. Now (after addressing #650) that we are caching the STATUS byte returned on every SPI transaction, we still have 1 less transaction to perform (in comparison to v1.3.9).

I also think using the FIFO_STATUS register may allow reading the STATUS byte only 1 additional time because a few microseconds should have elapsed during the read_register(FIFO_STATUS) (a 2-byte transfer). Simply put: return the second read attempt.

bool RF24::available(uint8_t* pipe_num)
{
    if (!isFifo(false, true)) { // if RX FIFO is not empty
        // If the caller wants the pipe number
        if (pipe_num) {
            *pipe_num = (get_status() >> RX_P_NO) & 0x07;
        }
        return 1;
    }

    return 0;
}

@TMRh20
Copy link
Member

TMRh20 commented Aug 14, 2023

Nice! I think that does correct the problem although in theory I think it would be possible to still get an incorrect reading if the timing of reception were such that the second reading was taken during the IRQ transition of a second packet. This should be a very rare occurrence I think though.

Also, I think the functions should be modified as such to only grab the pipe number if requested, because the pointer is now initialized to a non-null value either way:

bool RF24::available(void)
{
    uint8_t pipe = 0xFF;
    return available(&pipe);
}

/****************************************************************************/

bool RF24::available(uint8_t* pipe_num)
{
    if (!isFifo(false, true)) { // if RX FIFO is not empty
        // If the caller wants the pipe number
        if (*pipe_num != 0xFF) {
            *pipe_num = (get_status() >> RX_P_NO) & 0x07;
        }
        return 1;
    }

    return 0;
}

@2bndy5
Copy link
Member

2bndy5 commented Aug 14, 2023

I think it would be possible to still get an incorrect reading if the timing of reception were such that the second reading was taken during the IRQ transition of a second packet. This should be a very rare occurrence I think though.

I hadn't thought of that. In a perfect world, the status byte inaccuracy would only apply to an empty RX FIFO.

@2bndy5
Copy link
Member

2bndy5 commented Aug 14, 2023

I would prefer to use a macro definition instead of a magic number (for readability)

#define RF24_NO_FETCH_PIPE 0xFF

2bndy5 added a commit that referenced this issue Aug 14, 2023
Essentially, this reverts the algorithm used to satisfy #650 as it has proved erroneous on fast MCUs.
@2bndy5
Copy link
Member

2bndy5 commented Aug 14, 2023

created a branch and drafted #914 to see compile size implications. I still need to HW test this solution on a setup that can reproduce this issue.

@2bndy5 2bndy5 linked a pull request Aug 14, 2023 that will close this issue
@msobarzo
Copy link
Author

msobarzo commented Aug 15, 2023

when I used the Raspberry with the pyRF24 wrapper.

I'm assuming you're using the SPIDEV driver and the individual python wrapper included in this repo (not the pyrf24 pypi pkg). Other drivers (pigpio, MRAA & RPi) seem to be too slow to encounter this.

Yes, I'm using the SPIDEV driver and the wrapper in this repo. I also managed to make it work with two transmitters and all the data was received correctly.
I don't know if it would be worth to add a condition to check if the returned pipe is a valid open pipe in the MulticeiverDemo example?.

    if (radio.available(&pipe)) {              // is there a payload? get the pipe number that recieved it
      uint8_t bytes = radio.getPayloadSize();  // get the size of the payload
      radio.read(&payload, bytes);             // fetch payload from FIFO
      Serial.print(F("Received "));
      Serial.print(bytes);  // print the size of the payload
      Serial.print(F(" bytes on pipe "));
      Serial.print(pipe);  // print the pipe number
      Serial.print(F(" from node "));
      Serial.print(payload.nodeID);  // print the payload's origin
      Serial.print(F(". PayloadID: "));
      Serial.println(payload.payloadID);  // print the payload's number
    }```

@2bndy5
Copy link
Member

2bndy5 commented Aug 15, 2023

I don't know if it would be worth to add a condition to check if the returned pipe is a valid open pipe in the MulticeiverDemo example?

It shouldn't be necessary if the radio and library are working as they should. The state of the RX FIFO should additionally prevent using the stored pipe value from being used to display an invalid value.

@msobarzo If you get a chance to test the revert-650 branch on your RPi, that would be helpful feedback because it has the proposed solution for this issue.

2bndy5 added a commit that referenced this issue Aug 28, 2023
* fix #913

Essentially, this reverts the algorithm used to satisfy #650 as it has proved erroneous on fast MCUs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants