Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ringbuffer timeouts in certain cases (IDFGH-5624) #7344

Closed
bugadani opened this issue Jul 30, 2021 · 18 comments
Closed

Ringbuffer timeouts in certain cases (IDFGH-5624) #7344

bugadani opened this issue Jul 30, 2021 · 18 comments
Assignees
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@bugadani
Copy link
Contributor

bugadani commented Jul 30, 2021

I was experiencing some weird issues with regards to the Ringbuffer api, so I wrote a few tests to reproduce my problem. Both of the following tests fail on the "Timeout" assert. The issue is sensitive to the buffer size, the data size and I also had to insert data with different sizes.

Original, more complex examples - Click to expand!
#include "unity.h"

#include "freertos/FreeRTOS.h"
#include "freertos/ringbuf.h"

#include <stdio.h>
#include <string.h>

TEST_CASE("ringbuf xRingbufferSend problem case", "[ringbuf]") {
    RingbufHandle_t ringbuffer = xRingbufferCreate(200, RINGBUF_TYPE_NOSPLIT);
    uint8_t data[9] = {3, 4, 5, 6, 7, 8, 9, 10, 11};

    for (size_t size = 8; size <= sizeof(data); size++) {
        for (size_t i = 0; i < 40; i++) {
            // Make space by removing records
            while (xRingbufferGetCurFreeSize(ringbuffer) < size) {
                void* item = xRingbufferReceive(ringbuffer, NULL, 0);
                if (item == NULL) {
                    TEST_ASSERT(xRingbufferGetCurFreeSize(ringbuffer) ==
                                xRingbufferGetMaxItemSize(ringbuffer));
                    break;
                }
                // Removed record has the expected data
                TEST_ASSERT_EQUAL_MEMORY(item, data, size);
                vRingbufferReturnItem(ringbuffer, item);
            }
            TEST_ASSERT(xRingbufferGetCurFreeSize(ringbuffer) >= size);

            if (xRingbufferSend(ringbuffer, data, size, 0) != pdTRUE) {
                printf("Failed to acquire memory in event buffer. Required: %u Max: %u Free: %u\n",
                       size, xRingbufferGetMaxItemSize(ringbuffer),
                       xRingbufferGetCurFreeSize(ringbuffer));
                // If there is enough space, xRingbufferSend timed out.
                TEST_ASSERT_MESSAGE(xRingbufferGetCurFreeSize(ringbuffer) < size,
                                    "xRingbufferSend Timeout");
            }
        }

        // Flush buffer
        void* item;
        do {
            item = xRingbufferReceive(ringbuffer, NULL, 0);
            if (item) {
                // Record has expected data
                TEST_ASSERT_EQUAL_MEMORY(item, data, size);
                vRingbufferReturnItem(ringbuffer, item);
            }
        } while (item != NULL);
    }

    vRingbufferDelete(ringbuffer);
}

TEST_CASE("ringbuf xRingbufferSendAcquire problem case", "[ringbuf]") {
    RingbufHandle_t ringbuffer = xRingbufferCreate(200, RINGBUF_TYPE_NOSPLIT);
    uint8_t data[9] = {3, 4, 5, 6, 7, 8, 9, 10, 11};

    for (size_t size = 8; size <= sizeof(data); size++) {
        for (size_t i = 0; i < 40; i++) {
            // Make space by removing records
            while (xRingbufferGetCurFreeSize(ringbuffer) < size) {
                void* item = xRingbufferReceive(ringbuffer, NULL, 0);
                if (item == NULL) {
                    TEST_ASSERT(xRingbufferGetCurFreeSize(ringbuffer) ==
                                xRingbufferGetMaxItemSize(ringbuffer));
                    break;
                }
                // Removed record has the expected data
                TEST_ASSERT_EQUAL_MEMORY(item, data, size);
                vRingbufferReturnItem(ringbuffer, item);
            }
            TEST_ASSERT(xRingbufferGetCurFreeSize(ringbuffer) >= size);

            void* pItem;
            if (xRingbufferSendAcquire(ringbuffer, &pItem, size, 0) == pdTRUE) {
                memcpy(pItem, data, size);

                xRingbufferSendComplete(ringbuffer, pItem);
            } else {
                printf("Failed to acquire memory in event buffer. Required: %u Max: %u Free: %u\n",
                       size, xRingbufferGetMaxItemSize(ringbuffer),
                       xRingbufferGetCurFreeSize(ringbuffer));
                // If there is enough space, xRingbufferSendAcquire timed out.
                TEST_ASSERT_MESSAGE(xRingbufferGetCurFreeSize(ringbuffer) < size,
                                    "xRingbufferSendAcquire Timeout");
            }
        }

        // Flush buffer
        void* item;
        do {
            item = xRingbufferReceive(ringbuffer, NULL, 0);
            if (item) {
                // Record has expected data
                TEST_ASSERT_EQUAL_MEMORY(item, data, size);
                vRingbufferReturnItem(ringbuffer, item);
            }
        } while (item != NULL);
    }

    vRingbufferDelete(ringbuffer);
}

Edit: I've made a simpler test case:

TEST_CASE("simple case", "[ringbuf]") {
    uint8_t data[9] = {3, 4, 5, 6, 7, 8, 9, 10, 11};

    RingbufHandle_t ringbuffer = xRingbufferCreate(80, RINGBUF_TYPE_NOSPLIT);

    TEST_ASSERT(xRingbufferSend(ringbuffer, data, 8, 0));
    TEST_ASSERT(xRingbufferSend(ringbuffer, data, 8, 0));
    TEST_ASSERT(xRingbufferSend(ringbuffer, data, 8, 0));
    TEST_ASSERT(xRingbufferSend(ringbuffer, data, 8, 0));

    void* item;
    item = xRingbufferReceive(ringbuffer, NULL, 0);
    TEST_ASSERT(item != NULL);
    vRingbufferReturnItem(ringbuffer, item);

    item = xRingbufferReceive(ringbuffer, NULL, 0);
    TEST_ASSERT(item != NULL);
    vRingbufferReturnItem(ringbuffer, item);

    item = xRingbufferReceive(ringbuffer, NULL, 0);
    TEST_ASSERT(item != NULL);
    vRingbufferReturnItem(ringbuffer, item);

    TEST_ASSERT(xRingbufferGetCurFreeSize(ringbuffer) >= 9);
    TEST_ASSERT(xRingbufferSend(ringbuffer, data, 10, 0));

    item = xRingbufferReceive(ringbuffer, NULL, 0);
    TEST_ASSERT(item != NULL);
    vRingbufferReturnItem(ringbuffer, item);

    TEST_ASSERT(xRingbufferGetCurFreeSize(ringbuffer) >= 9);
    TEST_ASSERT(xRingbufferSend(ringbuffer, data, 9, 0));

    TEST_ASSERT(xRingbufferGetCurFreeSize(ringbuffer) >= 9);
    TEST_ASSERT(xRingbufferSend(ringbuffer, data, 9, 0));

    TEST_ASSERT(xRingbufferGetCurFreeSize(ringbuffer) >= 9);
    TEST_ASSERT(xRingbufferSend(ringbuffer, data, 9, 0));

    vRingbufferDelete(ringbuffer);
}
@espressif-bot espressif-bot added the Status: Opened Issue is new label Jul 30, 2021
@github-actions github-actions bot changed the title Ringbuffer timeouts in certain cases Ringbuffer timeouts in certain cases (IDFGH-5624) Jul 30, 2021
@AxelLin
Copy link
Contributor

AxelLin commented Jul 30, 2021

What is your esp-idf version? (git describe --tags)

@bugadani
Copy link
Contributor Author

It's dirty due to some accidental saves, but v4.2

bugadani added a commit to bugadani/esp-idf that referenced this issue Aug 5, 2021
Fix tracking free space. Fix a number of potential underflow issues, as well as a few incorrect comments. Don't round up data size for the last chunk.

Fixes espressif#7344
bugadani added a commit to bugadani/esp-idf that referenced this issue Aug 5, 2021
Fix tracking free space. Fix a number of potential underflow issues, as well as a few incorrect comments. Don't round up data size for the last chunk.

Fixes espressif#7344
bugadani added a commit to bugadani/esp-idf that referenced this issue Aug 5, 2021
Fix tracking free space. Fix a number of potential underflow issues, as well as a few incorrect comments. Don't round up data size for the last chunk.

Fixes espressif#7344
@AxelLin
Copy link
Contributor

AxelLin commented Aug 6, 2021

TEST_ASSERT(xRingbufferSend(ringbuffer, data, 10, 0));

You have uint8_t data[9], but pass 10 in above code.
Isn't this overflow?

@bugadani
Copy link
Contributor Author

bugadani commented Aug 6, 2021

It's a mistake on my part but doesn't matter. I'll fix the comment

@Dazza0 Dazza0 self-assigned this Sep 23, 2021
@Dazza0
Copy link
Collaborator

Dazza0 commented Sep 28, 2021

@bugadani I've confirmed that this indeed a bug. The source of the issue is in xRingbufferReceive(). The internal implementation (i.e., prvGetItemDefault()) forgets to wrap the read pointer around a dummy data block. This results in the free pointer also being corrupted when the item is returned.

I'll push a fix for this issue shortly. For now, to work around this issue you can replace the following lines of prvGetItemDefault() in ringbuf.c as such:

From...

     //Check if pucRead requires wrap around
     if ((pxRingbuffer->pucTail - pxRingbuffer->pucRead) < rbHEADER_SIZE) {
         pxRingbuffer->pucRead = pxRingbuffer->pucHead;
     }

To

     //Check if pucRead requires wrap around
     if (((ItemHeader_t *)pxRingbuffer->pucRead)->uxItemFlags & rbITEM_DUMMY_DATA_FLAG ||    //Next item is a dummy
         ((pxRingbuffer->pucTail - pxRingbuffer->pucRead) < rbHEADER_SIZE)) {                //Remaining space cannot fit a header
             pxRingbuffer->pucRead = pxRingbuffer->pucHead;  //Wrap pucREAD around
         }

@bugadani
Copy link
Contributor Author

Good catch, thanks!

@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Status: Opened Issue is new labels Sep 28, 2021
@AxelLin
Copy link
Contributor

AxelLin commented Sep 29, 2021

     //Check if pucRead requires wrap around
     if (((ItemHeader_t *)pxRingbuffer->pucRead)->uxItemFlags & rbITEM_DUMMY_DATA_FLAG ||    //Next item is a dummy
         ((pxRingbuffer->pucTail - pxRingbuffer->pucRead) < rbHEADER_SIZE)) {                //Remaining space cannot fit a header
             pxRingbuffer->pucRead = pxRingbuffer->pucHead;  //Wrap pucREAD around
         }

@Dazza0

Test on v4.4-dev-3235-g3e370c429624
If I made above code change, I got below assert:
assert failed: prvReturnItemDefault ringbuf.c:604 ((pxCurHeader->uxItemFlags & ( ( UBaseType_t ) 1 )) == 0)

Backtrace:0x40081bc6:0x3ffb09900x40089361:0x3ffb09b0 0x4008ff35:0x3ffb09d0 0x400886c7:0x3ffb0af0 0x400891bd:0x3ffb0b10 0x4008341a:0x3ffb0b30 0x40082a11:0x3ffb0ba0 0x4000bfed:0x3ffbbf70 0x4008cb7a:0x3ffbbf80 0x400fa16e:0x3ffbbfa0 0x400fa2a1:0x3ffbbfd0 0x400fa825:0x3ffbc020 0x400ebc78:0x3ffbc050 0x400eda3c:0x3ffbc080 0x400e4e23:0x3ffbc0b0 0x400e5691:0x3ffbc160 0x400d9322:0x3ffbc210 0x40199803:0x3ffbc5b0
0x40081bc6: panic_abort at /home/axel/esp/esp-idf/components/esp_system/panic.c:402

0x40089361: esp_system_abort at /home/axel/esp/esp-idf/components/esp_system/esp_system.c:129

0x4008ff35: __assert_func at /home/axel/esp/esp-idf/components/newlib/assert.c:85

0x400886c7: prvReturnItemDefault at /home/axel/esp/esp-idf/components/esp_ringbuf/ringbuf.c:597 (discriminator 1)

0x400891bd: vRingbufferReturnItemFromISR at /home/axel/esp/esp-idf/components/esp_ringbuf/ringbuf.c:1281

0x4008341a: uart_rx_intr_handler_default at /home/axel/esp/esp-idf/components/driver/uart.c:826

0x40082a11: _xt_lowint1 at /home/axel/esp/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1111

0x4008cb7a: vPortExitCritical at /home/axel/esp/esp-idf/components/freertos/port/xtensa/port.c:447

0x400fa16e: uart_enable_tx_intr at /home/axel/esp/esp-idf/components/driver/uart.c:580 (discriminator 2)

0x400fa2a1: uart_tx_all at /home/axel/esp/esp-idf/components/driver/uart.c:1169

0x400fa825: uart_tx_all at /home/axel/esp/esp-idf/components/driver/uart.c:1144
(inlined by) uart_write_bytes at /home/axel/esp/esp-idf/components/driver/uart.c:1210

@AxelLin
Copy link
Contributor

AxelLin commented Oct 4, 2021

@Dazza0
Do you have fix for above assert issue?

@AxelLin
Copy link
Contributor

AxelLin commented Oct 19, 2021

@Dazza0

I test it using pppos client, the uart driver uses the ringbuffer and I always hit below assertion if I change the code as you suggested.
assert failed: prvReturnItemDefault ringbuf.c:604 ((pxCurHeader->uxItemFlags & ( ( UBaseType_t ) 1 )) == 0)

ps. It works if I don't change the code as you suggested.

@0xjakob
Copy link
Collaborator

0xjakob commented Oct 20, 2021

Hi @AxelLin, thanks for the feedback. We are still working on this issue at the moment and trying to figure out what the problem with the workaround is.

@sudeep-mohanty sudeep-mohanty self-assigned this Oct 20, 2021
@sudeep-mohanty
Copy link
Collaborator

Hi @AxelLin , @bugadani ,
Thanks for reporting this issue and for your feedback.
We've narrowed down the bug to an incorrect use of the comparison operation used in the function prvGetCurMaxSizeNoSplit() to get the free size available in a no-spilt buffer. The comparison happens between a signed (xFreeSize) and an unsigned (pxRingbuffer->xMaxItemSize) operand. When the free size available is less than the no-spilt item header size, xFreeSize is a negative value. But the comparison operation treats it as a large unsigned value as is the nature of binary operators, therefore resulting in a free size equal to xMaxItemSize when in reality it should be 0.

I'll make a fix for this. In the interim, you could test your code by making the following changes in prvGetCurMaxSizeNoSplit() in components/esp_ringbuf/ringbuf.c file:

From:

    //No-split ring buffer items need space for a header
    xFreeSize -= rbHEADER_SIZE;
    //Limit free size to be within bounds
    if (xFreeSize > pxRingbuffer->xMaxItemSize) {
        xFreeSize = pxRingbuffer->xMaxItemSize;
    } else if (xFreeSize < 0) {
        //Occurs when free space is less than header size
        xFreeSize = 0;
    }

To

    //No-split ring buffer items need space for a header
    xFreeSize -= rbHEADER_SIZE;
    if (xFreeSize < 0) {
        //Occurs when free space is less than header size
        xFreeSize = 0;
    } else if (xFreeSize > pxRingbuffer->xMaxItemSize) {
        //Limit free size to be within bounds
        xFreeSize = pxRingbuffer->xMaxItemSize;
    }

@AxelLin
Copy link
Contributor

AxelLin commented Oct 21, 2021

@sudeep-mohanty

I still hit the same assertion with below modified code:

diff --git a/components/esp_ringbuf/ringbuf.c b/components/esp_ringbuf/ringbuf.c
index f8c62d03c7b5..e4414dad668c 100644
--- a/components/esp_ringbuf/ringbuf.c
+++ b/components/esp_ringbuf/ringbuf.c
@@ -542,7 +542,8 @@ static void *prvGetItemDefault(Ringbuffer_t *pxRingbuffer,
 
     pxRingbuffer->pucRead += rbHEADER_SIZE + rbALIGN_SIZE(pxHeader->xItemLen);   //Update pucRead
     //Check if pucRead requires wrap around
-    if ((pxRingbuffer->pucTail - pxRingbuffer->pucRead) < rbHEADER_SIZE) {
+    if (((ItemHeader_t *)pxRingbuffer->pucRead)->uxItemFlags & rbITEM_DUMMY_DATA_FLAG ||    //Next item is a dummy
+        ((pxRingbuffer->pucTail - pxRingbuffer->pucRead) < rbHEADER_SIZE)) {                //Remaining space cannot fit a header
         pxRingbuffer->pucRead = pxRingbuffer->pucHead;
     }
     return (void *)pcReturn;
@@ -674,12 +675,12 @@ static size_t prvGetCurMaxSizeNoSplit(Ringbuffer_t *pxRingbuffer)
 
     //No-split ring buffer items need space for a header
     xFreeSize -= rbHEADER_SIZE;
-    //Limit free size to be within bounds
-    if (xFreeSize > pxRingbuffer->xMaxItemSize) {
-        xFreeSize = pxRingbuffer->xMaxItemSize;
-    } else if (xFreeSize < 0) {
+    if (xFreeSize < 0) {
         //Occurs when free space is less than header size
         xFreeSize = 0;
+    } else if (xFreeSize > pxRingbuffer->xMaxItemSize) {
+        //Limit free size to be within bounds
+        xFreeSize = pxRingbuffer->xMaxItemSize;
     }
     return xFreeSize;
 }

@sudeep-mohanty
Copy link
Collaborator

Hi @AxelLin,
My bad. I missed mentioning that you do not need the below code change -

     //Check if pucRead requires wrap around
-    if ((pxRingbuffer->pucTail - pxRingbuffer->pucRead) < rbHEADER_SIZE) {
+    if (((ItemHeader_t *)pxRingbuffer->pucRead)->uxItemFlags & rbITEM_DUMMY_DATA_FLAG ||    //Next item is a dummy
+        ((pxRingbuffer->pucTail - pxRingbuffer->pucRead) < rbHEADER_SIZE)) {                //Remaining space cannot fit a header

Please keep the pucRead update as before and only update the prvGetCurMaxSizeNoSplit() function as mentioned in an earlier comment.
Would you help try this update and let us know if it works for you?

@AxelLin
Copy link
Contributor

AxelLin commented Oct 21, 2021

Hi @sudeep-mohanty

It works if only update the prvGetCurMaxSizeNoSplit() function.

I have some suggestions:

  1. The change is not trivial, please add a comment to the code.
  2. I use pppos_client and the uart driver uses esp_ringbuf heavily.
    However, it seems I didn't hit the timeout issue before changing any code.
    So I cannot verify if the fix actually works or not (because it was working in my use case).
    I suggest add a TEST_CASE for this issue to encure it's correct.

@sudeep-mohanty
Copy link
Collaborator

Thanks for confirming @AxelLin and also for your suggestions.
I will add appropriate comments to the code to explain this change.

This scenario is only hit when the free space left in the buffer is less than the size of a no-split header (i.e, rbHEADER_SIZE which is 8 bytes). With the buggy code, the free size is returned as xMaxItemSize instead of 0. This would make the app/driver believe that there is space left in the buffer and may attempt to send a new item. The new item would fail to get added to the buffer, leading to a timeout.
Hope this help explain the scenario.

@espressif-bot espressif-bot added Resolution: Done Issue is done internally Status: Done Issue is done internally and removed Status: In Progress Work is in progress labels Oct 29, 2021
@AxelLin
Copy link
Contributor

AxelLin commented Nov 8, 2021

@sudeep-mohanty
Need to backport this fix to stable branches.

espressif-bot pushed a commit that referenced this issue Dec 9, 2021
…and resulted in incorrect free size for no-split/allow-split buffers

This commit fixes a bug in no-split and allow-split ring buffers free buffer size calculation.
When the free size available in the buffers less than the size of one item header,
the function prvGetCurMaxSizeNoSplit/AllowSplit() incorrectly returned the maxItemSize instead of 0.
This is due to the comparision between a negative and a positive value
where both operands are treated as unsigned during the comparision operation,
thereby treating the negative operand as a large integer.

Also added new unit tests to test buffer-full and almost-full conditions
where this scenario is likely to be hit.

Closes #7344
Closes #7371
espressif-bot pushed a commit that referenced this issue Dec 10, 2021
…and resulted in incorrect free size for no-split/allow-split buffers

This commit fixes a bug in no-split and allow-split ring buffers free buffer size calculation.
When the free size available in the buffers less than the size of one item header,
the function prvGetCurMaxSizeNoSplit/AllowSplit() incorrectly returned the maxItemSize instead of 0.
This is due to the comparision between a negative and a positive value
where both operands are treated as unsigned during the comparision operation,
thereby treating the negative operand as a large integer.

Also added new unit tests to test buffer-full and almost-full conditions
where this scenario is likely to be hit.

Closes #7344
Closes #7371
@AxelLin
Copy link
Contributor

AxelLin commented Dec 24, 2021

Just remind that v4.3 branch still does not include this fix.

@Alvin1Zhang
Copy link
Collaborator

@AxelLin Thanks for the attention, we have back ported this fix to 4.3, but still under internal reviewing, will sync up here once the fix is available on GitHub.

espressif-bot pushed a commit that referenced this issue Dec 31, 2021
…and resulted in incorrect free size for no-split/allow-split buffers

This commit fixes a bug in no-split and allow-split ring buffers free buffer size calculation.
When the free size available in the buffers less than the size of one item header,
the function prvGetCurMaxSizeNoSplit/AllowSplit() incorrectly returned the maxItemSize instead of 0.
This is due to the comparision between a negative and a positive value
where both operands are treated as unsigned during the comparision operation,
thereby treating the negative operand as a large integer.

Also added new unit tests to test buffer-full and almost-full conditions
where this scenario is likely to be hit.

Closes #7344
Closes #7371
espressif-bot pushed a commit that referenced this issue Jan 1, 2022
…and resulted in incorrect free size for no-split/allow-split buffers

This commit fixes a bug in no-split and allow-split ring buffers free buffer size calculation.
When the free size available in the buffers less than the size of one item header,
the function prvGetCurMaxSizeNoSplit/AllowSplit() incorrectly returned the maxItemSize instead of 0.
This is due to the comparision between a negative and a positive value
where both operands are treated as unsigned during the comparision operation,
thereby treating the negative operand as a large integer.

Also added new unit tests to test buffer-full and almost-full conditions
where this scenario is likely to be hit.

Closes #7344
Closes #7371
espressif-bot pushed a commit that referenced this issue Jan 6, 2022
…and resulted in incorrect free size for no-split/allow-split buffers

This commit fixes a bug in no-split and allow-split ring buffers free buffer size calculation.
When the free size available in the buffers less than the size of one item header,
the function prvGetCurMaxSizeNoSplit/AllowSplit() incorrectly returned the maxItemSize instead of 0.
This is due to the comparision between a negative and a positive value
where both operands are treated as unsigned during the comparision operation,
thereby treating the negative operand as a large integer.

Also added new unit tests to test buffer-full and almost-full conditions
where this scenario is likely to be hit.

Closes #7344
Closes #7371
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants