New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hardware flow control on NRF52 series #8292

Merged
merged 2 commits into from Oct 26, 2018

Conversation

Projects
None yet
@marcuschangarm
Contributor

marcuschangarm commented Oct 2, 2018

Description

Due to buggy flow control logic in the UARTE, the stop signal is not being set as it is supposed to when the the module is not ready to receive data.

This commit signals the sender to halt transmitting when a DMA buffer is full and only continue again when the atomic FIFO buffer has been emptied. This allows platforms with hardware flow control to minimize all buffers and rely on flow control instead.

Pull request type

[x] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Breaking change
@marcuschangarm

This comment has been minimized.

Contributor

marcuschangarm commented Oct 2, 2018

@0xc0170 0xc0170 requested a review from TacoGrandeTX Oct 2, 2018

@@ -331,6 +331,20 @@ static void nordic_nrf5_uart_timeout_handler(uint32_t instance)
/* No activity detected, no timeout set. */
nordic_nrf5_uart_state[instance].ticker_is_running = false;
/* Check if hardware flow control is set and signal sender to stop.

This comment has been minimized.

@naveenkaje

naveenkaje Oct 2, 2018

Contributor

This block of code is repeated 3 times. Would it make sense to wrap this up in it's own function?

This comment has been minimized.

@marcuschangarm

marcuschangarm Oct 2, 2018

Contributor

It's being called twice. It interrupt handler is convoluted as it is, I don't think adding another function will make it easier to read.

@naveenkaje

This comment has been minimized.

Contributor

naveenkaje commented Oct 2, 2018

Please provide reference to the bug in UARTE (such as errata from Nordic) / a Github issue? It will be useful.

@marcuschangarm

This comment has been minimized.

Contributor

marcuschangarm commented Oct 2, 2018

The bug is an artifact of the UARTE API, because there is no safe way to flush the DMA buffer and swap it for a new one. When you follow the documentation and flush the DMA you'll get a ~6 ms period where the UARTE is not ready to receive. During this period the hardware flow control is inactive because the hardware flow control is only ready when the UARTE is ready to receive.

This PR manually toggles the flow control pin during this timeout window.

@TacoGrandeTX

This comment has been minimized.

Contributor

TacoGrandeTX commented Oct 2, 2018

Sounds like a dicey problem! Do you have a test that showcases the problem and fix? DMA buffers are just 1 byte - right?

@marcuschangarm

This comment has been minimized.

Contributor

marcuschangarm commented Oct 2, 2018

DMA buffers are at least 5 bytes. You can connect a logic analyzer to any application. You'll notice the RTS line is never set even though it should be.

@deepikabhavnani

This comment has been minimized.

Contributor

deepikabhavnani commented Oct 4, 2018

Build requested by @marcuschangarm
Query : Is it for 5.10.1 or 5.101.2?

/morph build

@mbed-ci

This comment has been minimized.

mbed-ci commented Oct 4, 2018

Build : SUCCESS

Build number : 3241
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/8292/

Triggering tests

/morph test
/morph mbed2-build

@0xc0170

This comment has been minimized.

Member

0xc0170 commented Oct 5, 2018

/morph export-build

@0xc0170 0xc0170 added needs: CI and removed needs: review labels Oct 5, 2018

@0xc0170

0xc0170 approved these changes Oct 5, 2018

@mbed-ci

This comment has been minimized.

@mbed-ci

This comment has been minimized.

@theotherjimmy

This comment has been minimized.

Contributor

theotherjimmy commented Oct 5, 2018

/morph test

NRF52_DK timed out.

@theotherjimmy

This comment has been minimized.

Contributor

theotherjimmy commented Oct 5, 2018

@marcuschangarm Is there a chance that the NRF52 timeout could have been caused by this change?

@mbed-ci

This comment has been minimized.

@NirSonnenschein

This comment has been minimized.

Contributor

NirSonnenschein commented Oct 6, 2018

failures in tests for NRF52_DK timeouts and sync issues. this doesn't seem incidental. @marcuschangarm please take a look

@marcuschangarm

This comment has been minimized.

Contributor

marcuschangarm commented Oct 8, 2018

/morph test

@mbed-ci

This comment has been minimized.

@cmonr

This comment has been minimized.

Contributor

cmonr commented Oct 9, 2018

/morph build

New commit > new SHA > new Build step needed.

@marcuschangarm marcuschangarm force-pushed the marcuschangarm:fix_flow branch from b335151 to d366cf5 Oct 16, 2018

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Oct 16, 2018

@studavekar @NirSonnenschein
The failures occur in a test with heavy UART traffic. I suspect the CI infrastructure not following the flow control signal causes the Tx buffer to overflow.

@marcuschangarm not sure if Daplink supports flow control from host <-->daplink<---- hw fw ctrl ---> device.

We also need to make review the test, we should avoid heavy uart traffic in tests. If we want to stress uart it should part of a separate test.

Failure :

  1. [1539363423.79][CONN][RXD] :144::FAIL: Expected 'passed' Was 'pased' - passed --> pased
@cmonr

This comment has been minimized.

Contributor

cmonr commented Oct 16, 2018

@c1728p9

not sure if Daplink supports flow control from host <-->daplink<---- hw fw ctrl ---> device.

Thoughts?

@c1728p9

This comment has been minimized.

Contributor

c1728p9 commented Oct 16, 2018

not sure if Daplink supports flow control from host <-->daplink<---- hw fw ctrl ---> device.

I took a look at the DAPLink code and flow control is enabled for the NRF52 boards.

@marcuschangarm

This comment has been minimized.

Contributor

marcuschangarm commented Oct 17, 2018

@c1728p9
Does DAPLink support flow control on the host side as well?

@c1728p9

This comment has been minimized.

Contributor

c1728p9 commented Oct 17, 2018

@marcuschangarm yep, it is built into the USB CDC protocol and can't be turned off.

@marcuschangarm

This comment has been minimized.

Contributor

marcuschangarm commented Oct 17, 2018

@c1728p9
So if the CI enables flow control then it should work all the way down?

@marcuschangarm

This comment has been minimized.

Contributor

marcuschangarm commented Oct 23, 2018

The failing test is for a component that is not supported by the NRF52_DK target.

Both the rtc_time and rtc_time_conv HAL tests are missing this header:

#if !DEVICE_RTC
#error [NOT_SUPPORTED] RTC API not supported for this target
#endif
@0xc0170

This comment has been minimized.

Member

0xc0170 commented Oct 24, 2018

@mprse Can you review ^^ ?

Fix hardware flow control on NRF52 series
Due to buggy flow control logic in the UARTE, the stop signal
is not being set as it is supposed to when the the module is
not ready to receive data.

This commit signals the sender to halt transmitting when a DMA
buffer is full and only continue again when the atomic FIFO
buffer has been emptied. This allows platforms with hardware
flow control to minimize all buffers and rely on flow control
instead.

@marcuschangarm marcuschangarm force-pushed the marcuschangarm:fix_flow branch from d366cf5 to 1ad3b49 Oct 24, 2018

@c1728p9

This comment has been minimized.

Contributor

c1728p9 commented Oct 24, 2018

So if the CI enables flow control then it should work all the way down?

Yep

@mprse

This comment has been minimized.

Member

mprse commented Oct 25, 2018

Both the rtc_time and rtc_time_conv HAL tests are missing this header:

#if !DEVICE_RTC
#error [NOT_SUPPORTED] RTC API not supported for this target
#endif

These tests verifies time conversion functions from the following library which is available for all targets and does not require RTC. Also even when RTC is not supported on the specific platform it is then replaced by LPTICKER if available (which is NRF52_DK case I think):

#if DEVICE_RTC
static void (*_rtc_init)(void) = rtc_init;
static int (*_rtc_isenabled)(void) = rtc_isenabled;
static time_t (*_rtc_read)(void) = rtc_read;
static void (*_rtc_write)(time_t t) = rtc_write;
#elif DEVICE_LPTICKER
#include "drivers/LowPowerTimer.h"
#define US_PER_SEC 1000000
static SingletonPtr<mbed::LowPowerTimer> _rtc_lp_timer;
static uint64_t _rtc_lp_base;
static bool _rtc_enabled;
static void _rtc_lpticker_init(void)
{
_rtc_lp_timer->start();
_rtc_enabled = true;
}
static int _rtc_lpticker_isenabled(void)
{
return (_rtc_enabled == true);
}
static time_t _rtc_lpticker_read(void)
{
return _rtc_lp_timer->read_high_resolution_us() / US_PER_SEC + _rtc_lp_base;
}
static void _rtc_lpticker_write(time_t t)
{
_rtc_lp_base = t;
}
static void (*_rtc_init)(void) = _rtc_lpticker_init;
static int (*_rtc_isenabled)(void) = _rtc_lpticker_isenabled;
static time_t (*_rtc_read)(void) = _rtc_lpticker_read;
static void (*_rtc_write)(time_t t) = _rtc_lpticker_write;
#else /* DEVICE_LPTICKER */
static void (*_rtc_init)(void) = NULL;
static int (*_rtc_isenabled)(void) = NULL;
static time_t (*_rtc_read)(void) = NULL;
static void (*_rtc_write)(time_t t) = NULL;
#endif /* DEVICE_LPTICKER */

I can confirm that the failing test sends lots of data between host and device using uart. So this is a great test for this PR 😅 . Looking at the last results we have failure on IAR and ARM compilers.
For IAR there is a regular test timeout case, so we could consider increasing the test execution timeout.
For ARM we got some kind of communication error:
:144::FAIL: Expected 'passed' Was 'pased'
Looks like one byte has been lost during the transmission which is bad and may cause some unexpected issues in the future, so I think this needs to be investigated.

@marcuschangarm

This comment has been minimized.

Contributor

marcuschangarm commented Oct 25, 2018

So, what's the plan?

  • Characters are missing, but the CI doesn't support flow control.
  • Tests are failing for components we don't claim support for.
Remove debug flow control from NRF52_DK and NRF52840_DK
Currently flow control is not supported by the CI and enabling it
causes unwanted side effects.
@marcuschangarm

This comment has been minimized.

Contributor

marcuschangarm commented Oct 25, 2018

/morph build

@mbed-ci

This comment has been minimized.

mbed-ci commented Oct 25, 2018

Build : SUCCESS

Build number : 3470
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/8292/

Triggering tests

/morph test
/morph export-build
/morph mbed2-build

@cmonr cmonr added needs: CI and removed needs: work labels Oct 25, 2018

@mbed-ci

This comment has been minimized.

@0xc0170

This comment has been minimized.

Member

0xc0170 commented Oct 26, 2018

Eclipse error, but all good there

/morph export-build

@mbed-ci

This comment has been minimized.

@mbed-ci

This comment has been minimized.

@cmonr cmonr merged commit a5b7573 into ARMmbed:master Oct 26, 2018

15 checks passed

ci-morph-build build completed
Details
ci-morph-exporter build completed
Details
ci-morph-mbed2-build build completed
Details
ci-morph-test test completed , RTOS ROM(+0 bytes) RAM(+0 bytes)
Details
continuous-integration/jenkins/pr-head This commit looks good
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
jenkins-ci/cloud-client-test Success
Details
jenkins-ci/unittests Success
Details
travis-ci/astyle Passed, 536 files (+0 files)
Details
travis-ci/docs Local docs testing has passed
Details
travis-ci/events Passed, runtime is 10076 cycles (+70 cycles)
Details
travis-ci/gitattributestest Local gitattributestest testing has passed
Details
travis-ci/licence_check Local licence_check testing has passed
Details
travis-ci/littlefs Passed, code size is 8372B (+0.00%)
Details
travis-ci/tools-py2.7 Local tools-py2.7 testing has passed
Details

@cmonr cmonr removed the needs: CI label Oct 26, 2018

@marcuschangarm marcuschangarm deleted the marcuschangarm:fix_flow branch Nov 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment