Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use us ticker for tickless on devs with wrapper #9785

Merged
merged 4 commits into from Mar 1, 2019

Conversation

c1728p9
Copy link
Contributor

@c1728p9 c1728p9 commented Feb 21, 2019

Description

The low power ticker wrapper layer adds a large amount of interrupt latency. This can cause dropped bytes at a baud of 115200 on some devices. To prevent this by default use the microsecond ticker for tickless on devices which make use of the low power ticker wrapper.

Pull request type

[x] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

Reviewers

@jeromecoutant
@ithinuel

@c1728p9
Copy link
Contributor Author

c1728p9 commented Feb 21, 2019

The targets updated by this PR will by default no longer be at risk of dropping bytes at higher baudrates. This comes at the expense of higher power consumption since they cannot enter deep sleep when tickless is running off of the microsecond ticker.

Some issues related to this problem:
#9577
#8714

CC @screamerbg

@c1728p9
Copy link
Contributor Author

c1728p9 commented Feb 21, 2019

#9786 will also help with interrupt latency

@ciarmcom ciarmcom requested review from ithinuel and a team February 21, 2019 04:00
@ciarmcom
Copy link
Member

@c1728p9, thank you for your changes.
@ithinuel @ARMmbed/mbed-os-maintainers @ARMmbed/mbed-os-core please review.

@kjbracey
Copy link
Contributor

Does this mean that #9786 is not sufficient for the platforms having problems?

rtos/TARGET_CORTEX/mbed_rtx_idle.cpp Outdated Show resolved Hide resolved
@@ -47,7 +48,11 @@ extern "C" {
{
// Do not use SingletonPtr since this relies on the RTOS
if (NULL == os_timer) {
os_timer = new (os_timer_data) rtos::internal::SysTimer();
#if MBED_CONF_TARGET_TICKLESS_FROM_US_TICKER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also be automatically inferred if DEVICE_LPTICKER is unset, rather than needing to be set manually?

#if !DEVICE_LPTICKER
#  define USE_US_TICKER true
#else 
#  define USE_US_TICKER MBED_CONF_TARGET_TICKLESS_FROM_US_TICKER
#endif

maybe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is automatically inferred then MBED_CONF_TARGET_TICKLESS_FROM_US_TICKER may be false (default) yet the microsecond ticker is used. This could lead to problems if MBED_CONF_TARGET_TICKLESS_FROM_US_TICKER is used elsewhere. I added compiler asserts so you get a clear error message if the required DEVICE_*TICKER is missing.

@jeromecoutant
Copy link
Collaborator

@mattbrown015
Maybe you can also have a look on this ?
Thx

@LMESTM

The low power ticker wrapper layer adds a large amount of interrupt
latency. This can cause dropped bytes at a baud of 115200 on some
devices. To prevent this by default use the microsecond ticker for
tickless on devices which make use of the low power ticker wrapper.
@c1728p9
Copy link
Contributor Author

c1728p9 commented Feb 21, 2019

Does this mean that #9786 is not sufficient for the platforms having problems?

On the NUCLEO_L073RZ I was still seeing overflows with #9786 if tickless was using the low power ticker, but this device has a core clock of 32MHz according to the platform page. Faster devices may not have this problem.

@mattbrown015
Copy link
Contributor

Maybe you can also have a look on this ?

@jeromecoutant Sorry, I've been moved onto another project that doesn't use Mbed OS. My Mbed OS project code has been frozen while we do the final testing before sign off.

I'm using the RTC in tickless rather than the LPTIM timer.

Apply the fixes found from astyle.
@jeromecoutant
Copy link
Collaborator

Hi

Here is my tests:

  • go on master branch
mbed test -m NUCLEO_L073RZ -t ARM -v -n tests-*tick*
mbedgt: test case results: 86 OK
mbed test -m NUCLEO_L073RZ -t ARM -v -n tests-*tick*
mbedgt: test case results: 1 FAIL / 5 SKIPPED / 80 OK

==> issue with tests-mbed_hal-common_tickers / lp ticker past interrupt test

mbed test -m NUCLEO_L073RZ -t ARM -v -n tests-*tick*
mbedgt: test case results: 1 FAIL / 5 SKIPPED / 80 OK

==> issue with tests-mbed_hal-common_tickers / lp ticker past interrupt test

@kjbracey
Copy link
Contributor

@jeromecoutant - so was that third result with both applied?

@jeromecoutant
Copy link
Collaborator

was that third result with both applied?

Yes.
Here is test in the other order:

  • go on master branch
mbed test -m NUCLEO_L073RZ -t ARM -v -n tests-*tick*
mbedgt: test case results: 86 OK
mbed test -m NUCLEO_L073RZ -t ARM -v -n tests-*tick*
mbedgt: test case results: 86 OK
mbed test -m NUCLEO_L073RZ -t ARM -v -n tests-*tick*
mbedgt: test case results: 1 FAIL / 5 SKIPPED / 80 OK

==> issue with tests-mbed_hal-common_tickers / lp ticker past interrupt test

@kjbracey
Copy link
Contributor

On the NUCLEO_L073RZ I was still seeing overflows with #9786 if tickless was using the low power ticker,

Has the situation changed, now that the critical section is reduced?

@c1728p9
Copy link
Contributor Author

c1728p9 commented Feb 25, 2019

Has the situation changed, now that the critical section is reduced?

Nope, the NUCLEO_L073RZ still has overflows.

@c1728p9
Copy link
Contributor Author

c1728p9 commented Feb 26, 2019

Hi @jeromecoutant, I was able to reproduce the failure you mention. I have been able to narrow down the failure, but I haven't been able to root cause it yet. It appears that on the NUCLEO_L073RZ after the first call to lp_ticker_init but before the time has overflowed at 0xFFFF, calling lp_ticker_set_interrupt with a value less than the current, LPTIM_CMP, causes the ticker ISR to fire sooner than it should.

Below is a test program which shows the problem. Use this program with this PR to show the problem. If the low power ticker is working correctly this program should cause an ISR to fire every ~2 seconds. Instead for the first two seconds interrupts fire rapidly, but after that return to the expected rate.

Program showing low power ticker interrupt firing too quickly:

#include "mbed.h"
#include "mbed_lp_ticker_wrapper.h"

#define MASK 0xFFFF
void wait_lp_cycles(uint8_t cycles)
{
    uint32_t start = lp_ticker_read();
    while (((lp_ticker_read() - start) & MASK) < cycles);
}

volatile uint32_t isr_count = 0;
void ticker_handler(const ticker_data_t *const ticker)
{
    lp_ticker_clear_interrupt();

    wait_lp_cycles(10);

    // Schedule next ISR for ~1 second
    lp_ticker_set_interrupt((lp_ticker_read() + 32768) & MASK);

    wait_lp_cycles(10);

    // Update ISR to fire 10 tick in the past (65526 ticks or ~2 seconds in the future)
    lp_ticker_set_interrupt((lp_ticker_read() - 10) & MASK);
    isr_count++;
}

int main()
{
    core_util_critical_section_enter();
    lp_ticker_wrapper_suspend();
    set_lp_ticker_irq_handler(ticker_handler);
    lp_ticker_init();
    lp_ticker_set_interrupt(lp_ticker_read() + 50);
    core_util_critical_section_exit();

    while (1) {
        isr_count = 0;
        wait(1);
        printf("ISRs per second %lu\r\n", isr_count);
    }
}

Output from this program on the NUCLEO_L073RZ:

ISRs per second 1366
ISRs per second 0
ISRs per second 1
ISRs per second 0
ISRs per second 1
ISRs per second 0
...

@c1728p9
Copy link
Contributor Author

c1728p9 commented Feb 27, 2019

@jeromecoutant are you only seeing failing tests on the NUCLEO_L073RZ? If so maybe the low power ticker should be disabled on this device until a fix can be made for it. What do you think?

@c1728p9
Copy link
Contributor Author

c1728p9 commented Feb 28, 2019

I think I was able to determine the root cause. Contrary to my previous comment, this problem can occur at any time and not just the first cycle after lp_ticker_init. The following sequence always causes the low power ticker to fire early:

  1. Call lp_ticker_set_interrupt with a match time that is greater than the current time but then less than or equal to 0xFFFF. In other words a value that satisfies lp_ticker_read() < match_time <= 0xFFFF
  2. Wait for this value to take effect - wait for LPTIM_FLAG_CMPOK to be set
  3. Before the previous match time expires make another call to lp_ticker_set_interrupt with a value less than the current time. In other words 0 <= new_match_time < lp_ticker_read()
  4. Interrupt fires early

Below is a modified version of the test program above. This causes the bug to manifest 100% of the time. Note - the bug is corrected by either removing the call to lp_ticker_set_interrupt(0xFFFF); in the ISR or by adding a 2nd call to lp_ticker_set_interrupt(0); in the ISR.

#include "mbed.h"
#include "mbed_lp_ticker_wrapper.h"

#define MASK 0xFFFF
void wait_lp_cycles(uint8_t cycles)
{
    uint32_t start = lp_ticker_read();
    while (((lp_ticker_read() - start) & MASK) < cycles);
}

volatile uint32_t isr_count = 0;
void ticker_handler(const ticker_data_t *const ticker)
{
    lp_ticker_clear_interrupt();

    wait_lp_cycles(10);

    // Schedule next ISR highest possible value
    // NOTE - Removing this line fixes the problem
    lp_ticker_set_interrupt(0xFFFF);

    /*
    // Alternatively, adding another write fixes this problem
    wait_lp_cycles(10);
    lp_ticker_set_interrupt(0);
    */

    wait_lp_cycles(10);

    // Update ISR to fire on the first tick
    // if working correctly this ISR should fire every 2 seconds.
    lp_ticker_set_interrupt(0);
    isr_count++;
}

int main()
{
    core_util_critical_section_enter();
    lp_ticker_wrapper_suspend();
    set_lp_ticker_irq_handler(ticker_handler);
    lp_ticker_init();
    lp_ticker_set_interrupt(lp_ticker_read() + 50);
    core_util_critical_section_exit();

    while (1) {
        isr_count = 0;
        wait(1);
        printf("ISRs per second %lu\r\n", isr_count);
    }
}

Output from this program on the NUCLEO_L073RZ:

ISRs per second 1367
ISRs per second 1366
ISRs per second 1368
ISRs per second 1367
ISRs per second 1367
ISRs per second 1367

Correct output (after applying either fix):

ISRs per second 0
ISRs per second 1
ISRs per second 0
ISRs per second 1
ISRs per second 0
ISRs per second 1
ISRs per second 0

Unfortunately I don't see a good fix for the L073 in mbed-os. A double write inside lp_ticker_set_interrupt requires blocking, which is problematic. @jeromecoutant @LMESTM do you have any ideas on this? Could you investigate this failure on your end?

@cmonr do you have any recommendations on how to proceed with this PR? This is an important fix to get in. Some options

  • Disable the lp ticker past interrupt test until the NUCLEO_L073RZ is corrected
  • Turn off low power ticker for the NUCLEO_L073R

@0xc0170
Copy link
Contributor

0xc0170 commented Feb 28, 2019

Turn off low power ticker for the NUCLEO_L073R

This would be my choice (test passes for majority of targets and it's valid use case ?).

@jeromecoutant
Copy link
Collaborator

Turn off low power ticker for the NUCLEO_L073R

No, there are many user using LP TICKER for DISCO L0 LORA boards

Due to a bug the LPTIM peripheral can cause interrupts to fire at the
wrong time on the NUCLEO_L073RZ. This patch switches the backend
of the low power ticker to the RTC until this bug is addressed.
Skip the systimer deep sleep test when running from the microsecond
ticker, since the microsecond ticker doesn't support operation in deep
sleep mode.
@c1728p9
Copy link
Contributor Author

c1728p9 commented Feb 28, 2019

I switched the Low power ticker backend of the NUCLEO_L073RZ to the RTC which does not have the previously investigated problem. Thanks @jeromecoutant for this suggestion.

I also added a commit turn off the deep sleep systimer tests when using the microsecond timer, since deep sleep isn't supported in this configuration.

@cmonr
Copy link
Contributor

cmonr commented Feb 28, 2019

CI started

@mbed-ci
Copy link

mbed-ci commented Feb 28, 2019

Test run: SUCCESS

Summary: 13 of 13 test jobs passed
Build number : 1
Build artifacts

@cmonr cmonr merged commit e393c2d into ARMmbed:master Mar 1, 2019
@c1728p9 c1728p9 deleted the default_to_us_ticker branch March 1, 2019 00:40
ABOSTM added a commit to ABOSTM/mbed-os that referenced this pull request Oct 18, 2019
Ticless on STM32 F4 boards causes SPI issue with following PR:
# 11682 Make FPGA tests to pass on CI targets (SPI, analogIn, PWM)

In asynch mode, using interrupts, SPI hardware detect an RX overrun.
Our understanding is that lpticker wrapper latency
causes issue similar to ARMmbed#8714 and ARMmbed#9785,
specially with SPI asynch which use interrupts.
ABOSTM added a commit to ABOSTM/mbed-os that referenced this pull request Oct 18, 2019
Ticless on STM32 F4 boards causes SPI issue with following PR:
# 11682 Make FPGA tests to pass on CI targets (SPI, analogIn, PWM)

In asynch mode, using interrupts, SPI hardware detect an RX overrun.
Our understanding is that lpticker wrapper latency
causes issue similar to ARMmbed#8714 and ARMmbed#9785,
specially with SPI asynch which use interrupts.
ABOSTM added a commit to ABOSTM/mbed-os that referenced this pull request Nov 14, 2019
Ticless on STM32 F4 boards causes SPI issue with following PR:
# 11682 Make FPGA tests to pass on CI targets (SPI, analogIn, PWM)

In asynch mode, using interrupts, SPI hardware detect an RX overrun.
Our understanding is that lpticker wrapper latency
causes issue similar to ARMmbed#8714 and ARMmbed#9785,
specially with SPI asynch which use interrupts.
yarbcy pushed a commit to yarbcy/mbed-os that referenced this pull request Nov 20, 2019
Ticless on STM32 F4 boards causes SPI issue with following PR:
# 11682 Make FPGA tests to pass on CI targets (SPI, analogIn, PWM)

In asynch mode, using interrupts, SPI hardware detect an RX overrun.
Our understanding is that lpticker wrapper latency
causes issue similar to ARMmbed#8714 and ARMmbed#9785,
specially with SPI asynch which use interrupts.
mprse added a commit to mprse/mbed-os that referenced this pull request Feb 3, 2020
FPGA SPI test fails while testing asynchronous mode.
It looks like lpticker wrapper latency causes issue similar to ARMmbed#8714 and ARMmbed#9785, especially with SPI async which use interrupts.
mprse added a commit to mprse/mbed-os that referenced this pull request Feb 3, 2020
FPGA SPI test fails while testing asynchronous mode.
It looks like  low-level-lpticker-wrapper latency causes issue similar to ARMmbed#8714 and ARMmbed#9785, especially with SPI async which use interrupts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants