Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TimerEvent tests #5046

Merged
merged 4 commits into from
Feb 26, 2018
Merged

TimerEvent tests #5046

merged 4 commits into from
Feb 26, 2018

Conversation

fkjagodzinski
Copy link
Member

@fkjagodzinski fkjagodzinski commented Sep 7, 2017

Description

TimerEvent:

Status

READY

Copy link
Member

@bulislaw bulislaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In multiple tests we are checking internal state of other objects (not under test), I think that will lead to possible issues when the internals will change. I will say we should keep more to the APIs.

virtual ~TestTimerEvent() {
}

void insert(timestamp_t ts) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we overriding the calls without changing anything? Can't we use them directly from TimerEvent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed them to be publicly accessible for test purposes. I've found a more elegant way to achieve that, so I'll update this code anyway.

* Then @a ticker_queue is set properly
*/
template<typename T>
void test_ticker_queue(void) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this test goes to deep into the internals and tests other components. Problem with this approach is that internals can change and API should be stable. TimerEvent only calls Tickers API so I would only test it to this depth.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed -- after the update tests don't check any inernal structures.

T tte;
tte.sem_wait(0);

tte.set_future_timestamp(TEST_DELAY_US);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could try that just after the call the semaphore is not incremented

template<typename T>
void test_insert(void) {
T tte;
tte.sem_wait(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we decrementing semaphore here? You can just initialise the semaphore with count of 0.


tte.insert_absolute(tte.ticker_read_us() - 1ULL);
int32_t sem_slots = tte.sem_wait(0);
TEST_ASSERT_EQUAL(1, sem_slots);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until #5051 is solved this assertion will fail on NUCLEO_F070RB & EFM32GG_STK3700.

Case("Test remove after insert_absolute", test_remove<TestTimerEventAbsolute>),

Case("Test insert zero", test_insert_zero<TestTimerEventRelative>),
Case("Test insert_absolute zero", test_insert_zero<TestTimerEventAbsolute>),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will fail on NUCLEO_F070RB and EFM32GG_STK3700 boards until #5051 is fixed.

Case("Test insert_absolute zero", test_insert_zero<TestTimerEventAbsolute>),

Case("Test insert timestamp from the past", test_insert_past<TestTimerEventRelative>),
Case("Test insert_absolute timestamp from the past", test_insert_past<TestTimerEventAbsolute>),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will fail on NUCLEO_F070RB and EFM32GG_STK3700 boards until #5051 is fixed.

@theotherjimmy
Copy link
Contributor

@fkjagodzinski @bulislaw What's the status of this?

@0xc0170
Copy link
Contributor

0xc0170 commented Sep 28, 2017

bump

@fkjagodzinski
Copy link
Member Author

[...] What's the status of this?

@theotherjimmy @0xc0170 @bulislaw
This PR has to wait for PR #5176, which fixes issue #5051.

@fkjagodzinski
Copy link
Member Author

Ready for CI -- preceding PR #5176 merged.

@bulislaw
Copy link
Member

bulislaw commented Oct 2, 2017

/morph test

@mbed-bot
Copy link

mbed-bot commented Oct 2, 2017

Result: SUCCESS

Your command has finished executing! Here's what you wrote!

/morph test

Output

mbed Build Number: 1491

All builds and test passed!

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 2, 2017

/morph test-nightly

@mbed-bot
Copy link

mbed-bot commented Oct 3, 2017

Result: ABORTED

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 1497

Test failed!

@theotherjimmy
Copy link
Contributor

/morph test-nightly

@mbed-bot
Copy link

mbed-bot commented Oct 6, 2017

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 1533

Test failed!

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 6, 2017

Please look at failures, some of the new test cases fail

@fkjagodzinski
Copy link
Member Author

Test case named "Test insert_absolute zero" fails on several devices with an assertion:

[1507236309.66][CONN][RXD] >>> Running case #6: 'Test insert_absolute zero'...
[1507236309.71][CONN][INF] found KV pair in stream: {{__testcase_start;Test insert_absolute zero}}, queued...
[1507236309.74][CONN][RXD] :183::FAIL: Expected 1 Was 0

which means the event handler was not called instantly after event insertion.

@0xc0170
Looks like the issue #5051 affects more boards than initially listed. Shall I reopen it or report a new one?

I could only check NRF51_DK and that seems to be a different case -- the handler is called eventually, but with a delay (~91 us) which looks more like #5159.

To sum up, all failing boards don't comply with expected behaviour, which is:

Test insert_absolute zero
Given an instance of @a TimerEvent subclass
When a timestamp of 0 us is set with @a insert_absolute()
Then an event handler is called instantly

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 6, 2017

I looked at two devices, and could noticed some bugs in there. I can send a patch, but would like you to test them. I'll send a new PR shortly or can send you patch via email, you can ammend it here, to make tests pass. what do you think?

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 6, 2017

@fkjagodzinski this commit fixes all failures in this PR - I dont have all devices with me now to retest, can you cherry-pick my commit here and we retest?

0xc0170@e514c9d

These failures emphasize how important tests are !

@fkjagodzinski
Copy link
Member Author

Update:

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 6, 2017

Please ammend the last commit, NRF5 should have

void us_ticker_fire_interrupt(void)
{
    uint32_t closest_safe_compare = common_rtc_32bit_ticks_get() + 2;

    nrf_rtc_cc_set(COMMON_RTC_INSTANCE, US_TICKER_CC_CHANNEL, RTC_WRAP(closest_safe_compare));
    nrf_rtc_event_enable(COMMON_RTC_INSTANCE, US_TICKER_INT_MASK);
}

I removed timestamp from the equation, should not be there

@fkjagodzinski
Copy link
Member Author

@cmonr From what I know, @maciejbocianski is about to look into this issue.

@maciejbocianski
Copy link
Contributor

#6162 contains fix for "tests-mbed_hal-sleep_manager_racecondition" which started failing after introduction of other fix for #5284 Fire interrupt function broken for nrf5x targets #5284Fire interrupt function broken for nrf5x targets

maciejbocianski added a commit to maciejbocianski/mbed-os that referenced this pull request Feb 22, 2018
sleep_manager_racecondition test fix for devices with low CPU clock

This RP contains fix for sleep_manager_racecondition test
for very slow devices (like NRF51). It fixes the test itself
as well as side effects of fix introduced in
 ARMmbed#5046 (us ticker: fix fire interrupt handling)

The idea of the test was to test race condition between main thread
and interrupt handler calling the same function.
To efficiently test this, each handler call should interrupt
main thread to make race more likely.
On very slow devices (like NRF51) when we set very low ticker period
(e.g less then 1000us for NRF51) there is no much time for thread scheduling.
On such slow devices, setting period to 500 us cause that
main thread is scheduled very rarely and only handler is
constantly called making test unreliable.
Fix introduced in ARMmbed#5046 (us ticker: fix fire interrupt handling)
changed fire_interrupt function implementation causing more
interrupt tailing thus even less time for main thread scheduling.
After introduction of ARMmbed#5046 (us ticker: fix fire interrupt handling)
when running sleep_manager_racecondition test on NRF51
(with ticker1.attach_us(&sleep_manager_locking_irq_test, 500);)
test is failing with timeout due to the fact that interrupt
handler is constantly called and main thread is never scheduled.
@@ -74,11 +77,19 @@ void COMMON_RTC_IRQ_HANDLER(void)

rtc_ovf_event_check();

if (m_common_sw_irq_flag & US_TICKER_SW_IRQ_MASK) {
m_common_sw_irq_flag &= ~US_TICKER_SW_IRQ_MASK;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xc0170 @mprse
What about moving m_common_sw_irq_flag &= ~US_TICKER_SW_IRQ_MASK; to us_ticker_clear_interrupt function
and flatten it to single if as it was before ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the codeflow, clear_interrupt is called right in the ticker_irq_handler, thus this could be moved there.
Do not follow this one flatten it to single if as it was before ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant single if like below. Single call of us_ticker_irq_handler will execute all outdated events

    if ((m_common_sw_irq_flag & US_TICKER_SW_IRQ_MASK) || nrf_rtc_event_pending(COMMON_RTC_INSTANCE, US_TICKER_EVENT)) {
        us_ticker_irq_handler();
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

fkjagodzinski and others added 4 commits February 23, 2018 09:13
Few targets need more than just pending IRQ set. They include some flags to be
set that are checked in IRQ handler. This is the case for targets in this
commit.
@fkjagodzinski
Copy link
Member Author

Update:

Local tests show all is ok now; fingers crossed for the CI tests to pass this time. :)

@0xc0170
Copy link
Contributor

0xc0170 commented Feb 23, 2018

/morph build

@mbed-ci
Copy link

mbed-ci commented Feb 23, 2018

Build : SUCCESS

Build number : 1230
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/5046/

Triggering tests

/morph test
/morph uvisor-test
/morph export-build
/morph mbed2-build

@mbed-ci
Copy link

mbed-ci commented Feb 23, 2018

@mbed-ci
Copy link

mbed-ci commented Feb 23, 2018

Test : SUCCESS

Build number : 1031
Test logs :http://mbed-os-logs.s3-website-us-west-1.amazonaws.com/?prefix=logs/5046/1031

@fkjagodzinski
Copy link
Member Author

Looks like this PR is finally ready for merge! Unless anyone has comments of course, @0xc0170 @cmonr?

adbridge pushed a commit that referenced this pull request Feb 23, 2018
sleep_manager_racecondition test fix for devices with low CPU clock

This RP contains fix for sleep_manager_racecondition test
for very slow devices (like NRF51). It fixes the test itself
as well as side effects of fix introduced in
 #5046 (us ticker: fix fire interrupt handling)

The idea of the test was to test race condition between main thread
and interrupt handler calling the same function.
To efficiently test this, each handler call should interrupt
main thread to make race more likely.
On very slow devices (like NRF51) when we set very low ticker period
(e.g less then 1000us for NRF51) there is no much time for thread scheduling.
On such slow devices, setting period to 500 us cause that
main thread is scheduled very rarely and only handler is
constantly called making test unreliable.
Fix introduced in #5046 (us ticker: fix fire interrupt handling)
changed fire_interrupt function implementation causing more
interrupt tailing thus even less time for main thread scheduling.
After introduction of #5046 (us ticker: fix fire interrupt handling)
when running sleep_manager_racecondition test on NRF51
(with ticker1.attach_us(&sleep_manager_locking_irq_test, 500);)
test is failing with timeout due to the fact that interrupt
handler is constantly called and main thread is never scheduled.
cmonr pushed a commit that referenced this pull request Feb 23, 2018
sleep_manager_racecondition test fix for devices with low CPU clock

This RP contains fix for sleep_manager_racecondition test
for very slow devices (like NRF51). It fixes the test itself
as well as side effects of fix introduced in
 #5046 (us ticker: fix fire interrupt handling)

The idea of the test was to test race condition between main thread
and interrupt handler calling the same function.
To efficiently test this, each handler call should interrupt
main thread to make race more likely.
On very slow devices (like NRF51) when we set very low ticker period
(e.g less then 1000us for NRF51) there is no much time for thread scheduling.
On such slow devices, setting period to 500 us cause that
main thread is scheduled very rarely and only handler is
constantly called making test unreliable.
Fix introduced in #5046 (us ticker: fix fire interrupt handling)
changed fire_interrupt function implementation causing more
interrupt tailing thus even less time for main thread scheduling.
After introduction of #5046 (us ticker: fix fire interrupt handling)
when running sleep_manager_racecondition test on NRF51
(with ticker1.attach_us(&sleep_manager_locking_irq_test, 500);)
test is failing with timeout due to the fact that interrupt
handler is constantly called and main thread is never scheduled.
@cmonr cmonr merged commit 3d37d81 into ARMmbed:master Feb 26, 2018
@fkjagodzinski fkjagodzinski deleted the timerevent_tests branch February 28, 2018 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fire interrupt function broken for nrf5x targets