New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update timing tests to be robust #4599

Merged
merged 5 commits into from Jul 7, 2017

Conversation

Projects
None yet
6 participants
@studavekar
Collaborator

studavekar commented Jun 21, 2017

In the new approach the host controls the device activity when the test
should start, finish and pass/fail status is send to device. Also deprecate
the test cases which can't accurately test.

Status

READY

Migrations

If this PR changes any APIs or behaviors, give a short description of what API users should do when this PR is merged.

NO

Related PRs

List related PRs against other branches: None

Todos

  • Tests
@0xc0170

Minor things that I believe could help us to maintain/improve these tests

TESTS/mbedmicro-rtos-mbed/basic/main.cpp Outdated
DigitalOut led1(LED1);
void update_tick_thread() {

This comment has been minimized.

@0xc0170

0xc0170 Jun 21, 2017

Member

Please fix the style, if we are changing we should align it. Should be separate commit

TESTS/mbed_drivers/ticker/main.cpp Outdated
utest::v1::status_t one_ticker_case_setup_handler_t(const Case *const source, const size_t index_of_case) {
ticker1 = new Ticker();
return greentea_case_setup_handler(source, index_of_case);
utest::v1::status_t one_ticker_case_setup_handler_t(const Case * const source, const size_t index_of_case) {

This comment has been minimized.

@0xc0170

0xc0170 Jun 21, 2017

Member

I believe this was good as it was *const source ?, same as int *pointer ?

TESTS/host_tests/timing_drift_auto.py Outdated
def compute_parameter(self, failure_criteria=0.05):
nominal_time = (self.round_trip_final_start + self.round_trip_final_end) / 2.0 - (self.round_trip_base_start + self.round_trip_base_end) / 2.0

This comment has been minimized.

@0xc0170

0xc0170 Jun 21, 2017

Member

Can you add any comments to this host files, there are lto of magic numbers, computations. It would explain what it does, for future would help to edit (at least me :-) ).

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

Little math nitpick:

This:

(self.round_trip_final_start + self.round_trip_final_end) / 2.0 - (self.round_trip_base_start + self.round_trip_base_end) / 2.0

Can be simplified to this:

((self.round_trip_final_start + self.round_trip_final_end) - (self.round_trip_base_start + self.round_trip_base_end)) / 2.0

^ that can technically be simplified further, but I think its a bit more obvious what you're doing if you leave it that way.

Not necessary to change though if you think its more readable as is.

TESTS/mbedmicro-rtos-mbed/basic/main.cpp Outdated
#define SIGNAL_PRINT_TICK 0x01
volatile uint32_t callback_trigger_count = 0;

This comment has been minimized.

@0xc0170

0xc0170 Jun 21, 2017

Member

I might be asking too much but I never had an idea what test does, can we add something here to the code file to explain?

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jun 21, 2017

@bridadan can you please review, As Martin suggested will add few comments to explain the magic numbers :)

@bridadan

This comment has been minimized.

Contributor

bridadan commented Jun 21, 2017

@studavekar will do. Would you also mind posting a log from running one of these tests?

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jun 21, 2017

Here are the logs for the run

test_run.txt

@bridadan

Most of the logic looks good! Mostly just commenting on cleaning up some dead code and some print statements.

TESTS/host_tests/timing_drift_auto.py Outdated
# this value is obtained for measurements when there is 0 transport delay and we want accurancy of 5%
time_measurement_multiplier = 80
def _callback_result(self, key, value, timestamp):
# We should not see result data in this test
self.__result = False
def _callback_end(self, key, value, timestamp):

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

Since you're not registering an end callback in setup, I'd remove this function.

TESTS/host_tests/timing_drift_auto.py Outdated
# this value is obtained for measurements when there is 0 transport delay and we want accurancy of 5%
time_measurement_multiplier = 80
def _callback_result(self, key, value, timestamp):

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

Doesn't look like this is being used, I'd remove this.

TESTS/host_tests/timing_drift_auto.py Outdated
def compute_parameter(self, failure_criteria=0.05):
nominal_time = (self.round_trip_final_start + self.round_trip_final_end) / 2.0 - (self.round_trip_base_start + self.round_trip_base_end) / 2.0

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

Little math nitpick:

This:

(self.round_trip_final_start + self.round_trip_final_end) / 2.0 - (self.round_trip_base_start + self.round_trip_base_end) / 2.0

Can be simplified to this:

((self.round_trip_final_start + self.round_trip_final_end) - (self.round_trip_base_start + self.round_trip_base_end)) / 2.0

^ that can technically be simplified further, but I think its a bit more obvious what you're doing if you leave it that way.

Not necessary to change though if you think its more readable as is.

TESTS/host_tests/timing_drift_auto.py Outdated
def compute_parameter(self, failure_criteria=0.05):
nominal_time = (self.round_trip_final_start + self.round_trip_final_end) / 2.0 - (self.round_trip_base_start + self.round_trip_base_end) / 2.0
transport_error = self.round_trip_final_time / 2.0 + self.round_trip_base_time / 2.0

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

Same thing here about simplifying. Not necessary though.

TESTS/host_tests/timing_drift_auto.py Outdated
self.log("min_range : {} , max_range : {}".format(min_range, max_range))
if (min_pass_value >= min_range) and (max_pass_value <= max_range):
self.log("Test passed !!!")

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

Can we change this print to something a bit more formal: "Time within passing range"

TESTS/host_tests/timing_drift_auto.py Outdated
self.log("Test passed !!!")
self.__result = True
elif (min_pass_value > max_range) or (max_pass_value < min_pass_value):
self.log("Test Failed !!!")

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

Can we change this print to something a bit more formal: "Time outside of passing range. Timing drift seems to be present."

TESTS/host_tests/timing_drift_auto.py Outdated
self.__result = False
else:
self.__result = True
self.log("Test Failed/Retry !!!")

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

Can we change this print to something a bit more formal: "Test inconclusive due to transport delay, retrying"

TESTS/host_tests/timing_drift_auto.py Outdated
self.log("Time required to measure drift accurately : {} ".format(measurement_stretch))
if measurement_stretch > self.max_measurement_time:
self.log("Measurement time is high to accurately qualify device timer")

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

The log message is a bit vague here. Could it be: "Time required to determine device timer is too high due to transport delay, skipping"

TESTS/host_tests/timing_drift_auto.py Outdated
if measurement_stretch > self.max_measurement_time:
self.log("Measurement time is high to accurately qualify device timer")
else:
time.sleep(measurement_stretch)

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

Could you print here that you'll be waiting for x seconds? Just so the user knows how long to wait.

TESTS/host_tests/timing_drift_auto.py Outdated
self.log("Actual Time : {}".format(device_time))
self.log("Nominal Time : {}".format(nominal_time))
self.log("min_pass : {} , max_pass_value : {}".format(min_pass_value, max_pass_value))

This comment has been minimized.

@bridadan

bridadan Jun 21, 2017

Contributor

Could you print min_pass_value instead of min_pass for consistency?

@studavekar studavekar force-pushed the studavekar:timing_tests_update branch Jun 21, 2017

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jun 22, 2017

triggering a request to check CI if it effects other platform, have tested locally on couple of devices.

/morph test

@mbed-bot

This comment has been minimized.

mbed-bot commented Jun 22, 2017

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test

Output

mbed Build Number: 608

Build failed!

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jun 22, 2017

/morph test

@mbed-bot

This comment has been minimized.

mbed-bot commented Jun 22, 2017

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test

Output

mbed Build Number: 610

Test failed!

@studavekar studavekar force-pushed the studavekar:timing_tests_update branch Jun 22, 2017

TESTS/mbed_drivers/ticker/main.cpp Outdated
Case("Timers: 2x tickers", two_ticker_case_setup_handler_t, test_case_2x_ticker, two_ticker_case_teardown_handler_t),
Case("Timers: 2x callbacks", two_ticker_case_setup_handler_t, test_case_2x_callbacks,two_ticker_case_teardown_handler_t),
};
Case("Timers: 1x ticker", one_ticker_case_setup_handler_t,test_case_1x_ticker, one_ticker_case_teardown_handler_t),

This comment has been minimized.

@bridadan

bridadan Jun 22, 2017

Contributor

Looks like the indent got really wacky here!

Thread gt_conn_thread(osPriorityNormal, TEST_STACK_SIZE);
tick_thread.start(update_tick_thread);
gt_conn_thread.start(gt_comm_wait_thread);

This comment has been minimized.

@bridadan

bridadan Jun 22, 2017

Contributor

I would recommend not creating another thread for this and just to run it on the main stack to reduce memory requirements.

Oops, just realized you need control over the priorities, so disregard my comment

studavekar added some commits Jun 21, 2017

Update timing tests to be robust.
In the new approach the host controls the device activity when the test
should start, finish and pass/fail status is send to device. Also deprecate
the test cases which can't accurately test.
Deprecate testcases related to time drift.
1) RtosTimer is been Deprecated by EventQueue.
2) wait_us testing can't be reliably tested.
Increased stack to 1024 from 768 and corrected assert.
1) mbed-os-tests-mbedmicro-rtos-mbed-basic tests use to fail when run in loop
after 20-25, This was because the stack used by test application was in range of
744-840 bytes. So bumped the stack size to 1024 bytes.

2) Corrected the assert and taken out the assert which was not required.

@studavekar studavekar force-pushed the studavekar:timing_tests_update branch to 7390725 Jun 22, 2017

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jun 22, 2017

have taken care of most of the review comments, will trigger nightly to check if this patch helps with the results

/morph test-nightly

@mbed-bot

This comment has been minimized.

mbed-bot commented Jun 23, 2017

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 618

Test failed!

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jun 23, 2017

NCS36510 ticker test failed as per logs looks like timer is ticking faster for this device. Need to investigate more on it.

@0xc0170

This comment has been minimized.

Member

0xc0170 commented Jun 29, 2017

@studavekar Any update for this patch?

@c1728p9

This comment has been minimized.

Contributor

c1728p9 commented Jun 29, 2017

You could probably get rid of most of these tests since they are testing the same functionality of the us ticker. A more concise test could test the us ticker in the hal directly by reading the time rather than incrementing a variable on an interrupt.

In addition, if you test against a ticker_data_t object then you could use the same test to for both the us ticker (via get_us_ticker_data) and the lp ticker (via get_lp_ticker_data).

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jun 29, 2017

We have seen with current implementation NCS36510 timer is off by 1 sec which caused the failure, That is not the case with other devices. However as suggested we can read us ticker values directly.

@0xc0170

This comment has been minimized.

Member

0xc0170 commented Jun 30, 2017

@c1728p9 Valid points. I reviewed the ticker test, and came to the conclusion is not only testing timing via interrupts, but also how fast can you remove/add a new callback - I suspect this is why on semi board fails, because detach() - removing from queue, is under critical section, and onsemi has 2x 16bit tickers using ISR. I suspect this is the main problem (it accumulates by the time), and if you remove detach/attach, and have just one callback , it ticks as it should - no time drift. I tried to improve ticker in on semi target, there's some code in HAL that can be removed, still was failing though.
Even the first test case (1x callback) is for onsemi on the limit (I get failures from time to time):

[1498820977.89][HTST][INF] min_pass : 8.73335007429 , max_pass : 9.54240006208 for 5.0%%
[1498820977.89][HTST][INF] min_inconclusive : 8.63360005617 , max_inconclusive : 9.65265008211
[1498820977.89][SERI][TXD] {{pass;0}}
[1498820977.89][HTST][INF] Time reported by device: 9.458

very very close for this target :) We can squeeze it by few more milliseconds

We shall rethink entire ticker test. As above, we can test us ticker in HAL tests, and having this attach/detach might require more time than this test currently sets. Reading the tickers directly would be much more accurate.

We should move forward with this, timing failures the last couple of days are blocking patches. I can help, let me know your thoughts and lets make this in.

@0xc0170

This comment has been minimized.

Member

0xc0170 commented Jun 30, 2017

This is my proposal:

Remove attach/detach from this test, use 2 tickers, one firing at 0,2,4 intervals, the other one at 1,3,5 . Same result as this one, should not have any timedrift caused by attach/detach and basically the functionality we care about in this test. Detach/attach might be another test case, here is my commit 0xc0170@31c3734

the timing is still close to the upper limit but passing on ARM and GCC ARM . Shall we make this change?

I got also ncs changes but it hangs from time to time, exploring what is going on., My goal is that in ticker ISR, we hsould not read(), but rather use a reminder, here is a snippet 0xc0170@05fe987

@0xc0170

This comment has been minimized.

Member

0xc0170 commented Jun 30, 2017

the timing is still close to the upper limit but passing on ARM and GCC ARM . Shall we make this change?
I got also ncs changes but it hangs from time to time, exploring what is going on., My goal is that in ticker ISR, we hsould not read(), but rather use reminder, here is a snippet 0xc0170/mbed-os@05fe987

An update: This commit now 0xc0170@b3d3e3c is passing for GCC and ARM. We are not at the upper limit anymore. Somewhere in the middle . compare this to the above I posted.

[1498824908.79][HTST][INF] min_pass : 8.80935000181 , max_pass : 9.62219978571 for 5.0%%
[1498824908.79][HTST][INF] min_inconclusive : 8.70579980612 , max_inconclusive : 9.736650002
[1498824908.79][HTST][INF] Time reported by device: 9.211
@0xc0170

This comment has been minimized.

Member

0xc0170 commented Jun 30, 2017

I accidentally closed this, sorry ! Reopened. Please review my two commits, tests are now passing with some margin ! (using heap/stack statistics, previously I was testing before as it was failing even without them).

This is for 1x callback, GCC ARM, on semi target, we can see that from times close to the upper limit we are much much down.

[1498825219.20][HTST][INF] min_pass : 8.88820017576 , max_pass : 9.70304993391 for 5.0%%
[1498825219.20][HTST][INF] min_inconclusive : 8.7789499402 , max_inconclusive : 9.82380019426
[1498825219.20][HTST][INF] Time reported by device: 9.292

Based on this, we might not even need to update the ticker test, just this on semi ticker ! tested on this PR, all passing.

@studavekar studavekar changed the title from Update timing tests to be robust. to Update timing tests to be robust Jun 30, 2017

@0xc0170

This comment has been minimized.

Member

0xc0170 commented Jul 3, 2017

@studavekar What is the status of this PR? Besides the ncs fix, that should land soon, what else is missing from this patch? Just CI and ready for review/integration?

It is important to fix ticker/timer tests and their implementations. As I noticed, the one device left is that maxim, that might also have some ticker problems. Anyone investigated this already? I don't have this target available at the moment to investigate.

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jul 3, 2017

@0xc0170 once ncs fix in in, we should be good with this PR. Have not investigated maxim device.

@theotherjimmy

This comment has been minimized.

Contributor

theotherjimmy commented Jul 3, 2017

/morph test-nightly

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jul 3, 2017

Will also trigger export-build in parallel to this PR to check if it can handle load and actually fixes the issue.

@mbed-bot

This comment has been minimized.

mbed-bot commented Jul 3, 2017

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 723

Test failed!

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jul 3, 2017

features_netsocket_cellular_generic_modem_driver_tests_unit_tests_default tests failed on all the platform , #4672 should fix it.

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jul 4, 2017

/morph test-nightly

@0xc0170 0xc0170 added needs: CI and removed needs: work labels Jul 4, 2017

@mbed-bot

This comment has been minimized.

mbed-bot commented Jul 4, 2017

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 726

Test failed!

@0xc0170

This comment has been minimized.

Member

0xc0170 commented Jul 4, 2017

MAX32625MBED-IAR.tests-mbedmicro-rtos-mbed-basic.tests-mbedmicro-rtos-mbed-basic failure :( @studavekar Do you have this target to test and reproduce this?

@bridadan

@0xc0170 @studavekar I can see why this one failed! And its very fixable 😄 Please see my comments on where you need to make the changes. I'll explain below.

This test failed because of a "feature" of greentea. If a greentea doesn't see a response to the first __sync message, it will send another one after a second. This is done in the off case that the device hasn't fully booted when the first one is sent. However this can be caused by high transport delay as well.

From the log:

[1499167387.28][HTST][INF] host test executor ver. 1.1.8
[1499167387.28][HTST][INF] copy image onto target...
[1499167387.28][COPY][INF] Waiting up to 60 sec for '04150000b08042ed00000000000000000000000097969904' mount point (current is 'H:')...
        1 file(s) copied.
[1499167393.71][HTST][INF] starting host test process...
[1499167394.07][CONN][INF] starting connection process...
[1499167394.07][CONN][INF] notify event queue about extra 60 sec timeout for serial port pooling
[1499167394.07][CONN][INF] initializing serial port listener... 
[1499167394.07][PLGN][INF] Waiting up to 60 sec for '04150000b08042ed00000000000000000000000097969904' serial port (current is 'COM8')...
[1499167394.08][HTST][INF] setting timeout to: 60 sec
[1499167394.43][SERI][INF] serial(port=COM8, baudrate=9600, read_timeout=0.01, write_timeout=5)
[1499167394.43][SERI][INF] reset device using 'default' plugin...
[1499167394.68][SERI][INF] waiting 1.00 sec after reset
[1499167395.68][SERI][INF] wait for it...
[1499167395.68][SERI][TXD] mbedmbedmbedmbedmbedmbedmbedmbedmbedmbed
[1499167395.68][CONN][INF] sending up to 2 __sync packets (specified with --sync=2)
[1499167395.68][CONN][INF] sending preamble '69235bae-b7bb-435c-a1fd-f8370feebe91'
[1499167395.68][SERI][TXD] {{__sync;69235bae-b7bb-435c-a1fd-f8370feebe91}}
[1499167396.70][CONN][INF] resending new preamble 'cd7132ca-d89f-4f91-ac13-b36caa40763f' after 1.02 sec
[1499167396.70][SERI][TXD] {{__sync;cd7132ca-d89f-4f91-ac13-b36caa40763f}}

The key line there being resending new preamble 'cd7132ca-d89f-4f91-ac13-b36caa40763f' after 1.02 sec. So at this point, the device should have two __sync messages in the serial buffer.

However, the greentea will only remove the first __sync message from the buffer. The test will then continue. For these tests, you send the initial request from the device with the key timing_drift_check_start. Then a call to greentea_parse_kv is made. This will actually pull the second __sync message from the serial buffer, NOT the responds from the host test for timing_drift_check_start. This will cause the test to use the UUID sent as the start time, causing the test to fail.

So TL;DR, in the places where I've specified in this review, just call greentea_parse_kv in a loop and check to make sure the key received matches the key you're expecting. It's only necessary to do this at the start of each test case.

ticker1->attach_us(&ticker_callback_0, ONE_MILLI_SEC);
// wait for 1st signal from host
greentea_parse_kv(_key, _value, sizeof(_key), sizeof(_value));

This comment has been minimized.

@bridadan

bridadan Jul 5, 2017

Contributor

Please call this in a loop, double checking that the key is what you expect it to be.

ticker1->attach_us(ticker_callback_1_switch_to_2, ONE_MILLI_SEC);
// wait for 1st signal from host
greentea_parse_kv(_key, _value, sizeof(_key), sizeof(_value));

This comment has been minimized.

@bridadan

bridadan Jul 5, 2017

Contributor

Please call this in a loop, double checking that the key is what you expect it to be.

timeout.attach_us(set_incremeant_count, ONE_MILLI_SEC);
// wait for 1st signal from host
greentea_parse_kv(_key, _value, sizeof(_key), sizeof(_value));

This comment has been minimized.

@bridadan

bridadan Jul 5, 2017

Contributor

Please call this in a loop, double checking that the key is what you expect it to be.

Thread tick_thread(osPriorityNormal, TEST_STACK_SIZE);
tick_thread.start(print_tick_thread);
// wait for 1st signal from host
greentea_parse_kv(_key, _value, sizeof(_key), sizeof(_value));

This comment has been minimized.

@bridadan

bridadan Jul 5, 2017

Contributor

Please call this in a loop, double checking that the key is what you expect it to be.

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jul 5, 2017

Yeah i thought the uart stream is corrupted with below string received on host.

 found KV pair in stream: {{__sync;842000}}, queued... 

Can loop forever till we get the proper key.

@bridadan

This comment has been minimized.

Contributor

bridadan commented Jul 5, 2017

Yeah not a corruption, really just greentea attempting to ensure a handshake happens when the system is under heavy load. When tests don't have a heavy reliance (for messages containing test information) on the host PC, you never notice it doing this and it just works. So in this case you just have to handle this behavior in the test.

validate base_time key received
This is required as greentea can send multiple __sync packets and cause the
to be in unexpected state.
@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jul 5, 2017

Have added a check to validate the key received from greentea. Also have triggered morph test and export build in parallel to load the systems. We shouldn't have any timing tests failure under such load.

/morph test-nightly

@mbed-bot

This comment has been minimized.

mbed-bot commented Jul 5, 2017

Result: FAILURE

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 735

Test failed!

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jul 5, 2017

Changes are good there are no timing tests failure :). tests-mbed_functional-functionpointer test failed but doesn't seem to be related.

@studavekar

This comment has been minimized.

Collaborator

studavekar commented Jul 5, 2017

/morph test-nightly

@mbed-bot

This comment has been minimized.

mbed-bot commented Jul 6, 2017

Result: SUCCESS

Your command has finished executing! Here's what you wrote!

/morph test-nightly

Output

mbed Build Number: 737

All builds and test passed!

@theotherjimmy theotherjimmy merged commit 50fdca8 into ARMmbed:master Jul 7, 2017

4 checks passed

Cam-CI uvisor Build & Test Success
Details
ci/morph-test-nightly Job has completed
Details
continuous-integration/jenkins/pr-head This commit looks good
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment