Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2017.10 - RC1 #47

Closed
43 tasks done
haukepetersen opened this issue Oct 16, 2017 · 56 comments
Closed
43 tasks done

Release 2017.10 - RC1 #47

haukepetersen opened this issue Oct 16, 2017 · 56 comments

Comments

@haukepetersen
Copy link
Contributor

haukepetersen commented Oct 16, 2017

@miri64
Copy link
Member

miri64 commented Oct 17, 2017

Task 1 successful

@smlng
Copy link
Member

smlng commented Oct 17, 2017

Task 5 semi successful, ping6 on ff02::1

  • from samr21-xpro ✅, but 30% packet loss
  • from remote 🚫, gets stuck after first ping

see gist for details

@miri64
Copy link
Member

miri64 commented Oct 17, 2017

Task 5 semi successful, ping6 on ff02::1

  • from samr21-xpro ✅, but 30% packet loss
  • from remote 🚫, gets stuck after first ping

see gist for details

Since they are experimental I wouldn't say it is blocking for this release (however @haukepetersen has the last word on this), but could you maybe debug, what the problem is? For iotlab-m3 and samr21-xpro it works like a charm.

@smlng
Copy link
Member

smlng commented Oct 17, 2017

Task 06 semi successful, ping6 on fe80:: of the other node

  • from samr21-xpro to remote ✅ , but nearly 40% packet loss
  • from remote to samr21-xpro 🚫, gets stuck even before first ping

results are in gist as above.

@miri64 I think the remote gets stuck when sending ping, because answering pings works.

@kYc0o
Copy link
Contributor

kYc0o commented Oct 17, 2017

[edit]
Interop 08 Task 3 successful!! 😃
[/edit]

@miri64
Copy link
Member

miri64 commented Oct 17, 2017

@kYc0o can you write somewhere down what you did, so other people than you can test this in the future as well? ;-)

@smlng
Copy link
Member

smlng commented Oct 17, 2017

Since they are experimental I wouldn't say it is blocking for this release

well if ping is not working on Zolertia remote (maybe no send at all) this would IMO be blocking. Will investigate ...

@miri64
Copy link
Member

miri64 commented Oct 18, 2017

Task 1.1 failing / not completable. pic32-wifire fails randomly for some applications + even with 170 GB free disk space I run into a problem of a full disk at about 20 applications in (tried to avoid it by manually cleaning up afterwards, but it just takes to long so the disk ran full tonight as of gnrc_udp). Potential solution would be to clean build dirs directly after build. I don't want to skip this Task with the excuse "well Murdock builds it anyway", since we shouldn't trust automation too blindly (see RIOT-OS/RIOT#7741). This is nothing against Murdock in particular, just my personal stance regarding automation in general ;-).

@miri64
Copy link
Member

miri64 commented Oct 18, 2017

well if ping is not working on Zolertia remote (maybe no send at all) this would IMO be blocking. Will investigate ...

Could you at least provide an issue (if not already existent) to the mainline repository ;-)?

@miri64
Copy link
Member

miri64 commented Oct 18, 2017

conn_can fails for mysterious reasons. When building separately it works, but with buildtest many boards fail (and since RIOT-OS/RIOT#7589 there is no error output anymore). Will investigate.

@miri64
Copy link
Member

miri64 commented Oct 18, 2017

conn_can fails for mysterious reasons. When building separately it works, but with buildtest many boards fail (and since RIOT-OS/RIOT#7589 there is no error output anymore). Will investigate.

buildtest changes the environment so a certain lib (though I don't think it is required for non-native boards) can't be found. Will see, if I can fix.

@vincent-d
Copy link
Member

@miri64 after I saw you message, I tested it quickly. It can't fin libsocketcan which is required only for native, indeed. But it don't understand why buildtest adds it to the LINKFLAGS on non-native boards.

@miri64
Copy link
Member

miri64 commented Oct 18, 2017

Analysed that: it is set in the environment when starting to build (reasons unclear) and the buildtests just take that environment.

@miri64
Copy link
Member

miri64 commented Oct 18, 2017

The pic32-clicker/pic32-wifire (now after analyzing the problem I think both are affected, not just pic32-wifire) is caused by the following bug:

/home/mlenders/Repositories/RIOT-OS/RIOT/examples/gnrc_networking/bin/pic32-wifire/shell_commands.a(sc_random.o): In function `_random_init':
sc_random.c:(.text._random_init+0xc): undefined reference to `timer_read'
/home/mlenders/Repositories/RIOT-OS/RIOT/examples/gnrc_networking/bin/pic32-wifire/shell_commands.a(sc_icmpv6_echo.o): In function `.L61':
sc_icmpv6_echo.c:(.text._icmpv6_ping+0x1f8): undefined reference to `timer_read'
/home/mlenders/Repositories/RIOT-OS/RIOT/examples/gnrc_networking/bin/pic32-wifire/shell_commands.a(sc_icmpv6_echo.o): In function `.L63':
sc_icmpv6_echo.c:(.text._icmpv6_ping+0x250): undefined reference to `timer_read'
/home/mlenders/Repositories/RIOT-OS/RIOT/examples/gnrc_networking/bin/pic32-wifire/xtimer.a(xtimer_core.o): In function `_xtimer_lltimer_now':
../include/xtimer/implementation.h:(.text._xtimer_lltimer_now+0x0): undefined reference to `timer_read'
/home/mlenders/Repositories/RIOT-OS/RIOT/examples/gnrc_networking/bin/pic32-wifire/xtimer.a(xtimer_core.o): In function `_lltimer_set':
../include/xtimer/implementation.h:(.text._lltimer_set+0xa): undefined reference to `timer_set_absolute'
/home/mlenders/Repositories/RIOT-OS/RIOT/examples/gnrc_networking/bin/pic32-wifire/xtimer.a(xtimer_core.o): In function `xtimer_init':
../include/xtimer/implementation.h:(.text.xtimer_init+0x16): undefined reference to `timer_init'
/home/mlenders/Repositories/RIOT-OS/RIOT/examples/gnrc_networking/bin/pic32-wifire/xtimer.a(xtimer.o): In function `_xtimer_spin':
xtimer.c:(.text._xtimer_spin+0x6): undefined reference to `timer_read'
xtimer.c:(.text._xtimer_spin+0xe): undefined reference to `timer_read'
/home/mlenders/Repositories/RIOT-OS/RIOT/examples/gnrc_networking/bin/pic32-wifire/core.a(kernel_init.o): In function `.L4':
kernel_init.c:(.text.idle_thread+0x4): undefined reference to `pm_set_lowest'
/home/mlenders/Repositories/RIOT-OS/RIOT/examples/gnrc_networking/bin/pic32-wifire/gnrc_rpl.a(gnrc_rpl_dodag.o): In function `.L45':
gnrc_rpl_dodag.c:(.text.gnrc_rpl_parent_remove+0x3e): undefined reference to `timer_read'
collect2: error: ld returned 1 exit status
make[1]: *** [link] Error 1

@miri64
Copy link
Member

miri64 commented Oct 18, 2017

I will check if that is a regression of the various pm fixes in this release.

@haukepetersen
Copy link
Contributor Author

Have been running all the tests that are automatically runnable for 02-Tests. So far, the following findings need to be investigated (as they failed right away):

[native]
lwip                           building [OK] flashing [OK] testing [FAILED]

[samr21-xpro]
cbor                           building [OK] flashing [OK] testing [FAILED]
driver_ds1307                  building [OK] flashing [OK] testing [FAILED]
evtimer_msg                    building [OK] flashing [OK] testing [FAILED]
evtimer_underflow              building [OK] flashing [OK] testing [FAILED]
gnrc_ipv6_ext                  building [OK] flashing [OK] testing [FAILED]
gnrc_ipv6_nib                  building [OK] flashing [OK] testing [FAILED]
gnrc_ipv6_nib_6ln              building [OK] flashing [OK] testing [FAILED]
gnrc_ndp2                      building [OK] flashing [OK] testing [FAILED]
gnrc_netif2                    building [OK] flashing [OK] testing [FAILED]
gnrc_sixlowpan                 building [OK] flashing [OK] testing [FAILED]
lwip                           building [OK] flashing [OK] testing [FAILED]
mutex_order                    building [OK] flashing [OK] testing [FAILED]
od                             building [OK] flashing [OK] testing [FAILED]
posix_semaphore                building [OK] flashing [OK] testing [FAILED]
rmutex                         building [OK] flashing [OK] testing [FAILED]
thread_flood                   building [OK] flashing [OK] testing [FAILED]
xtimer_usleep                  building [OK] flashing [OK] testing [FAILED]

[nucleo144-f746]
cbor                           building [OK] flashing [OK] testing [FAILED]
driver_hd44780                 building [OK] flashing [OK] testing [FAILED]
evtimer_msg                    building [OK] flashing [OK] testing [FAILED]
evtimer_underflow              building [OK] flashing [OK] testing [FAILED]
gnrc_ipv6_ext                  building [OK] flashing [OK] testing [FAILED]
gnrc_ipv6_nib                  building [OK] flashing [OK] testing [FAILED]
gnrc_ipv6_nib_6ln              building [OK] flashing [OK] testing [FAILED]
gnrc_ndp2                      building [OK] flashing [OK] testing [FAILED]
gnrc_netif2                    building [OK] flashing [OK] testing [FAILED]
gnrc_sixlowpan                 building [OK] flashing [OK] testing [FAILED]
lwip                           building [OK] flashing [OK] testing [FAILED]
mutex_order                    building [OK] flashing [OK] testing [FAILED]
od                             building [OK] flashing [OK] testing [FAILED]
posix_semaphore                building [OK] flashing [OK] testing [FAILED]
rmutex                         building [OK] flashing [OK] testing [FAILED]
thread_flood                   building [OK] flashing [OK] testing [FAILED]
unittests                      building [OK] flashing [OK] testing [FAILED]

See https://gist.github.com/haukepetersen/39f77ce8db55959ff270e6fb2223b9f4 for more detailed results.
INFO: for the test run, I rebased the RC1 branch on RIOT-OS/RIOT#7758 and RIOT-OS/RIOT#7756

@miri64
Copy link
Member

miri64 commented Oct 19, 2017

I rebased the RC1 branch on RIOT-OS/RIOT#7758 and RIOT-OS/RIOT#7756

Wouldn't cherry-picking those commits on top of RC1 make more sense? Otherwise changes from master might get into your tests.

@haukepetersen
Copy link
Contributor Author

Actually, that is what I did. Did just use the wrong wording, sorry.

@haukepetersen
Copy link
Contributor Author

Also, the run on the iotlab-m3 finished: https://gist.github.com/haukepetersen/dabb8a8e8e2220705c36af67b4119e70

They look pretty similar to the other two boards...

@haukepetersen
Copy link
Contributor Author

added a list with open issues in the original issue description above. Feel free to take on items (by putting your name to them)

@miri64
Copy link
Member

miri64 commented Oct 19, 2017

Current status of Task 1.1 (still running): pkg_fatfs is failing, but there is a fix: RIOT-OS/RIOT#7765. conn_can is also failing, but the fix was already merged.

@miri64
Copy link
Member

miri64 commented Oct 19, 2017

@haukepetersen for tests/lwip: How did you manage the two boards (which are required for that test)? I never got the pexpect script properly to run for non-native platforms.

@kYc0o
Copy link
Contributor

kYc0o commented Oct 19, 2017

Also, the run on the iotlab-m3 finished: https://gist.github.com/haukepetersen/dabb8a8e8e2220705c36af67b4119e70

There are some tests that shouldn't be even build is it? Especially those about the drivers we know are not attached to the device natively.

This raises another question: how do we test device drivers which are not attached? Because as I see in the results they're of course failing.

Then I also see some test (like driver_grove_ledbar) which build and the test succeeds... (knowing that there's no such device attached to the node).

@haukepetersen
Copy link
Contributor Author

@haukepetersen for tests/lwip: How did you manage the two boards (which are required for that test)? I never got the pexpect script properly to run for non-native platforms.

I did not. All I did is to go into the tests/lwip folder and call make all test. So if there is more to this test than just that, it should be documented and the tests should behave nicely, e.g. sorry, can't run, you need to boards.... As it does neither, I call the test failed :-)

@haukepetersen
Copy link
Contributor Author

There are some tests that shouldn't be even build is it? Especially those about the drivers we know are not attached to the device natively.

Nope, there is no reason why they should not run. All we expect in that case is that the test will stop in a defined state, e.g. quitting while telling us no xxx device found or similar.

Because as I see in the results they're of course failing.

No, they are not neccessarily failing. Failing IMHO means that they do something unexpected (or do nothing...). If a sensor is not present, the test should still run and end up in a defined state, which can be checked.

Then I also see some test (like driver_grove_ledbar) which build and the test succeeds...

These tests might well be successful from a software point of view: as some devices simply output stuff e.g. with their GPIO pins, they have no means of checking what actually happens on that pin. So from their view, the test runs just fine (as the driver initialization and further functions do not return any errors).

@kYc0o
Copy link
Contributor

kYc0o commented Oct 19, 2017

Nope, there is no reason why they should not run. All we expect in that case is that the test will stop in a defined state, e.g. quitting while telling us no xxx device found or similar.

So what are we actually testing? The driver implementation? The interface with the board? The expected behaviour of the driver if everything is in place (e.g. device attached to the node)?

These tests might well be successful from a software point of view: as some devices simply output stuff e.g. with their GPIO pins, they have no means of checking what actually happens on that pin. So from their view, the test runs just fine (as the driver initialization and further functions do not return any errors).

That might be true for some drivers which have no means to know if the device is actually there and if it's able to return error codes. But in reality we have much more drivers which can (and actually do) return errors, which are actually parsed either by the test of by the interface to which they're attached.

So, TL;DR, At what point it make sense to perform all this bunch of tests? What do we intend to test? Are all the tests written having this in mind?

@smlng
Copy link
Member

smlng commented Oct 19, 2017

I re-tested task 5 and 6 of 04-Single Hop 6LoWPAN ICMP with fix for cc2538 applied. The test went trough, but have 30 to 50 % packet loss in both directions, goal is 10 % - so suboptimal results.

@cladmi
Copy link
Contributor

cladmi commented Oct 19, 2017

@miri64 I am trying lwip/native, and messages do not go from one node to another but I can manually send ip packets to the server using socat. With tcpdump I can see neighbor solicitation messages arrive on the tap interface but nothing goes to the other node interface. Do I need a specific configuration ?

@cladmi
Copy link
Contributor

cladmi commented Oct 19, 2017

Disabling tests for other boards in lwip RIOT-OS/RIOT#7766

@haukepetersen
Copy link
Contributor Author

@kYc0o I see your point. But for know, my take is to go in small steps: first, lets try to make all tests terminate in a deterministic manner, e.g. sensor test aborts saying the sensor is not present and the test script for this takes this as passing the test. Once we have all tests in order, we need to a) execute many of them also on setups where they are applicable, and also go through all tests and see if they actually test what they are supposed to test and fix them if needed...

@cladmi
Copy link
Contributor

cladmi commented Oct 19, 2017

Many of the tests are broken because the Makefiles do

tests:
# `testrunner` calls `make term` recursively, results in duplicated `TERMFLAGS`.
# So clears `TERMFLAGS` before run.
    TERMFLAGS= tests/01-run.py

And make/serial.inc.mk does not rewrite TERMFLAGS if it is empty, changed by: RIOT-OS/RIOT@2739354

Not all tests have this TERMFLAGS=.

Where does this multiple make term happen ? Because I could remove this TERMFLAGS= if it is not needed, or replace it by unset TERMFLAGS; ./test/01.py and maybe add it to all tests ?
Just need to know how it is used.

@kYc0o
Copy link
Contributor

kYc0o commented Oct 19, 2017

Oops, I performed a test and marked it as successful but the main message got modified. Can we take it back or is lost forever?

@cladmi
Copy link
Contributor

cladmi commented Oct 19, 2017

cbor/samr21-xpro: cbor_stream_decode is completely broken it makes CPU reboot for the default case, when re-ordering I get HARD FAULT or it cannot decode multiple arrays.
I will try to compare the buffer with native if it is only decode or the serialized data.

Also for the test, printf cannot handle PRIu64 which is just printed lu. Is there a solution for this ?

@jnohlgard
Copy link
Member

Regarding PRIu64, you need a newlib version built with long long support for printf. I think the arm gcc embedded toolchain is supposed to have this, not sure though.

@kYc0o
Copy link
Contributor

kYc0o commented Oct 19, 2017

Spec 8 successfully completed.

@cladmi
Copy link
Contributor

cladmi commented Oct 19, 2017

To run tests on samr21, testrunner needs to wait that make term is started to run make reset. RIOT-OS/RIOT#7769

Some more tests are working with this.

@miri64
Copy link
Member

miri64 commented Oct 19, 2017

Task 1.1 finally finished. Noted down the tests that failed (apart from the pic32 issue) above. For most issues PR already exist and some are even merged already. Only issue remaining is the mcuboot thing. @kYc0o can you have a look?

@kYc0o
Copy link
Contributor

kYc0o commented Oct 20, 2017

@miri64 Can you copy/paste here or somewhere what fails?

@miri64
Copy link
Member

miri64 commented Oct 20, 2017

@kYc0o

$ make -C tests/mcuboot/
make: Entering directory '/home/mlenders/Repositories/RIOT-OS/RIOT/tests/mcuboot'
Traceback (most recent call last):
  File "/home/mlenders/Repositories/RIOT-OS/RIOT/dist/tools/mcuboot/imgtool.py", line 4, in <module>
    from imgtool import keys
  File "/home/mlenders/Repositories/RIOT-OS/RIOT/dist/tools/mcuboot/imgtool/keys.py", line 5, in <module>
    from Crypto.Hash import SHA256
ImportError: No module named 'Crypto'
/home/mlenders/Repositories/RIOT-OS/RIOT/makefiles/multislot.mk:18: recipe for target '/home/mlenders/Repositories/RIOT-OS/RIOT/tests/mcuboot/bin/nrf52dk/key.pem' failed
make: *** [/home/mlenders/Repositories/RIOT-OS/RIOT/tests/mcuboot/bin/nrf52dk/key.pem] Error 1
make: Leaving directory '/home/mlenders/Repositories/RIOT-OS/RIOT/tests/mcuboot'

@miri64
Copy link
Member

miri64 commented Oct 20, 2017

Found no doc on that dependency, installing python-crypto did not solve that.

@miri64
Copy link
Member

miri64 commented Oct 20, 2017

[native]
lwip                           building [OK] flashing [OK] testing [FAILED]

I was unable to confirm that. Were your TAP interfaces set-up properly (I think it is fair to assume a test server would do that once at start-up and not has to do it for every test)? [edit]You need two TAP interfaces connected by a bridge (i.e. what ./dist/tools/tapsetup/tapsetup provides you with)[/edit]

@miri64
Copy link
Member

miri64 commented Oct 20, 2017

$ make -C tests/lwip test
make: Entering directory '/home/mlenders/Repositories/RIOT-OS/RIOT/tests/lwip'
./tests/01-run.py
Testing for (<Board 'native',port='tap0',serial=None>, <Board 'native',port='tap1',serial=None>): 
....
make: Leaving directory '/home/mlenders/Repositories/RIOT-OS/RIOT/tests/lwip'

@cladmi
Copy link
Contributor

cladmi commented Oct 20, 2017

For samr21-xpro, I tested the 'failed' tests. I needed to add the testrunner commit to wait for make term to be started. Then more of them passed.

  • unittests: does not compile because not everything fits into ROM/RAM.

Broken but easy fixes

  • lwip: test only works on native: tests/lwip: only enable test for native board RIOT#7766

  • thread_flood: python test is not upgraded to testrunner so does not do 'make reset'. However the test is working

  • posix_semaphore: the test code is broken, it uses 3 timestamps instead of 2, does weird comparisons and takes stop timestamp after doing printf. I will provide a fix for it.
    For native it is also broken some times, because we expect to return within 100usec after, and linux is not real time so it can fail. tests/posix semaphore: fix test4 RIOT#7782

  • gnrc_ipv6_ext: requires ENABLE_DEBUG in gnrc_ipv6.c, would it make sense to check that it is done using grep ? It cannot be configured from the test because of recursive makefiles. tests/posix semaphore: fix test4 RIOT#7782

Work to do

Broken tests because of taking current timestamp

Tests read current timestamp between expect calls, and does some verification, but this is invalid. Expect is doing some buffering and reads data in advance. So does not return from expect as soon as the characters arrive.

Tests execution with make term is ok.

  • gnrc_sock_ip
  • gnrc_sock_udp
  • xtimer_usleep

@cladmi
Copy link
Contributor

cladmi commented Oct 20, 2017

I was unable to confirm that. Were your TAP interfaces set-up properly (I think it is fair to assume a test server would do that once at start-up and not has to do it for every test)?

If it could be simply detected, it would be good to check it is configured before running test or else print an error.

@miri64
Copy link
Member

miri64 commented Oct 20, 2017

If it could be simply detected, it would be good to check it is configured before running test or else print an error.

See haukepetersen/riotsandbox#1

@miri64
Copy link
Member

miri64 commented Oct 20, 2017

(also: it is. If you don't have a tap make term will error on execution (which will also be an error with make test). If the tap is connected via a bridge is not easily (portable) testable.

@miri64
Copy link
Member

miri64 commented Oct 20, 2017

driver_ds1307: not working

They are failing because the device is not connected. (How) can one check for this in code?

@kYc0o
Copy link
Contributor

kYc0o commented Oct 20, 2017

See comment.

@miri64
Copy link
Member

miri64 commented Oct 20, 2017

This does not really answer my question... I know that the device needs to be their and from what I know of the future test system you can actually say that it is provided, but there need to be some safe-state if it is not so that "uninformed" testers running the test locally don't start complaining "hey this test isn't working so it must be broken" (which IMHO it is if there is no safe-state, but that is just my 2ct).

So, TL;DR, At what point it make sense to perform all this bunch of tests? What do we intend to test? Are all the tests written having this in mind?

At the minimum: that the driver compiles for that particular and if you have you can also check if it works

@A-Paul
Copy link
Member

A-Paul commented Oct 23, 2017

I did the tests on native in the old fashioned way. Almost all succeded. I made some comments in the issue I have created.

@miri64
Copy link
Member

miri64 commented Oct 23, 2017

So, TL;DR, At what point it make sense to perform all this bunch of tests? What do we intend to test? >> Are all the tests written having this in mind?

At the minimum: that the driver compiles for that particular and if you have you can also check if it works

Also: testing if the initialization is working properly in an error case is also a valid test ;-)

@miri64
Copy link
Member

miri64 commented Oct 23, 2017

driver_ds1307                  building [OK] flashing [OK] testing [FAILED]

Fixed in RIOT-OS/RIOT#7788

@miri64
Copy link
Member

miri64 commented Oct 24, 2017

Any reason why my multihop results were reseted?

@cgundogan
Copy link
Member

cgundogan commented Oct 24, 2017

Any reason why my multihop results were reseted?

damn these GitHub checkmarks. I started doing them, because they were not checked :/ I will skip them now.

@haukepetersen
Copy link
Contributor Author

To prevent any further hickups while editing the issue above, please use from now on the following google document for tracking:
https://docs.google.com/spreadsheets/d/1CT21QQCwU-zsjmic1E19BnkchnmAYE0neIdL0AYJq9Y/edit#gid=0

@haukepetersen
Copy link
Contributor Author

Thanks everyone, seems to me like everything is covered. Moving on to RC2!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants