-
Notifications
You must be signed in to change notification settings - Fork 34
Reset database and servers for each test #568
Reset database and servers for each test #568
Conversation
Explicitly set attribute's min_value to 0. Until now, min_value = 0 was carried from previous testcase (event::att_conf_event).
Write Short_attr_w value so that device is in ALARM state. Until now, the value = 10 was carried from previous testcase (CXX::cxx_attr_write, test_write_some_scalar_attributes).
Set abs_change property for attribute event_change_tst. Until now, property was carried from previous testcase (event::multi_event).
3601eb5 to
c8313c7
Compare
c8313c7 to
8318481
Compare
8318481 to
840d759
Compare
d2f2838 to
ebdd4ed
Compare
92c3ac4 to
c8f0da6
Compare
c8f0da6 to
8bc2ba3
Compare
8bc2ba3 to
dfb49e0
Compare
1697660 to
a51a7a1
Compare
a51a7a1 to
ae2dd81
Compare
24410b2 to
41bde04
Compare
b3db05b to
809fb3a
Compare
|
When looking at the first log of Travis, it seems like one test failed (37 - CXX::cxx_reconnection_zmq (Failed)) but Travis check status is green as if everything was fine. |
This is by design. We have problems with our testcases as they rely on timers and sleeps. if ! docker exec -w "${build_dir}" cpp_tango ctest -output-on-failure -j8
then
if ! docker exec -w "${build_dir}" cpp_tango ctest --output-on-failure --rerun-failed --repeat-until-fail 2
then
exit 1
fi
fiThe test failed in the first run and then passed two times in a row on the second run, so I assumed that the test is unstable and the commit should be accepted. Is this behavior ok from your pov? |
|
Indeed, you already explained that in the long description of the Pull Request... Could we infer the number of available cores on the Travis machine and use that number (or a relevant number for this number of available cores) in the -j option? |
Do you think we need to determine the number of available cores, i.e. can it change? Per travis docs, it is fixed to 2: https://docs.travis-ci.com/user/reference/overview/ Also, there is $TRAVIS_NUMCORES: travis-ci/travis-build#1079 Anyway, I did a dummy commit. According to lscpu output and /proc/cpuinfo, there are two CPUs in the machine used by travis. Judgind by the clock rate and L3 cache those may be Xeons E5-2699 v3. There is some virtualization involved (KVM) and we have just one hyper-threaded core in each CPU. This means that we have four cores in total. The problem is to determine the relation between the cpu cores and the number of parallel tests. Some tests are sleeping for a long time, some are waiting for I/O, etc. ... and in such cases another process may have a chance to run on the cpu. From original post: If we really want flexibility here, I propose to use: (num of jobs) = 2 * (num of cores). |
|
@mliszcz I would be in favour of having this here. |
Ingvord
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me. Go ahead and merge!
|
@t-b thanks!. Have you had a chance to read through the changes? @t-b, @Ingvord, @bourtemb
Currently I've made it mandatory, so that you can't start the database or device servers manually. All you can (and need to) do is to run |
|
I believe you can always run DevTest and co manually using e.g. CLion. Since these are just executables. Obviously one will need to run conf_devtest on a fresh database before running any test server. Would be nice to have some convenience script for that matter. Anyway for me this seems to be enough to let's say debug server/client. I can not envision any other complex case right away. Once anyone bumps into that we will handled it. |
I agree with Igor. We can change the behaviour if needed in the future. |
|
@mliszcz I'll make the review after the rebase. |
|
@t-b thanks! I won't be able to rebase the PR at least during the next couple of days. Also, let's merge first #608 which will bring more conflicts and will require some extra changes to start the second database just for group test. If you have any general comments regarding the proposed solution, please tell. Otherwise I'll ask you for a review once I have the PR rebased. |
|
@mliszcz The general direction is very nice! |
t-b
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mliszcz This is a really cool PR!!
The title says only something about databaser server resetting but in fact running the tests locally now just works! It took here 240s with 12 parallel jobs. I had to cherry-pick 4b5128f (zmqeventconsumer/zmqeventsupplier: Make it compatible with all stock cppzmq versions, 2019-06-05) though. So we need to get that merged :))
|
Thanks @t-b It will be hard to bring this whole PR up to date. |
|
@mliszcz I've merged the mentioned PRs. |
|
Obsoleted by #640. Let's continue the review in the other PR. |
This PR resets the database and device servers for each tests.
There is a wrapper script:
The script:
<test-binary>(forwards all parameters)Note: resetting environment for each CxxTest's
test_something()function is not in scope of this PR.Changes
Testcase output is collected in
test_resultsdirectory:Results
I've compared ctest output with and without changes in this PR. It is the same (timestamps, pids, etc. ... differ of course), except CXX::cxx_old_poll, which seems to produce different output on every run.
There is ~10s overhead for each test (start/stop dockers/servers). This gives ~15m increase in sequential test execution of ~90 tests.
Before any changes (8b69890) - 1000s:
https://travis-ci.org/tango-controls/cppTango/jobs/546186693
With script (ae2dd81) - 1000 + 1000 = 2000s:
https://travis-ci.org/tango-controls/cppTango/jobs/546296376
With 8 parallel tests (d86ffeb) - 2000s / 8 = 250s:
https://travis-ci.org/tango-controls/cppTango/jobs/546399263
If we run multiple tests in parallel, some may fail (due to timer expiration under high load).
If there are any failures in CI, I re-run failed tests sequentially and require them to pass two times in a row.
8 parallel tests is a reasonable value for Travis (0 - 4 tests usually fail). I tried with 16 parallel tests but on first run 21 out of 93 tests failed and there was no improvement in execution time.
Remarks
aptlike:apt install package=x.y.z. I don't think it will do any good.CMAKE_CTEST_COMMANDfor coveralls plugin, but coveralls seems to be no longer in use. @Ingvord could you comment on? Is it possible to see coverage report for this PR on sonar?