Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testsuite: revert travis-ci to vm builds and fix many broken &&-chains in tests #1167

Merged
merged 22 commits into from Aug 29, 2017

Conversation

Projects
None yet
4 participants
@grondo
Copy link
Contributor

grondo commented Aug 29, 2017

This PR reverts Travis-CI to use the vm builder instead of container builders by replacing sudo: false with sudo: true in .travis.yml. In my testing, this change completely resolves the intermittent (but high frequency) failures under travis such as hangs the write error problem, and really slow builds (#1145, #1141, etc.)

WIth that change in place, I was able to rebase many (minor) testsuite fixes I had on a branch, which fix up all the broken &&-chains in the testsuite, as well as add timeouts to some tests which have the potential to hang.

I think because many people have moved to the container build infrastructure, the VM build queue is not quite as long as it was getting before, so I don't think this change will slow down the turnaround time on Travis reporting.

This PR also uses one of the builders (currently the gcc-4.9 build) to run the suite with --chain-lint so we catch any broken &&-chains that might creep back in. (Many of these are harmless, however some broken && chains could cause a false positive in a test, so best to keep the testsuite suite in this regard)

grondo added some commits Jul 21, 2017

t0001-basic.t: fix broken &&-chain in tests
Fix broken `&&`-chains in some tests
t0003-module.t: fix broken &&-chain in tests
Fix some broken &&-chains in some sharness tests.

In some cases, the `&&` chain was broken presumably because a call
was expected to fail, e.g. in the valid/invalid rank flux-module
load/unload tests. These cases were fixed by either adding `!` or using
`test_must_fail`.
travis-ci: unshallow git clone so git-describe works
For flux version to be properly detected, we need to at least fetch
tags and possibly also unshallow the git clone. Do this before running
configure in travis-ci.
t0005-exec.t: fix broken &&-chains in tests
Where appropriate, fix broken &&-chain in tests.

The chain-lint check kludgery in sharness doesn't like when processes
are sent into background within the test, so add a flux_exec_bg()
shell function to wrap sending flux-exec into that background.
t0007-ping.t: fix broken &&-chain in tests
Fix broken &&-chain in tests so that --chain-lint passes on this test.
t0009-dmesg.t: fix broken &&-chain in tests
Add missing `&&`-chain in tests to make --chain-lint happy.
t0010-generic-utils.t: fix flux-version check
The check for flux-version output wasn't doing anything becuase of
leftover debugging in the test. In fact, the test itself had been
failing because of the use of extended regex pattern.

Remove the debugging from the test and fix the regex pattern.
t0010-generic-utils.t: fix broken &&-chain in test
Fix the broken &&-chain in the heaptrace tests by pulling out
the ||-based output checks into their own shell function.
t0015-cron.t: fix broken &&-chain in test
Fix broken &&-chains in test so that --chain-lint doesn't complain.
t0016-cron-faketime.t: fix broken &&-chain in tests
Fix broken &&-chain in tests so --chain-lint won't complain
t1001-barrier-basic.t: fix broken &&-chains in tests
Fix broken &&-chains in tests so that --chain-lint stops complaining
t1008-proxy.t: fix broken &&-chains in test
Fix broken &&-chain in tests so --chain-lint doesn't complain.

Some tests were checking for failed commands with

 `command; test $? -ne 0 && ...`

these were replaced with

 `test_must_fail command && ...`
t2000-wreck.t: fix broken &&-chains in tests
Fix broken &&-chains in tests so --chain-lint doesn't complain.
t2001-jsc.t: fix broken &&-chains in tests
Fix broken &&-chains in tests so that --chain-lint doesn't complain.
t2005-hwloc-basic.t: fix broken &&-chain in test
Fix a broken &&-chain in a test to make --chain-lint happy.
t1000-kvs.t: fix broken &&-chains in tests
Fix broken &&-chains in tests so that --chain-lint doesn't complain.
t1002-kvs-extra.t: fix broken &&-chain in tests
Fix a few broken &&-chains in tests so that --chain-lint doesn't
complain.
t0001-basic.t: run test_under_flux test under timeout
Timeout the test_under_flux test after 5s in case the test hangs
for unknown reason.
t0001-basic.t: add timeouts to all flux-start invocations
Since t0001-basic.t doesn't run under test_under_flux, it doesn't have
a test timeout. To ensure invidual tests using flux-start do not hang
the testsuite, add -o -Sinit.rc2_timeout option to each invocation
of flux-start.
travis-ci: revert to VM builds
Set `sudo: true` in .travis.yml to revert to VM based builds instead
of container/docker builds on Travis-CI. This is done because our
build appears to require more resources than are given to the
container-based Travis builds.
travis-ci: run one build with --chain-lint
Add chain-lint check to the gcc-4.9 build in travis.
This way, broken &&-chains will be detected before merging.
t3000-mpi-basic.t: fix broken &&-chain in test
Fix a broken &&-chain in a test so that --chain-lint doesn't complain.
@coveralls

This comment has been minimized.

Copy link

coveralls commented Aug 29, 2017

Coverage Status

Coverage decreased (-0.03%) to 78.025% when pulling 3341f7c on grondo:chain-lint into 1c99b02 on flux-framework:master.

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Aug 29, 2017

failures under travis such as hangs the write error problem

Of course after I stated that one test failed with write error so I'm not sure this is a fix for that problem

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Aug 29, 2017

Codecov Report

Merging #1167 into master will decrease coverage by 0.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1167      +/-   ##
==========================================
- Coverage    77.7%   77.65%   -0.05%     
==========================================
  Files         158      158              
  Lines       29075    29041      -34     
==========================================
- Hits        22592    22552      -40     
- Misses       6483     6489       +6
Impacted Files Coverage Δ
src/common/libflux/ev_flux.c 97.56% <0%> (-2.44%) ⬇️
src/connectors/ssh/ssh.c 83.25% <0%> (-1.92%) ⬇️
src/common/libflux/request.c 88.46% <0%> (-1.29%) ⬇️
src/cmd/flux-event.c 67.74% <0%> (-1.08%) ⬇️
src/common/libflux/dispatch.c 84.81% <0%> (-0.75%) ⬇️
src/connectors/local/local.c 87.4% <0%> (-0.75%) ⬇️
src/common/libkvs/kvs_txn.c 76.47% <0%> (-0.66%) ⬇️
src/common/libflux/message.c 80.94% <0%> (-0.36%) ⬇️
src/broker/overlay.c 71.67% <0%> (-0.35%) ⬇️
src/common/libflux/reactor.c 93.18% <0%> (-0.29%) ⬇️
... and 11 more
@garlick

This comment has been minimized.

Copy link
Member

garlick commented Aug 29, 2017

These are good fixes, even if we don't have a 100 pct fix for the spurious travis errors. I'm for merging this if you're ready.

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Aug 29, 2017

Yes, I think it is ready. Pretty straightforward really.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Aug 29, 2017

Thanks!

@garlick garlick merged commit 820301e into flux-framework:master Aug 29, 2017

3 of 4 checks passed

codecov/project 77.65% (-0.05%) compared to 1c99b02
Details
codecov/patch Coverage not affected when comparing 1c99b02...3341f7c
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.03%) to 78.025%
Details

@grondo grondo deleted the grondo:chain-lint branch Oct 25, 2017

@grondo grondo referenced this pull request May 10, 2018

Closed

0.9.0 Release #1479

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.