Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

travis-ci: run tests using docker #1670

Merged
merged 18 commits into from Sep 27, 2018

Conversation

Projects
None yet
5 participants
@grondo
Copy link
Contributor

commented Sep 20, 2018

Ok, finally have a working docker integration with travis, built off @trws Dockerfiles and initial work in this area. The changes of note from Tom's work include:

  • Combine gcc (default), gcc 8, and clang 6.0 into a single Ubuntu 18.04 (bionic) image (it was just easier for me to work with one "base" ubuntu image)
  • Add missing deps to Dockerfiles to satisfy all tests
  • Since flux-security will soon be a moving target, remove its build from the "base" image(s), install a "travis" Dockerfile that builds a specified version of flux-security and optionally installs a user matching the userid running docker build. This is done so that we don't have to chown the working directory, which then becomes unreadable by the testing user after the build.
  • Added a src/test/docker/docker-run-checks.sh which abstracts the build of the ephemeral travis container and execution of the underlying travis_run.sh script inside the container. This makes it easy to run the same build as travis from your docker-capable workstation/laptop with a single command. E.g. you can run the default checks with make -j 8 in the centos7-base image with:
    src/test/docker/docker-run-checks.sh -j 8 -i centos7-base -- --with-flux-security.
Usage: docker-run-checks.sh [OPTIONS] -- [CONFIGURE_ARGS...]
Build docker image for travis builds, then run tests inside the new
container as the current user and group.

Uses the current git repo for the build.

Options:
 -h, --help                    Display this message
     --no-cache                Disable docker caching
 -i, --image=NAME              Use base docker image NAME (default=bionic-base)
 -S, --flux-security-version=N Install flux-security vers N (default=0.2.0)

 -j, --jobs=N                  Value for make -j (default=2)

 -d, --distcheck               Run 'make distcheck' instead of 'make check'
 -I, --interactive             Instead of running travis build, run docker
                                image with interactive shell.
  • Added a config.site to the centos7-base to give a hint to configure (for flux-security) that libdir=/usr/lib64 not /usr/lib.
  • redo and simplify .travis.yml to reduce number of builds and add CentOS7 build, etc.
  • a few other fixes found along the way.

Before this can go in, unfortunately, we'll need a strategy for flux-sched. We could copy a similar scheme and place a "travis" Dockerfile that builds and installs flux-security and flux-core and uses a similar scheme as here (maybe the same script)

A better idea might be to have flux-core travis builds automatically upload a docker image with flux-core installed to a fluxrm/flux-core:latest tag on Docker Hub, which would simplify not only builds against flux-core but would allow people to easily try out flux-core master by pulling from Docker Hub. We could also use this created tagged Docker images like fluxrm/flux-core:0.11.0.

However, I haven't implemented that yet.

Fixes #1648.

@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2018

Oh, forgot to mention that currently both cppcheck and pylint are disabled here because the newer versions in Ubuntu 18.04 generated a lot of errors.

I found that if I let cppcheck parse the system include files, I got a lot less false positives (i.e. with -I /usr/include), however, the run took about 10x longer.

@codecov-io

This comment has been minimized.

Copy link

commented Sep 20, 2018

Codecov Report

Merging #1670 into master will decrease coverage by 0.37%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1670      +/-   ##
==========================================
- Coverage   79.68%   79.31%   -0.38%     
==========================================
  Files         186      186              
  Lines       34335    35059     +724     
==========================================
+ Hits        27361    27806     +445     
- Misses       6974     7253     +279
Impacted Files Coverage Δ
src/common/libpmi/pmi_strerror.c 66.66% <0%> (-13.34%) ⬇️
src/common/libflux/keepalive.c 82.35% <0%> (-10.99%) ⬇️
src/bindings/lua/lutil.c 78.12% <0%> (-7.25%) ⬇️
src/cmd/flux-logger.c 64.7% <0%> (-5.3%) ⬇️
src/common/libflux/response.c 79.62% <0%> (-3.93%) ⬇️
src/broker/rusage.c 61.11% <0%> (-3.6%) ⬇️
src/broker/exec.c 65% <0%> (-3.43%) ⬇️
src/common/libflux/content.c 86.66% <0%> (-3.34%) ⬇️
src/common/libutil/log.c 83.33% <0%> (-3.21%) ⬇️
src/common/libflux/reduce.c 75.15% <0%> (-3.08%) ⬇️
... and 149 more
@trws

This comment has been minimized.

Copy link
Member

commented Sep 21, 2018

This sounds great @grondo! As to your last point, I was thinking we could split the base images (unfortunately, it would work not to but the result would be huge) into a version with run dependencies and a version with build deps, then use the multi-stage build method from the docs to pull the built flux out of the build container into the run container and have the travis deploy target (or the script after success, doesn't really matter) push it up to dockerhub with a tag matching the release.

Doing a version of this that just builds off of the build image should be pretty straightforward, it's the optimizing that would likely be annoying.

@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 21, 2018

Great suggestions @trws. I might try to punt on the optimization for now and try to get something that flux-sched builder could use, which would be built from the base image. It may be large, but a lot of the stuff needed by the flux-core build is also required for flux-sched, so we could consider this just a "build" image, and optimize the "run" image later...

@trws

This comment has been minimized.

Copy link
Member

commented Sep 21, 2018

@grondo grondo force-pushed the grondo:travis-docker branch 4 times, most recently from c7a591f to 80db79a Sep 27, 2018

@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2018

Ok, latest changes here should be very close to ready for merging, including:

  • Added fixes for issues causing failures in docker builds, including #1679 and #1674
  • Optionally "tag" the resulting docker image after make install supon successful checks (docker-run-check.sh --tag=NAME) option, and push the result to docker hub. The docker hub image will be named fluxrm/flux-core:${IMG}-${tag} where tag is one of latest (if built as a result of a merge to master) vx.y.z (if built as a result of a tag) and IMG is the name of the image, e.g. bionic-base. For bionic-base images, a convenience copy of the tag is created without the bionic-base- prefix. E.g. that means there will be a fluxrm/flux-core:latest that points to the latest build from master.

This means that flux-sched's build could be modified to pull from fluxrm/flux-core and it will get an image with the latest flux-core build installed. If software is tied to a specific version of flux-core, this verison could be pulled down via fluxrm/flux-core:v0.10.0 (though that doesn't exist yet since it was built before these changes)

The work to take advantage of docker builds in flux-sched has not been done yet.

Finally, you can rerun builds similar to the Travis-CI environment via the docker-run-checks.sh script on any docker-enabled machine. For example, to check the centos7 build on an Ubuntu machine:

$ src/test/docker/docker-run-checks.sh -j 12 -i centos7-base -- --with-flux-security --enable-caliper

Note that the build occurs in your current working directory (i.e. the current git repo is mounted as a volume in the container), and the script will re-run autogen and make clean on you. Note also the -- used to separate the options for the docker-run-checks.sh script and the options that are passed to the underlying ./configure.

To instead run interactively in the image, use the -I, --interactive option to the script.

 src/test/docker/docker-run-checks.sh -q -I -i centos7-base 

@grondo grondo force-pushed the grondo:travis-docker branch from 80db79a to 7b74b3a Sep 27, 2018

@garlick

This comment has been minimized.

Copy link
Member

commented Sep 27, 2018

This is pretty cool!

Small nit: The "update dockerfiles" commit drops pmix from the bionic Dockerfile but not the centos7 one.

@garlick

This comment has been minimized.

Copy link
Member

commented Sep 27, 2018

Suggestion: split cmd/flux commits out to their own PR?

@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2018

Small nit: The "update dockerfiles" commit drops pmix from the bionic Dockerfile but not the centos7 one.

Good catch! Thanks.

Suggestion: split cmd/flux commits out to their own PR?

Yeah, that seems to make logical sense now that I think about it. I'll put those 3 commits into their own PR.

trws and others added some commits Sep 12, 2018

test: libflux: skip intermittently failing test
Skip a rector stat watcher test that intermittently fails in
Travis-CI pending a fix.
t0001-basic: improve tests efficiency
Use <1s timeouts for timeout tests.
Switch to minimal flux instance for gracetime shutdown test.
testsuite: switch run_timeout to use ualarm
For fractional second tests, use perl's ualarm function.
ualarm uses a library, but the library is standard in perl 5.8
which is standard as far back as RHEL6.
t/valgrind: add suppression of leaked zactor stack frame
Add a suppression for detected leak of zactor stack frame.
src/test: add Dockerfiles for Ubuntu and CentOS
Add initial Docker files for Ubuntu 18.04  and CentOS 7 with build
dependencies pre-installed for building flux-core.

CentOS 7 includes a config.site file which is installed so that
autotools projects automatically set libdir to /usr/lib64 and
sysconfdir to /etc when configured with prefix=/usr.

Co-authored-by: Mark A. Grondona <mgrondona@llnl.gov>
src/test/travis_run.sh: add script Travis-CI run
Move commands from .travis.yml into common travis_run.sh
script under src/test for easy transition to Docker builds.

Co-authored-by: Mark A. Grondona <mgrondona@llnl.gov>
test: add Dockerfile for use in travis-ci
Add a simple dockerfile for building ephemeral containers for testing
purposes which allows

 - A version of flux-security to selected at build time
 - The user and group used for the build to match the user and group
   of the user running the tests

There's probably a better way to do this, but this works for now.
test/cppcheck: ignore all test code
Do not bother emitting cppcheck errors for code which is only
used in testing.
build: move -Wno-* flags from CPPFLAGS to CFLAGS
The -Wno-* flags belong on AM_CFLAGS not AM_CPPFLAGS so move
them there to ensure these compiler flags are used durig compile.
test: add script to run travis checks in docker
Add a helper script for creating docker images for testing,
mainly meant to be used with travis (but will work outside of
travis)

This simplifies the travis use case, and allows testing of
docker image builds from the command line during development.

This change updates the "travis_run.sh" script to take configure
arguments on the command line, as well as other changes and
improvements.
test/subprocess: fix overflow warning in test_echo
Fix warning from -Werror=format-overflow in subprocess test
util test_echo.c. Add 1 byte to output buffer to silence the
warning.
Makefile.am: be more specific about coverage ignore
Don't ignore everything under /usr/* in case we are building
under /usr/src.
travis-ci: rewrite travis.yml to use docker builds
Rewrite the .travis.yml to utilize the new docker infrastructure
(Dockerfiles, docker-run-checks.sh, and travis_run.sh).

Other changes to the Travis builds include:

 - builds run on Ubuntu 18.04 (bionic) GCC 8 and Clang 6.0 as
   well as CentOS 7 (default gcc)
 - cppcheck and pylint are disabled for now becuase newer versions
   of these commands on Ubuntu 18.04 cause unadressed errors
 - re-titled builders for a clearer description on the travis
   website
 - If DOCKER_DEPLOY=t, then deploy the resulting docker image for
   docker hub

Co-authored-by: Tom Scogland <scogland1@llnl.gov>
t2000-wreck-epilog.t: fix bug in test script
A couple of tests in the wreckrun epilog test script
were using the wrong job kvs path (they were using LWJ
value from previous tests).

grondo and others added some commits Sep 20, 2018

codecov: update ignore pattern
Update Codecov ignore patterns to match the patterns used by
`make check-code-coverage`. Also, be sure we tolerate builds
in /usr/src as is done in Travis now.
t/lua: ensure non-installed lalarm module can be found
Problem: With TEST_INSTALL=t tests, the installed fluxometer
cannot find the non-installed lalarm.so Lua module, used by
some of the tests.

Make sure that the build directory LUA_CPATH is in the LUA_CPATH
used by `make check`, so that lalarm is found at runtime during
use of the "installed" flux-core.
testsuite: add timeout to check for lua posix
The check in sharness.d/flux-sharness.sh for existing of the Lua
posix module has a tendency to hang. Add a timeout so that sharness
tests will fail instead of hanging indefinitely.
t1006-kvs-getroot.t: fix race in getroot --watch test
Fix race in 'flux kvs getroot --watch yields monotonic sequence' by
use of waitfile to ensure getroot has written at least one line to
seq.out before starting kvs puts to avoid a race condition in the test.

Fixes #1682

@grondo grondo force-pushed the grondo:travis-docker branch from 7a7b511 to e6121d5 Sep 27, 2018

@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2018

Ok, I've rebased on master, squashed down the incremental Dockerfile development, and removed pmix from the centos7 image (and updated the centos7-base image on Docker Hub)

I'm still not sure why the coverage dropped, but I can continue to look at that in the background.

@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2018

Oh, also the travis-dep-builder.sh script isn't removed yet since it will still be used by flux-sched builds until we can transition to using the fluxrm/flux-core docker image for builds.

@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2018

Oops forgot to have the centos7 image issue a deploy to Docker Hub. This could wait, as it remains to be seen if the deploy will even work...

@garlick

This comment has been minimized.

Copy link
Member

commented Sep 27, 2018

OK, want to merge then?

@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2018

Sure. Then we can see if the docker deploy works for Ubuntu

@garlick garlick merged commit af8487d into flux-framework:master Sep 27, 2018

2 of 3 checks passed

codecov/project 79.31% (-0.38%) compared to 3225d9f
Details
codecov/patch Coverage not affected when comparing 3225d9f...e6121d5
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@grondo

This comment has been minimized.

Copy link
Contributor Author

commented Sep 28, 2018

Eh, darnit. Left the wrong repo slug in the .travis.yml for pushing to docker hub :-(

I'll have to make another PR to fix that issue and will add DOCKER_TAG to the CentOS image so a version that gets deployed (hopefully) as well.

@grondo grondo deleted the grondo:travis-docker branch Sep 28, 2018

@SteVwonder

This comment has been minimized.

Copy link
Member

commented Sep 28, 2018

Wooo! Great work @grondo and @trws!

&& yum clean all

# The cmake from yum is incredibly ancient, download a less ancient one
RUN wget -q --no-check-certificate https://cmake.org/files/v3.10/cmake-3.10.1-Linux-x86_64.tar.gz\

This comment has been minimized.

Copy link
@SteVwonder

SteVwonder Sep 28, 2018

Member

Why no checking of the certificate? I presume this is more for compatibility when building at LLNL than when building on Travis.

This comment has been minimized.

Copy link
@grondo

grondo Sep 28, 2018

Author Contributor

I'm not sure, but your supposition sounds correct.

@trws

This comment has been minimized.

Copy link
Member

commented Sep 28, 2018

Looks great @grondo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.