Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GCC 8 output truncation warnings #1642

Merged
merged 4 commits into from Sep 4, 2018

Conversation

Projects
None yet
5 participants
@grondo
Copy link
Contributor

grondo commented Sep 1, 2018

This PR fixes the GCC 8 warning mentioned in #1633 by sizing the sockfile buffer in connector_init to the size of the addr.sun_path instead of PATH_MAX+1, since Unix domain socket pathname length would seem to be constrained to the sun_path array size, not PATH_MAX.

Other GCC 8 warnings came from the use of snprintf() in tests where source and destination buffers are the same length. Checking the return code from snprintf() for truncation suppressed these warnings as suggested by @trws.

I also tried to update the "newer" GCC in travis-ci from 4.9 to 8.1, but ran into an error linking against libjobspec, which I do not pretend to understand:

  CXXLD    job-ingest.la
libtool: warning: '/lib64/libmunge.la' seems to be moved
../../../src/common/libjobspec/.libs/libjobspec.a(libjobspec_la-jobspec.o): In function `YAML::Node::Scalar[abi:cxx11]() const [clone .isra.151]':
/usr/include/yaml-cpp/node/impl.h:158: undefined reference to `YAML::detail::node_data::empty_scalar[abi:cxx11]'
../../../src/common/libjobspec/.libs/libjobspec.a(libjobspec_la-jobspec.o): In function `Flux::Jobspec::Jobspec::Jobspec(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)':
/g/g0/grondo/git/flux-core.git/src/common/libjobspec/jobspec.cpp:400: undefined reference to `YAML::Load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
collect2: error: ld returned 1 exit status

So for now, the GCC8 build adds --disable-jobspec

Along the way some other issues popped up and were fixed:

  • In travis-ci the ARGS meant to be passed through to configure in make distcheck were not, so all our builds were using default ./configure. DISTCHECK_CONFIGURE_FLAGS=$ARGS is now used to pass these flags to the inner configure. (oops!)
  • For built dependencies in travis-dep-builder.sh, the system compiler was forced by overriding CC=gcc, but this wasn't done for CXX. Fixed.
  • I started hitting #1638 in travis, so that fix was applied to this PR.
@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Sep 1, 2018

This shouldn't be merged until we figure out the libjobspec issue. If it is a problem with incompatibility between the system C++ compiler and GCC 8 used for the travis build, then we could back out the changes to test with GCC 8, but keep the other fixes.

@grondo grondo force-pushed the grondo:gcc-8 branch 2 times, most recently from 45d87bc to 8113d33 Sep 1, 2018

@coveralls

This comment has been minimized.

Copy link

coveralls commented Sep 1, 2018

Coverage Status

Coverage decreased (-0.07%) to 79.522% when pulling 379f071 on grondo:gcc-8 into b4c64c0 on flux-framework:master.

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Sep 2, 2018

I don't think we have any tests that behave differently if munge is started in the environment, do we? Looks like the security unit tests use the FLUX_SEC_FAKEMUNGE flag, and the job ingest tests call submitbench --sign-type=none. However, ci coverage jumped up when you did that, so maybe I am confused.

The flux-security tests start a munge daemon for the duration of the test (borrowed from @dun's munge tests) - see the SIDEMUNGE prereq in t1002-sign-munge.t. If necessary we could do something like that here to conditionally enable some munge testing?

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Sep 3, 2018

I don't think we have any tests that behave differently if munge is started in the environment, do we? Looks like the security unit tests use the FLUX_SEC_FAKEMUNGE flag, and the job ingest tests call submitbench --sign-type=none. However, ci coverage jumped up when you did that, so maybe I am confused.

Sorry I did not look into this in depth yet, just noticed that when I fixed the issue where configure flags were not passed into make distcheck errors about "failed to open munge socket" appeared, and I was just quickly trying to fix that. I first thought it was perhaps because the munge package wasn't installed, but that alone did not fix it (and perhaps munge is pulled in automatically with libmunge)

I don't think the make check-code-coverage tests run under make distcheck though, so the coverage bump is something to look into (I hadn't noticed)

I'll try to look into this more in depth, but I first wanted to figure out the link errors with libjobspec to determine if we even want/need to test with GCC8 in travis.

@trws

This comment has been minimized.

Copy link
Member

trws commented Sep 3, 2018

@trws

This comment has been minimized.

Copy link
Member

trws commented Sep 3, 2018

@trws

This comment has been minimized.

Copy link
Member

trws commented Sep 3, 2018

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Sep 3, 2018

Actually, try changing CXX=gcc-8 to CXX=g++-8. Gcc doesn’t link the c++ standard library when linking, g++ does, that may be all it is.

Oh geez, whoops! That was a typo. Sorry and good catch!

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Sep 3, 2018

Trying to build again without munge to investigate any errors (which may have misled me). Also testing @trws fix for GCC 8 compile. Sorry for any noise in this PR.

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Sep 3, 2018

Thanks @trws, CXX=g++-8 instead of gcc-8 fixed the linking issue with libjobspec.

There were a couple problems causing issues with munged not running in Travis, now that I've taken a look.

  1. A typo in t2200-job-ingest.t caused failure for the test to detect that flux job submitbench was built with flux-security, and thus the --sign-type=none was not getting passed to the tests.
diff --git a/t/t2200-job-ingest.t b/t/t2200-job-ingest.t
index bd61254..799fb5a 100755
--- a/t/t2200-job-ingest.t
+++ b/t/t2200-job-ingest.t
@@ -10,7 +10,7 @@ fi
if test -x ${FLUX_BUILD_DIR}/src/cmd/flux-jobspec-validate; then
    test_set_prereq ENABLE_JOBSPEC
fi
-if flux job --help 2>&1 | grep -q sign-type; then
+if flux job submitbench --help 2>&1 | grep -q sign-type; then
    test_set_prereq HAVE_FLUX_SECURITY
    SUBMITBENCH_OPT_R="--reuse-signature"
    SUBMITBENCH_OPT_NONE="--sign-type=none"
  1. With the patch above applied, the last 2 tests still fail. Seems like there's an attempt to verify that stderr contains the word permitted but this word doesn't appear in the error output (at least when built with flux-security)
not ok 8 - job-ingest: submit user != signed user fails
#
#               ! FLUX_HANDLE_USERID=9999 ${SUBMITBENCH} \
#                    ${JOBSPEC}/valid/basic.yaml 2>baduser.out &&
#               grep -q permitted baduser.out
#
not ok 9 - job-ingest: non-owner mech=none fails
#
#               ! FLUX_HANDLE_ROLEMASK=0x2 ${SUBMITBENCH} \
#                    ${JOBSPEC}/valid/basic.yaml 2>badrole.out &&
#               grep -q permitted badrole.out
#
# failed 2 among 9 test(s)
$ grep . trash-directory.t2200-job-ingest/bad*.out
trash-directory.t2200-job-ingest/badrole.out:flux-job: submit: only instance owner can use sign-type=none
trash-directory.t2200-job-ingest/baduser.out:flux-job: submit: signer=1000 != requestor=9999
  1. There is a problem with the coverage builder. Either make check-code-coverage doesn't exit with non-zero status when a test fails, or Travis is somehow failing to detect it. t2200-job-ingest.t has been failing in the COVERAGE=t builder, but we never noticed. 👎 I'm looking into this one.

@grondo grondo force-pushed the grondo:gcc-8 branch from 4ddc1d6 to 65973ae Sep 3, 2018

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Sep 3, 2018

A typo in t2200-job-ingest.t caused failure for the test to detect that flux job submitbench was built with flux-security, and thus the --sign-type=none was not getting passed to the tests.

Bleah. Sorry about that!

With the patch above applied, the last 2 tests still fail. Seems like there's an attempt to verify that stderr contains the word permitted but this word doesn't appear in the error output (at least when built with flux-security)

Having a look at this...

@grondo grondo force-pushed the grondo:gcc-8 branch from 65973ae to 92fad9c Sep 3, 2018

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Sep 3, 2018

Thanks, sorry for the sloppiness there. Was going to suggest dropping the last line of each test, but this is better.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Sep 4, 2018

Codecov Report

Merging #1642 into master will increase coverage by 0.38%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1642      +/-   ##
==========================================
+ Coverage   78.98%   79.37%   +0.38%     
==========================================
  Files         179      179              
  Lines       32535    32535              
==========================================
+ Hits        25699    25825     +126     
+ Misses       6836     6710     -126
Impacted Files Coverage Δ
src/connectors/local/local.c 88.14% <ø> (ø) ⬆️
src/modules/connector-local/local.c 72.76% <0%> (-1.63%) ⬇️
src/common/libflux/request.c 87.17% <0%> (-1.29%) ⬇️
src/common/libutil/base64.c 95.07% <0%> (-0.71%) ⬇️
src/common/libflux/message.c 80.78% <0%> (+0.11%) ⬆️
src/modules/kvs/kvs.c 66.03% <0%> (+0.16%) ⬆️
src/common/libkvs/kvs_txn.c 75.28% <0%> (+0.56%) ⬆️
src/common/libutil/dirwalk.c 95% <0%> (+1.42%) ⬆️
src/common/libjobspec/jobspec.cpp 84.21% <0%> (+1.61%) ⬆️
src/cmd/flux-job.c 87.24% <0%> (+11.4%) ⬆️
... and 2 more
@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Sep 4, 2018

There is a problem with the coverage builder.

I guess make check-code-coverage explicitly ignores errors in make check:

# Use recursive makes in order to ignore errors during check
check-code-coverage:
        -$(AM_V_at)$(MAKE) $(AM_MAKEFLAGS) -k check
        $(AM_V_at)$(MAKE) $(AM_MAKEFLAGS) code-coverage-capture

I guess it makes sense in some situations to ignore make check errors when you are interested in checking code coverage, but for us it might be smart to add a step after make check-code-coverage that would ensure all tests succeeded (not sure exactly how to do that at this point). Otherwise, we'd have to pay close(r) attention to coverage changes in PRs and search logs for errors when a suspicious change occurs.

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Sep 4, 2018

should I separate the Travis-CI fixes and GCC8 warning fixes into different PRs?

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Sep 4, 2018

should I separate the Travis-CI fixes and GCC8 warning fixes into different PRs?

Sounds good to me if it's not much trouble.

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Sep 4, 2018

@grondo grondo force-pushed the grondo:gcc-8 branch from 92fad9c to 6025118 Sep 4, 2018

@grondo grondo changed the title Fix GCC 8 output truncation warnings (and other fixes) Fix GCC 8 output truncation warnings Sep 4, 2018

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Sep 4, 2018

Ok, I've pulled the "other fixes" out into two separate PRs

  • #1645 : use _exit(2) instead of exit(3) in subprocess_child
  • #1644: all the other travis-ci fixes unrelated to GCC 8 update
@trws

This comment has been minimized.

Copy link
Member

trws commented Sep 4, 2018

Looks pretty good. No worries on the typo BTW, that actually would work perfectly well with some older GCCs (maybe even 4.x), they used to do a ton of inference to make that kind of thing work, makes debugging builds "interesting." 😄

grondo added some commits Aug 31, 2018

connector/local: fix truncated output warning from GCC 8
Problem: compilation of connector/local.c was failing under GCC 8
due to a warning (fatal, due to -Werror) about output truncation when
the variable sockpath (an array of size PATH_MAX + 1) is copied to
addr.sun_path (107 bytes).

Since Unix domain socket pathnames are actually constrained to the length
of  struct sockaddr_un.sun_path array, use this length instead of PATH_MAX
in sizing the array in which the socket path is constructed, so that
truncation error is caught in connector_init() instead of actually causing
truncation when copying in connect_sock_with_retry().

Fixes #1633
tests: fixes for GCC 8 truncation warnings
Problem: many tests fail to compile on GCC 8 due to a truncation
warning (fatal due to use of -Werror), when using snprintf() to
construct a string where a source buffer of the string is the same
size or larger than the destination string.

Technically, this isn't such a big deal, and because of the tests
the truncation would actually never occur, but it is easiest for
now to just resize some of the source buffers to be smaller than the
destination and suppress the warnings.
travis-ci: compile with gcc-8
Update travis-ci "new" GCC build from gcc-4.9 to gcc-8
travis-ci: build Caliper with GCC 8
Caliper build in travis-dep-builder requires "newer" GCC than
the default version provided in Travis. Now that we're using
GCC 8 as the newer compiler, update Caliper build as well.

@grondo grondo force-pushed the grondo:gcc-8 branch from 6025118 to 379f071 Sep 4, 2018

@grondo

This comment has been minimized.

Copy link
Contributor Author

grondo commented Sep 4, 2018

rebased on master

@garlick

This comment has been minimized.

Copy link
Member

garlick commented Sep 4, 2018

Looks like this is ready to go?

@garlick garlick merged commit 6091d38 into flux-framework:master Sep 4, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.07%) to 79.522%
Details

@grondo grondo deleted the grondo:gcc-8 branch Feb 8, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.