Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

travis: war on leaks #4558

Merged
merged 11 commits into from Jun 7, 2018
Merged

travis: war on leaks #4558

merged 11 commits into from Jun 7, 2018

Conversation

tiennou
Copy link
Contributor

@tiennou tiennou commented Feb 28, 2018

This should make Travis fail the build when leaks are detected.

There's a prototype implementation on osx, because I just noticed leaks gained an -atExit argument, which I can't test locally for update-laziness reasons. Removed for the time being because it depends on undocumentation.

Supersedes #4416.

script/citest.sh Outdated
ctest -V -R libgit2_clar-proxy_credentials || exit $?
else
echo "Cannot find java in path, skipping proxy test"
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you making this change? Are there instances where travis doesn't have java installed? Or is this for running locally? If the latter, then let's make it some explicit opt-out.

Otherwise I fear that travis will change its images without us noticing, we'll stop running these tests silently, and then we'll break something without knowing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is slightly conflated with being able to run those scripts in a VM, hence this change (I didn't have java at first).

I understand the concern though, but on the other hand, changing that test made the proxy test run on the osx machines too. If I find a way to tell if the script is Travis run, I'll be able to make it hard error.

@@ -9,5 +9,5 @@ then
fi

if [ -n "$VALGRIND" -a -e "$(which valgrind)" ]; then
valgrind --leak-check=full --show-reachable=yes --error-exitcode=125 --suppressions=./libgit2_clar.supp $@ _build/libgit2_clar -ionline
valgrind --leak-check=full --show-reachable=yes --error-exitcode=125 --suppressions=./libgit2_clar.supp $@ _build/libgit2_clar -ionline -xbuf::oom
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test still useful at all? Maybe we should remove it entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally planning to try and make leaks works too, but I can remove it in the mean time.

script/citest.sh Outdated
@@ -10,7 +10,7 @@ fi

if [ ! -d _build ]; then
echo "no _build dir found; you should run cibuild.sh first"
exit 1
exit 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need some more explanation here - why does this fail the coverity build? Again, this seems like something that might could use an explicit flag that the coverity build can pass in so that we don't stop running our build+tests because of some subtle change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the three new "build" steps (cibuild, citest & cileaks) will be run in order now, hence the need to "fail" successfully those 2 new steps so we don't. Alternatively, I could make the coverity.sh explicit instead of calling that from cibuild.sh…

@tiennou
Copy link
Contributor Author

tiennou commented Mar 2, 2018

I've removed all changes caused by my trying to reuse the scripts inside VMs. I've also run travis lint locally (hence the suppression of dist).

I've noticed that Travis has build stages, which can accept conditions, so maybe there's a way to make the Coverity/Valgrind scans be less "hacky": right now COVERITY completely shortcuts the build, and VALGRIND will run the test suite twice (once as part of cibuild.sh, and again as part of cileaks.sh).

As this is likely to end up in a mess of pushes, and I can't control AppVeyor builds, I favored being cautious.

Copy link
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR. It helps readability quite a lot to split out those separate scripts

exit $?;
fi

if [ -n "$VALGRIND" -a -e "$(which valgrind)" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we should error out if we request a Valgrind build without Valgrind being installed. Otherwise, we'll get a green CI even if the job doesn't do what it's supposed to do

# If this platform doesn't support test execution, bail out now
if [ -n "$SKIP_TESTS" ];
then
exit $?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you pass back the value of the previous test? Shouldn't this just exit success?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a vanilla copy-paste of the block just above, which appeared in 8f426d7. I honestly have no idea why it's written that way, but it seems to predate AppVeyor, so there's that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Let's leave it as-is, then. But I'd be grateful if you at least remove these useless semicolons ;)

@@ -9,5 +9,5 @@ then
fi

if [ -n "$VALGRIND" -a -e "$(which valgrind)" ]; then
valgrind --leak-check=full --show-reachable=yes --error-exitcode=125 --suppressions=./libgit2_clar.supp _build/libgit2_clar -ionline
valgrind --leak-check=full --show-reachable=yes --error-exitcode=125 --suppressions=./libgit2_clar.supp $@ _build/libgit2_clar -ionline
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks unintentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's intentional, the purpose is that you can VALGRIND=1 ./script/cileaks.sh -stestcase in a VM and get just that valgrinded (note that I actually changed the location of the %@, this one is outdated).

Memcheck:Leak
...
fun:curl_global_init
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, cool that we can do this now with our global curl initialization.

.travis.yml Outdated
@@ -48,13 +48,11 @@ matrix:
- compiler: gcc
env: COVERITY=1
os: linux
dist: trusty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather prefer if we'd delete the global "dist" key. It's nice to have all that information directly in the build matrix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dist is a global key though, so I can't really use the other. Else I'd have kept the other one 😉.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is true. For some time, our matrix contained jobs for both Trusty and Precise, and it worked perfectly fine

@tiennou
Copy link
Contributor Author

tiennou commented Apr 18, 2018

Rebased. I yanked the dist: key change out.

@tiennou tiennou force-pushed the travis/war-on-leaks branch 2 times, most recently from 2dd99c0 to 4d37949 Compare April 18, 2018 23:16
@tiennou
Copy link
Contributor Author

tiennou commented Apr 18, 2018

🎶 Please stand by while the leaks are getting plugged, and the coverity scans appeased 🎶

@tiennou tiennou force-pushed the travis/war-on-leaks branch 2 times, most recently from 2cba134 to a88db36 Compare April 19, 2018 19:36
@tiennou
Copy link
Contributor Author

tiennou commented Apr 19, 2018

This is ready !

@tiennou
Copy link
Contributor Author

tiennou commented Apr 19, 2018

I gave a spin to Travis build stages, and it looks like this. Note that the Coverity build doesn't show up since it's not coming from libgit2/libgit2.

I think this could be used to do the pipelining @pks-t envisions in libgit2/discussions#1.

Copy link
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good to me. I'd love if you'd split out the "real" fixes into a separate PR such that we can fast-track them in case this PR here doesn't get merged fast, as I'd like to include them for #4632. Would also make handling on the backport-label a bit easier for me.

return 0;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you care to put this into an extra PR? It's worthy to be included in v0.27.1, as I've just introduced that leak with v0.27.0

src/worktree.c Outdated
@@ -131,7 +131,7 @@ static int open_worktree_dir(git_worktree **out, const char *parent, const char
goto out;
}

if ((wt = git__calloc(1, sizeof(struct git_repository))) == NULL) {
if ((wt = git__calloc(1, sizeof(*wt))) == NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one, as well

@@ -300,6 +300,7 @@ void test_worktree_worktree__add_with_explicit_branch(void)
git_reference_free(branch);
git_reference_free(wthead);
git_repository_free(wtrepo);
git_worktree_free(wt);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this

@tiennou
Copy link
Contributor Author

tiennou commented Apr 20, 2018

Moved the reported leaks to #4635 & #4636.

@pks-t
Copy link
Member

pks-t commented Apr 30, 2018

The PR work as advertised ;) Can't merge it due to that, unfortunately, as we got new leaks in our submodule code

@tiennou
Copy link
Contributor Author

tiennou commented May 9, 2018

I've locally tested that #4641 fixes the leaks (at least from Valgrind's POV), so this depends on that.
The crux of the issue is that because of the duplicated submodule, we're not freeing the first name the second time around.

@pks-t
Copy link
Member

pks-t commented Jun 6, 2018

#4641 is merged now. Could you please rebase on latest master?

==18109== 664 bytes in 1 blocks are still reachable in loss record 279 of 339
==18109==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==18109==    by 0x675B120: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==18109==    by 0x675C13C: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==18109==    by 0x675C296: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==18109==    by 0x679BD14: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==18109==    by 0x679CC64: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==18109==    by 0x6A64946: ??? (in /usr/lib/x86_64-linux-gnu/libgnutls.so.26.22.6)
==18109==    by 0x6A116E8: ??? (in /usr/lib/x86_64-linux-gnu/libgnutls.so.26.22.6)
==18109==    by 0x6A01114: gnutls_global_init (in /usr/lib/x86_64-linux-gnu/libgnutls.so.26.22.6)
==18109==    by 0x52A6C78: ??? (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.3.0)
==18109==    by 0x5285ADC: curl_global_init (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.3.0)
==18109==    by 0x663524: git_curl_stream_global_init (curl.c:44)
==2957== 912 bytes in 19 blocks are still reachable in loss record 323 of 369
==2957==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2957==    by 0x675B120: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==2957==    by 0x675BDF8: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==2957==    by 0x675FE0D: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==2957==    by 0x6761DC4: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==2957==    by 0x676477E: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==2957==    by 0x675B071: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==2957==    by 0x675B544: ??? (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==2957==    by 0x675914B: gcry_control (in /lib/x86_64-linux-gnu/libgcrypt.so.11.8.2)
==2957==    by 0x5D30EC9: libssh2_init (in /usr/lib/x86_64-linux-gnu/libssh2.so.1.0.1)
==2957==    by 0x66BCCD: git_transport_ssh_global_init (ssh.c:910)
==2957==    by 0x616443: init_common (global.c:65)
==17851== Invalid free() / delete / delete[] / realloc()
==17851==    at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==17851==    by 0x60BBE2B: __libc_freeres (in /lib/x86_64-linux-gnu/libc-2.19.so)
==17851==    by 0x4A256BC: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-amd64-linux.so)
==17851==    by 0x5F8F16A: __run_exit_handlers (exit.c:97)
==17851==    by 0x5F8F1F4: exit (exit.c:104)
==17851==    by 0x5F74F4B: (below main) (libc-start.c:321)
==17851==  Address 0x63153c0 is 0 bytes inside data symbol "noai6ai_cached"
The goal is to let cmake manage the parallelism
@tiennou
Copy link
Contributor Author

tiennou commented Jun 6, 2018

Rebased!

@pks-t pks-t merged commit 534b70a into libgit2:master Jun 7, 2018
@pks-t
Copy link
Member

pks-t commented Jun 7, 2018

Cool, we can finally merge this! Thanks for your awesome work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants