Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing test_pt_ls (at least on RHEL7) #286

Merged
merged 13 commits into from Dec 19, 2016
Merged

Conversation

mxz297
Copy link
Member

@mxz297 mxz297 commented Dec 2, 2016

No description provided.

@wrwilliams
Copy link
Member

retest this please

@jdetter
Copy link
Contributor

jdetter commented Dec 5, 2016

This does not seem to be passing for me on ubuntu. I'm getting an assert running parseThat.

Here is the stack trace: https://gist.github.com/jdetter/849e04efb0051be61532ccaad3b41f3a

@mxz297
Copy link
Member Author

mxz297 commented Dec 5, 2016

@jdetter Do you first pull from the github master to get the stack analysis fix from Matt? It seems like you are missing those fixes.

@jdetter
Copy link
Contributor

jdetter commented Dec 5, 2016

@mxz297 I pulled the latest master and merged that into my local release9.3/fixes/test_pt_ls and it looks like I'm still getting the assert on line 1241 of stackanalysis.C.

@mxz297
Copy link
Member Author

mxz297 commented Dec 5, 2016

@jdetter You are right. This seems to be a different assertion failure in stack analysis. Could you help find out we are asserting when we are analyzing which function?

@morehouse We seem to find a different assertion problem in stack analysis. This time, we fail when we are analyzing libc-2.19 from Ubuntu 14, which is a different version of libc from the libc on RHEL6 or REHEL7.

@mxz297
Copy link
Member Author

mxz297 commented Dec 5, 2016

@jdetter Which libc version is on your system? I run stack analysis on all functions in libc-2.19, which is used by Ubuntu 14. Stack analysis did not assert.

@jdetter
Copy link
Contributor

jdetter commented Dec 5, 2016

@mxz297 When we're in stack frame 10 (function getCurrentStackHeight) it looks like were looking at targ40f3e0. The version of libc on the system I ran the test on is 2.23.

@jdetter
Copy link
Contributor

jdetter commented Dec 5, 2016

@mxz297 if you want to look at the libc I'm using, you can get it here: http://cs.wisc.edu/~detter/libc-2.23.so

@mxz297
Copy link
Member Author

mxz297 commented Dec 5, 2016

@morehouse This should be a stack analysis problem. By using libc-2.23.so provided by John, I can reproduce this assertion failure. It seems like you are expecting that the operand of a push instruction can only be either a register or an immediate. However, in this case, the push instruction will push a stack location to the top of the stack.

If you re-copy my stack analysis mutator at /p/paradyn/development/xmeng/workspaces/DataflowAPIExamples/StackAnalysis.cpp and run it with

./StackAnalysis libc-2.23.so targ25270

You should be able to reproduce the failure and the offending instruction is

2589c: ff 74 24 40 pushq 0x40(%rsp)

@jdetter Thanks for you input!

@morehouse
Copy link
Contributor

Added StackAnalysis support for the problematic push instruction.

@jdetter Could you retest?

@jdetter
Copy link
Contributor

jdetter commented Dec 6, 2016

@morehouse Everything is passing on my end now!

@wrwilliams
Copy link
Member

wrwilliams commented Dec 6, 2016

@morehouse Do you mind running through the StackAnalysis code that's on master and ensuring that it's got asserts properly replaced with exceptions (either bottoming values or failing the analysis in catch blocks, and logging the failure)? I doubt this is going to be the only binary that catches us with instructions we don't handle correctly. Something like:

#define STACKANALYSIS_ASSERT(X) \
if(!(X)) { \
  throw(stackanalysis_exception("Stack analysis assertion failed: " ## X); \
}
struct stackanalysis_exception : public std::exception {
  stackanalysis_exception(std::string what) : std::exception(what) {}
};

should let you get started with a global replace, and then some minimal try-catch code should make things well-behaved.

ETA: this is really, really dirty on Jenkins, with a bunch of new failures and crashes:
http://jenkins.dyninst.org/jenkins/job/dyninst-test/92/testReport/

@wrwilliams
Copy link
Member

retest this please

@wrwilliams
Copy link
Member

Now that line info fixes are merged: retest this please

@wrwilliams
Copy link
Member

Statically linked/g++/pic binaries are failing here (and for some reason they haven't run on all PRs, so this may or may not be a legitimate regression). Jenkins is on an ubuntu box; @jdetter, can you sanity check and see if you can reproduce locally? You'll probably need to install static libc/libc++ etc if you haven't already.

Results of interest are here:
http://jenkins.dyninst.org/jenkins/job/dyninst-test/96/testReport/ (overview)
http://jenkins.dyninst.org/jenkins/job/dyninst-test/96/testReport/dyninst.dyninst_group_test/ (drilling down on the mutatee variants)

@jdetter
Copy link
Contributor

jdetter commented Dec 8, 2016

@wrwilliams It looks like I'm also having issues with static+pic binaries.

@wrwilliams
Copy link
Member

@jdetter Can you get a call stack for the crash?

@jdetter
Copy link
Contributor

jdetter commented Dec 8, 2016

I'm running parseThat on a simple binary compiled with -fPIC and I'm getting this:

--SERIOUS-- #0: Dyninst was unable to load the dyninst runtime library into the application.  This may be caused by statically linked executables, or by having dyninst linked against a different version of libelf than it was built with.
--FATAL-- #68: Dyninst was unable to create the specified process
--FATAL-- #68: create process failed bootstrap

I'm guessing I have to recompiled dyninst with -fPIC?

@wrwilliams
Copy link
Member

Nope. Don't worry about it, I got what I needed from the Jenkins box.

@wrwilliams
Copy link
Member

retest this please

@jdetter: What I'm seeing is failures/crashes in test1_1 -rewriter -staticlink -pic that I haven't gotten to the bottom of. Those are worth attempting to reproduce with a stacktrace if you can.

@wrwilliams
Copy link
Member

retest this please

@jdetter
Copy link
Contributor

jdetter commented Dec 13, 2016

On Vanilla Ubuntu 14.04 I am passing this test:

detter@enderman ~/prefix/bin/testsuite $ ./test_driver -test test1_1 -rewriter -staticlink -pic -verbose -log
Commencing test(s) ...
Tue Dec 13 13:13:04 CST 2016
Linux enderman 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
TESTDIR=/home/d/detter/prefix/bin/testsuite
[Tests with dyninst_group_test.stat_g++_64_pic_none]
-rwxr-xr-x 1 detter upl-user 1159376 Nov 30 18:51 dyninst_group_test.stat_g++_64_pic_none*
dyninst_group_test.stat_g++_64_pic_none: statically linked

Snippet cost=0
Inserted snippet
test_lib_mutateeStart.C[268]:  starting rewritten process './binaries/rewritten_dyninst_group_test.stat_g++_64_pic_none Mutatee ./binaries/rewritten_dyninst_group_test.stat_g++_64_pic_none [C++]:""
Address of func1_1 is 0x401819, calling func1_1
call1() called - setting globalVariable1_1 = 11
Value of globalVariable1_1 is 11.
Address of func1_2 is 0x4017ed, calling now
func1_2 () called, address is 0x4017ed
Value of globalVariable1_1 is now 11.

Passed test #1 (zero arg function call)
All tests passed.
test_lib_mutateeStart.C[283]:  after waitForCompletion: no crash, result = 0
test_lib_mutateeStart.C[77]:  unlinking ./binaries/rewritten_dyninst_group_test.stat_g++_64_pic_none
TEST                       COMP   OPT  ABI MODE     THREAD  LINK    PIC     RESULT
test1_1                    g++    none 64  rewriter NA      static  PIC     PASSED

I have to rebuild boost on my office machine but I will test there as well and see what I get. I should have something for you today.

@wrwilliams
Copy link
Member

wrwilliams commented Dec 13, 2016

Okay, here's the reproducer that triggers the crash under jenkins, should be fixed on the autodetect RT lib PR:
runTests -dyninst -rewriter -all -junit

I'm also seeing failures on that PR that only show up with that broad a scope on the tests, This is being a fun one for certain values of fun...

ETA: @jdetter is there a good time for the two of us to compare notes tomorrow (12/14)?

@jdetter
Copy link
Contributor

jdetter commented Dec 14, 2016

@wrwilliams I have class at 2:30p today but besides that I'm free all day. Do you want me to meet you in your office around 11am?

@jdetter
Copy link
Contributor

jdetter commented Dec 14, 2016

@wrwilliams Yeah I get a bunch of crashing tests that complain about not being able to find functions. It also looks like the test only fails when running with your reproducer. When I run the tests individually they all pass.

@wrwilliams
Copy link
Member

My spider sense says there's some memory corruption and or bad handling of out of memory, judging from the fact that I can still take down Jenkins with a sufficiently large test run. I'll grab you when I get in.

@jdetter
Copy link
Contributor

jdetter commented Dec 14, 2016

I ran the test again by itself using valgrind and it looks like there wasn't that much memory leaked. However parseThat was taking up a significant amount of memory, about 2GB, when it ended, I'm not sure if that is normal or not. There also aren't any memory corruption issues reported by valgrind:

detter@pepperjack:testsuite$ valgrind ./test_driver -S -create -64 -linkdynamic -test test_pt_ls
==18601== Memcheck, a memory error detector
==18601== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==18601== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==18601== Command: ./test_driver -S -create -64 -linkdynamic -test test_pt_ls
==18601==
TEST                       COMP   OPT  ABI MODE     THREAD  LINK    PIC     RESULT
test_pt_ls                        none 64  create   NA      dynamic nonPIC  PASSED
==18601==
==18601== HEAP SUMMARY:
==18601==     in use at exit: 8,086,345 bytes in 112,453 blocks
==18601==   total heap usage: 143,297 allocs, 30,844 frees, 9,676,287 bytes allocated
==18601==
==18601== LEAK SUMMARY:
==18601==    definitely lost: 1,124,667 bytes in 8,809 blocks
==18601==    indirectly lost: 6,917,876 bytes in 103,322 blocks
==18601==      possibly lost: 3,006 bytes in 62 blocks
==18601==    still reachable: 40,796 bytes in 260 blocks
==18601==         suppressed: 0 bytes in 0 blocks
==18601== Rerun with --leak-check=full to see details of leaked memory
==18601==
==18601== For counts of detected and suppressed errors, rerun with: -v
==18601== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@jdetter
Copy link
Contributor

jdetter commented Dec 14, 2016

With the autodetect commits merged into my local branch I am no longer getting the crashes from Bill's reproducer. We decided to merge the autodetect commits into this PR to see if it will go through the Jenkins.

@jdetter
Copy link
Contributor

jdetter commented Dec 14, 2016

@wrwilliams I'm doing a clean build locally and I'm going to run runTests through valgrind just to see if there is a memory corruption issue.

@wrwilliams
Copy link
Member

Sounds good; it's also worth looking at the leak question at this point if no corruption is evident.

@wrwilliams
Copy link
Member

Reunified Dyninst tests; running 2x Dyninst test sets in parallel is worse for memory than we can handle, so we'll serialize and see if that helps. Retest this please.

@wrwilliams
Copy link
Member

Once more with feeling and with test1_35 removed (was skipped everywhere, can't see how it wasn't causing rewriter mode failures). Retest this please.

…ng it by default when performing any absloc/absregion analysis) and to ensure stack analysis performed during relocation gets cleaned up afterward.
@wrwilliams
Copy link
Member

Merging this because it's clearly improvement.

@wrwilliams wrwilliams merged commit e889b1d into master Dec 19, 2016
@mxz297 mxz297 deleted the release9.3/fixes/test_pt_ls branch August 30, 2017 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants