Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctfmerge gets a segmentation fault when merging large set of objects on Ubuntu 12.04 X86_64 machine #92

Open
shrikanth07 opened this issue Dec 10, 2014 · 12 comments

Comments

@shrikanth07
Copy link

The machine I am building on is a Ubuntu 12.0.4 X86_64 version and the CTF tools are built 64 bit. There is a large set of object files around 200+ which are compiled,linked to create a kernel module. The ctfconvert builds the .SUNW_ctf sections for each object file but when merging ctfmerge blows up, gets a segmentation fault. The command is ctfmerge -L VERSION -g -o module.ko.debug <list of object .o's>. I enabled the debugs in the environment i.e CTFMERGE_DEBUG_LEVEL 7 and CTFMERGE_DEBUG_PARSE 1 and have the following snapshot before the crash
Average: 0.35
DEBUG: 354588416: entering first barrier
DEBUG: 346195712: entering second barrier
DEBUG: 346195712: phase 1 complete
DEBUG: 371373824: entering second barrier
DEBUG: 354588416: doing work in first barrier
DEBUG: clearing slot 0 (0) (saving 5)
DEBUG: clearing slot 1 (1) (saving 5)
DEBUG: clearing slot 2 (2) (saving 5)
DEBUG: clearing slot 3 (3) (saving 5)
DEBUG: clearing slot 4 (4) (saving 5)
DEBUG: 362981120: entering second barrier
DEBUG: phase one done: donequeue has 75 items
DEBUG: 354588416: ninqueue is 149, 75 on queue
DEBUG: 354588416: entering second barrier

The crash backtrace
(gdb) bt
#0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:45
#1 0x00000000004029a5 in barrier_wait (bar=0x6333e0) at barrier.c:107
#2 0x0000000000403388 in worker_thread (wq=0x633280) at ctfmerge.c:547
#3 0x00007ffff7498e9a in start_thread (arg=0x7ffff5eca700) at pthread_create.c:308
#4 0x00007ffff71c53fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5 0x0000000000000000 in ?? ()

(gdb) frame 1
#1 0x00000000004029a5 in barrier_wait (bar=0x6333e0) at barrier.c:107

107
(gdb) l
102 pthread_mutex_lock(&bar->bar_lock);
103
104 if (++bar->bar_numin < bar->bar_nthr) {
105 pthread_mutex_unlock(&bar->bar_lock);
106 sem_wait(bar->bar_sem);
107
108 return (0);
109
110 } else {
111 int i;
(gdb) p bar[0]
$1 = {bar_lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, bar_numin = 1, bar_sem = 0x0,
bar_nthr = 5}

Further, an experimental nit in barrier.c

diff --git a/cmd/ctfconvert/barrier.c b/cmd/ctfconvert/barrier.c
index 94bb78d..8c37bff 100644
--- a/cmd/ctfconvert/barrier.c
+++ b/cmd/ctfconvert/barrier.c
@@ -90,8 +90,9 @@ void
barrier_init(barrier_t *bar, int nthreads)
{
pthread_mutex_init(&bar->bar_lock, NULL);

  •   bar->bar_sem = sem_open("ctfmerge_barrier", O_CREAT | O_EXCL);
    
  •   bar->bar_sem = sem_open("ctfmerge_barrier", O_CREAT);
    
  •    if (bar->bar_sem == SEM_FAILED)
    
  •            perror("sem_open failed");
     bar->bar_numin = 0;
     bar->bar_nthr = nthreads;
    

}

The perror reports EEXISTS.

@qwkslvr
Copy link

qwkslvr commented Dec 16, 2014

I have a patch for the above said issue which involves creating a uniq name for the named-semaphore instead of a standard name. How do I submit the patch for review/commit ?

@dtrace4linux
Copy link
Owner

Hi

Thanks for this you can send me the paych direct.
On 16 Dec 2014 00:01, "qwkslvr" notifications@github.com wrote:

I have a patch for the above said issue which involves creating a uniq
name for the named-semaphore instead of a standard name. How do I submit
the patch for review/commit ?


Reply to this email directly or view it on GitHub
#92 (comment).

@qwkslvr
Copy link

qwkslvr commented Dec 16, 2014

Hi,
Sorry, I haven't submitted patches to github before. What's the recommended way? Do I send it to some email address? or just attach to this thread?

@qwkslvr
Copy link

qwkslvr commented Jan 5, 2015

Trying again. Can anyone tell me how to submit patches?

@dtrace4linux
Copy link
Owner

You can just do a diff -c on your tree vs the original and send it to me or
this mail thread, or do a "git push" from your tree.

tx

On 5 January 2015 at 19:08, qwkslvr notifications@github.com wrote:

Trying again. Can anyone tell me how to submit patches?


Reply to this email directly or view it on GitHub
#92 (comment).

@qwkslvr
Copy link

qwkslvr commented Jan 5, 2015

Just to summarize the changes, What I have done

  • Create dynamic semaphore names based on gettimeofday() so that parallel builds don't fail. Saved the name of the semaphore created in struct barrier
  • Create a new function called barrier_destroy() which cleans up the semaphores created.
  • Call barrier_destroy() when the program exits.

I tried doing git push, but It doesn't allow me to do. I get a
"error: The requested URL returned error: 403 while accessing https://github.com/dtrace4linux/linux.git/info/refs
fatal: HTTP request failed"

And I can't attach the patches here.
What's the email address I send this to?

@qwkslvr
Copy link

qwkslvr commented Jan 6, 2015

Hope this works. I am trying to attach the patch to this email.

Let me know if you got it.

On Mon, Jan 5, 2015 at 1:31 PM, dtrace4linux notifications@github.com
wrote:

You can just do a diff -c on your tree vs the original and send it to me
or
this mail thread, or do a "git push" from your tree.

tx

On 5 January 2015 at 19:08, qwkslvr notifications@github.com wrote:

Trying again. Can anyone tell me how to submit patches?


Reply to this email directly or view it on GitHub
#92 (comment).


Reply to this email directly or view it on GitHub
#92 (comment).

@dtrace4linux
Copy link
Owner

hello qwkslvr - apologies for not replying. I havent merged your
contribution yet into my tree. I was working on some nuisance reliability
issues in the 3.16 kernel, but havent finished. My last part was to
review/understand your changes (they look very reasonable given the
research you have done).

just backlogged on non-dtrace stuff.

On 15 January 2015 at 21:45, qwkslvr notifications@github.com wrote:

You didn't respond to my queries. So, pasting the patch here.
diff --git a/cmd/ctfconvert/barrier.c b/cmd/ctfconvert/barrier.c
index 94bb78d..6b8a423 100644
--- a/cmd/ctfconvert/barrier.c
+++ b/cmd/ctfconvert/barrier.c
@@ -53,6 +53,11 @@ barrier_init(barrier_t *bar, int nthreads)
bar->bar_nthr = nthreads;
}

+barrier_destroy(barrier_t *bar)
+{

  • /*XXX */ + +} int barrier_wait(barrier_t *bar) { @@ -83,19 +88,52 @@
    barrier_wait(barrier_t *bar) #include #include #include - +#include
    +#include +#include #include "barrier.h"

+#define SEM_STRING "ctfmerge"
+
void
barrier_init(barrier_t *bar, int nthreads)
{

  • int num;

  • struct timeval t; + pthread_mutex_init(&bar->bar_lock, NULL);

  • bar->bar_sem = sem_open("ctfmerge_barrier", O_CREAT | O_EXCL);

  • /* Get the timeof day as the seed to the random */

  • gettimeofday(&t, NULL);

  • srandom((unsigned int)t.tv_usec);

  • num = random(); +

    snprintf(bar->sem_name, 20, "%s-%d", SEM_STRING, num);

    bar->bar_sem = sem_open(bar->sem_name, O_CREAT|O_EXCL);
    +

  • if(bar->bar_sem == SEM_FAILED) {

  • perror("sem_open failed");

  • exit(0);

    • } +
  • /*

  • * Save the random number generated, so that we can destroy on

  • * the exit of program.

  • */

  • printf("sem_open for %s\n", bar->sem_name);

    bar->bar_numin = 0; bar->bar_nthr = nthreads; }

+void
+barrier_destroy(barrier_t *bar)
+{

  • printf("Unlinking semaphore %s\n", bar->sem_name);

    sem_unlink(bar->sem_name);
    +}
    +
    int
    barrier_wait(barrier_t *bar)
    {
    diff --git a/cmd/ctfconvert/barrier.h b/cmd/ctfconvert/barrier.h
    index af9aea7..1487fb4 100644
    --- a/cmd/ctfconvert/barrier.h
    +++ b/cmd/ctfconvert/barrier.h
    @@ -49,6 +49,7 @@ typedef struct barrier {
    } barrier_t;

    extern void barrier_init(barrier_t *, int);
    +extern void barrier_destroy(barrier_t *);
    extern int barrier_wait(barrier_t *);

    #ifdef __cplusplus
    @@ -63,6 +64,7 @@ typedef struct barrier {

    sem_t bar_sem; / where everyone waits

    / int bar_nthr; / # of waiters to trigger release */

    char sem_name[20];
    } barrier_t;

    extern void barrier_init(barrier_t *, int);
    diff --git a/cmd/ctfconvert/ctfmerge.c b/cmd/ctfconvert/ctfmerge.c
    index fffb92a..8766ed5 100644
    --- a/cmd/ctfconvert/ctfmerge.c
    +++ b/cmd/ctfconvert/ctfmerge.c
    @@ -609,10 +609,14 @@ merge_ctf_cb(tdata_t *td, char *name, void *arg)

    • completion. The run time of ctfmerge can, however, be measured in
      minutes
      • in some cases, so this is not a valid option. */ + +static
        workqueue_t wq; + static void handle_sig(int sig) {

    static void
    @@ -630,6 +634,8 @@ terminate_cleanup(void)
    fprintf(stderr, "Removing %s\n", outfile);
    unlink(outfile);
    }

    - barrier_destroy(&(wq.wq_bar1));

    barrier_destroy(&(wq.wq_bar2));
    }

    static void
    @@ -763,7 +769,6 @@ strcompare(const void *p1, const void *p2)

    • into your stack to another thread is fragile at best and leads to
      some
      • hard-to-debug failure modes. */ -static workqueue_t wq;

    int
    main(int argc, char *
    _argv) @@ -1043,5 +1048,7 @@ main(int argc, char *_argv)
    terminate("Couldn't rename output temp file %s", tmpname);

    free(tmpname);

    barrier_destroy(&(wq.wq_bar1));

  • barrier_destroy(&(wq.wq_bar2)); return (0); }


Reply to this email directly or view it on GitHub
#92 (comment).

@qwkslvr
Copy link

qwkslvr commented Jan 28, 2015

Ok, thanks. Let me know if you run into issues with the patch or have any questions for me. Would be interested in getting the patch committed soon.

@dtrace4linux
Copy link
Owner

gmail seems to have corrupted the diff output - can you send me the
relevant files you changed, and I will merge them in.

thanks

On 28 January 2015 at 22:22, qwkslvr notifications@github.com wrote:

Ok, thanks. Let me know if you run into issues with the patch or have any
questions for me. Would be interested in getting the patch committed soon.


Reply to this email directly or view it on GitHub
#92 (comment).

@qwkslvr
Copy link

qwkslvr commented Feb 23, 2015

I would love to, just tell me how. How do I attach the patch?

@dtrace4linux
Copy link
Owner

you can mail me at crispeditor/at/gmail.com

thanks

On 23 February 2015 at 04:49, qwkslvr notifications@github.com wrote:

I would love to, just tell me how. How do I attach the patch?


Reply to this email directly or view it on GitHub
#92 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants