Skip to content

Commit

Permalink
fix: segmation fault occurs when many groups are used (Close #24)
Browse files Browse the repository at this point in the history
Allocate Onigmo's stack from a heap instead of a stack when many groups
are used.

see: https://bugs.ruby-lang.org/issues/8716
  • Loading branch information
k-takata committed Aug 16, 2013
1 parent 754f936 commit b9fba1d
Showing 1 changed file with 23 additions and 4 deletions.
27 changes: 23 additions & 4 deletions regexec.c
Original file line number Diff line number Diff line change
Expand Up @@ -445,9 +445,22 @@ onig_region_copy(OnigRegion* to, OnigRegion* from)



#define STACK_INIT(alloc_addr, ptr_num, stack_num) do {\
if (msa->stack_p) {\
#define MAX_PTR_NUM 100

#define STACK_INIT(alloc_addr, heap_addr, ptr_num, stack_num) do {\
if (ptr_num > MAX_PTR_NUM) {\
xfree(msa->stack_p);\
alloc_addr = (char* )xmalloc(sizeof(OnigStackIndex) * (ptr_num));\
heap_addr = alloc_addr;\
stk_alloc = (OnigStackType* )xmalloc(sizeof(OnigStackIndex) * (stack_num));\
stk_base = stk_alloc;\
stk = stk_base;\
stk_end = stk_base + (stack_num);\
msa->stack_p = stk_alloc;\
msa->stack_n = stk_end - stk_base;\
} else if (msa->stack_p) {\
alloc_addr = (char* )xalloca(sizeof(OnigStackIndex) * (ptr_num));\
heap_addr = NULL;\
stk_alloc = (OnigStackType* )(msa->stack_p);\
stk_base = stk_alloc;\
stk = stk_base;\
Expand All @@ -456,6 +469,7 @@ onig_region_copy(OnigRegion* to, OnigRegion* from)
else {\
alloc_addr = (char* )xalloca(sizeof(OnigStackIndex) * (ptr_num)\
+ sizeof(OnigStackType) * (stack_num));\
heap_addr = NULL;\
stk_alloc = (OnigStackType* )(alloc_addr + sizeof(OnigStackIndex) * (ptr_num));\
stk_base = stk_alloc;\
stk = stk_base;\
Expand Down Expand Up @@ -1324,6 +1338,7 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
UChar *p = reg->p;
UChar *pkeep;
char *alloca_base;
char *xmalloc_base;
OnigStackType *stk_alloc, *stk_base, *stk, *stk_end;
OnigStackType *stkp; /* used as any purpose. */
OnigStackIndex si;
Expand All @@ -1339,7 +1354,7 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
/* Stack #0 is used to store the pattern itself and used for (?R), \g<0>, etc. */
n = reg->num_repeat + (reg->num_mem + 1) * 2;

STACK_INIT(alloca_base, n, INIT_MATCH_STACK_SIZE);
STACK_INIT(alloca_base, xmalloc_base, n, INIT_MATCH_STACK_SIZE);
pop_level = reg->stack_pop_level;
num_mem = reg->num_mem;
repeat_stk = (OnigStackIndex* )alloca_base;
Expand All @@ -1353,7 +1368,7 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,
/* Stack #0 not is used. */
n = reg->num_repeat + reg->num_mem * 2;

STACK_INIT(alloca_base, n, INIT_MATCH_STACK_SIZE);
STACK_INIT(alloca_base, xmalloc_base, n, INIT_MATCH_STACK_SIZE);
pop_level = reg->stack_pop_level;
num_mem = reg->num_mem;
repeat_stk = (OnigStackIndex* )alloca_base;
Expand Down Expand Up @@ -2915,20 +2930,24 @@ match_at(regex_t* reg, const UChar* str, const UChar* end,

finish:
STACK_SAVE;
xfree(xmalloc_base);
return best_len;

#ifdef ONIG_DEBUG
stack_error:
STACK_SAVE;
xfree(xmalloc_base);
return ONIGERR_STACK_BUG;
#endif

bytecode_error:
STACK_SAVE;
xfree(xmalloc_base);
return ONIGERR_UNDEFINED_BYTECODE;

unexpected_bytecode_error:
STACK_SAVE;
xfree(xmalloc_base);
return ONIGERR_UNEXPECTED_BYTECODE;
}

Expand Down

8 comments on commit b9fba1d

@sorbits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this commit usable with threaded onigmo?

After upgrading to HEAD I got random crashes (in free). Reverting this commit seems to have fixed the issue.

@k-takata
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this commit with a customized testc.c on Windows and Linux, but I couldn't reproduce the problem.
Actually, I found some deadlock problems. I committed the fixes into the topic/multithread branch.

@sorbits
Copy link
Contributor

@sorbits sorbits commented on b9fba1d Mar 8, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the commit I (often) get the following error:

abort() called
*** error for object 0x7fad4b16ac08: incorrect checksum for freed object - object was probably modified after being freed.

The stack for that message is the following:

Onigmo.dylib              onig_region_free + 79 (regexec.c:320)
libsystem_malloc.dylib    szone_free_definite_size + 3429
libsystem_malloc.dylib    small_free_list_remove_ptr + 158
libsystem_malloc.dylib    szone_error + 587
libsystem_c.dylib         abort + 125
libsystem_pthread.dylib   pthread_kill + 92
libsystem_kernel.dylib    __pthread_kill + 10

To make onigmo thread safe I define THREAD_ATOMIC_START and THREAD_ATOMIC_END to lock/unlock a pthread_mutex_t, the code for that can be seen here: setup.c and setup.h.

@k-takata
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found another problem with this commit, and updated the topic/multithread branch.
But I still can't reproduce the problem. I have no idea why the error occurs in onig_region_free().

@k-takata
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sorbits Sorry, I didn't notice your comment in the gist. I added a comment.

@sorbits
Copy link
Contributor

@sorbits sorbits commented on b9fba1d Apr 4, 2014 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k-takata
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sorbits Can you try the latest commit in the mutlithread branch (a0efc0a)?

@sorbits
Copy link
Contributor

@sorbits sorbits commented on b9fba1d Apr 11, 2014 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.