Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory corruption in onig_error_code_to_str() #132

Closed
RKX1209 opened this issue Jul 27, 2019 · 1 comment
Closed

Memory corruption in onig_error_code_to_str() #132

RKX1209 opened this issue Jul 27, 2019 · 1 comment
Labels

Comments

@RKX1209
Copy link

RKX1209 commented Jul 27, 2019

When onig_new(ONIG_SYNTAX_PERL) fails with error code -215(ONIGERR_INVALID_GROUP_NAME), onig_error_code_to_str() crashes due to invalid memory access.

Here is a POC code based on sample/syntax.c

/*
 * perl_syntax.c
 */
#include <stdio.h>
#include <string.h>
#include "onigmo.h"

extern int exec(const OnigSyntaxType* syntax,
		char* apattern, char* astr)
{
  int r;
  unsigned char *start, *range, *end;
  regex_t* reg;
  OnigErrorInfo einfo;
  OnigRegion *region;
  UChar* pattern = (UChar* )apattern;
  UChar* str     = (UChar* )astr;

  fprintf(stderr, "pattern %s (%lu)\n", pattern, strlen((char*)pattern));
  r = onig_new(&reg, pattern, pattern + strlen((char* )pattern),
	       ONIG_OPTION_DEFAULT, ONIG_ENCODING_ASCII, syntax, &einfo);
  fprintf(stderr, "RES: %d EINFO.ENC: %p\n", r, (void *)einfo.enc);
  if (r != ONIG_NORMAL) {
    OnigUChar s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str(s, r, &einfo);
    fprintf(stderr, "ERROR: %s\n", s);
    return -1;
  }

  region = onig_region_new();

  end   = str + strlen((char* )str);
  start = str;
  range = end;
  r = onig_search(reg, str, end, start, range, region, ONIG_OPTION_NONE);
  if (r >= 0) {
    int i;

    fprintf(stderr, "match at %d\n", r);
    for (i = 0; i < region->num_regs; i++) {
      fprintf(stderr, "%d: (%ld-%ld)\n", i, region->beg[i], region->end[i]);
    }
    r = 0;
  }
  else if (r == ONIG_MISMATCH) {
    fprintf(stderr, "search fail\n");
    r = -1;
  }
  else { /* error */
    OnigUChar s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str(s, r);
    fprintf(stderr, "ERROR: %s\n", s);
    return -1;
  }

  onig_region_free(region, 1 /* 1:free self, 0:free contents only */);
  onig_free(reg);
  onig_end();
  return r;
}

//#define CRASH_WHP

extern int main(int argc, char* argv[])
{
  int r = 0;

  #ifdef CRASH_WHP
  /* Entire raw binary of regex which causes NULL-ptr dereference with highy possibility */  
  char regex[] = {0x28, 0x3f, 0x30, 0x64, 0x1f, 0x03, 0x14, 0x7f};  
  #else
  /* Short text version of regex which causes NULL-ptr dereferecne */
  char regex[] = "(?0d";
  #endif
  r |= exec(ONIG_SYNTAX_PERL,
  	    regex,
  	   "bgh8x");    
  onig_end();
  return r;
}
gcc -o perl_syntax perl_syntax.c  -lonigmo
./perl_syntax
pattern (?0d (4)
RES: -215 EINFO.ENC: 0x7ffc07848cd0
Segmentation fault (Core dumped)

I've confirmed that, after onig_new(ONIG_SYNTAX_PERL ....) failure in exec(), einfo.enc points to invalid address.
Then onig_error_code_to_str() force to use some member of invalid einfo.enc, causes memory corruption.

Here is a crash log.
ONIGENC_MBC_TO_CODE(enc, p, end) in to_ascii() try to call address 0 (einfo.enc->mbc_to_code).

$ gdb -q ./perl_syntax -c core
Reading symbols from ./perl_syntax...done.
[New LWP 26801]
Core was generated by `./perl_syntax'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000000000 in ?? ()

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007f8f4b15f7fc in to_ascii (enc=0x7fff38b8e890, s=0x0,
    end=0x7f8f4b637a98 "\310ycK\217\177", buf=0x7fff38b8e700 "", buf_size=47,
    is_over=0x7fff38b8e6c4) at regerror.c:219
#2  0x00007f8f4b15fb35 in onig_error_code_to_str (s=0x7fff38b8e880 "", code=-215)
    at regerror.c:281
#3  0x000055b7122bdc40 in exec (syntax=0x55b7124bed40 <OnigSyntaxPerl>,
    apattern=0x7fff38b8e923 "(?0d", astr=0x55b7122bdfd1 "bgh8x") at perl_syntax.c:25
#4  0x000055b7122bdec1 in main (argc=1, argv=0x7fff38b8ea18) at perl_syntax.c:75

(gdb) up
#1  0x00007f8f4b15f7fc in to_ascii (enc=0x7fff38b8e890, s=0x0,
    end=0x7f8f4b637a98 "\310ycK\217\177", buf=0x7fff38b8e700 "", buf_size=47,
    is_over=0x7fff38b8e6c4) at regerror.c:219
219           code = ONIGENC_MBC_TO_CODE(enc, p, end);

(gdb) p *enc
$1 = {precise_mbc_enc_len = 0xffffffff, name = 0x0, max_enc_len = 951968360,
  min_enc_len = 32767, is_mbc_newline = 0x7f8f4b637710, mbc_to_code = 0x0,
  code_to_mbclen = 0x0, code_to_mbc = 0x0, mbc_case_fold = 0x756e6547,
  apply_all_case_fold = 0x9, get_case_fold_codes_by_str = 0x7f8f4b410660 <dl_main>,
  property_name_to_ctype = 0x7fff38b8e948, is_code_ctype = 0x2fba3e78dd29d900,
  get_ctype_code_range = 0x7fff38b8e930,
  left_adjust_char_head = 0x55b7122bdec1 <main+74>,
  is_allowed_reverse_match = 0x7fff38b8ea18, case_map = 0x100000000,
  ruby_encoding_index = 304865008, flags = 21943}

(gdb) p enc->mbc_to_code
$2 = (OnigCodePoint (*)(const OnigUChar *, const OnigUChar *,
    const struct OnigEncodingTypeST *)) 0x0

Thanks
Ren

@k-takata k-takata added the bug label Jul 29, 2019
k-takata added a commit that referenced this issue Jul 29, 2019
Fix SEGV in onig_error_code_to_str() (Fix #132)
@k-takata
Copy link
Owner

Thank you for reporting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants