Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

App crashes with "A heap has been corrupted" #249

Closed
danipen opened this issue Feb 17, 2022 · 11 comments
Closed

App crashes with "A heap has been corrupted" #249

danipen opened this issue Feb 17, 2022 · 11 comments

Comments

@danipen
Copy link

danipen commented Feb 17, 2022

Hi dear Onuguruma creator,

What I'm writing here is maybe a little bit out-of-topic, so sorry for that. I'd just like to kindly ask for some tip/help/advise ...

I'm creating a wrapper around your fantastic library to build TextMateSharp, a grammar interpreter for C#.

I'm asking you because you may help me to find or debug the issue. The problem is that we only reproduce it really very sporadically and in production environments, and that is killing us since we don't have a controlled environment to reproduce and fix.

The exception we get

What we're seeing only sometimes, is the following crash:

Unhandled exception at 0x00007FFE1ABCBE99 (ntdll.dll) in winplasticx.exe.31276.dmp: 0xC0000374: A heap has been corrupted (parameters: 0x00007FFE1AC36780).

This is the mixed stack trace:

 	ntdll.dll!RtlReportFatalFailure()
 	ntdll.dll!RtlReportCriticalFailure()
 	ntdll.dll!RtlpHeapHandleError()
 	ntdll.dll!RtlpHpHeapHandleError()
 	ntdll.dll!RtlpLogHeapFailure()
 	ntdll.dll!RtlpFreeHeapInternal()
 	ntdll.dll!RtlFreeHeap()
 	ntdll.dll!RtlpReAllocateHeap()
 	ntdll.dll!RtlpReAllocateHeapInternal()
 	ntdll.dll!RtlReAllocateHeap()
	onigwrap.dll!00007ffde0ebd03f()
 	onigwrap.dll!00007ffde0eadf62()
 	onigwrap.dll!00007ffde0ea4919()
 	onigwrap.dll!00007ffde0eb35d1()
 	onigwrap.dll!00007ffde0e9744c()
 	onigwrap.dll!00007ffde0e971da()
 	onigwrap.dll!00007ffde0e910af()
 	[Managed to Native Transition]	
 	TextMateSharp!TextMateSharp.Internal.Oniguruma.ORegex.ORegex(string pattern, bool ignoreCase, bool multiline)

Some useful information

We reproduce the issue in both Windows 10 x64 and macOS platforms. Not sure if it's reproduced in Linux, but probably.

The onig-wrapper is compiled including the static version of the oniguruma library.

We have a minidump file for the issue.

Can you take a look at the wrapper to see if you see everything correct? I think we're getting that "heap has been corrupted when doing onig_new: https://github.com/danipen/TextMateSharp/blob/49631c692556169a833cfccfc46c96576655d4f6/onigwrap/src/onigwrap.c#L20

And this is how we use the wrapper from C# code:
https://github.com/danipen/TextMateSharp/blob/5315ab80d6f1233ec28057d2788a5d32b7408435/src/TextMateSharp/Internal/Oniguruma/ORegex.cs

Thanks in advance!

@kkos
Copy link
Owner

kkos commented Feb 18, 2022

No problems were found in onigwrap.c.
I think it's useless to generate region every time in onigwrap_search(), but it has nothing to do with the current problem.
This library has been tested by OSS-Fuzz for one year and nine months, so I don't think the problem will happen so easily.
I think you need to narrow down the conditions under which the problem occurs from various perspectives, such as whether the problem occurs in single-threaded mode or only in multi-threaded mode.

@danipen
Copy link
Author

danipen commented Feb 18, 2022

Thank you very much for pointing it out, I'll try to rework the region part in onigwrap_search().

Yes! the library works perfectly. We have been extensively using it for a long time and 99.9% of the time we run without issues, but under some machine load or other circumstances, it crashes the program.

I noticed that the crash is much easier to reproduce when I launch many instances of my program at the same time (for example 6 instances). When I do that I reproduce the issue 100% of the time.

I compiled oniguruma and onigwrap libs in debug mode, and it unveiled that the last call is executed before the heap reallocation is onig_set_callout_of_name(). Then, the heap reallocation happens, the heap corruption is detected and the program crashes.

image

@danipen
Copy link
Author

danipen commented Feb 18, 2022

This is the log from the application verifier:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<avrf:logfile xmlns:avrf="Application Verifier">
	<avrf:logSession TimeStarted="2022-02-18 : 09:08:39" PID="26512" Version="2">
		<avrf:logEntry Time="2022-02-18 : 09:08:56" LayerName="Heaps" StopCode="0x10" Severity="Error">
			<avrf:message>Corrupted start stamp for heap block.</avrf:message>
			<avrf:parameter1>1fc5cce1000 - Heap handle used in the call.</avrf:parameter1>
			<avrf:parameter2>1fc2f968b50 - Heap block involved in the operation.</avrf:parameter2>
			<avrf:parameter3>4b0 - Size of the heap block.</avrf:parameter3>
			<avrf:parameter4>abcdbbbb - Corrupted stamp value.</avrf:parameter4>
			<avrf:stackTrace>
				<avrf:trace>vrfcore!VerifierDisableVerifier+7a7 ( @ 0)</avrf:trace>
				<avrf:trace>verifier!VerifierStopMessage+b9 ( @ 0)</avrf:trace>
				<avrf:trace>verifier!VerifierDisableFaultInjectionExclusionRange+3730 ( @ 0)</avrf:trace>
				<avrf:trace>verifier!VerifierDisableFaultInjectionExclusionRange+3a9a ( @ 0)</avrf:trace>
				<avrf:trace>verifier!VerifierCheckPageHeapAllocation+77 ( @ 0)</avrf:trace>
				<avrf:trace>vfbasics!+7ffa89d6397e ( @ 0)</avrf:trace>
				<avrf:trace>onigwrap!_realloc_base+73 (minkernel\crts\ucrt\src\appcrt\heap\realloc_base.cpp @ 46)</avrf:trace>
				<avrf:trace>onigwrap!onig_set_callout_of_name+2a2 ( @ 0)</avrf:trace>
				<avrf:trace>onigwrap!sprintf_s+859 ( @ 0)</avrf:trace>
				<avrf:trace>onigwrap!onig_initialize_encoding+b1 ( @ 0)</avrf:trace>
				<avrf:trace>onigwrap!onig_reg_init+5e ( @ 0)</avrf:trace>
				<avrf:trace>onigwrap!onig_new+5a ( @ 0)</avrf:trace>
				<avrf:trace>onigwrap!onigwrap_create+af ( @ 0)</avrf:trace>
				<avrf:trace>????????+00007FF9A824BE85</avrf:trace>
			</avrf:stackTrace>
		</avrf:logEntry>
	</avrf:logSession>
</avrf:logfile>

@danipen
Copy link
Author

danipen commented Feb 18, 2022

More findings: It seems to be a multi-threading issue. I'm able to avoid the issue if I statically lock every access to onig_new so only one thread at a time calls to it.

@danipen
Copy link
Author

danipen commented Feb 18, 2022

For the moment I will workaround the issue in my code it by synchronizing the threads when creating new oniguruma regexs:
https://github.com/danipen/TextMateSharp/blob/bd074b790ec58beda39a96a8b04908a0b861a578/src/TextMateSharp/Internal/Oniguruma/ORegex.cs#L33

But please, let me if you want to take a deeper dive into the issue, and/or if you need my help from my side.

@jkotas
Copy link

jkotas commented Feb 18, 2022

It seems to be a multi-threading issue.

The problem seems to be that onig_set_callout_of_name is growing GlobalCalloutNameList without taking any locks. Two threads trying to grow the list at the same time will lead to this heap corruption.

@kkos
Copy link
Owner

kkos commented Feb 19, 2022

It is true that onig_set_callout_of_name() does not support multi-threading. However, it is intended to affect all threads, so if you are in a multi-threaded environment, it is assumed that you will call it only as needed from one thread first and not use it after that, so there is no exclusive control.
And onig_set_callout_of_name() is not used in onigwrap.c, so it cannot be called directly.
However, I remembered that there is a place where it is called indirectly.

If the onig_initialize() function has never been called by the application, we have no choice but to call onig_initialize() in onig_new(). (I would like to stop this if possible).
In onig_initialize(), for a character encoding specified by an argument, the initialization function for that encoding is called, and onig_set_callout_of_name() may be executed from within it.

I'm not sure if this is the cause yet, but if it is, adding the following initialization function to onigwrap.c and calling it only once from the application first (not from multiple threads at the same time) may help.

extern int
onigwrap_initialize()
{
  OnigEncoding use_encs[1];

  use_encs[0] = ONIG_ENCODING_UTF16_LE;
  return onig_initialize(use_encs, sizeof(use_encs)/sizeof(use_encs[0]));
}

Even if this is not the cause of the problem, it should be done that way. (We also call onig_initialize() in sample/simple.c etc..)

@danipen
Copy link
Author

danipen commented Feb 19, 2022

@kkos thank you for the detailed information. I'll call onig_initialize() one at app level as you suggest.

Additionally, I would like to improve this ...

I think it's useless to generate region every time in onigwrap_search(), but it has nothing to do with the current problem.

Unfortunately I didn't catch you. Could you please clarify what you exactly mean and elaborate a little bit more?

@kkos
Copy link
Owner

kkos commented Feb 19, 2022

You don't have to create a region for each search, you can use one region for all searches.
In a multi-threaded environment, one region for each thread that performs the search is sufficient.
I don't know about C#, but there are probably thread-local variables/storage.
But it just means that you can reduce the number of calls to heap allocation by at most two in each search call, so if you don't mind that, you can leave it as it is.

#include "onigwrap.h"

extern int
onigwrap_initialize()
{
  OnigEncoding use_encs[1];

  use_encs[0] = ONIG_ENCODING_UTF16_LE;
  return onig_initialize(use_encs, sizeof(use_encs)/sizeof(use_encs[0]));
}

regex_t *onigwrap_create(char *pattern, int len, int ignoreCase, int multiline)
{
	regex_t *reg;

	OnigErrorInfo einfo;

	OnigOptionType onigOptions = ONIG_OPTION_NONE | ONIG_OPTION_CAPTURE_GROUP;

	if (ignoreCase == 1)
		onigOptions |= ONIG_OPTION_IGNORECASE;

	if (multiline == 1)
		onigOptions |= ONIG_OPTION_MULTILINE;

	OnigUChar *stringStart = (OnigUChar*) pattern;
	OnigUChar *stringEnd   = (OnigUChar*) pattern + len;
	
    int res = onig_new(
        &reg,
        stringStart,
        stringEnd,
        onigOptions,
        ONIG_ENCODING_UTF16_LE,
        ONIG_SYNTAX_DEFAULT,
        &einfo);

	return reg;
}

void onigwrap_region_free(OnigRegion *region)	
{
	onig_region_free(region, 1);
}

void onigwrap_free(regex_t *reg)
{
	onig_free(reg);
}

int onigwrap_index_in(regex_t *reg, char *charPtr, int offset, int length, OnigRegion *region)
{
	OnigUChar *stringStart  = (OnigUChar*) charPtr;
	OnigUChar *stringEnd    = (OnigUChar*) (charPtr + length);
	OnigUChar *stringOffset = (OnigUChar*) (charPtr + offset);
	OnigUChar *stringRange  = (OnigUChar*) stringEnd;

	int result = onig_search(
        reg,
        stringStart,
        stringEnd,
        stringOffset,
        stringRange,
        region,
        ONIG_OPTION_NONE);

	if (result >= 0)
		return result >> 1;
	if (result == ONIG_MISMATCH)
		return -1;
	return -2;
}

int onigwrap_search(regex_t *reg, char *charPtr, int offset, int length, OnigRegion *region)
{
	OnigUChar *stringStart  = (OnigUChar*) charPtr;
	OnigUChar *stringEnd    = (OnigUChar*) (charPtr + length);
	OnigUChar *stringOffset = (OnigUChar*) (charPtr + offset);
	OnigUChar *stringRange  = (OnigUChar*) stringEnd;

	int result = onig_search(reg, stringStart, stringEnd, stringOffset, stringRange, region, ONIG_OPTION_NONE);
	return result;
}

OnigRegion* onigwrap_region_new()
{
	return onig_region_new();
}

int onigwrap_num_regs(OnigRegion *region)
{
	return region->num_regs;
}

int onigwrap_pos(OnigRegion *region, int nth)
{
	if (nth < region->num_regs)
	{
		int result = region->beg[nth];
		if (result < 0)
			return result;
		return result >> 1;
	}
	return -3;
}

int onigwrap_len(OnigRegion *region, int nth)
{
	if (nth < region->num_regs)
	{
		int result = region->end[nth] - region->beg[nth];
		return result >> 1;
	}
	return -4;
}

@danipen
Copy link
Author

danipen commented Feb 21, 2022

@kkos are any other library parts that are not thread-safe?

In other words... If I call onigwrap_initialize() at program level, do you consider safe then to remove this lock?

https://github.com/danipen/TextMateSharp/blob/bd074b790ec58beda39a96a8b04908a0b861a578/src/TextMateSharp/Internal/Oniguruma/ORegex.cs#L33

@kkos
Copy link
Owner

kkos commented Feb 21, 2022

I think so.
Except for some initialization/setting functions and functions that manipulate regset elements, I think I've made it thread-safe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants