Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

match: avoid crash if subject NULL and PCRE2_ZERO_TERMINATED #53

Merged
merged 3 commits into from
Nov 27, 2021

Conversation

carenas
Copy link
Contributor

@carenas carenas commented Nov 20, 2021

Probably not that critical, as it has been there since the very beginning but seems like a better behaviour to have when using PCRE2_ZERO_TERMINATED and matches better the documentation.

Problem also exist in pcre2_dfa_match and POSIX (but there crashing is the expected behaviour), as well as in pcre2_substitute where it could crash even when not using PCRE2_ZERO_TERMINATED, so split onto multiple patches for easy of reviewing.

pcre2_substitute needs a lot more checks and will have to keep an exception as the replacement string was not checked before and it just happens to work fine regardless when the length was given as 0.

Tests are still missing and I am not sure of the design required to implement them yet, but if that is deemed not critical (after all that is the current state, as there is no way to test NULL subjects, patterns or replacement strings as of now) then could be implemented as a follow up.

@ltrzesniewski
Copy link
Contributor

I also noticed there's currently an error when subject is NULL and length is 0, which would amount to matching against an empty string. Do you think the library should allow that case?

@carenas
Copy link
Contributor Author

carenas commented Nov 21, 2021

I also noticed there's currently an error when subject is NULL and length is 0

that is by design, subject can never be NULL and if it does pcre2_match() returns and error with equivalent message "NULL argument passed". it is a little subtle, but in C, NULL is the address of a variable outside your memory space (hence invalid), while "" is the address of a variable that contains the NUL('\0') character in your memory space (therefore valid).

which would amount to matching against an empty string. Do you think the library should allow that case?

when the subject is a valid address, then length of 0 implies it is an "empty string" and even matches a pattern like "^$", NULL cannot be accessed and therefore can't match anything, hence why the API make it clear by returning an error.

hopefully the following hacky example code could help clarify:

#define PCRE2_CODE_UNIT_WIDTH 8
#include <pcre2.h>
#include <stdio.h>

int main(void)
{
	PCRE2_UCHAR pattern[] = "^$";
	PCRE2_SIZE o;
	PCRE2_UCHAR err[256];
	PCRE2_UCHAR *subjects[] = { "nonempty", "", NULL };
	int r;
	pcre2_match_data *m;
	pcre2_code *re = pcre2_compile(pattern, 2, 0, &r, &o, NULL);
	if (!re) {
		pcre2_get_error_message(r, err, sizeof err);
		printf("%s at %zu\n", err, o);
		return 1;
	}
	m = pcre2_match_data_create_from_pattern(re, NULL);
	for (o = 0; o < 3; o++) {
		printf("matching: '%s' ", subjects[o]);
		r = pcre2_match(re, subjects[o], PCRE2_ZERO_TERMINATED, 0, 0, m, NULL);
		if (r < PCRE2_ERROR_NOMATCH) {
			pcre2_get_error_message(r, err, sizeof err);
			printf("%s\n", err);
		} else if (r == PCRE2_ERROR_NOMATCH)
			printf("NOMATCH\n");
		else
			printf("MATCH: %d\n", r);
	}
	return 0;
}

@zherczeg
Copy link
Collaborator

I usually prefer to support ptr:NULL, length:0 for string manipulations (all string.h functions support it) but it looks like this is not the case for pcre2. I wouldn't mind supporting it though.

@carenas carenas marked this pull request as draft November 21, 2021 10:28
carenas added a commit to carenas/pcre2 that referenced this pull request Nov 21, 2021
As the first step towards allowing a NULL string to represent an
empty one, and as an alternative fix for the failures addressed
by PCRE2Project#53
@ltrzesniewski
Copy link
Contributor

@carenas I think there's a misunderstanding.

What I meant is that when length is 0 (and only in that case), the value of subject becomes irrelevant, as it should not be dereferenced anyway. NULL should not represent an empty string in the general case, it should only be allowed when the length is 0.

From a quick glance at your change, it looks like you handle a NULL subject with a PCRE2_ZERO_TERMINATED length like an empty string, which is incorrect. PCRE2 should return PCRE2_ERROR_NULL in that case. #55 is also incorrect since NULL should not always be treated as an empty string.

If you prefer, go with your original change, and I'll submit a separate PR later on. I didn't want to hijack your PR, sorry for that.
Also, there's no need to explain basic C principles to me :)

When length of subject is PCRE2_ZERO_TERMINATED strlen is used
to calculate its size, which will trigger a crash if subject is
also NULL.

Move the NULL check before strlen on it would be used, and make
sure or dependent variables are set after the NULL validation
as well.

While at it, fix a typo in a debug flag in the same file, which
is otherwise unrelated and make sure the full section of constrain
checks can be identified clearly using the leading comment alone.
When length of subject is PCRE2_ZERO_TERMINATED strlen is used
to calculate its size, which will trigger a crash if subject is
also NULL.

Move the NULL check before the detection for subject sizes to
avoid this issue.
The underlying pcre2_match() function will validate the subject if
needed, but will crash when length is PCRE2_ZERO_TERMINATED or if
subject == NULL and pcre2_match() is not being called because
match_data was provided.

The replacement parameter is missing NULL checks, and so currently
allows for an equivalent response to "" if rlength == 0.

Restrict all other cases to avoid strlen(NULL) crashes in the same
way that is done for subject, but also make sure to reject invalid
length values as early as possible.
@carenas carenas changed the title pcre2_match: avoid crash if subject NULL and PCRE2_ZERO_TERMINATED match: avoid crash if subject NULL and PCRE2_ZERO_TERMINATED Nov 22, 2021
@carenas carenas marked this pull request as ready for review November 26, 2021 17:57
@PhilipHazel PhilipHazel merged commit ae4e626 into PCRE2Project:master Nov 27, 2021
@carenas carenas deleted the matchcrash branch November 27, 2021 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants