New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfixes #616
Bugfixes #616
Conversation
If the allocation in parserInit fails:
parser->m_protocolEncodingName = copyString(encodingName, &(parser->m_mem));
Then we'll call XML_ParserFree on the parser inside parserCreate:
if (encodingName && ! parser->m_protocolEncodingName) {
XML_ParserFree(parser);
return NULL;
}
If we're inside XML_ExternalEntityParserCreate, this can mean that
XML_ParserFree is called on a parser whose m_dtd pointer is shared with the
document's root parser. Since the flag m_isParamEntity is only set in
XML_ExternalEntityParserCreate after parserCreate returns, this call to
XML_ParserFree will incorrectly destroy the shared dtd.
This fix moves the setting of m_isParamEntity into parserCreate, since the
dtd parameter is only non-NULL in this case.
In the handling of XML_TOK_END_TAG, we push tag onto parser->m_freeTagList. If we then hit the error case, we'll return without pushing the tag's bindings onto parser->m_freeBindingList, so the binding allocations will be leaked.
| moveToFreeBindingList(parser, tag->bindings); | ||
| tag->bindings = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far I cannot confirm a leak. This line of code is touched by the test suite and the test suite is executed under LeakSan. I would expect a call to XML_ParserFree to "plug" this leak, and missing to call XML_ParserFree would be a bug in the code using Expat. Please help me understand why we have a true leak here if so, thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we move tag from m_tagStack to m_freeTagList:
3017:
TAG *tag = parser->m_tagStack;
parser->m_tagStack = tag->parent;
tag->parent = parser->m_freeTagList;
parser->m_freeTagList = tag;
We don't do anything with the tag bindings until later:
3057:
while (tag->bindings) {
BINDING *b = tag->bindings;
if (parser->m_endNamespaceDeclHandler)
parser->m_endNamespaceDeclHandler(parser->m_handlerArg,
b->prefix->name);
tag->bindings = tag->bindings->nextTagBinding;
b->nextTagBinding = parser->m_freeBindingList;
parser->m_freeBindingList = b;
b->prefix->binding = b->prevPrefixBinding;
}
This means that we can have a TAG object on m_freeTagList with a non-NULL
bindings pointer.
If the user then calls XML_ParserFree, this will be handled correctly:
1439:
/* free m_tagStack and m_freeTagList */
tagList = parser->m_tagStack;
for (;;) {
TAG *p;
if (tagList == NULL) {
if (parser->m_freeTagList == NULL)
break;
tagList = parser->m_freeTagList;
parser->m_freeTagList = NULL;
}
p = tagList;
tagList = tagList->parent;
FREE(parser, p->buf);
destroyBindings(p->bindings, parser);
FREE(parser, p);
}
However, if the user instead calls XML_ParserReset, which the API documentation
suggests to me is an intended use-case, then this is not handled:
1186:
XML_Bool XMLCALL
XML_ParserReset(XML_Parser parser, const XML_Char *encodingName) {
TAG *tStk;
OPEN_INTERNAL_ENTITY *openEntityList;
if (parser == NULL)
return XML_FALSE;
if (parser->m_parentParser)
return XML_FALSE;
/* move m_tagStack to m_freeTagList */
tStk = parser->m_tagStack;
while (tStk) {
TAG *tag = tStk;
tStk = tStk->parent;
tag->parent = parser->m_freeTagList;
moveToFreeBindingList(parser, tag->bindings);
tag->bindings = NULL;
parser->m_freeTagList = tag;
}
/* move m_openInternalEntities to m_freeInternalEntities */
openEntityList = parser->m_openInternalEntities;
while (openEntityList) {
OPEN_INTERNAL_ENTITY *openEntity = openEntityList;
openEntityList = openEntity->next;
openEntity->next = parser->m_freeInternalEntities;
parser->m_freeInternalEntities = openEntity;
}
moveToFreeBindingList(parser, parser->m_inheritedBindings);
FREE(parser, parser->m_unknownEncodingMem);
if (parser->m_unknownEncodingRelease)
parser->m_unknownEncodingRelease(parser->m_unknownEncodingData);
poolClear(&parser->m_tempPool);
poolClear(&parser->m_temp2Pool);
FREE(parser, (void *)parser->m_protocolEncodingName);
parser->m_protocolEncodingName = NULL;
parserInit(parser, encodingName);
dtdReset(parser->m_dtd, &parser->m_mem);
return XML_TRUE;
}
When we then later re-use the tag, the bindings pointer is just cleared:
2904:
TAG *tag;
enum XML_Error result;
XML_Char *toPtr;
if (parser->m_freeTagList) {
tag = parser->m_freeTagList;
parser->m_freeTagList = parser->m_freeTagList->parent;
} else {
tag = (TAG *)MALLOC(parser, sizeof(TAG));
if (! tag)
return XML_ERROR_NO_MEMORY;
tag->buf = (char *)MALLOC(parser, INIT_TAG_BUF_SIZE);
if (! tag->buf) {
FREE(parser, tag);
return XML_ERROR_NO_MEMORY;
}
tag->bufEnd = tag->buf + INIT_TAG_BUF_SIZE;
}
tag->bindings = NULL;
AFAIU this won't happen without a call to XML_ParserReset, since we
need to go through the error path in XML_TOK_END_TAG, and at that point
we'll be in an error state and shouldn't continue using the parser.
(Another fix would be to have XML_ParserReset add the bindings from all
of the free tags to m_freeBindingsList, but given that the path which
allocates from m_freeTagList clearly doesn't expect this pointer to be
valid, I thought it was better to ensure that invariant is always true
instead.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@c01db33f excellent explanation, thank you!
I believe I have found an even simpler fix: We could postpone the move to the free tag list until after the closing tag name match check. I have created a pull request #653 now that contains that approach of a patch and a reproducer test based on your explanation. You review would be appreciated.
(I have also created a dedicated ticket for this issue at #652.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related: #653 (comment)
| parser->m_dtd = dtd; | ||
| else { | ||
| #ifdef XML_DTD | ||
| parser->m_isParamEntity = XML_TRUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@c01db33f I have had a chance at a closer look at the DTD destruction topic now. I confirm that you have found a bug, many thanks for the detailed report!
I found that function XML_ExternalEntityParserCreate can be called with either a NULL context or a non-NULL context. Previously on master, only in case of a NULL context parser->m_isParamEntity = XML_TRUE would be applied (in the else branch below) …
Lines 1392 to 1412 in b4eecc1
| if (context) { | |
| #endif /* XML_DTD */ | |
| if (! dtdCopy(oldParser, parser->m_dtd, oldDtd, &parser->m_mem) | |
| || ! setContext(parser, context)) { | |
| XML_ParserFree(parser); | |
| return NULL; | |
| } | |
| parser->m_processor = externalEntityInitProcessor; | |
| #ifdef XML_DTD | |
| } else { | |
| /* The DTD instance referenced by parser->m_dtd is shared between the | |
| document's root parser and external PE parsers, therefore one does not | |
| need to call setContext. In addition, one also *must* not call | |
| setContext, because this would overwrite existing prefix->binding | |
| pointers in parser->m_dtd with ones that get destroyed with the external | |
| PE parser. This would leave those prefixes with dangling pointers. | |
| */ | |
| parser->m_isParamEntity = XML_TRUE; | |
| XmlPrologStateInitExternalEntity(&parser->m_prologState); | |
| parser->m_processor = externalParEntInitProcessor; | |
| } |
…but the patch in #616 here seems to now apply
parser->m_isParamEntity = XML_TRUE irrespective of context. Is that intended?
I have created a dedicated pull request #650 with an alternative candidate for a patch and a reproducer test case. Your review would be greatly appreciated.
(I have also created a dedicated ticket #649 from your report and patch candidate.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS: @c01db33f how much security impact (if any) do you expect for this bug? I'm unsure and asking due to the need for OOM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the late reply.
If I remember correctly my intention was to default to m_isParamEntity = XML_TRUE, and then later instead of setting it to XML_TRUE too late, it would instead set XML_FALSE for the opposite case, but I think that you're right that I have missed the case of a NULL context not also setting XML_TRUE.
Your fix in the alternative patch should also resolve the issue, and is a smaller change, so should probably be preferred.
I don't think there's a significant security impact here, although I could be wrong. The size of the allocation that needs to fail is very small, so it would need a lot of control over the context in which it's running to get that allocation specifically to fail, and it would be even more difficult to then get things to continue far enough to make use of the bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@c01db33f thanks for the review and elaboration!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@c01db33f for the record, we have received CVE-2022-43680 for this issue today.
|
Closing as superseded by the issues…
…and the related pull requests…
…for the purpose of clarity. |
Two fixes for minor issues, see the commit messages for explanations of the issues being fixed.
An alternative fix for the memory leak would be to delay moving the tag from m_tagStack to m_freeTagList until after the check has passed; I opted for this approach as it only changes code in the error path.