Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfixes #616

Closed
wants to merge 3 commits into from
Closed

Bugfixes #616

wants to merge 3 commits into from

Conversation

c01db33f
Copy link

Two fixes for minor issues, see the commit messages for explanations of the issues being fixed.

An alternative fix for the memory leak would be to delay moving the tag from m_tagStack to m_freeTagList until after the check has passed; I opted for this approach as it only changes code in the error path.

If the allocation in parserInit fails:
parser->m_protocolEncodingName = copyString(encodingName, &(parser->m_mem));

Then we'll call XML_ParserFree on the parser inside parserCreate:
if (encodingName && ! parser->m_protocolEncodingName) {
  XML_ParserFree(parser);
  return NULL;
}

If we're inside XML_ExternalEntityParserCreate, this can mean that
XML_ParserFree is called on a parser whose m_dtd pointer is shared with the
document's root parser. Since the flag m_isParamEntity is only set in
XML_ExternalEntityParserCreate after parserCreate returns, this call to
XML_ParserFree will incorrectly destroy the shared dtd.

This fix moves the setting of m_isParamEntity into parserCreate, since the
dtd parameter is only non-NULL in this case.
In the handling of XML_TOK_END_TAG, we push tag onto parser->m_freeTagList.
If we then hit the error case, we'll return without pushing the tag's bindings
onto parser->m_freeBindingList, so the binding allocations will be leaked.
moveToFreeBindingList(parser, tag->bindings);
tag->bindings = NULL;
Copy link
Member

@hartwork hartwork Jul 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I cannot confirm a leak. This line of code is touched by the test suite and the test suite is executed under LeakSan. I would expect a call to XML_ParserFree to "plug" this leak, and missing to call XML_ParserFree would be a bug in the code using Expat. Please help me understand why we have a true leak here if so, thank you.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we move tag from m_tagStack to m_freeTagList:

3017:

  TAG *tag = parser->m_tagStack;
  parser->m_tagStack = tag->parent;
  tag->parent = parser->m_freeTagList;
  parser->m_freeTagList = tag;

We don't do anything with the tag bindings until later:

3057:

  while (tag->bindings) {
    BINDING *b = tag->bindings;
    if (parser->m_endNamespaceDeclHandler)
      parser->m_endNamespaceDeclHandler(parser->m_handlerArg,
                                        b->prefix->name);
    tag->bindings = tag->bindings->nextTagBinding;
    b->nextTagBinding = parser->m_freeBindingList;
    parser->m_freeBindingList = b;
    b->prefix->binding = b->prevPrefixBinding;
  }

This means that we can have a TAG object on m_freeTagList with a non-NULL
bindings pointer.

If the user then calls XML_ParserFree, this will be handled correctly:

1439:

  /* free m_tagStack and m_freeTagList */
  tagList = parser->m_tagStack;
  for (;;) {
    TAG *p;
    if (tagList == NULL) {
      if (parser->m_freeTagList == NULL)
        break;
      tagList = parser->m_freeTagList;
      parser->m_freeTagList = NULL;
    }
    p = tagList;
    tagList = tagList->parent;
    FREE(parser, p->buf);
    destroyBindings(p->bindings, parser);
    FREE(parser, p);
  }

However, if the user instead calls XML_ParserReset, which the API documentation
suggests to me is an intended use-case, then this is not handled:

1186:

  XML_Bool XMLCALL
  XML_ParserReset(XML_Parser parser, const XML_Char *encodingName) {
    TAG *tStk;
    OPEN_INTERNAL_ENTITY *openEntityList;

    if (parser == NULL)
      return XML_FALSE;

    if (parser->m_parentParser)
      return XML_FALSE;
    /* move m_tagStack to m_freeTagList */
    tStk = parser->m_tagStack;
    while (tStk) {
      TAG *tag = tStk;
      tStk = tStk->parent;
      tag->parent = parser->m_freeTagList;
      moveToFreeBindingList(parser, tag->bindings);
      tag->bindings = NULL;
      parser->m_freeTagList = tag;
    }
    /* move m_openInternalEntities to m_freeInternalEntities */
    openEntityList = parser->m_openInternalEntities;
    while (openEntityList) {
      OPEN_INTERNAL_ENTITY *openEntity = openEntityList;
      openEntityList = openEntity->next;
      openEntity->next = parser->m_freeInternalEntities;
      parser->m_freeInternalEntities = openEntity;
    }
    moveToFreeBindingList(parser, parser->m_inheritedBindings);
    FREE(parser, parser->m_unknownEncodingMem);
    if (parser->m_unknownEncodingRelease)
      parser->m_unknownEncodingRelease(parser->m_unknownEncodingData);
    poolClear(&parser->m_tempPool);
    poolClear(&parser->m_temp2Pool);
    FREE(parser, (void *)parser->m_protocolEncodingName);
    parser->m_protocolEncodingName = NULL;
    parserInit(parser, encodingName);
    dtdReset(parser->m_dtd, &parser->m_mem);
    return XML_TRUE;
  }

When we then later re-use the tag, the bindings pointer is just cleared:

2904:

  TAG *tag;
  enum XML_Error result;
  XML_Char *toPtr;
  if (parser->m_freeTagList) {
    tag = parser->m_freeTagList;
    parser->m_freeTagList = parser->m_freeTagList->parent;
  } else {
    tag = (TAG *)MALLOC(parser, sizeof(TAG));
    if (! tag)
      return XML_ERROR_NO_MEMORY;
    tag->buf = (char *)MALLOC(parser, INIT_TAG_BUF_SIZE);
    if (! tag->buf) {
      FREE(parser, tag);
      return XML_ERROR_NO_MEMORY;
    }
    tag->bufEnd = tag->buf + INIT_TAG_BUF_SIZE;
  }
  tag->bindings = NULL;

AFAIU this won't happen without a call to XML_ParserReset, since we
need to go through the error path in XML_TOK_END_TAG, and at that point
we'll be in an error state and shouldn't continue using the parser.

(Another fix would be to have XML_ParserReset add the bindings from all
of the free tags to m_freeBindingsList, but given that the path which
allocates from m_freeTagList clearly doesn't expect this pointer to be
valid, I thought it was better to ensure that invariant is always true
instead.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@c01db33f excellent explanation, thank you! 👍 I confirm that this is a true memory leak bug.

I believe I have found an even simpler fix: We could postpone the move to the free tag list until after the closing tag name match check. I have created a pull request #653 now that contains that approach of a patch and a reproducer test based on your explanation. You review would be appreciated.

(I have also created a dedicated ticket for this issue at #652.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related: #653 (comment)

parser->m_dtd = dtd;
else {
#ifdef XML_DTD
parser->m_isParamEntity = XML_TRUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@c01db33f I have had a chance at a closer look at the DTD destruction topic now. I confirm that you have found a bug, many thanks for the detailed report! 👍

I found that function XML_ExternalEntityParserCreate can be called with either a NULL context or a non-NULL context. Previously on master, only in case of a NULL context parser->m_isParamEntity = XML_TRUE would be applied (in the else branch below) …

libexpat/expat/lib/xmlparse.c

Lines 1392 to 1412 in b4eecc1

if (context) {
#endif /* XML_DTD */
if (! dtdCopy(oldParser, parser->m_dtd, oldDtd, &parser->m_mem)
|| ! setContext(parser, context)) {
XML_ParserFree(parser);
return NULL;
}
parser->m_processor = externalEntityInitProcessor;
#ifdef XML_DTD
} else {
/* The DTD instance referenced by parser->m_dtd is shared between the
document's root parser and external PE parsers, therefore one does not
need to call setContext. In addition, one also *must* not call
setContext, because this would overwrite existing prefix->binding
pointers in parser->m_dtd with ones that get destroyed with the external
PE parser. This would leave those prefixes with dangling pointers.
*/
parser->m_isParamEntity = XML_TRUE;
XmlPrologStateInitExternalEntity(&parser->m_prologState);
parser->m_processor = externalParEntInitProcessor;
}

…but the patch in #616 here seems to now apply parser->m_isParamEntity = XML_TRUE irrespective of context. Is that intended?

I have created a dedicated pull request #650 with an alternative candidate for a patch and a reproducer test case. Your review would be greatly appreciated.

(I have also created a dedicated ticket #649 from your report and patch candidate.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS: @c01db33f how much security impact (if any) do you expect for this bug? I'm unsure and asking due to the need for OOM.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the late reply.

If I remember correctly my intention was to default to m_isParamEntity = XML_TRUE, and then later instead of setting it to XML_TRUE too late, it would instead set XML_FALSE for the opposite case, but I think that you're right that I have missed the case of a NULL context not also setting XML_TRUE.

Your fix in the alternative patch should also resolve the issue, and is a smaller change, so should probably be preferred.

I don't think there's a significant security impact here, although I could be wrong. The size of the allocation that needs to fail is very small, so it would need a lot of control over the context in which it's running to get that allocation specifically to fail, and it would be even more difficult to then get things to continue far enough to make use of the bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@c01db33f thanks for the review and elaboration! 🙏 From what you wrote I'm hearing that there is some greater zero chance that this could be exploited. I'm tending towards treating it as security relevant then and to file a CVE so that everyone is safe, eventually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@c01db33f for the record, we have received CVE-2022-43680 for this issue today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants