Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use different namespace_separator with new libexpat #77

Merged
merged 1 commit into from
Feb 28, 2022

Conversation

sebageek
Copy link
Collaborator

Newer version of libexpat have a mitigation for CVE-2022-25236 in place,
which disallows the use of certain characters as namespace separators
(to my understanding this is the separator used to separate namespace
and tag name in the parsed xml output we receive from the library). We
implicitly use libexpat via xmltodict.parse(), xmltodict uses a default
of ':', which now is invalid. Using ':' as separator results in the
following exception:

xml.parsers.expat.ExpatError: out of memory: line 1, column 0

This can also be reproduced with this python snippet:

xmltodict.parse("", process_namespaces=True)

To mitigate this we need to use a different separator. xmltodict.parse()
exposes this as an argument, so passing namespace_separator=' ' (as
recommended by libexpat as a char that is not part of an url, see bug
reports below or CVE) solves the problem for us. From what I can see
this also doesn't require any other changes on our side.

Relevant change in libexpat:

Relevant bugreports:

Newer version of libexpat have a mitigation for CVE-2022-25236 in place,
which disallows the use of certain characters as namespace separators
(to my understanding this is the separator used to separate namespace
and tag name in the parsed xml output we receive from the library). We
implicitly use libexpat via xmltodict.parse(), xmltodict uses a default
of ':', which now is invalid. Using ':' as separator results in the
following exception:

xml.parsers.expat.ExpatError: out of memory: line 1, column 0

This can also be reproduced with this python snippet:

xmltodict.parse("<foo></foo>", process_namespaces=True)

To mitigate this we need to use a different separator. xmltodict.parse()
exposes this as an argument, so passing namespace_separator=' ' (as
recommended by libexpat as a char that is not part of an url, see bug
reports below or CVE) solves the problem for us. From what I can see
this also doesn't require any other changes on our side.

Relevant change in libexpat:
 * libexpat/libexpat#561

Relevant bugreports:
 * libexpat/libexpat#572
 * martinblech/xmltodict#289
@sebageek sebageek merged commit 90ee7ac into stable/ussuri-m3 Feb 28, 2022
@sebageek sebageek deleted the xmltodict-specify-namespace-separator branch February 28, 2022 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant