New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xmlparse.c: Fix extraction of namespace prefix from XML name (#186) #262
Conversation
I feel a bit uneasy about the "fix" and even more so about the explanation. It is right, that in "bare" (namespace agnostic) XML documents colons (:) are allowed in element (and also) attribute names. That is: parser created with XML_ParserCreate() should accept documents with such names (and they did and do, as far as I'm aware). Now, if you want a parser, that accepts documents which are compliant not only to the w3c XML recommendation but also to the w3c XML namespaces recommendation the rules are a little bit adjusted. For expat this are the parsers created with XML_ParserCreateNS(). For such documents at max one colon per element (or attribute) name is possible. The productions only a few lines below the place (https://www.w3.org/TR/xml-names/#NT-QName) cited by hartwork are cristal clear in this regard (if one need a confirming look into the spec for such a basic fact about XML). More than colon is a well-formedness error in this case. So, either the parser is a "bare" XML rec parser then any number of colons (not as start char, but otherwise) in an element or attribute name is OK and there is no special meaning to them. So there is no point in looking them up with SetElementTypePrefix(). Or the parser is an XML and XML namespaces aware parser. Then SetElementTypePrefix() should simply return parsing error at the second colon in a name. |
I have not been following this issue, just noticed it now. But it seems
Rolf Ade's comment is correct.
Can't go really look at the code change as I am on vacation and only have
my phone at hand.
…On Mon., Jul. 8, 2019, 1:04 a.m. Rolf Ade, ***@***.***> wrote:
I feel a bit uneasy about the "fix" and even more so about the
explanation. It is right, that in "bare" (namespace agnostic) XML documents
colons (:) are allowed in element (and also) attribute names. That is:
parser created with XML_ParserCreate() should accept documents with such
names (and they did and do, as far as I'm aware).
Now, if you want a parser, that accepts documents which are compliant not
only to the w3c XML recommendation but also to the w3c XML namespaces
recommendation the rules are a little bit adjusted. For expat this are the
parsers created with XML_ParserCreateNS().
For such documents at max *one* colon per element (or attribute) name is
possible. The productions only a few lines below the place (
https://www.w3.org/TR/xml-names/#NT-QName) cited by hartwork are cristal
clear in this regard (if one need a confirming look into the spec for such
a basic fact about XML). More than colon is a well-formedness error in this
case.
So, either the parser is a "bare" XML rec parser then any number of colons
(not as start char, but otherwise) in an element or attribute name is OK
and there is no special meaning to them. So there is no point in looking
them up with SetElementTypePrefix().
Or the parser is an XML and XML namespaces aware parser. Then
SetElementTypePrefix() should simply return parsing error at the second
colon in a name.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#262?email_source=notifications&email_token=AAFABKUYA3J7PD4C7RTPPETP6JY65A5CNFSM4HX5H6QKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZLUYSI#issuecomment-509037641>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAFABKX6EK2KSEWJMXNF6ALP6JY65ANCNFSM4HX5H6QA>
.
|
Hi @pointsman and @kwaclaw , I appreciate the review and feedback. I would like to clearly separate the two topics that we might be about to mix here: (1) Whether the fix in this pull request made things better and/or worse and (2) if we can and should reject multiple colons and under what circumstances. For (1) the security issue is fixed — that's better — and the name prefix extracted eventually now contains the the text up to but excluding the first colon rather than the last one. Given that a name prefix cannot contain a colon by the spec, that seems also better than before to me. Is if there is anything off in that evaluation for (1) so far? For (2) I am happy to team up on improving the situation further. The way I read …
… at 7 Conformance of Documents supports @pointsman's argument. If enabling namespace support flips Expat from reporting about XML well-formedness to reporting about namespace-well-formed I guess we need to error out with a parse error in future releases indeed. Let me know what you think! |
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [#186](libexpat/libexpat#186) [#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
- exp-run by antoine PR: 238864 Submitted by: Sergei Vyshenski <svysh.fbsd@gmail.com> (maintainer) Reviewed by: koobs Relnotes: https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes Security: libexpat/libexpat#186 libexpat/libexpat#262 git-svn-id: svn+ssh://svn.freebsd.org/ports/head@512162 35697150-7ecd-e111-bb59-0022644237b5
- exp-run by antoine PR: 238864 Submitted by: Sergei Vyshenski <svysh.fbsd@gmail.com> (maintainer) Reviewed by: koobs Relnotes: https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes Security: libexpat/libexpat#186 libexpat/libexpat#262 git-svn-id: svn+ssh://svn.freebsd.org/ports/head@512162 35697150-7ecd-e111-bb59-0022644237b5
- exp-run by antoine PR: 238864 Submitted by: Sergei Vyshenski <svysh.fbsd@gmail.com> (maintainer) Reviewed by: koobs Relnotes: https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes Security: libexpat/libexpat#186 libexpat/libexpat#262
textproc/expat2: upgrade 2.2.6 -> 2.2.7 - exp-run by antoine PR: 238864 Submitted by: Sergei Vyshenski <svysh.fbsd@gmail.com> (maintainer) Reviewed by: koobs Relnotes: https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes Security: libexpat/libexpat#186 libexpat/libexpat#262 textproc/expat2: upgrade 2.2.7 -> 2.2.8 PR: 240613 Submitted by: Sergei Vyshenski <svysh.fbsd@gmail.com> (maintainer) Exp-Run by: antoine Relnotes: https://github.com/libexpat/libexpat/blob/R_2_2_8/expat/Changes Security: CVE-2019-15903 Approved by: ports-secteam
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [#186](libexpat/libexpat#186) [#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
textproc/expat2: upgrade 2.2.6 -> 2.2.7 - exp-run by antoine PR: 238864 Submitted by: Sergei Vyshenski <svysh.fbsd@gmail.com> (maintainer) Reviewed by: koobs Relnotes: https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes Security: libexpat/libexpat#186 libexpat/libexpat#262 textproc/expat2: upgrade 2.2.7 -> 2.2.8 PR: 240613 Submitted by: Sergei Vyshenski <svysh.fbsd@gmail.com> (maintainer) Exp-Run by: antoine Relnotes: https://github.com/libexpat/libexpat/blob/R_2_2_8/expat/Changes Security: CVE-2019-15903 Approved by: ports-secteam
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [dashpay#186](libexpat/libexpat#186) [dashpay#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [dashpay#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [dashpay#186](libexpat/libexpat#186) [dashpay#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [dashpay#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [dashpay#186](libexpat/libexpat#186) [dashpay#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [dashpay#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [dashpay#186](libexpat/libexpat#186) [dashpay#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [dashpay#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [dashpay#186](libexpat/libexpat#186) [dashpay#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [dashpay#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [dashpay#186](libexpat/libexpat#186) [dashpay#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [dashpay#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [dashpay#186](libexpat/libexpat#186) [dashpay#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [dashpay#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [dashpay#186](libexpat/libexpat#186) [dashpay#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [dashpay#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
0512f05 depends: expat 2.2.7 (fanquake) Pull request description: Major changes in expat 2.2.7: * [dashpay#186](libexpat/libexpat#186) [dashpay#262](libexpat/libexpat#262) Fix extraction of namespace prefixes from XML names; XML names with multiple colons could end up in the wrong namespace, and take a high amount of RAM and CPU resources while processing, opening the door to use for denial-of-service attacks * [dashpay#227](libexpat/libexpat#227) Autotools: Add --without-examples and --without-tests Full changelog is available [here](https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes#L5). ACKs for top commit: laanwj: ACK 0512f05 Tree-SHA512: 45162a9b0011107fd59a97dae7b5eb61989dafbec26b1ee497d1b11bf5c6a119971096899caa2998648b82a62db57c629a1560453557146c2496b39a7f3f8de9
- exp-run by antoine PR: 238864 Submitted by: Sergei Vyshenski <svysh.fbsd@gmail.com> (maintainer) Reviewed by: koobs Relnotes: https://github.com/libexpat/libexpat/blob/R_2_2_7/expat/Changes Security: libexpat/libexpat#186 libexpat/libexpat#262
My impression is that function
setElementTypePrefix
extracts the XML namespace prefix from an XML name and puts it intoelementType->prefix
if it was able find a registered prefix for that namespace:libexpat/expat/lib/xmlparse.c
Lines 6050 to 6078 in 7f32910
Now according to this quote …
… from Namespaces in XML 1.0 (Third Edition) namespace prefixes cannot contain a colon.
It also seems that colons should be used in XML names for namespaces only, but do not render the name malformed if used otherwise they way I read this:
To conclude from those two, my understanding is that we should treat all text before the first colon (if any) as a namespace prefix and treat the remainder as a name from (or relative to) that namespace, containing further colons or not.
To me, the "right" fix seems to be be to just add a
break
statement to the end if the innerif
-block to stop after the first time we handled a colon. In my test, this patch does defuse the case of the file attached to issue #186 while keeping the test suite happy.Review welcome and appreciated!
Related: