-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[W.I.P.] Limit entity expansion to prevent DoS attacks #220
Conversation
e5a40e9
to
f6c14dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Christian,
I have added some implementation related comment below. On bird view level, one thing I wonder about is how important you consider 100% future API compatibility to defusedexpat. What do you think?
Best, Sebastian
expat/lib/expat.h
Outdated
@@ -124,9 +124,32 @@ enum XML_Error { | |||
XML_ERROR_RESERVED_PREFIX_XMLNS, | |||
XML_ERROR_RESERVED_NAMESPACE_URI, | |||
/* Added in 2.2.1. */ | |||
XML_ERROR_INVALID_ARGUMENT | |||
XML_ERROR_INVALID_ARGUMENT, | |||
/* Added in XXX */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the feature merged (and then only) I'd want to go for version 2.3.0.
expat/lib/xmlparse.c
Outdated
@@ -2418,6 +2455,10 @@ XML_ErrorString(enum XML_Error code) | |||
/* Added in 2.2.5. */ | |||
case XML_ERROR_INVALID_ARGUMENT: /* Constant added in 2.2.1, already */ | |||
return XML_L("invalid argument"); | |||
case XML_ERROR_ENTITY_EXPANSION_LIMIT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add marker /* Added in 2.3.0 */
here as well
expat/lib/xmlparse.c
Outdated
@@ -5391,6 +5432,13 @@ processInternalEntity(XML_Parser parser, ENTITY *entity, | |||
enum XML_Error result; | |||
OPEN_INTERNAL_ENTITY *openEntity; | |||
|
|||
if (!(parser->m_options & XML_OPTION_HUGE_ENTITES)) { | |||
parser->m_entityRecursionLevel++; | |||
if (parser->m_entityRecursionLevel >= XML_ENTITY_RECURSION_LIMIT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that XML_ENTITY_RECURSION_LIMIT
is compared with "greater than or equal to" while XML_ENTITY_EXPANSION_SIZE
is compared with "greater than". Is that difference intentional?
expat/lib/xmlparse.c
Outdated
(parser->m_entityExpansionLen > XML_ENTITY_EXPANSION_SIZE) && | ||
(parser->m_entityExpansionLen > index * XML_ENTITY_EXPANSION_RATIO)) { | ||
/* Ratio between processed XML bytes and all expanded entities is off */ | ||
return XML_ERROR_ENTITY_EXPANSION_LIMIT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code XML_ERROR_ENTITY_EXPANSION_LIMIT
seems to be used for two (slightly) different error scenarios now. Is there some security concern why we need to obscure those two cases into one? Else I imagine debugging is going to be more fun if those two cases can be told apart. What do you think?
expat/tests/xmltest.log.expected
Outdated
Passed: 1801 | ||
Failed: 8 | ||
Passed: 1800 | ||
Failed: 9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like we have a single failed test more now? Please share the story with this.
expat/lib/xmlparse.c
Outdated
parser->m_entityExpansionLen += entity->textLen; | ||
if ((index > 0) && | ||
(parser->m_entityExpansionLen > XML_ENTITY_EXPANSION_SIZE) && | ||
(parser->m_entityExpansionLen > index * XML_ENTITY_EXPANSION_RATIO)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That multiplication may need integer overflow detection/protection. What do you think?
expat/lib/xmlparse.c
Outdated
@@ -603,6 +603,9 @@ struct XML_ParserStruct { | |||
OPEN_INTERNAL_ENTITY *m_openInternalEntities; | |||
OPEN_INTERNAL_ENTITY *m_freeInternalEntities; | |||
XML_Bool m_defaultExpandInternalEntities; | |||
int m_entityRecursionLevel; | |||
long m_entityExpansionLen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That long
should probably be XML_Size
instead, especially for people with XML_LARGE_SIZE
defined.
6f2d630
to
afde728
Compare
I have updated the PR, added tests and addressed your comments. I wrote |
e061ac7
to
ef592df
Compare
expat/lib/xmlparse.c
Outdated
parser->m_entityExpansionLen += entity->textLen; | ||
index = XML_GetCurrentByteIndex(parser); | ||
limit = index * XML_ENTITY_EXPANSION_RATIO; | ||
if (limit < index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my best knowlegde signed integer overflow is undefined behavior, hence we'd need to prevent overflow before it even happened. Both index
and XML_ENTITY_EXPANSION_RATIO
are signed as far as I can see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expat.h
doesn't define MAX values for types. That makes it kinda complicated to check for overflows. What's your proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could try division instead to bypass that issue. On a chalkboard I'd write in order:
- Right now we have:
parser->m_entityExpansionLen > limit
. - Inlined, that's:
parser->m_entityExpansionLen > index * XML_ENTITY_EXPANSION_RATIO
. - If we devide both sides by
XML_ENTITY_EXPANSION_RATIO
we get:
parser->m_entityExpansionLen / XML_ENTITY_EXPANSION_RATIO > index
. - If we take the floor effect of integer division into account and correct for it to accept the same range we get:
(parser->m_entityExpansionLen + (XML_ENTITY_EXPANSION_RATIO - 1)) / XML_ENTITY_EXPANSION_RATIO > index
.
Does that look about right?
That could be one way. Another way could be leveraging (XML_Size)(-1)
since that type is unsigned. I'd need to think about that more.
Thanks! I haven't mentioned yet how important this feature will be for the whole community and how happy I am that you're working on this. Very nice! I'll comment on the new changes in detail shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests look pretty good overall, please find detailed comments below
expat/README.md
Outdated
@@ -2,7 +2,7 @@ | |||
[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/libexpat/libexpat?svg=true)](https://ci.appveyor.com/project/libexpat/libexpat) | |||
|
|||
|
|||
# Expat, Release 2.2.6 | |||
# Expat, Release 2.3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be awesome to have all the version bump changes split off to a single dedicated commit following the others. Thanks!
expat/lib/xmlparse.c
Outdated
@@ -603,6 +603,9 @@ struct XML_ParserStruct { | |||
OPEN_INTERNAL_ENTITY *m_openInternalEntities; | |||
OPEN_INTERNAL_ENTITY *m_freeInternalEntities; | |||
XML_Bool m_defaultExpandInternalEntities; | |||
int m_entityRecursionLevel; | |||
int m_options; | |||
XML_Index m_entityExpansionLen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why prefer XML_Index
over XML_Size
here? Right now we have a length-index mix here. If the idea is to save some casts later, I'd want to be sure that the loss in clean semantics is worth the gain.
expat/tests/runtests.c
Outdated
@@ -10132,7 +10135,7 @@ START_TEST(test_alloc_realloc_param_entity_newline) | |||
if (i == 0) | |||
fail("Parse succeeded despite failing reallocator"); | |||
if (i == max_realloc_count) | |||
fail("Parse failed at maximum reallocation count"); | |||
fail("Parse failed at maximum reallocation count 8"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cool to avoid that hardcoding of 8
(and 9
a few lines down) one way or another (e.g. format string or doing things on preprocessor level).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 8
and 9
shouldn't have landed in the PR. They are remnants of a debugging session.
expat/tests/runtests.c
Outdated
@@ -11984,6 +12129,7 @@ make_suite(void) | |||
TCase *tc_misc = tcase_create("miscellaneous tests"); | |||
TCase *tc_alloc = tcase_create("allocation tests"); | |||
TCase *tc_nsalloc = tcase_create("namespace allocation tests"); | |||
TCase *tc_he = tcase_create("huge entities"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big problem, but huge_entities
would read a lot better than he
in my eyes (and it's less guessable that nsalloc
, for instance). Just my two cents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed he
to huge_entities
everywhere.
expat/tests/runtests.c
Outdated
" <!ENTITY e4 'entity'>\n" | ||
"]>\n" | ||
"<he>&e1;</he>\n"; | ||
expect_failure(text, XML_ERROR_ENTITY_RECURSION_LIMIT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed to lookup if expect_failure
does parsing inside or not because the name does not say.
If you feel like adding a dedicated commit upfront that renames it to something like parse_and_expect_failure
, my vote for it. Just an idea.
expat/tests/runtests.c
Outdated
"ABCDEFGHIJKLMNOPABCDEFGHIJKLMNOPABCDEFGHIJKLMNOPABCDEFGHIJKLMNOP" | ||
"'>]><he>" | ||
/* 50 + 64 + 8 + (36 * 3) == 230 | ||
* 33 expansions of 64 bytes == 2304 */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to have that math, I tried to verify its details.
I get that 2304 is slightly bigger than 230 x 10 (=default ratio limit) so that extra 4 characters triggers the limit.
The first line is clear:
- 50 seems to be the length of
<!DOCTYPE he [<!ELEMENT he (#PCDATA)*><!ENTITY e '
. - Followed by 64 for entity value
A...P
. - 8 is length of
'>]><he>
- 36 x 3 is all those
&e;
.
For the second I now notice that 33 expansions of 64 bytes
should probably read 36
rather than 33
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I forgot to update that line when I added the <!ELEMENT>
line.
expat/lib/xmlparse.c
Outdated
return XML_L("entity is too large"); | ||
case XML_ERROR_ENTITY_EXPANSION_LIMIT: | ||
return XML_L("entity expansion limit reached"); | ||
case XML_ERROR_ENTITY_RECURSION_LIMIT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still thinking: If we don't need to be compatible with defusedexpat names for these constants, it would be nice to get them in good consistent shape. Just to give an example, I'd find something like
XML_ERROR_ENTITY_VIOLATION_SIZE
XML_ERROR_ENTITY_VIOLATION_RATIO
XML_ERROR_ENTITY_VIOLATION_DEPTH
more consistant. Maybe we can find some names nice that have that property. (The current _LIMIT
part is also not ideal in my eyes.) What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your names sound better than mine.
expat/lib/expat.h
Outdated
* | ||
* Mitigation against billion laugh and quadratic blowup attacks. | ||
* | ||
* XML_ENTITY_RECURSION_LIMIT confines nesting of entities within entities. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to think if "nesting" would be a better term since "recursion" seems to imply that something can reference itself (which XML entities cannot).
*/ | ||
#define XML_ENTITY_RECURSION_LIMIT 3 | ||
#define XML_ENTITY_EXPANSION_SIZE 1023 | ||
#define XML_ENTITY_EXPANSION_RATIO 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Just to have said it somewhere: We will need to support adjustment of those defaults at runtime at some point, maybe even before merging to master. I'm all in with getting other things polished first, though.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I wouldn't go there. I made the settings flexible in defusedexpat
. In retrospective it was unnecessary API bloat. Even libxml2
doens't allow to tune the limits. It has only an off-switch for limitation. If libxml2
doesn't need extra options, then expat
surely doesn't need them at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting thought. Let me think about it more.
expat/lib/expat.h
Outdated
XML_SetOptions(XML_Parser parser, int options); | ||
|
||
XMLPARSEAPI(int) | ||
XML_GetOptions(XML_Parser parser); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could cover those two functions by some simple test(s) as well. Some idea for a test:
- Call to
XML_GetOptions
returns default options - Call to
XML_SetOptions
succeeds - Call to
XML_GetOptions
returns what was set byXML_SetOptions
just before - Parsing is started
- Call to
XML_SetOptions
fails (due to parser started) - Call to
XML_GetOptions
still returns unmodified previous options
Could also be split in half to make two tests where the second is dedicated to changes at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I added a new test scenario as you proposed.
ef592df
to
e05951d
Compare
expat/tests/runtests.c
Outdated
XML_FALSE) == XML_STATUS_ERROR) | ||
xml_failure(parser); | ||
if (XML_SetOptions(parser, XML_OPTION_NONE)) { | ||
fail("XML_SetOptions() should have failed."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe append " during/while parsing"
expat/tests/runtests.c
Outdated
fail("XML_SetOptions() should have failed."); | ||
} | ||
if (XML_GetOptions(parser) != XML_OPTION_HUGE_ENTITES) | ||
fail("Expected XML_OPTION_HUGE_ENTITES"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe append " (i.e. previously set options)"
dea41f1
to
9455160
Compare
expat/lib/xmlparse.c
Outdated
return XML_STATUS_ERROR; | ||
switch(option) { | ||
case XML_OPTION_HUGE_ENTITIES: | ||
parser->m_hugeEntities = *(XML_Bool*)(value) ? XML_TRUE : XML_FALSE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That dereference makes the usage of XML_SetOption
overly complicated. Compare current:
XML_Bool ENABLED = XML_TRUE;
XML_SetOption(parser, XML_OPTION_HUGE_ENTITIES, &ENABLED);
with possible
XML_SetOption(parser, XML_OPTION_HUGE_ENTITIES, (void *)XML_TRUE);
I believe each option should have a say if it needs a dereference or not and save that extra indirection for plain integer ones. What do you think?
I'm also wondering about use of uintptr_t
/intptr_t
rather than void *
, just an idea.
expat/doc/xmlwf.xml
Outdated
<listitem> | ||
<para> | ||
Enables huge entities option, disabling all mitigations against entity | ||
expansion attacks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote for mentioning "billion laughs" for an example and "DoS" to make things less abstract to people new to these topics.
expat/lib/xmlparse.c
Outdated
/* overflow safe comparison */ | ||
XML_Size limit = (parser->m_entityExpansionLen + | ||
(XML_ENTITY_EXPANSION_RATIO - 1)) / XML_ENTITY_EXPANSION_RATIO; | ||
if (limit > (XML_Size)index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clearer semantics, I'd consider turning this into:
const XML_Index minimumRequiredIndex = (XML_Index)
((parser->m_entityExpansionLen + (XML_ENTITY_EXPANSION_RATIO - 1))
/ XML_ENTITY_EXPANSION_RATIO);
if (index < minimumRequiredIndex) {
Please note the flip of comparision order and operator.
Regarding the cast: The largest positive value for XML_Index
and XML_Size
are factor 2 apart, so if we divide by XML_ENTITY_EXPANSION_RATIO
and that number is greater or equal 2, even m_entityExpansionLen
of (XML_Size)(-1)
would fit in XML_Index minimumRequiredIndex
without truncation.
expat/Changes
Outdated
relation to processed bytes as mitigation against billion | ||
laughs and quadratic blowup attacks. Countermeasure is | ||
enabled by default and can be disabled with | ||
XML_SetOption(parser, XML_OPTION_HUGE_ENTITIES, &bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is CVE-2013-0340. If so, let's mention it.
expat/lib/expat.h
Outdated
}; | ||
|
||
enum XML_Option { | ||
/* Added in 2.3.0 */ | ||
XML_OPTION_HUGE_ENTITIES = 1 /* XML_Bool, no limits for entity expansion */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also mention DoS and security here in case this ends up being the place where peokle look at the most.
expat/tests/xmltest.sh
Outdated
# reldir includes trailing slash | ||
RunXmlwfWF() { | ||
file="$1" | ||
reldir="$2" | ||
$XMLWF -p -N -d "$OUTPUT$reldir" "$file" > outfile || return $? | ||
options="$3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since options is plural, let's make it support multiple options. How about:
file="$1"
shift
reldir="$1"
shift
options=( "$@" )
$XMLWF -p -N -d "$OUTPUT$reldir" "${options[@]}" "$file" > outfile || return $?
Would that work?
expat/tests/xmltest.sh
Outdated
xmlfile=012.xml | ||
cd "$TS/xmlconf/$xmldir" | ||
mkdir -p "$OUTPUT$xmldir" | ||
if [[ -f "$xmlfile" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why that guard? Can that file ever be missing?
expat/tests/xmltest.log.expected
Outdated
@@ -7,4 +7,4 @@ Expected not well-formed: oasis/p06fail1.xml | |||
Expected not well-formed: oasis/p08fail1.xml | |||
Expected not well-formed: oasis/p08fail2.xml | |||
Passed: 1801 | |||
Failed: 8 | |||
Failed: 9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Failed
count goes up by one for tests/xmlconf/xmltest/valid/ext-sa/012.xml
without -H
why does Passed
count not go up by one for the new additional test with -H
?
e7a3551
to
2b4038a
Compare
libexpat is vulnerable to billion laughs and quadratic blowup DoS attacks. processInternalEntity() function now limits entity expansion and nesting in three ways: * Entity nesting is limited to three levels of recursion. * Total length of an entity is limited to 1023 bytes. * The ratio of entity expansion to processed bytes cannot exceed 1:10. The mitigation is enabled by default and can be disabled with XML_SetOptions(parser, XML_OPTION_HUGE_ENTITES). The xmlwf command has a new option -H to enable huge entities. The new entity expansion limits cause one test to fail. The XML test file tests/xmlconf/xmltest/valid/ext-sa/012.xml has five levels of entity expansion. Fixes: libexpat#34 Fixes: libexpat#46 Signed-off-by: Christian Heimes <christian@python.org>
Signed-off-by: Christian Heimes <christian@python.org>
2b4038a
to
f70e6a2
Compare
expat/lib/expat.h
Outdated
XML_OPTION_NESTING_LIMIT, /* int, limit entity nesting depth */ | ||
XML_OPTION_EXPANSION_RATIO, /* int, ratio between expansions and processed text */ | ||
XML_OPTION_MAX_EXPANSION_SIZE, /* XML_Size, limit max entity size for single and nested entities */ | ||
XML_OPTION_MAX_HASH_TABLE_ENTRIES /* XML_Size, max entries in DTD hash tables */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Now that we also have a hash table option in here, maybe the entity ones should include the word "entity". On a blank slate, I'd maybe write:
XML_OPTION_ENTITIES_HUGE
XML_OPTION_ENTITIES_MAX_DEPTH
XML_OPTION_ENTITIES_MAX_RATIO
XML_OPTION_ENTITIES_MAX_SIZE
XML_OPTION_HASH_TABLE_DTD_MAX_ENTRY_COUNT
- For size, we should mention
XML_Char
s to make clear it's not bytes. Maybe even append_CHARS
to that name? - After
= 1
I see no= 2
,= 4
and so on. That looks unintended - Why go for
int
rather thanunsigned int
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XML_OPTION_ENTITIES_MAX_DEPTH
is misleading. It's not a limitation of the depths but how many expansions are nested. For example
<!DOCTYPE xmlbomb [
<!ENTITY a "1234567890" >
<!ENTITY b "&a;&a;&a;&a;&a;&a;&a;&a;&a;&a;">
<!ENTITY c "&b;&b;&b;&b;&b;&b;&b;&b;&b;&b;">
]>
<bomb>&c;</bomb>
has depths of 3 but nesting of 111 (1c + 10b + 1010c).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe depth and nesting are both mis-leading. Maybe path, steps, calls, ..? In a way it's an expansion tree and we limit regarding the number of total nodes of that tree then.
#endif | ||
|
||
#ifndef XML_ENTITY_EXPANSION_SIZE | ||
/* 1MB text */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be 2MB with -DXML_UNICODE
(or even 4MB if we get UTF32 support on day) since we check for number of XML_Chars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I'll update the comment.
expat/tests/runtests.c
Outdated
/* set and get */ | ||
he = XML_TRUE; | ||
if (XML_SetOption(parser, XML_OPTION_HUGE_XML, &he) != XML_STATUS_OK) { | ||
fail("XML_SetOptions() failed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XML_SetOptions()
should probably be XML_SetOption
without s
and without braces for consistency. There are more (leftover) occurrences like that below.
expat/lib/expat.h
Outdated
XML_ERROR_ENTITY_VIOLATION_NESTED_SIZE, | ||
XML_ERROR_ENTITY_VIOLATION_RATIO, | ||
XML_ERROR_ENTITY_VIOLATION_DEPTH, | ||
XML_ERROR_HASH_TABLE_VIOLATION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe XML_ERROR_HASH_TABLE_VIOLATION_SIZE
with _SIZE
appended so that it's clear what the violation is about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XML_ERROR_HASH_TABLE_SIZE_VIOLATION
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would sound good on its own but not match the *_VIOLATION_*
schema above. As you prefer.
expat/lib/expat.h
Outdated
#endif | ||
|
||
#ifndef XML_ENTITY_NESTING_LIMIT | ||
#define XML_ENTITY_NESTING_LIMIT 40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Billion laughs demos use a nesting level of 10. Our default needs to be below 10 then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The settings limits the amount of nested expansions, not the maximum depths of an expansion stack. Perhaps we need a better name or explanation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need a better name then. Some ideas mentioned further up.
return XML_STATUS_OK; | ||
case XML_OPTION_MAX_HASH_TABLE_ENTRIES: | ||
*(XML_Size *)(value) = parser->m_limit->maxHashTableEntries; | ||
return XML_STATUS_OK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these cases are not covered by test yet, it seems.
expat/lib/xmlparse.c
Outdated
@@ -2478,6 +2594,8 @@ XML_GetFeatureList(void) | |||
#ifdef XML_ATTR_INFO | |||
{XML_FEATURE_ATTR_INFO, XML_L("XML_ATTR_INFO"), 0}, | |||
#endif | |||
{XML_FEATURE_HUGE_XML, XML_L("XML_OPTION_HUGE_XML"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That XML_L
seems to be off the indentation grid.
expat/lib/xmlparse.c
Outdated
#endif /* XML_DTD */ | ||
dtdDestroy(parser->m_dtd, (XML_Bool)!parser->m_parentParser, &parser->m_mem); | ||
} | ||
/* LIMIT structure parser-m_limit is shared with all external parsers */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a >
in parser-m_limit
missing here.
@@ -6322,6 +6480,117 @@ normalizePublicId(XML_Char *publicId) | |||
*p = XML_T('\0'); | |||
} | |||
|
|||
static LIMIT * | |||
limitCreate(const XML_Memory_Handling_Suite *ms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may also need something like limitReset
to be used in XML_ParserReset
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I'll add it to XML_ParserReset
.
expat/lib/xmlparse.c
Outdated
} | ||
|
||
static void | ||
limitDestroy(LIMIT *limit,const XML_Memory_Handling_Suite *ms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is limitDestroy
called at all needed places, yet? I expected a call from XML_ParserFree
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in two places. I also need to free the structure in case another allocation in parserCreate
fails. But it's just a simple FREE
. I'll get rid of the function.
613dfb4
to
2841299
Compare
* rename various constants and enum members * change types to size_t and unsigned int * add tests for getters and setters * add limitReset() to XML_ParserReset() * limit size in bytes, not sizeof(XML_Char) * add ratio threshold * more comments Signed-off-by: Christian Heimes <christian@python.org>
2841299
to
24fc761
Compare
typedef struct { | ||
ENTITY *first_entity; | ||
unsigned int entitiesNestingLevel; | ||
size_t nestedEntitiesExpansionSize; /* in bytes, not XML_Cha) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo XML_Cha)
XML_Bool hugeXML; | ||
unsigned int entitiesMaxNesting; | ||
unsigned int entitiesMaxRatio; | ||
unsigned int entitiesMaxRatioThreshold; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use XML_Size
or XML_Index
for entitiesMaxRatioThreshold
.
Is it possible to bring this up? |
I'm not sure I get the question. Please elaborate. Thank you! |
I was thinking, if it's possible to make these changes ready to be merged. |
At some point, but it will take more time I'm afraid. If you have someone at hand with some spare time to collaborate with me getting this "project" closer to the finishing line, I might find spare time a bit sooner. |
libexpat is vulnerable to billion laughs and quadratic blowup DoS attacks.
processInternalEntity() function now limits entity expansion and
nesting in three ways:
The mitigation is enabled by default and can be disabled with
XML_SetOptions(parser, XML_OPTION_HUGE_ENTITES).
The xmlwf command has a new option -H to enable huge entities.
The new entity expansion limits cause one test to fail. The XML test
file tests/xmlconf/xmltest/valid/ext-sa/012.xml has five levels of
entity expansion.
Fixes: #34
Fixes: #46
Signed-off-by: Christian Heimes christian@python.org