Skip to content
Permalink
Browse files Browse the repository at this point in the history
Enforce entity loading policy in raptor_libxml_resolveEntity
and raptor_libxml_getEntity by checking for file URIs and network URIs.

Add RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES / loadExternalEntities for
turning on loading of XML external entity loading, disabled by default.

This affects all the parsers that use SAX2: rdfxml, rss-tag-soup (and
aliases) and rdfa.
  • Loading branch information
dajobe committed Mar 16, 2012
1 parent ce893f4 commit a676f23
Show file tree
Hide file tree
Showing 11 changed files with 169 additions and 16 deletions.
17 changes: 17 additions & 0 deletions ChangeLog
@@ -1,3 +1,20 @@
2012-01-29 Dave Beckett <dave@dajobe.org>

* librdfa/rdfa.c, src/raptor2.h.in, src/raptor_libxml.c,
src/raptor_option.c, src/raptor_rdfxml.c, src/raptor_rss.c,
src/raptor_turtle_writer.c: CVE-2012-0037

Enforce entity loading policy in raptor_libxml_resolveEntity and
raptor_libxml_getEntity by checking for file URIs and network
URIs.

Add RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES / loadExternalEntities
for turning on loading of XML external entity loading, disabled by
default.

This affects all the parsers that use SAX2: rdfxml, rdfa,
rss-tag-soup (and aliases).

2012-03-15 Dave Beckett <dave@dajobe.org>

* librdfa/rdfa.c: Pass on options NO_NET and NO_FILE to RDFA SAX2
Expand Down
13 changes: 7 additions & 6 deletions NEWS.html
Expand Up @@ -8,20 +8,21 @@

<h1 style="text-align:center">Raptor RDF Syntax Library - News</h1>

<h2 id="D2012-XX-XX-V2.0.7">2012-XX-XX Raptor2 Version 2.0.7 Released</h2>
<h2 id="D2012-03-22-V2.0.7">2012-03-22 Raptor2 Version 2.0.7 Released</h2>

<p>Not yet released.
</p>

<p>
<p>CVE-2012-0037 fixed<br />
Removed Expat support<br />
Removed internal Unicode NFC code for better and optional <a href="http://www.icu-project.org/">ICU</a><br />
Added new options for denying file requests and SSL certificate verifying<br />
Added options for denying file requests and XML entity loading<br />
Added options for SSL certificate verifying<br />
Fixed reported <a href="http://bugs.librdf.org/">issues</a>:
<a href="http://bugs.librdf.org/mantis/view.php?id=448">0000448</a> and
<a href="http://bugs.librdf.org/mantis/view.php?id=469">0000469</a>
</p>

<p>See the <a href="RELEASE.html#rel2_0_7">Raptor2 2.0.7 Release Notes</a>
for the full details of the changes.</p>


<h2 id="D2011-11-27-V2.0.6">2011-11-27 Raptor2 Version 2.0.6 Released</h2>

Expand Down
24 changes: 19 additions & 5 deletions RELEASE.html
Expand Up @@ -11,8 +11,7 @@ <h1 style="text-align:center">Raptor RDF Syntax Library - Release Notes</h1>

<h2 id="rel2_0_7"><a name="rel2_0_7">Raptor2 2.0.7 changes</a></h2>

<p>Not yet released.
</p>
<p>CVE-2012-0037 fixed</p>

<p>Issues Fixed:</p>
<ul>
Expand All @@ -39,6 +38,10 @@ <h3>Options changes</h3>
<dt><code>RAPTOR_OPTION_NO_FILE</code><br/></dt>
<dd>Deny file requests during parsing.</dd>

<dt><code>RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES</code><br/></dt>
<dd>Deny loading of XML external entity loading. Disabled by
default.</dd>

<dt><code>RAPTOR_OPTION_WWW_SSL_VERIFY_PEER</code><br/></dt>
<dd>Controls verifying an SSL peer during parsing / WWW. Takes an
integer value: non-0 to verify peer SSL certificate (default
Expand All @@ -53,6 +56,10 @@ <h3>Options changes</h3>

<h3>Parser class changes</h3>

<p>The RDF/XML, RSS Tag Soup and RDFa parsers now pass on network,
file and entity loading parser options to the internal SAX2 to enable
enforcing of network, file and entity loading policy.</p>

<p>RDF/JSON parser handles an API change between YAJL V1 and V2.
</p>

Expand All @@ -64,11 +71,18 @@ <h3>Parser class changes</h3>

<h3>SAX2 class changes</h3>

<p>
Added <code>raptor_sax2_set_uri_filter()</code> to set a URI filter
for any SAX2 calls that do internal lookups of URIs.
<p>Added <code>raptor_sax2_set_uri_filter()</code> to set a URI
filter for any SAX2 calls that do internal lookups of URIs.
</p>

<p>Control file and network loading inside SAX2. Option
<code>RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES</code> now enforces
loading external XML entities and is by default enabled. If enabled,
<code>RAPTOR_OPTION_NO_FILE</code> and
<code>RAPTOR_OPTION_NO_NET</code> are also checked. All URIs loaded
are also passed through any URI filter, if set by
<code>raptor_sax2_set_uri_filter()</code>.
</p>

<h3>URI class changes</h3>

Expand Down
1 change: 1 addition & 0 deletions docs/raptor-1-to-2-map.tsv
Expand Up @@ -560,3 +560,4 @@
2.0.6 enum - - 2.0.7 enum RAPTOR_OPTION_NO_FILE - -
2.0.6 enum - - 2.0.7 enum RAPTOR_OPTION_WWW_SSL_VERIFY_PEER - -
2.0.6 enum - - 2.0.7 enum RAPTOR_OPTION_WWW_SSL_VERIFY_HOST - -
2.0.6 enum - - 2.0.7 enum RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES - -
3 changes: 3 additions & 0 deletions librdfa/rdfa.c
Expand Up @@ -1230,6 +1230,9 @@ int rdfa_parse_start(rdfacontext* context)
raptor_sax2_set_option(context->sax2,
RAPTOR_OPTION_NO_FILE, NULL,
RAPTOR_OPTIONS_GET_NUMERIC(rdf_parser, RAPTOR_OPTION_NO_FILE));
raptor_sax2_set_option(context->sax2,
RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES, NULL,
RAPTOR_OPTIONS_GET_NUMERIC(rdf_parser, RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES));
if(rdf_parser->uri_filter)
raptor_sax2_set_uri_filter(context->sax2, rdf_parser->uri_filter,
rdf_parser->uri_filter_user_data);
Expand Down
4 changes: 3 additions & 1 deletion src/raptor2.h.in
Expand Up @@ -528,6 +528,7 @@ typedef struct {
* @RAPTOR_OPTION_WWW_SSL_VERIFY_PEER: Integer. SSL verify peer - non-0 to verify peer SSL certificate (default)
* @RAPTOR_OPTION_WWW_SSL_VERIFY_HOST: Integer. SSL verify host - 0 none, 1 CN match, 2 host match (default). Other values are ignored.
* @RAPTOR_OPTION_NO_FILE: Deny file reading requests inside other requests.
* @RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES: When reading XML, load external entities.
* @RAPTOR_OPTION_LAST: Internal
*
* Raptor parser, serializer or XML writer options.
Expand Down Expand Up @@ -574,7 +575,8 @@ typedef enum {
RAPTOR_OPTION_NO_FILE,
RAPTOR_OPTION_WWW_SSL_VERIFY_PEER,
RAPTOR_OPTION_WWW_SSL_VERIFY_HOST,
RAPTOR_OPTION_LAST = RAPTOR_OPTION_WWW_SSL_VERIFY_HOST
RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES,
RAPTOR_OPTION_LAST = RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES
} raptor_option;


Expand Down
109 changes: 105 additions & 4 deletions src/raptor_libxml.c
Expand Up @@ -145,16 +145,117 @@ raptor_libxml_hasExternalSubset (void* user_data)

static xmlParserInputPtr
raptor_libxml_resolveEntity(void* user_data,
const xmlChar *publicId, const xmlChar *systemId) {
const xmlChar *publicId, const xmlChar *systemId)
{
raptor_sax2* sax2 = (raptor_sax2*)user_data;
return libxml2_resolveEntity(sax2->xc, publicId, systemId);
xmlParserCtxtPtr ctxt = sax2->xc;
const unsigned char *uri_string = NULL;
xmlParserInputPtr entity_input;
int load_entity = 0;

if(ctxt->input)
uri_string = RAPTOR_GOOD_CAST(const unsigned char *, ctxt->input->filename);

if(!uri_string)
uri_string = RAPTOR_GOOD_CAST(const unsigned char *, ctxt->directory);

load_entity = RAPTOR_OPTIONS_GET_NUMERIC(sax2, RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES);
if(load_entity)
load_entity = raptor_sax2_check_load_uri_string(sax2, uri_string);

if(load_entity) {
entity_input = xmlLoadExternalEntity(RAPTOR_GOOD_CAST(const char*, uri_string),
RAPTOR_GOOD_CAST(const char*, publicId),
ctxt);
} else {
RAPTOR_DEBUG4("Not loading entity URI %s by policy for publicId '%s' systemId '%s'\n", uri_string, publicId, systemId);
}

return entity_input;
}


static xmlEntityPtr
raptor_libxml_getEntity(void* user_data, const xmlChar *name) {
raptor_libxml_getEntity(void* user_data, const xmlChar *name)
{
raptor_sax2* sax2 = (raptor_sax2*)user_data;
return libxml2_getEntity(sax2->xc, name);
xmlParserCtxtPtr xc = sax2->xc;
xmlEntityPtr ret = NULL;

if(!xc)
return NULL;

if(!xc->inSubset) {
/* looks for hardcoded set of entity names - lt, gt etc. */
ret = xmlGetPredefinedEntity(name);
if(ret) {
RAPTOR_DEBUG2("Entity '%s' found in predefined set\n", name);
return ret;
}
}

/* This section uses xmlGetDocEntity which looks for entities in
* memory only, never from a file or URI
*/
if(xc->myDoc && (xc->myDoc->standalone == 1)) {
RAPTOR_DEBUG2("Entity '%s' document is standalone\n", name);
/* Document is standalone: no entities are required to interpret doc */
if(xc->inSubset == 2) {
xc->myDoc->standalone = 0;
ret = xmlGetDocEntity(xc->myDoc, name);
xc->myDoc->standalone = 1;
} else {
ret = xmlGetDocEntity(xc->myDoc, name);
if(!ret) {
xc->myDoc->standalone = 0;
ret = xmlGetDocEntity(xc->myDoc, name);
xc->myDoc->standalone = 1;
}
}
} else {
ret = xmlGetDocEntity(xc->myDoc, name);
}

if(ret && !ret->children &&
(ret->etype == XML_EXTERNAL_GENERAL_PARSED_ENTITY)) {
/* Entity is an external general parsed entity. It may be in a
* catalog file, user file or user URI
*/
int val = 0;
xmlNodePtr children;
int load_entity = 0;

load_entity = RAPTOR_OPTIONS_GET_NUMERIC(sax2, RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES);
if(load_entity)
load_entity = raptor_sax2_check_load_uri_string(sax2, ret->URI);

if(!load_entity) {
RAPTOR_DEBUG2("Not getting entity URI %s by policy\n", ret->URI);
children = xmlNewText((const xmlChar*)"");
} else {
/* Disable SAX2 handlers so that the SAX2 events do not all get
* sent to callbacks during dealing with the entity parsing.
*/
sax2->enabled = 0;
val = xmlParseCtxtExternalEntity(xc, ret->URI, ret->ExternalID, &children);
sax2->enabled = 1;
}

if(!val) {
xmlAddChildList((xmlNodePtr)ret, children);
} else {
xc->validate = 0;
return NULL;
}

ret->owner = 1;

/* Mark this entity as having been checked - never do this again */
if(!ret->checked)
ret->checked = 1;
}

return ret;
}


Expand Down
6 changes: 6 additions & 0 deletions src/raptor_option.c
Expand Up @@ -295,6 +295,12 @@ static const struct
RAPTOR_OPTION_VALUE_TYPE_INT,
"wwwSslVerifyHost",
"SSL verify host matching"
},
{ RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES,
(raptor_option_area)(RAPTOR_OPTION_AREA_PARSER | RAPTOR_OPTION_AREA_SAX2),
RAPTOR_OPTION_VALUE_TYPE_BOOL,
"loadExternalEntities",
"Parsers and SAX2 should load external entities."
}
};

Expand Down
3 changes: 3 additions & 0 deletions src/raptor_rdfxml.c
Expand Up @@ -1004,6 +1004,9 @@ raptor_rdfxml_parse_start(raptor_parser* rdf_parser)
raptor_sax2_set_option(rdf_xml_parser->sax2,
RAPTOR_OPTION_NO_FILE, NULL,
RAPTOR_OPTIONS_GET_NUMERIC(rdf_parser, RAPTOR_OPTION_NO_FILE));
raptor_sax2_set_option(rdf_xml_parser->sax2,
RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES, NULL,
RAPTOR_OPTIONS_GET_NUMERIC(rdf_parser, RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES));
if(rdf_parser->uri_filter)
raptor_sax2_set_uri_filter(rdf_xml_parser->sax2, rdf_parser->uri_filter,
rdf_parser->uri_filter_user_data);
Expand Down
3 changes: 3 additions & 0 deletions src/raptor_rss.c
Expand Up @@ -252,6 +252,9 @@ raptor_rss_parse_start(raptor_parser *rdf_parser)
raptor_sax2_set_option(rss_parser->sax2,
RAPTOR_OPTION_NO_FILE, NULL,
RAPTOR_OPTIONS_GET_NUMERIC(rdf_parser, RAPTOR_OPTION_NO_FILE));
raptor_sax2_set_option(rss_parser->sax2,
RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES, NULL,
RAPTOR_OPTIONS_GET_NUMERIC(rdf_parser, RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES));
if(rdf_parser->uri_filter)
raptor_sax2_set_uri_filter(rss_parser->sax2, rdf_parser->uri_filter,
rdf_parser->uri_filter_user_data);
Expand Down
2 changes: 2 additions & 0 deletions src/raptor_turtle_writer.c
Expand Up @@ -705,6 +705,7 @@ raptor_turtle_writer_set_option(raptor_turtle_writer *turtle_writer,
/* Shared */
case RAPTOR_OPTION_NO_NET:
case RAPTOR_OPTION_NO_FILE:
case RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES:

/* XML writer options */
case RAPTOR_OPTION_RELATIVE_URIS:
Expand Down Expand Up @@ -829,6 +830,7 @@ raptor_turtle_writer_get_option(raptor_turtle_writer *turtle_writer,
/* Shared */
case RAPTOR_OPTION_NO_NET:
case RAPTOR_OPTION_NO_FILE:
case RAPTOR_OPTION_LOAD_EXTERNAL_ENTITIES:

/* XML writer options */
case RAPTOR_OPTION_RELATIVE_URIS:
Expand Down

0 comments on commit a676f23

Please sign in to comment.