Skip to content

Commit

Permalink
Activated defense against XML eXternal Entity (XXE) attacks
Browse files Browse the repository at this point in the history
  • Loading branch information
vruusmann committed Jul 8, 2018
1 parent 0499a97 commit 494f821
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 17 deletions.
16 changes: 13 additions & 3 deletions pmml-model/src/main/java/org/jpmml/model/SAXUtil.java
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,23 @@ private SAXUtil(){
static
public SAXSource createFilteredSource(InputStream is, XMLFilter... filters) throws SAXException {
XMLReader reader = XMLReaderFactory.createXMLReader();
reader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

reader = createFilteredReader(reader, filters);

return new SAXSource(reader, new InputSource(is));
}

static
public XMLReader createFilteredReader(XMLReader reader, XMLFilter... filters){
XMLReader result = reader;

for(XMLFilter filter : filters){
filter.setParent(reader);
filter.setParent(result);

reader = filter;
result = filter;
}

return new SAXSource(reader, new InputSource(is));
return result;
}
}
27 changes: 13 additions & 14 deletions pmml-model/src/test/java/org/jpmml/model/XXEAttackTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,15 @@
import java.util.List;

import javax.xml.bind.UnmarshalException;
import javax.xml.transform.sax.SAXSource;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;

import org.dmg.pmml.Extension;
import org.dmg.pmml.PMML;
import org.junit.Test;
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.SAXParseException;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
import static org.junit.Assert.fail;

public class XXEAttackTest {
Expand All @@ -30,30 +28,31 @@ public void unmarshal() throws Exception {
System.setProperty("javax.xml.accessExternalDTD", "file");

try(InputStream is = ResourceUtil.getStream(XXEAttackTest.class);){
pmml = JAXBUtil.unmarshalPMML(new StreamSource(is));
Source source = new StreamSource(is);

pmml = JAXBUtil.unmarshalPMML(source);
} finally {
System.clearProperty("javax.xml.accessExternalDTD");
}

List<Extension> extensions = pmml.getExtensions();
assertEquals(1, extensions.size());
List<?> content = ExtensionUtil.getContent(pmml);

Extension extension = extensions.get(0);
assertEquals(Arrays.asList("lol"), extension.getContent());
assertEquals(Arrays.asList("lol"), content);
}

@Test
public void unmarshalSecurely() throws Exception {

try(InputStream is = ResourceUtil.getStream(XXEAttackTest.class)){
XMLReader reader = XMLReaderFactory.createXMLReader();
reader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
Source source = SAXUtil.createFilteredSource(is);

JAXBUtil.unmarshalPMML(new SAXSource(reader, new InputSource(is)));
JAXBUtil.unmarshalPMML(source);

fail();
} catch(UnmarshalException ue){
// Ignored
Throwable cause = ue.getCause();

assertTrue(cause instanceof SAXParseException);
}
}
}

3 comments on commit 494f821

@srowen
Copy link

@srowen srowen commented on 494f821 Sep 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vruusmann would you be willing to back-port this to branches 1.3 and 1.2 and consider a new release of those? AFAICT this might affect Spark, and there are versions still using the older JPMML. Or if I tried to back-port, would that help?

@vruusmann
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen Defense against XEE and XXE attacks is achieved simply by activating some XMLReader configuration options. It was previously mentioned in the README file, but I doubt that many end users actually noticed and cared about it all.

This commit simply activates those configuration options for most common JPMML-Model unmarshalling utility methods.

Don't know if backporting this commit alone is worth it. There were some API changes right before and after it, which aimed to make the SAX filtering more unified and intuitive (eg. a SAX filter for skipping whitespace characters - used to be stored in JAXB class model classes, and consume loads of memory).

AFAICT this might affect Spark

Apache Spark is PMML producer. This attack vector is more relevant for PMML consumers, which may be exposed to malevolent content (eg. XML upload by third parties).

@srowen
Copy link

@srowen srowen commented on 494f821 Sep 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah on a second look, I don't see that Spark would unmarshal XML, so I don't know if it's even affected. Thanks!

Please sign in to comment.