New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

saxon:parse() must disable external entities #3468

Closed
ebruchez opened this Issue Feb 8, 2018 · 3 comments

Comments

Projects
1 participant
@ebruchez
Collaborator

ebruchez commented Feb 8, 2018

As per the Saxon doc, "The XML parser that is used will be the one nominated to the Saxon Configuration as the parser for source documents." And this is not "our" parser but

This goes through Configuration.loadParser, via Configuration.getSourceParser, which does:

SAXParserFactory.newInstance().newSAXParser().getXMLReader();

This returns a com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.

Configuration supports setting a Java class name to determine which parser is to be used. This is not ideal for us. Maybe we can just instead override getSourceParser?

+1 from customer

@ebruchez ebruchez added the XForms label Feb 8, 2018

@ebruchez ebruchez self-assigned this Feb 8, 2018

@ebruchez

This comment has been minimized.

Collaborator

ebruchez commented Feb 9, 2018

So going for overriding getSourceParser and getStyleParser. Note that Saxon keeps a pool of parsers. We don't use a pool of parsers in our XMLParsing class, just a pool of factories. I think that the cost is related to the factories and not the parser creating itself.

@ebruchez

This comment has been minimized.

Collaborator

ebruchez commented Feb 9, 2018

Profiling this:

<xf:output value="(for $i in (1 to 10000) return saxon:parse(concat('&lt;foo>Hello', $i, '!&lt;/foo>')))[last()]"/>

Shows that creating the parser takes about 1% of the time. Not too catastrophic.

Time for a refresh with the 10,000 iterations:

  • JDK parser: 2,557 ms
  • Our parser: 543 ms

So it's a win-win, pool or not.

@ebruchez ebruchez added the Security label Feb 9, 2018

@ebruchez

This comment has been minimized.

Collaborator

ebruchez commented Feb 9, 2018

For the record, I verified that:

  1. With the previous situation, you can indeed read local files using external entities.
  2. With the new parser and settings, you can't.

@ebruchez ebruchez closed this in 59cdcbf Feb 9, 2018

@ebruchez ebruchez added this to Done in Orbeon Forms 2018.1 via automation Feb 9, 2018

@ebruchez ebruchez added this to To do in Orbeon Forms 2017.2.1 via automation Feb 9, 2018

@ebruchez ebruchez moved this from To do to Done in Orbeon Forms 2017.2.1 Feb 9, 2018

ebruchez added a commit that referenced this issue Feb 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment