New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicitly parse as XML and fix setting of Nokogiri options. #335
Conversation
@Umofomia , thanks for report that. In order to do exactly the same behaviour, should we use?
|
@pitbulk: Would it instead be better to define |
Yes, I think so |
Actually, looking at the definition of
For this reason, I argue we shouldn't be including |
@pitbulk: Also note that So I don't believe any additional changes need to be done to this pull request. Would you agree? |
@pitbulk: Any thoughts? I mentioned above why it doesn't appear that any other changes are needed. I'd like to get this change in because it's currently breaking on our systems for the user that happens to have that ID. |
@luisvm can you review this? |
looks good 👍 |
Thanks! |
Thanks you for the contribution. |
Issue
Nokogiri.parse
can return either aNokogiri::HTML::Document
orNokogiri::XML::Document
. This appears to be determined by a regular expression matcher: https://github.com/sparklemotion/nokogiri/blob/v1.5.11/lib/nokogiri.rb#L70Because of this, if the characters
html
(case-insensitive) appears inside a tag in the document, Nokogiri will parse the document as HTML instead of XML, which breaks ruby-saml functionality. For instance, we hit this when one of our SAML responses happened to contain this in an ID attribute:In addition, while investigating the fix for this issue I noticed that the configuration options for the document did not appear to be set properly. It appears that these configuration options were introduced in #247 to address a security issue; I'm not sure I have enough context to assess this, but it might be that the original security issue was not fully addressed.
Fix
Use
Nokogiri::XML
to ensure that the document is always parsed as XML. In addition, correctly set theoptions
attribute in the XML document configuration.