Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly parse as XML and fix setting of Nokogiri options. #335

Merged
merged 1 commit into from Jul 19, 2016

Conversation

Umofomia
Copy link
Contributor

@Umofomia Umofomia commented Jun 28, 2016

Issue

Nokogiri.parse can return either a Nokogiri::HTML::Document or Nokogiri::XML::Document. This appears to be determined by a regular expression matcher: https://github.com/sparklemotion/nokogiri/blob/v1.5.11/lib/nokogiri.rb#L70

Because of this, if the characters html (case-insensitive) appears inside a tag in the document, Nokogiri will parse the document as HTML instead of XML, which breaks ruby-saml functionality. For instance, we hit this when one of our SAML responses happened to contain this in an ID attribute:

...5RwkhTMlTp1n...

In addition, while investigating the fix for this issue I noticed that the configuration options for the document did not appear to be set properly. It appears that these configuration options were introduced in #247 to address a security issue; I'm not sure I have enough context to assess this, but it might be that the original security issue was not fully addressed.

Fix

Use Nokogiri::XML to ensure that the document is always parsed as XML. In addition, correctly set the options attribute in the XML document configuration.

@pitbulk
Copy link
Collaborator

pitbulk commented Jun 29, 2016

@Umofomia , thanks for report that.

In order to do exactly the same behaviour, should we use?

config.options = XMLSecurity::BaseDocument::NOKOGIRI_OPTIONS || XML::ParseOptions::DEFAULT_XML

@Umofomia
Copy link
Contributor Author

@pitbulk: Would it instead be better to define XMLSecurity::BaseDocument::NOKOGIRI_OPTIONS to include XML::ParseOptions::DEFAULT_XML as well? This would address all of the places where it's used at once and also another existing location that appears to use the same pattern:
https://github.com/onelogin/ruby-saml/blob/5ebd8d1f1f41690fc80e2856adb46e586e596863/lib/onelogin/ruby-saml/saml_message.rb#L68

@pitbulk
Copy link
Collaborator

pitbulk commented Jun 29, 2016

Yes, I think so

@Umofomia
Copy link
Contributor Author

Actually, looking at the definition of XML::ParseOptions::DEFAULT_XML, it appears to be defined as RECOVER | NONET: https://github.com/sparklemotion/nokogiri/blob/v1.5.11/lib/nokogiri/xml/parse_options.rb#L50

RECOVER would basically end up nullifying the STRICT option that is currently in NOKOGIRI_OPTIONS: https://github.com/onelogin/ruby-saml/blob/8453f125146a081035bfe9275ffd04007c9c2e59/lib/xml_security.rb#L41

For this reason, I argue we shouldn't be including DEFAULT_XML if your intent is to use the STRICT option.

@Umofomia
Copy link
Contributor Author

Umofomia commented Jun 29, 2016

@pitbulk: Also note that Nokogiri.parse uses || instead of | to include the DEFAULT_XML option. This means it only uses DEFAULT_XML when options aren't specified at all. When any options are specified, then DEFAULT_XML is not included.

So I don't believe any additional changes need to be done to this pull request. Would you agree?

@Umofomia
Copy link
Contributor Author

@pitbulk: Any thoughts? I mentioned above why it doesn't appear that any other changes are needed. I'd like to get this change in because it's currently breaking on our systems for the user that happens to have that ID.

@pitbulk
Copy link
Collaborator

pitbulk commented Jul 19, 2016

@luisvm can you review this?

@luisvm
Copy link
Contributor

luisvm commented Jul 19, 2016

looks good 👍

@pitbulk pitbulk merged commit 7d48ca8 into SAML-Toolkits:master Jul 19, 2016
@Umofomia
Copy link
Contributor Author

Thanks!

@pitbulk
Copy link
Collaborator

pitbulk commented Jul 19, 2016

Thanks you for the contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants