New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly escape text to produce valid XML #315
Conversation
The generation methods for metadata, AuthnRequest, LogoutRequest and LogoutResponse XML documents and messages have been changed so that any text that could be specified by the user is properly escaped in order to produce a valid XML output: characters like &, ", < and > are then replaced with the corresponding XML entities (& ", <, >). The commons-lang StringEscapeUtils class has been used, although being deprecated: the deprecation warning suggests to switch to an equivalent class of commons-text, which java-saml currently does not depend on. I leave the decision to possibly switch to it for future evaluation: the escaping logic was indeed implemented in the Util class, so any implementation change of it just needs to change Util.toXml(String) method. Closes SAML-Toolkits#309 and SAML-Toolkits#305.
ab7e4d7
to
3c79c8c
Compare
I don't think you need to escape objects like: Related to the escape method, escapeXml is deprecated but it only tries to escaped the five basic XML entities (gt, lt, quot, amp, apos), which is what we need. Not sure if the use of escapeXml10 or escapeXml11, that escaped mor chars, can carry side effects. |
Any chance you could do a release once this is merged? |
I can't check the code right now, but are we really sure that those settings are fully under our control? I'm thinking about the dynamic settings use case. I went for the safe way, wherever there are strings involved I escape them. In this way we should protect ourselves against malicious settings.
Perhaps minimal escaping would be enough, but I can't think of how a more comprehensive escaping could harm. This is why I opted for the escapeXml10. I can't remember the details right now but I remember that I had problems in the past with more essential escaping when non printable characters were involved: that case was handled well by escapeXml10 instead. |
I just checked:
To sum it up, I think that the only place where escaping may be removed is in |
Just to provide more details: I can remember now that those non printable characters were making the JSF parser crazy and produce some fancy exceptions. As I said, I could solve the problem with In java-saml context I think that basic escaping could help to make legitimate use cases (which are currently broken) work, while a more comprehensive escaping can help to prevent malicious XML to be generated and cause any sort of unexpected problems to consumers. IMHO, Anyway, I think this issue is quite severe, so we may probably start with |
@pitbulk I removed the useless XML escaping of certificate data. |
The generation methods for metadata, AuthnRequest, LogoutRequest and
LogoutResponse XML documents and messages have been changed so that any
text that could be specified by the user is properly escaped in order to
produce a valid XML output: characters like &, ", < and > are then
replaced with the corresponding XML entities (& ", <, >).
The commons-lang StringEscapeUtils class has been used, although being
deprecated: the deprecation warning suggests to switch to an equivalent
class of commons-text, which java-saml currently does not depend on. I
leave the decision to possibly switch to it for future evaluation:
the escaping logic was indeed implemented in the Util class, so any
implementation change of it just needs to change Util.toXml(String)
method.
Closes #309 and #305.