New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trim values obtained with getTextContent() on any XML node #340
Conversation
Those changes could break current integrations so them should be applied in a major release. That said, I don't get why the toolkit may "clean" the values. If there are not expected spaces in the XML, is IdP faults... |
In some places the trim was already in place (see the audiences validation, for instance). <saml:NameID
Format="urn:oasis:names:tc:SAML:2.0:nameid-format:transient"
NameQualifier= "http://spidIdp.spididpProvider.it">
_06e983facd7cd554cfe067e
</saml:NameID> what do you think the
while the expected value in this case would be just |
Ok, lets do it configurable, by default disabled |
I'll update this PR as soon as possible. I plan to add two properties (both defaulting to
The Meanwhile it's interesting to see that the SAML specification (in the core specification document, section 1.3.1) says:
However, the same specification document shows an example in which a <NameID
Format="urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress">
scott@example.org
</NameID> We can argue that it might be just due to PDF pagination, but indeed it leads to confusion. |
Hi @pitbulk , here is the update to make trimming an opt-in feature. Please note that |
ab7e4d7
to
3c79c8c
Compare
Sorry for the delay, #333 merged |
Hi @pitbulk, thanks for merging! I will then adapt the other open pull requests (including this one), but please note that I'm leaving for a couple of weeks of holidays, I will be back around 12th July, when I will be able to proceed. Thanks for your patience. |
We need to take care while applying this one... An example of a security issue related to the use of "trim": |
Thanks for the link, I will study it and let you know! |
If the subject name id is not trimmed, its value may contain surrounding whitespace characters depending on XML formatting. This change also avoids a double trim of audiences (which indeed were already trimmed).
This change extends the previous one, made for SamlResponse name id, so that surrounding whitespace is removed for any value obtained from a XML element where this is indeed the expected behaviour (like in issuers, audiences, status messages, name ids).
Name IDs (including issuers) are by default left untouched, as well as attribute values, like it was before. However two new settings have been introduced (whose default value is false) which allow to enable trimming for such values, which is probably the desired behaviour in practice, although SAML specification says that no whitespace processing should be performed on strings. Another place where trimming may be desirable is in SessionIndex extraction from LogoutRequests: this is not performed at any point of the LogoutRequest processing, but an overloading has been provided so that the API consumer may still request trimming. AuthnResponseTest.testGetIssuersTrimming() is disabled by now because it fails due to a bug in SamlResponse.getIssuers() which is addressed by another PR.
Now that fixes to the getIssuers() and the new getResponseIssuer() and getAssertionIssuer() are available on master, proper test cases can be provided with regards to trimming.
1ae96e3
to
1562784
Compare
Hi @pitbulk, I rebased the whole work done here against current master and completed the AuthnResponseTest changes. Let me know if you have more objections. I read the issue at https://hackerone.com/reports/976603: said that I don't know what Grammarly really is, if I understand it correctly the issue is related to the use of different trimming strategies in a scenario where there's the ability for a user to register to an IdP and there's the ability for a new IdP to join the federation exposing a malicious entity id. |
This change ensures that surrounding whitespace is removed for any value obtained from a XML element where this is indeed the expected behaviour (like in issuers, audiences, status messages, name ids).