-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing of message body of messages in the ML Switch, JWS implications #755
Comments
Canonical encoding of JSON is a known issue. This is why all JOSE specs require payloads to be base64url encoded and the signatures computed against the raw bytes of that encoding.
This is the approach taken by JOSE but it also introduces a potential security foot-gun. Implementors need to ensure that they validate the signature against the binary version of the data but ALSO ensure that the binary data matches the decoded data used for processing. Just something to be aware of. The current Signtaure spec does 90% of JWT but then throws away the base64url encoded payload. It puts the following header into the request: {
"signature":
"dz2ntyS0_rDyA0pLeWluG--tBcYYrlvG99ffkXcEBuve5Qzvzyn0ZUi82J7h17RsdfHPuTnbEGvCeU9Y4Bg0nIZHGL4icswaaO09T5hPPYKBTzVQeHkokLmL4dXpHdr1ggSEpu3WEU3nfgOFGGAdOq355i1iGuDbhqm_lSfVHaqdVCEh
kJ2Y_r2glO2QpdZrcbvsBV39derj_PlfISBBGjdh0dIPxnFIVcZuPHiq9Ha2MslrBHfqwF
fNeU_xhErBd2PywkDQJbKOlfqdkmFC9bS8Ofx0O6Mg7qdFGwQkseJTfp0HMbH1d9e6H0cocY8xfuDNGaZpOJhxiYtiPLg",
"protectedHeader":
"eyJhbGciOiJSUzI1NiIsIkZTUElPUC1EZXN0aW5hdGlvbiI6IjU2NzgiLCJGU1BJT1AtVVJJIjoiL3F1b3RlcyIsIkZTUElPUC1IVFRQLU1ldGhvZCI6IlBPU1QiLCJEYXRlIjoiVHVlLCAyMyBNYXkgMjAxNyAyMToxMjozMSBHTVQiLCJGU1BJT1AtU291cmNlIjoiMTIzNCJ9"
} If it followed the JWT spec fully this header would simply be a JWT (i.e. the protected header, payload and signature all encoded and separated by a Something like:
See on JWT.io One thing to note is that we MAY hit limits on header sizes. The HTTP spec doesn't specify limits but some servers do implement them (lowest I am aware of is 8Kb) so we're still well within that limit. |
Currently the ML-API-Adapter parses the incoming message into a JSON Object via the Hapijs library, which is then sent to the node-rdkafka streaming library, and subsequently, each handler either decodes or re-encodes the message from kafka using the same node-rdkafka library. Messages are being stored on Kafka based on the Lime protocol (https://github.com/mojaloop/central-services-stream/blob/master/src/kafka/protocol/readme.md). The header & payload is stored in the {
"id": "f2f038cc-b749-464d-a364-c24acad58ef0",
"to": "mockfsp02",
"from": "mockfsp01",
"type": "application/json",
"content": {
"headers": {
"accept": "application/vnd.interoperability.transfers+json;version=1",
"content-type": "application/vnd.interoperability.transfers+json;version=1",
"date": "2019-01-22T21:27:55.000Z",
"fspiop-source": "mockfsp01",
"fspiop-destination": "mockfsp02",
"content-length": 437
},
"payload": {
"transferId": "f2f038cc-b749-464d-a364-c24acad58ef0",
"payerFsp": "mockfsp01",
"payeeFsp": "mockfsp02",
"amount": {
"amount": "1",
"currency": "USD"
},
"ilpPacket": "eyJxdW90ZUlkIjoiZThmYzFlYzEtZTM5Yy00NDBjLWJjNmQtZjg1NjM2MzYwMTBiIiwidHJhbnNhY3Rpb25JZCI6ImU4ZmMxZWMxLWUzOWMtNDQwYy1iYzZkLWY4NTYzNjM2MDEwYiJ9",
"condition": "ta5inyYarm7thvY3DPHZgSDK0ml_kuGOPi-qhiF1SCs",
"expiration": "2019-12-10T12:07:47.349Z"
}
},
"metadata": {
"event": {
"id": "25240fa4-da6a-4f18-8b42-e391fde70817",
"type": "prepare",
"action": "prepare",
"createdAt": "2019-05-06T08:53:16.996Z",
"state": {
"status": "success",
"code": 0
}
}
}
} An additional field
Here is an example where the payload contains the above JSON which has been encoded as base64: {
"id": "f2f038cc-b749-464d-a364-c24acad58ef0",
"to": "mockfsp02",
"from": "mockfsp01",
"type": "application/json",
"content": {
"headers": {
"accept": "application/vnd.interoperability.transfers+json;version=1",
"content-type": "application/vnd.interoperability.transfers+json;version=1",
"date": "2019-01-22T21:27:55.000Z",
"fspiop-source": "mockfsp01",
"fspiop-destination": "mockfsp02",
"content-length": 437
},
"payload": "ewogICAgICAgICAgICAidHJhbnNmZXJJZCI6ICJmMmYwMzhjYy1iNzQ5LTQ2NGQtYTM2NC1jMjRhY2FkNThlZjAiLAogICAgICAgICAgICAicGF5ZXJGc3AiOiAibW9ja2ZzcDAxIiwKICAgICAgICAgICAgInBheWVlRnNwIjogIm1vY2tmc3AwMiIsCiAgICAgICAgICAgICJhbW91bnQiOiB7CiAgICAgICAgICAgICAgICAiYW1vdW50IjogIjEiLAogICAgICAgICAgICAgICAgImN1cnJlbmN5IjogIlVTRCIKICAgICAgICAgICAgfSwKICAgICAgICAgICAgImlscFBhY2tldCI6ICJleUp4ZFc5MFpVbGtJam9pWlRobVl6RmxZekV0WlRNNVl5MDBOREJqTFdKak5tUXRaamcxTmpNMk16WXdNVEJpSWl3aWRISmhibk5oWTNScGIyNUpaQ0k2SW1VNFptTXhaV014TFdVek9XTXRORFF3WXkxaVl6WmtMV1k0TlRZek5qTTJNREV3WWlKOSIsCiAgICAgICAgICAgICJjb25kaXRpb24iOiAidGE1aW55WWFybTd0aHZZM0RQSFpnU0RLMG1sX2t1R09QaS1xaGlGMVNDcyIsCiAgICAgICAgICAgICJleHBpcmF0aW9uIjogIjIwMTktMTItMTBUMTI6MDc6NDcuMzQ5WiIKfQ==",
"payloadType": "application/json;encode=base64"
},
"metadata": {
"event": {
"id": "25240fa4-da6a-4f18-8b42-e391fde70817",
"type": "prepare",
"action": "prepare",
"createdAt": "2019-05-06T08:53:16.996Z",
"state": {
"status": "success",
"code": 0
}
}
}
}
Note that the performance overhead of decoding/encoding should also be taking into consideration here. I.e. UTF8 encode/decode is probably more efficient than base64 since we can take the string and directly parse it into a JSON object as required where base64 will require an additional decode. This would ensure that the payload could then be validated against the JWS Signature as normal. We could enhance the existing Let me know what you guys think. |
Alternatively, we could encode the complete {
"id": "f2f038cc-b749-464d-a364-c24acad58ef0",
"to": "mockfsp02",
"from": "mockfsp01",
"type": "application/json;encode=base64",
"content": "ewogICAgICAgICJoZWFkZXJzIjogewogICAgICAgICAgICAiYWNjZXB0IjogImFwcGxpY2F0aW9uL3ZuZC5pbnRlcm9wZXJhYmlsaXR5LnRyYW5zZmVycytqc29uO3ZlcnNpb249MSIsCiAgICAgICAgICAgICJjb250ZW50LXR5cGUiOiAiYXBwbGljYXRpb24vdm5kLmludGVyb3BlcmFiaWxpdHkudHJhbnNmZXJzK2pzb247dmVyc2lvbj0xIiwKICAgICAgICAgICAgImRhdGUiOiAiMjAxOS0wMS0yMlQyMToyNzo1NS4wMDBaIiwKICAgICAgICAgICAgImZzcGlvcC1zb3VyY2UiOiAibW9ja2ZzcDAxIiwKICAgICAgICAgICAgImZzcGlvcC1kZXN0aW5hdGlvbiI6ICJtb2NrZnNwMDIiLAogICAgICAgICAgICAiY29udGVudC1sZW5ndGgiOiA0MzcKICAgICAgICB9LAogICAgICAgICJwYXlsb2FkIjogewogICAgICAgICAgICAidHJhbnNmZXJJZCI6ICJmMmYwMzhjYy1iNzQ5LTQ2NGQtYTM2NC1jMjRhY2FkNThlZjAiLAogICAgICAgICAgICAicGF5ZXJGc3AiOiAibW9ja2ZzcDAxIiwKICAgICAgICAgICAgInBheWVlRnNwIjogIm1vY2tmc3AwMiIsCiAgICAgICAgICAgICJhbW91bnQiOiB7CiAgICAgICAgICAgICAgICAiYW1vdW50IjogIjEiLAogICAgICAgICAgICAgICAgImN1cnJlbmN5IjogIlVTRCIKICAgICAgICAgICAgfSwKICAgICAgICAgICAgImlscFBhY2tldCI6ICJleUp4ZFc5MFpVbGtJam9pWlRobVl6RmxZekV0WlRNNVl5MDBOREJqTFdKak5tUXRaamcxTmpNMk16WXdNVEJpSWl3aWRISmhibk5oWTNScGIyNUpaQ0k2SW1VNFptTXhaV014TFdVek9XTXRORFF3WXkxaVl6WmtMV1k0TlRZek5qTTJNREV3WWlKOSIsCiAgICAgICAgICAgICJjb25kaXRpb24iOiAidGE1aW55WWFybTd0aHZZM0RQSFpnU0RLMG1sX2t1R09QaS1xaGlGMVNDcyIsCiAgICAgICAgICAgICJleHBpcmF0aW9uIjogIjIwMTktMTItMTBUMTI6MDc6NDcuMzQ5WiIKICAgICAgICB9CiAgICB9",
"metadata": {
"event": {
"id": "25240fa4-da6a-4f18-8b42-e391fde70817",
"type": "prepare",
"action": "prepare",
"createdAt": "2019-05-06T08:53:16.996Z",
"state": {
"status": "success",
"code": 0
}
}
}
} With the content field containing the following once it has been de-encoded: "content": {
"header": <header as received by hapijs>,
"payload": <raw payload as received by hapijs>
}" I feel like my initial suggested approach is more explicit and would remove any confusion on how to handle the contents, and how the internal payload field should be handled. |
I doubt this will be an issue but even if it is I would suggest getting this working first before worrying about optimising the encoding/decoding. In terms of the suggestions: If we follow your second suggestion For that reason I prefer your first suggestion, however it would be useful to have some explicit header normalisation rules (I didn't see any in the Signature spec) to ensure that the decoded headers are still used correctly when validating the signature. Then I would do the following:
|
Agreed.
That is correct. I do not know of any header normalization rules as per the spec other then what is validated against the API specification explicitly, nor do the headers form part of the calculated signature (other than to store the signature itself). Do you have an example of how we could bridge this world?
Just to clarify, do you mean encode the entire value of the
Fair enough. If your 1st point above applies to both the headers and payload fields, then I would suggest that the name of this field is more appropriately nameed |
All, I think it would be a mistake to prematurely generalise. Let us not worry about building a solution to any possible incoming MIME type. The API specifies UTF-8 encoded JSON only and that is highly unlikely to change any time soon. Let us worry about fixing the situation for the current spec and stack first. We have a node.js front-end that can be configured to not parse incoming payloads using the default node JSON parser and simply give our code the raw bytes. So, we could stop parsing the bodies in the HAPI framework and do it explicitly in our code; that gives us more control of the parsing. It also allows us to store the bytestream intact before parsing. All that may be academic however as @adrianhopebailie raises a very serious security concern. Storing and re-transmitting the incoming bytestream opens the door to us possibly passing on a message that is not the same as our parsed interpretation; if our parser has such a flaw then we are already very broken indeed, but at least the eventual receiver would know given our current approach of reconstructing the outgoing message as JSON from our js objects would invalidate the signature in this situation. If my interpretation of @millerabel 's recent observations is correct, he suggests a change to the spec to eliminate the cases that cause signature breaks during payload reconstruction e.g. whitespace, ordering of lists and object properties, and unicode escaping. I suggest we give this suggestion a significant amount of credence and consideration as I believe it is the only solution that prevents a possible parser defect being exploited. |
It looks like we could also build our own body parser middleware if needed, or take this one: https://github.com/nasa8x/hapi-bodyparser and customize it as needed. |
|
This is now implemented and deployed on AWS. To keep it manageable, closing this out - for any new issue or regression, please log a new bug |
Summary:
The body parsing in ml-api-adapter is breaking JWS! It is not respecting the body bytewise.
A string which contains an escaped unicode character is being unescaped. This breaks the JWS spec.
For example
Payer sends this message:
{"errorInformation":{"errorCode":"5200","errorDescription":"Generic limit error, amount \u0026 payments threshold."}}
Payee receives this message:
{"errorInformation":{"errorCode":"5200","errorDescription":"Generic limit error, amount & payments threshold."}}
See the escaped unicode sequence in the incoming message, this is allowed as per the JSON RFC.
The issue is that the JSON RFC allows escaped unicode characters, these get unescaped by the default body parsing mechanism in ml-api-adapter (whatever body parser HAPI uses). The unescaping breaks JWS as forwarded message bodies must be byte-for-byte identical to the incoming bodies. The only way mojaloop can be sure to not break JWS is to retain the incoming byte stream for forwarding at a later point. This is a mojaloop wide issue, across the entire surface, ml-api-adapter, all the kafka handlers etc... The switch is not allowed to alter the bodies of any message it forwards on, but it does under certain circumstances, so will require a fix.
Severity:
High
Priority:
Critical
Expected Behavior
Steps to Reproduce
Incoming message :
{"errorInformation":{"errorCode":"5200","errorDescription":"Generic limit error, amount \u0026 payments threshold."}}
Outgoing message:
{"errorInformation":{"errorCode":"5200","errorDescription":"Generic limit error, amount & payments threshold."}}
See the escaped unicode sequence in the incoming message, this is allowed as per the JSON RFC.
Specifications
Notes:
that would require a coordinated change to the kafka message "protocol"/structure across all the handlers and ml-api-adapter.
Tasks:
Acceptance criteria:
Pull Requests:
The text was updated successfully, but these errors were encountered: