Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rtc: Verify outermost XML signature of SII AECs #242

Merged
merged 9 commits into from
Oct 12, 2021

Conversation

jtrobles-cdd
Copy link
Member

@jtrobles-cdd jtrobles-cdd self-assigned this Sep 22, 2021
@jtrobles-cdd jtrobles-cdd force-pushed the feature/verify-aec-outermost-xml-signature branch from c3b78fe to 7fff179 Compare September 23, 2021 01:54
@codecov-commenter
Copy link

codecov-commenter commented Sep 23, 2021

Codecov Report

Merging #242 (38821b8) into develop (59c37c3) will increase coverage by 0.24%.
The diff coverage is 82.35%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #242      +/-   ##
===========================================
+ Coverage    83.00%   83.24%   +0.24%     
===========================================
  Files           32       33       +1     
  Lines         2500     2531      +31     
  Branches       349      355       +6     
===========================================
+ Hits          2075     2107      +32     
+ Misses         275      269       -6     
- Partials       150      155       +5     
Impacted Files Coverage Δ
cl_sii/rtc/xml_utils.py 66.66% <66.66%> (ø)
cl_sii/rtc/data_models_aec.py 89.90% <77.77%> (-0.55%) ⬇️
cl_sii/libs/xml_utils.py 80.19% <100.00%> (+1.25%) ⬆️
cl_sii/rtc/parse_aec.py 91.51% <100.00%> (+2.63%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 59c37c3...38821b8. Read the comment docs.

@jtrobles-cdd jtrobles-cdd force-pushed the feature/verify-aec-outermost-xml-signature branch 8 times, most recently from 4d769b9 to dd48f4e Compare September 23, 2021 21:02
@jtrobles-cdd jtrobles-cdd marked this pull request as ready for review September 23, 2021 21:04
@ycouce-cdd
Copy link
Member

@jtrobles-cdd @glarrain-cdd This is what I found out so far:
After running tests on a set of 48.709 AEC documents approved by the SII, the outcome was:

  • Signature verification was successful in a total of 46.951 AEC documents (≈ 96%)
  • In the AEC documents in which the signature verification failed 1.758 (<4%), it was for only two causes:
    • ValidationError: 1 validation error for _Aec signature -> key_info_x509_data_x509_cert Unable to load certificate (type=value_error) (1.388 AEC documents ≈ 79% of the documents that the signature verification failed)
    • XmlSignatureUnverified: Digest mismatch for reference 0 (370 AEC documents)

The second error (XmlSignatureUnverified) is the one that is directly related to the changes in this PR. When analyzing a subset of the AEC documents that fail due to this error, what stands out the most is that these documents contain empty-element tags (e.g. <ImagenDTE />1), which the canonicalization algorithm is expanding to the representation of start-tag followed by an end-tag (e.g. <ImagenDTE></ImagenDTE>). This obviously introduces changes to the content of the document and is the probable cause that the signature verification is failing.
If this is the cause, a possible solution would be to overwrite the _c14n method used in signxml.XMLVerifier for canonicalization, introducing a change that allows us to control what we do with the empty-element tags, similar to the behavior implemented in the method xml.etree.ElementTree.tostring via the short_empty_elements parameter (see the docs):

The keyword-only short_empty_elements parameter controls the formatting of elements that contain no content. If True (the default), they are emitted as a single self-closed tag, otherwise they are emitted as a pair of start/end tags.

Note: Both scripts/canonicalize_xml_file.py (added in this PR), and scripts/clean_dte_xml_file.py do the transformation of empty-element tags to start-tag followed by end-tag.

Footnotes

  1. DocumentoAEC//Cesiones//DTECedido//DocumentoDTECedido//ImagenDTE
    https://github.com/cl-sii-extraoficial/archivos-oficiales/blob/99b15aff252836e1ac311d243636aa3a9e6b89c6/src/code/rtc/2019-12-12-schema_cesion/schema_cesion/DTECedido_v10.xsd#L22-L26

@jtrobles-cdd
Copy link
Member Author

The second error (XmlSignatureUnverified) is the one that is directly related to the changes in this PR.

This PR doesn't contain any changes related to XML canonicalization during signature verification. The script scripts/canonicalize_xml_file.py was created to clean AEC XML files for testing, but is not used during signature verification.

SignXML does perform XML canonicalization, but only because it must do that if the XML signature specifies a canonicalization method, as the following example shows:

<Signature ...>
<SignedInfo>
<CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
...
</Signature>

When analyzing a subset of the AEC documents that fail due to this error, what stands out the most is that these documents contain empty-element tags (e.g. <ImagenDTE />1), which the canonicalization algorithm is expanding to the representation of start-tag followed by an end-tag (e.g. <ImagenDTE></ImagenDTE>). This obviously introduces changes to the content of the document and is the probable cause that the signature verification is failing.

XML canonicalization isn't supposed to break an XML digital signature if the signature itself specifies it through <CanonicalizationMethod>.

@ycouce-cdd
Copy link
Member

XML canonicalization isn't supposed to break an XML digital signature if the signature itself specifies it through <CanonicalizationMethod>.

You are absolutely right and in an ideal context these changes should be enough, but when it comes to SII, nothing works as expected.
The CanonicalizationMethod element is required, but unfortunately, there is no way to enforce the canonicalization algorithm to be applied performing signature calculations and apparently, the SII doesn't apply it in its signature verification process and that is why we have a large number of AEC and DTE XML documents that are not in their canonical form and applying the canonicalization algorithm to them usually does not introduce major changes, except in cases like the one I mentioned above, which are altering the signed information.
That is why a large part of the revision of this PR was dedicated to executing the tests on a wide set of XML AEC accepted by the SII.
e.g.
Fyntex FP doesn't apply canonicalization algorithm in the signature verification process and I think that thanks to this it has a high success rate in this process, unfortunately, it does not apply it either in the AEC signing process and still adds them as a reference in the SignedInfo element, because it is required. Am I wrong @yrios-cdd?

Copy link
Member

@ycouce-cdd ycouce-cdd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving these changes because the solution is great, it works as expected, and they also had a high success rate (+ 96%) in AEC document signature verification tests on a large set of SII-approved AECs.
But we should document that one of the reasons why the signature verification process will fail is because the XML is not in its canonical form, even when a canonicalization algorithm is referenced in the SignedInfo element.

Add a parameter to `verify_xml_signature` to enable support for XML
documents that contain multiple signatures, assuming that a suitable
custom XML signature verifier is used.
Test have not been implemented.

Based on `scripts/clean_dte_xml_file.py`.
The file is the output of command:

```sh
./scripts/canonicalize_xml_file.py file \
  'tests/test_data/sii-rtc/AEC--76354771-K--33--170--SEQ-2.xml' \
  'tests/test_data/sii-rtc/AEC--76354771-K--33--170--SEQ-2-canonicalized-c14n.xml'
```

The MD5 checksum of the added file is f41e807aba2a08a3d42f0848a6621f72.
- Extract the Base64+DER–encoded certificate and the Base64-encoded
  signature value from the signature over `<DocumentoAEC>` from the AEC
  XML files
  `tests/test_data/sii-rtc/AEC--76354771-K--33--170--SEQ-2.xml` and
  `tests/test_data/sii-rtc/AEC--76399752-9--33--25568--SEQ-1.xml`.
- Convert the DER-encoded certificates to PEM-encoded.

  Commands:

  ```sh
  cd 'tests/test_data/sii-crypto/'

  openssl x509 \
    -inform  DER -in  'AEC--76354771-K--33--170--SEQ-2-cert.der' \
    -outform PEM -out 'AEC--76354771-K--33--170--SEQ-2-cert.pem'

  openssl x509 \
    -inform  DER -in  'AEC--76399752-9--33--25568--SEQ-1-cert.der' \
    -outform PEM -out 'AEC--76399752-9--33--25568--SEQ-1-cert.pem'
  ```

(OpenSSL 1.1.1f (2020-03-31))
The three files are related to the canonicalized AEC XML file
`tests/test_data/sii-rtc/AEC--76354771-K--33--170--SEQ-2-canonicalized-c14n.xml`.
An AEC XML document has multiple XML digital signatures. This commit
implements the parsing of the signature over the AEC XML element
`<DocumentoAEC>` (XPath: `/AEC/DocumentoAEC`).
Only the outermost signature of the AEC XML document is verified (i.e.
the one over `/AEC/DocumentoAEC`).
@jtrobles-cdd jtrobles-cdd force-pushed the feature/verify-aec-outermost-xml-signature branch from dd48f4e to 38821b8 Compare October 12, 2021 22:41
@jtrobles-cdd jtrobles-cdd merged commit 477680f into develop Oct 12, 2021
@jtrobles-cdd jtrobles-cdd deleted the feature/verify-aec-outermost-xml-signature branch October 12, 2021 22:43
@jtrobles-cdd
Copy link
Member Author

But we should document that one of the reasons why the signature verification process will fail is because the XML is not in its canonical form, even when a canonicalization algorithm is referenced in the SignedInfo element.

@ycouce-cdd Can you create a GitHub issue for that?

@ycouce-cdd
Copy link
Member

But we should document that one of the reasons why the signature verification process will fail is because the XML is not in its canonical form, even when a canonicalization algorithm is referenced in the SignedInfo element.

@ycouce-cdd Can you create a GitHub issue for that?

Done #246

@ycouce-cdd ycouce-cdd mentioned this pull request Oct 20, 2021
ycouce-cdd added a commit that referenced this pull request Dec 23, 2021
There is a high number of AEC documents approved by the SII that
cannot be instantiated because the validation that the AEC signature
certificate is loadable fails. For the moment we will disable this
validation so we can instantiate the AEC documents while we look for
a solution to the issue loading the AEC signature certificate.

Ref: https://cordada.aha.io/features/COMPCLDATA-13
Ref: #242 (comment)
ycouce-cdd added a commit that referenced this pull request Dec 23, 2021
There is a high number of AEC documents approved by the SII that
cannot be instantiated because the validation that the AEC signature
certificate is loadable fails. For the moment we will disable this
validation so we can instantiate the AEC documents while we look for
a solution to the issue loading the AEC signature certificate.

Ref: https://cordada.aha.io/features/COMPCLDATA-13
Ref: #242 (comment)
@glarrain-cdd glarrain-cdd added the component: sii_crypto crypto particular to the SII (unlike 'crypto') label Dec 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: rtc component: sii_crypto crypto particular to the SII (unlike 'crypto') enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants