Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[XML] properly handle normalizedString & token #638

Closed
jkowalleck opened this issue Jun 24, 2024 · 2 comments · Fixed by #646
Closed

[XML] properly handle normalizedString & token #638

jkowalleck opened this issue Jun 24, 2024 · 2 comments · Fixed by #646
Assignees
Labels
bug Something isn't working

Comments

@jkowalleck
Copy link
Member

jkowalleck commented Jun 24, 2024

CycloneDX uses http://www.w3.org/2001/XMLSchema - which defines normalizedString as follows:

<xs:simpleType name="normalizedString" id="normalizedString">
  <xs:annotation>
    <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#normalizedString"/>
  </xs:annotation>
  <xs:restriction base="xs:string">
    <xs:whiteSpace value="replace" id="normalizedString.whiteSpace"/>
  </xs:restriction>
</xs:simpleType>

normalizedString represents white space normalized strings. The ·value space· of normalizedString is the set of strings that do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters. The ·lexical space· of normalizedString is the set of strings that do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters. The ·base type· of normalizedString is string.


CycloneDX uses http://www.w3.org/2001/XMLSchema - which defines token as follows:

<xs:simpleType name="token" id="token">
  <xs:annotation>
    <xs:documentation source="http://www.w3.org/TR/xmlschema-2/#token"/>
  </xs:annotation>
  <xs:restriction base="xs:normalizedString">
    <xs:whiteSpace value="collapse" id="token.whiteSpace"/>
  </xs:restriction>
</xs:simpleType>

token represents tokenized strings. The ·value space· of token is the set of strings that do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces. The ·lexical space· of token is the set of strings that do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces. The ·base type· of token is normalizedString.


therefore, on XML-(de)normalization for normalizedString, the following chars must be replaced by space( ):

  • carriage return: \r (#xD)
  • line feed: \n (#xA)
  • tab: \t (#x9)

Therefore, on XML-(de)normalization for token, the following must apply:

  • all from above
  • consecutive spaces are collapsed to one space.
  • leading and trialing spaces are truncated

Affected are only fields that are defined as normalizedString respective token in XML spec!
Other field MUST NOT be affected!

@jkowalleck
Copy link
Member Author

jkowalleck commented Jun 24, 2024

possible solution:

  1. implement a capability (decorator) to the serialization library, to doe the needed transformation on loading and writing XML.
  2. add the newly added capability (decorator) to the CycloneDX lib.

@jkowalleck jkowalleck added good first issue Good for newcomers help wanted Extra attention is needed enhancement New feature or request bug Something isn't working and removed enhancement New feature or request labels Jun 24, 2024
@jkowalleck jkowalleck changed the title [XML] properly handle normalizedString [XML] properly handle normalizedString & token Jun 24, 2024
@jkowalleck
Copy link
Member Author

solution as done in JS/TS

@jkowalleck jkowalleck self-assigned this Jul 8, 2024
@jkowalleck jkowalleck removed good first issue Good for newcomers help wanted Extra attention is needed labels Jul 8, 2024
jkowalleck added a commit that referenced this issue Jul 8, 2024
fixes #638

---------

Signed-off-by: Jan Kowalleck <jan.kowalleck@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant