Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

414: Attempt to implement expanding the allowed character repertoire #546

Merged
merged 2 commits into from Jul 25, 2023

Conversation

ndw
Copy link
Contributor

@ndw ndw commented Jun 12, 2023

Fix #414

This PR addresses ACTION QT4CG-036-01 on me.

@michaelhkay
Copy link
Contributor

(a) I think it might be worth emphasizing that unpaired surrogates can never be permitted characters (because they are not Unicode characters) (for example in codepoints-to-string())
(b) It might be worth saying something in the serialization spec - perhaps adding a serialization error condition?

@ndw
Copy link
Contributor Author

ndw commented Jun 12, 2023

Yes, good suggestions.

@Arithmeticus
Copy link
Contributor

For consistency, the type of edits applied in this commit also need to be applied to the following (my comments based upon the qt4cg diff html files for this PR):

  • 1.8.1 -- the data model edits need to be applied to their summary counterparts.
  • 1.8.1 first note: these substantive restrictions would be better placed in the data model document, particularly material referencing the surrogate blocks.
  • 1.8.1 second note: I think the revisions mean that the set of codepoints and the set of characters are coterminous, no? (With the proviso that an implementer may choose to not make them so.) Or maybe I misunderstand.
  • first note for fn:unparsed-text-available(): "no characters that are invalid in XML"

@ChristianGruen ChristianGruen changed the title Attempt to implement expanding the allowed character repertoire 414: Attempt to implement expanding the allowed character repertoire Jun 19, 2023
@ndw ndw closed this Jun 19, 2023
@ndw ndw reopened this Jun 19, 2023
@ndw
Copy link
Contributor Author

ndw commented Jun 19, 2023

I managed to lose all my commits and GitHub thoughtfully closed the issue for me. :-/

I've reapplied the changes and reopened the PR.

@michaelhkay I've added a note about unpaired surrogates. I don't believe any new errors are required in serialization, there's already an error SER0006 for "character not allowed in this version of XML" which I think applies.

@Arithmeticus
Copy link
Contributor

@ndw You may also want to adjust the language in xpath-datamodel.xml at lines 932-934:

This means, for example, that the xs:string data type should (at the time of writing) support the set of characters defined by the Char production in XML 1.1 Second Edition.

The relative parenthetical clause gives me slight chills, as well.

@ndw
Copy link
Contributor Author

ndw commented Jul 25, 2023

The CG agreed to merge this issue at meeting 043

@ndw ndw merged commit 920edcb into qt4cg:master Jul 25, 2023
2 checks passed
@ChristianGruen ChristianGruen added the Tests Needed Tests need to be written or merged label Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Tests Needed Tests need to be written or merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lift character set restriction of xs:string
4 participants