Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation on p:escape-markup wrt unicode characters #14

Closed
dmj opened this issue Jul 11, 2018 · 4 comments
Closed

Improve documentation on p:escape-markup wrt unicode characters #14

dmj opened this issue Jul 11, 2018 · 4 comments

Comments

@dmj
Copy link
Member

@dmj dmj commented Jul 11, 2018

The result of this step is an XML document that contains the Unicode characters that are the characters that result from escaping the input. It is not encoded characters in a serialized octet stream, therefore, the serialization options related to encoding characters (byte-order-mark, encoding, and normalization-form) do not apply. They are omitted from the standard serialization options on this step.

I stumbled on this and find it hard to understand. Practically it means that for a 'ü' (U+00FC) I get an XML document with (U+00C3, U+00BC) after p:escape-markup.

@dmj

This comment has been minimized.

Copy link
Member Author

@dmj dmj commented Jul 11, 2018

...and

p:escape-markup
p:unescape-markup

is not an identity operation.

Does that mean that once you p:escape-markup you can't get the Unicode characters back?

@ndw

This comment has been minimized.

Copy link
Collaborator

@ndw ndw commented Jul 11, 2018

That actually sounds like a bug to me, I don't see why 'ü' (U+00FC) shouldn't be preserved. (A bug in the implementation, I mean, I don't see anything in the spec that says Unicode characters other than markup characters should be changed.)

@dmj

This comment has been minimized.

Copy link
Member Author

@dmj dmj commented Jul 12, 2018

Yes. The 'ü' must be an error: I try the same pipeline and today it works oO I'll investigate this.

@ndw ndw added the atomic-step label Sep 5, 2018
@ndw ndw transferred this issue from xproc/3.0-specification Nov 1, 2018
@ndw ndw added the editorial label Jun 10, 2019
@xatapult

This comment has been minimized.

Copy link
Contributor

@xatapult xatapult commented Sep 11, 2019

The intro for p:escape-markup is not grammatically correct but also unclear and confusing (to me). Fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.