Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

U+E007F is reinstated to non-deprecated since Unicode 9.0 #2469

Closed
rdeltour opened this issue Oct 25, 2022 · 8 comments · Fixed by #2476
Closed

U+E007F is reinstated to non-deprecated since Unicode 9.0 #2469

rdeltour opened this issue Oct 25, 2022 · 8 comments · Fixed by #2476
Labels
EPUB33 Issues addressed in the EPUB 3.3 revision i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation Topic-OCF The issue affects the OCF section of the core EPUB 3 specification

Comments

@rdeltour
Copy link
Member

EPUB OCF says U+E007F is disallowed as one of the two deprecated characters in the Tags and Variation Selectors Supplement.

But E+E007F CANCEL TAG was reinstated as non-deprecated in Unicode 9.0, see the change history for the Unicode Character Database

The stateful tag terminator U+E007F CANCEL TAG, formerly deprecated, was reinstated to non-deprecated, for use in emoji contexts.

See also the up-to-date list of deprecated characters in the latest UCD PropList.txt file (search for "Deprecated").

@rdeltour
Copy link
Member Author

Referencing previous discussion around this in #1885 #1899

@mattgarrish mattgarrish added Topic-OCF The issue affects the OCF section of the core EPUB 3 specification Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation labels Oct 31, 2022
@mattgarrish
Copy link
Member

Ya, it seems we ended up with E007F deprecated despite wanting to allow emoji sequences...

I wonder if we can remove that bullet to avoid the redundancy of restricting each code point that unicode already deprecates (and the future maintenance it entails). Maybe we can use the file you've referenced @rdeltour to create a new one at the end of the list, like:

Thoughts @iherman @xfq @r12a ?

@iherman
Copy link
Member

iherman commented Nov 1, 2022

For someone who has never looked at a Unicode listing closely... @mattgarrish I presume you refer to these lines in the file you referred to:

0149          ; Deprecated # L&       LATIN SMALL LETTER N PRECEDED BY APOSTROPHE
0673          ; Deprecated # Lo       ARABIC LETTER ALEF WITH WAVY HAMZA BELOW
0F77          ; Deprecated # Mn       TIBETAN VOWEL SIGN VOCALIC RR
0F79          ; Deprecated # Mn       TIBETAN VOWEL SIGN VOCALIC LL
17A3..17A4    ; Deprecated # Lo   [2] KHMER INDEPENDENT VOWEL QAQ..KHMER INDEPENDENT VOWEL QAA
206A..206F    ; Deprecated # Cf   [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES
2329          ; Deprecated # Ps       LEFT-POINTING ANGLE BRACKET
232A          ; Deprecated # Pe       RIGHT-POINTING ANGLE BRACKET
E0001         ; Deprecated # Cf       LANGUAGE TAG

Maybe it is worth making a note on how to read that reference...

@mattgarrish
Copy link
Member

I presume you refer to these lines in the file you referred to:

Right, I presume that list is all of them. I checked some of the other files but it appears the deprecated ones have been consolidated there.

It would have been nice if there were an HTML equivalent with a direct link, but searching around I couldn't find one. If there's another reference we could use, though...

The alternative, of course, is we say nothing about deprecated code points and assume that epubcheck should be warning about them, because, well, they're already deprecated by the official standard. That would be even better.

@iherman
Copy link
Member

iherman commented Nov 1, 2022

I think that, spec-wise, we should keep this in the spec. It would be strange if epubcheck defined the spec...

@mattgarrish
Copy link
Member

Ya, but it's back to that basic question we've bumped into a couple of times now of whether we need to restrict people from using things that are already deprecated by their respective specifications. Epubcheck would only be reporting what unicode defines.

But I'm fine either way.

@iherman
Copy link
Member

iherman commented Nov 1, 2022

Ya, but it's back to that basic question we've bumped into a couple of times now of whether we need to restrict people from using things that are already deprecated by their respective specifications. Epubcheck would only be reporting what unicode defines.

But I'm fine either way.

That is also correct...

We could also consider an approach whereby we put, instead of the bullet point in the normative text as above, a note whereby authors should also abide to any restrictions dictated by Unicode (who knows, they may come up, at some point, with a different notion than "deprecated"), put deprecation as an example?

We can also toss a coin. :-)

@r12a r12a added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Nov 1, 2022
@mattgarrish
Copy link
Member

a note whereby authors should also abide to any restrictions dictated by Unicode

Ya, I like this approach. I'll see what I can come up with.

@mattgarrish mattgarrish added the EPUB33 Issues addressed in the EPUB 3.3 revision label Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EPUB33 Issues addressed in the EPUB 3.3 revision i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Spec-EPUB3 The issue affects the core EPUB 3.3 Recommendation Topic-OCF The issue affects the OCF section of the core EPUB 3 specification
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants