Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review Segment Break Transformation Rules (CSS Text Level 3) #211

Open
kidayasuo opened this issue May 8, 2020 · 6 comments
Open

Review Segment Break Transformation Rules (CSS Text Level 3) #211

kidayasuo opened this issue May 8, 2020 · 6 comments

Comments

@kidayasuo
Copy link
Contributor

There are discussions in CSS WG regarding Segment Break Transformation Rules:

We would like to review the rule to see if there are any remaining issues or areas which need discussions.

@kidayasuo
Copy link
Contributor Author

kidayasuo commented May 9, 2020

[updated] Updated the data by removing ones that are actually fullwidth versions of the character, and by removing character classes that are inherently non-Japanese (cl-24-cl-27). It makes the list easier to examine.

List of characters listed in JLReq that are not Space Discarding according to https://drafts.csswg.org/css-text-3/#space-discard-set

NOT_SpaceDiscarding_JLReq_char.txt

@xfq
Copy link
Member

xfq commented May 9, 2020

There's also w3c/csswg-drafts#5017 , which is the new CSS issue for "ambiguous" characters.

@kojiishi
Copy link
Contributor

kojiishi commented May 9, 2020

The list is very much helpful, thank you very much, @kidayasuo! It looks to me that the list is reasonable; i.e., the current set of space-discarding unicode characters is reasonable from JLREQ perspective. /cc @fantasai

@kidayasuo
Copy link
Contributor Author

A basic, but fundamental question. How much we can expect authors or editor software, if they fold line automatically, to corporate? In one extreme, we could say to CJ authors to fold lines only between two Kanjis. then we do not need any other rules than "the segment break transformation rule will not insert a space between two Kanjis". (also, probably these expectations should be documented)

@kojiishi
Copy link
Contributor

A basic, but fundamental question...

I think that is exactly where this is controversial. I'm in favor of making rules as simple as possible, because no matter what we do, authors must remember all the rules, and adopt to it. @r12a seems to have similar opinion if I understand his comment correctly. I see some people arguing more rules can make it smarter. I agree they help some cases but authors must remember more.

@kidayasuo
Copy link
Contributor Author

Thank you. I agree we should make the rule easier to remember, in another word intuitive. It also needs to be reliable and in that sense I am not so much fond of language tagging idea because it is more prone to errors.

One little caution is that, in general, things that look simpler for human and easier to remember does not necessarily match something that is simple for rule makers. I think we should strive to devise a "smart" rules that feels simpler to people or our users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants