-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use specialized line break strategy for Chinese and Japanese #1206
Comments
At first I though it would be possible to patch the method in Prawn that scans for line break opportunities. However, that logic is very difficult to override (not to mention understand). A simpler approach, which @chloerei proposed, is to modify the string being typeset by inserting zero-width spaces at line break opportunities. This tells Prawn where it can break the line. While that may not be the most elegant solution, it gets us a solution that we can use today and leaves room for better solutions to come along...including a fix in Prawn itself. Here's the crux of that logic:
|
… lines between any two CJK characters
… lines between any two CJK characters
cc: @diguage |
While we can proceed with this workaround for Asciidoctor PDF users, this issue really needs to be filed upstream in Prawn for a long-term, proper fix. |
I have one question. Is it normal to put spaces between English and Chinese or Japanese characters when mixing the languages? I've seen it done both ways and I'm just curious whether it's a rule or a stylistic choice. |
It's just style. FYI https://chinese.stackexchange.com/questions/31746/spacing-guidelines-for-modern-chinese-writing |
Thanks! |
Thank you for bring fixes for CJK line-break on the upstream. |
Fix chloerei/asciidoctor-pdf-cjk#4 See asciidoctor#1206 also for details
Fix chloerei/asciidoctor-pdf-cjk#4, Also see asciidoctor#1206 for details Require fileutils explictly to fix following errors when run command with `rake spec` An error occurred in a `before(:suite)` hook. Failure/Error: FileUtils.mkdir_p output_dir NameError: uninitialized constant FileUtils Did you mean? FileTest # ./spec/spec_helper.rb:187:in `block (2 levels) in <top (required)>'
Fix chloerei/asciidoctor-pdf-cjk#4, Also see asciidoctor#1206 for details Require fileutils explictly to fix following errors when run command with `rake spec` An error occurred in a `before(:suite)` hook. Failure/Error: FileUtils.mkdir_p output_dir NameError: uninitialized constant FileUtils Did you mean? FileTest # ./spec/spec_helper.rb:187:in `block (2 levels) in <top (required)>'
…s `cjk` (#1355) break CJK characters in table when scripts attribute is `cjk` A follow-up to #1206. Also chloerei/asciidoctor-pdf-cjk#4.
Line break rules for Latin-based languages such as English and French are also being applied to Chinese and Japanese. However, Chinese and Japanese don't use spaces (at least not in the same way). While Latin-based languages have spaces between words where line breaks can occur, Chinese and Japanese are written without spaces in which a line break can occur between any two characters. Chinese and Japanese also use different punctuation for pause, full stop, and dash. These need to be taken into account.
While this isn't so much of a problem when the text is written exclusively in a CJK language (since the line break will be forced once the line is full), it becomes a problem when the text is mixed with another language such as English. All of a sudden, huge gaps appear because the groups of CJK languages get treated as a single "word".
Here's an example:
When rendered with Asciidoctor PDF, huge gaps appear in the line. This can be partially mitigated by changing the text alignment from left to justify, but then the gaps are just shifted to the end of the line.
The correct fix is to allow a line break between any two CJK characters, as long as one of the characters is not punctuation.
To activate this specialized logic, the author must set the
scripts
attribute in the document header tocjk
.Related issue: #82.
The text was updated successfully, but these errors were encountered: