Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving @-font rotatable character choice #95

Open
GoogleCodeExporter opened this issue Dec 30, 2015 · 5 comments
Open

Improving @-font rotatable character choice #95

GoogleCodeExporter opened this issue Dec 30, 2015 · 5 comments

Comments

@GoogleCodeExporter
Copy link

@-fonts are used for vertical writing and work by rotating characters that 
should
always stay upright by ninety degrees in preparation for the entire line being
rotated by ninety degrees in the opposite direction.

Currently libass decides which characters to rotate by simply thresholding
their Unicode codepoints. Anything below U+02F1 (#define VERTICAL_LOWER_BOUND
in ass_font.h) is left alone, while everything else is rotated.

Of course, what libass is trying to do is emulate the behaviour of VSFilter.
But VSFilter doesn't simply rotate all codepoints above a threshold!
I tried the following in xy-VSFilter on Windows 7:

    {\fn@MS Mincho\frz-90}よ‼ņfi«‹вα←⒜—ー

(note: א is rendered in a different font)
and got the following results:

    Original | Substituted | Rotated? | Codepoint | Category | Width
           よ |             |      yes |    U+3088 |       Lo |     W
           ‼ |             |      yes |    U+203C |       Po |     N
           ņ |             |       no |    U+0146 |       Ll |     N
           fi |             |       no |    U+FB01 |       Ll |     N
           « |             |       no |    U+00AB |       Pi |     N
           ‹ |             |      yes |    U+2039 |       Pi |     N
           в |             |       no |    U+0432 |       Ll |     A
           α |             |       no |    U+03B1 |       Ll |     A
           ← |     upward* |       no |    U+2190 |       Sm |     A
           ⒜ |             |      yes |    U+249C |       So |     A
           — |             |      yes |    U+2014 |       Pd |     A
           ー |    vertical |      yes |    U+30FC |       Lm |     W
    * and since it's then rotated by -90 degrees along with the baseline,
      ultimately it appears to point to the right

As I understand it, responsibility for VSFilter's behaviour here lies with GDI.
So I tried to search for a description of how GDI does this, and I failed to 
find
one. Then I tried to disassemble a GDI function that supposedly decides whether
to rotate glyphs based on an input parameter, but I couldn't find anything
relevant in the code I got (which is long and seemingly complicated). Thanks
to Google, I even looked at the implementation of the same function in 
ReactOS...
but it didn't seem to take @-fonts into account, and I suspect their font system
may not implement them at all.

It seems further investigation is needed. I think it may be useful to test many
codepoints one by one in Windows and try to extract rules from the gathered
statistics. For example, in the above (admittedly very small) sample,
all wide characters and none of the Ll characters were rotated.

Original issue reported on code.google.com by chortos@inbox.lv on 25 Feb 2013 at 2:02

@GoogleCodeExporter
Copy link
Author

Whoops, forgot to remove the note about aleph after removing aleph itself.

Original comment by chortos@inbox.lv on 25 Feb 2013 at 2:03

@GoogleCodeExporter
Copy link
Author

While researching how the @font feature of GDI works, I noticed that Wine 
rotates all characters over a certain codepoint, so that's what I implemented 
as well.

What GDI probably does (I havent't tested) is checking the Unicode script 
property. It should be easy to query HarfBuzz's character database (which is 
likely UCDN anyway :)) and only rotate CJK characters.

Original comment by g...@chown.ath.cx on 3 Mar 2013 at 10:08

@rcombs
Copy link
Contributor

rcombs commented Aug 5, 2023

The check is, as far as I can tell:

  • If the character is in any of these blocks, it's rotated:
    • 0x4E00~0x9FFF (CJK UNIFIED IDEOGRAPHS)
    • 0x3040~0x309F (HIRAGANA)
    • 0x30A0~0x30FF (KATAKANA)
    • 0xAC00~0xD7A3 (HANGUL, obsolete)
  • If the character exists in the "font code page" and takes up 2 bytes there, it's rotated
    • For Type-1 conversions, this is based on ulCodePageRange1, with preference for:
      • JIS/Japan, CP932
      • Traditional Chinese, CP950
      • Simplified Chinese, CP936
      • Wansung, CP949
      • If none match, CP1252
    • Otherwise, it's based on what characters are present:
      • If halfwidth katakana アイウエオ all present, CP932
      • If U+61D4 and U+9EE2, CP936
      • If U+9F98 and U+9F79, CP950
      • If Hangul "ga" and "ha", CP949
      • If U+E000 (PUA) and the current ANSI code page is a DBCS, current ANSI code page
      • Otherwise, CP1252

So, it's font- and potentially locale-specific! Fun. It's really just trying to determine "is this character fullwidth", poorly, and rotating anything that is.

When rotation is enabled (per above), alternates are selected via the mort or gsub tables, using the vert feature from gsub.

@astiob
Copy link
Contributor

astiob commented Aug 5, 2023

Yeah, that’s starting to sound like what I’ve seen when I tried to dig in. (You probably went deeper than I did; thanks!) And why I’ve been worrying that HarfBuzz’s writing mode may not be the solution.

@khaledhosny
Copy link

There is also Unicode Vertical_Orientation property, but you probably want to be GDI-copmatible which might not exactly match this property.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants