Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kazuraki breaks assumptions about hanmen layout #138

Closed
simoncozens opened this issue Jul 23, 2015 · 9 comments
Closed

Kazuraki breaks assumptions about hanmen layout #138

simoncozens opened this issue Jul 23, 2015 · 9 comments

Comments

@simoncozens
Copy link
Member

The Kazuraki font is a calligraphic Japanese font, which contains vertical ligatures. Currently we shape Japanese characters independently, so the ligatures are never triggered. Also some of the vertical spacing is totally wrong.

@alerque
Copy link
Member

alerque commented Jul 23, 2015

私はちょうどそれを言及するつもりでした。

@simoncozens
Copy link
Member Author

...and it looks like Harfbuzz won't help me here.

> text = "として"
> SILE.shaper:shapeToken(text, SILE.font.loadDefaults({ font = "Kazuraki SPN", direction = "TTB", language = "ja" }))
{
  {
    codepoint = 1699,
    depth = 0,
    height = 8.81,
    name = "",
    width = 10,
  },
  {
    codepoint = 1682,
    depth = 0,
    height = 13.31,
    name = "",
    width = 10,
  },
  {
    codepoint = 1697,
    depth = 0,
    height = 7.97,
    name = "",
    width = 10,
  },
}

That's meant to be a single ligature, I think.

@simoncozens
Copy link
Member Author

Ligged and non-ligged variants:
screen shot 2015-07-23 at 16 32 40

@simoncozens
Copy link
Member Author

OK, I just had lunch with Behdad and he explained the problem. To use Kazuraki font, you also have to add\font[..,features="+liga"] because it doesn't turn ligatures on by default.

However, because ja.lua shapes each character individually, Harfbuzz will never see enough text to form a ligature. If you don't tell SILE that the text is Japanese, you get ligatured output, but of course this has its own problems.

@behdad
Copy link

behdad commented Jul 25, 2015

However, because ja.lua shapes each character individually, Harfbuzz will never see enough text to form a ligature. If you don't tell SILE that the text is Japanese, you get ligatured output, but of course this has its own problems.

Not sure what you mean...

Are you telling HarfBuzz that this is vertical layout or not? What "other" problems do you mean?

@simoncozens
Copy link
Member Author

Hey Behdad! Yes, we're telling Harfbuzz it's vertical. There are two problems:

  1. The Japanese language support module languages/ja.lua applies the JIS X 4051 rules for inter-character spacing. To do this, the string is broken up into individual glyphs, each glyph is shaped and glue and penalty is (optionally) added between each pair of glyphs according to the rules. So the "として" ligature will never be formed because ja.lua passes the three glyphs "と", "し" and "て" to the shaper independently. And if Kazuraki didn't exist, that would be fine, because you want to allow for line break opportunities between those glyphs. To make Kazuraki work perfectly, you would need to essentially implement Japanese hyphenation! i.e. you send "として" to the shaper in one go, but allow the line break algorithm to split it. I am not sure this is an intelligent thing to do for the sake of one font, even one as pretty as Kazuraki.
  2. If you turn off the special ja.lua intercharacter penalty/glue handling and use the default SILE approach of shipping text to the shaper chunked on UCD line breaking data, the ligatures work OK. But something is also wrong with our glyph metric code. In particular, I think the Y-advance of some characters is not big enough.

In this example, there is not enough space between 日 and 本 (本 should be positioned lower = 日 advance is not big enough) and か and ら.
screen shot 2015-07-26 at 21 10 36

To be honest I think the metric code is suspect. (See this thread on the HB mailing list.) Here is the code from justenoughharfbuzz.c:

void calculate_extents(box* b, hb_glyph_info_t glyph_info, hb_glyph_position_t glyph_pos, FT_Face ft_face, double point_size, hb_direction_t direction) {
  FT_Error error = FT_Load_Glyph(ft_face, glyph_info.codepoint, FT_LOAD_NO_SCALE);
  if (error) return;
  FT_Glyph glyph;
  error = FT_Get_Glyph(ft_face->glyph, &glyph);
  if (error) return;
  FT_BBox ft_bbox;
  FT_Glyph_Get_CBox(glyph, FT_GLYPH_BBOX_UNSCALED, &ft_bbox);
  FT_Fixed advance;
  FT_Get_Advance(ft_face, glyph_info.codepoint, FT_LOAD_NO_SCALE, &advance);
  const FT_Glyph_Metrics *ftmetrics = &ft_face->glyph->metrics;
  b->width = advance * point_size / ft_face->units_per_EM;
  if (direction == HB_DIRECTION_TTB) {
    FT_Get_Advance(ft_face, glyph_info.codepoint, FT_LOAD_NO_SCALE | FT_LOAD_VERTICAL_LAYOUT, &advance);
    b->height = advance * point_size / ft_face->units_per_EM;
    b->depth = 0;
  } else {
    b->height = ft_bbox.yMax * point_size / ft_face->units_per_EM;
    b->depth = -ft_bbox.yMin * point_size / ft_face->units_per_EM;
  }
  FT_Done_Glyph(glyph);
}

@khaledhosny
Copy link
Contributor

I think one way to handle this, is to apply the JIS X 4051 spacing rules after shaping not before it. I think you will need output glyph to input character mapping (which you will need anyway for #110 anyway) as JIS X 4051 rules are character based, but this can be done using cluster values from hb_glyph_info_t.

@behdad
Copy link

behdad commented Aug 12, 2015

What @khaledhosny said.

@simoncozens
Copy link
Member Author

Implementing #179 has made this now work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants