Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make matcher aware of CGJ #554

Closed
behdad opened this issue Oct 4, 2017 · 11 comments
Closed

Make matcher aware of CGJ #554

behdad opened this issue Oct 4, 2017 · 11 comments

Comments

@behdad
Copy link
Member

behdad commented Oct 4, 2017

Many Arabic fonts have GSUB rules to reorder marks. Currently the matcher skips over CGJ. It shouldn't.

@KrasnayaPloshchad
Copy link

So CGJ should be treated the same as other combining characters, isn’t it?

@khaledhosny
Copy link
Collaborator

khaledhosny commented Oct 14, 2017

Do we really want to do that? If the font supports <markA><markB> but the user inserts a CGJ between them because normalization re-orders them, skipping CGJ means the font will just work, while not skipping it means the font will have to explicitly support this case or CGJ will be making things worse.

@behdad
Copy link
Member Author

behdad commented Oct 14, 2017

Right now the matches jumps over CGJ because it's a Default_Ignorable. This bug is exactly about not skipping over it, at least when matching marks.

@behdad
Copy link
Member Author

behdad commented Oct 15, 2017

The desired logic here is quite convoluted. We want matcher to NOT skip a CGJ if and only if removing it would have reordered marks...

I think I can do this by changing the CGJ type to a skippable vs non-skippable during normalization. Ie. turn it into a ZWNJ-like thing if it should not be skipped.

@behdad
Copy link
Member Author

behdad commented Oct 15, 2017

Fortunately we have a bit for that already; that's how Mongolian variation selectors are handled. So I just need to turn that bit on for all or certain CGJ's. A good start is to never skip CGJ. I'll put that in now.

behdad added a commit that referenced this issue Oct 15, 2017
We might want to tweak this some more.  For now, never skipping over
it is better behavior than always skipping.

Part of #554
@khaledhosny
Copy link
Collaborator

With the last commit in master, the third case in w3c/alreq#147 is now broken, but I guess the more refined logic would handle it?

hb-view amiri-regular.ttf --unicodes="0649 0655 034F 0650 062F"

q

@behdad
Copy link
Member Author

behdad commented Oct 15, 2017

With the last commit in master, the third case in w3c/alreq#147 is now broken, but I guess the more refined logic would handle it?

Yes. I'll work on it. Good to have a test.

BTW, why doesn't Amiri have mark2mark attachments?

@khaledhosny
Copy link
Collaborator

BTW, why doesn't Amiri have mark2mark attachments?

It has, but I substitute kasra following hamza with a smaller one and there is no mark to mark attachments between hamza and regular kasra.

@behdad
Copy link
Member Author

behdad commented Oct 22, 2017

I'm releasing 1.6.1. I'm aware that this bug is not fully fixed and 1.6.1 will have different behavior from previous versions, and probably following versions. Just documenting here for posterity.

clrpackages pushed a commit to clearlinux-pkgs/harfbuzz that referenced this issue Oct 25, 2017
…1.6.1

Behdad Esfahbod (39):
      Never skip over CGJ
      [util] Include hb-private.hh
      Add polyfill for static_assert and nullptr
      Use static_assert instead of custom ASSERT_STATIC
      Don't use NULL in public headers
      Use nullptr instead of NULL
      Use C++11
      Fix warnings
      Try fixing AppVeyor bots
      Move set-digests into their own header file
      Remove unused hb_frozen_set_t
      Remove unused hb_cache_t
      Add popcount for 64bit ints
      Inline another bsearch()
      Add bfind() to prealloaced_array_t
      Deprecate hb_set_invert()
      Faster hb_set_t
      [set] Add fallback implementation of int-vector type
      Properly detect vector_size attribute and use fallback otherwise
      Simplify hb_prealloced_array_t initialization
      [set] Speed up intersects()
      [set] Remove TODO items not worth pursuing
      Fix bots
      [set] Disable vectorization
      Ouch.
      [set] Fix merge logic
      [coretext] Fix build
      Try fixing build on VC
      Update docs symbols
      [bsearch] Micro-optimization
      [myanmar] Fix unsafe usage of FLAG_SAFE()
      Remove FLAG_SAFE()
      Another try at fixing build bots
      [docs] Deprecate hb_set_invert()
      Another try at fixing mingw32 build bot fail
      Merge commit 'bfe0faf1a2d39302129a7202994456afd96694ca'
      Another try at fixing mingw32 build bot fail
      Merge commit '3ee15a60358f4d894bbf2431d7a7df38b7acc4ce'
      1.6.1

Chun-wei Fan (3):
      CMake builds: Fix builds
      CMake: Support building HarfBuzz-GObject
      CMake builds: Support introspection builds

Fredrik Roubert (1):
      Switch from ICU deprecated unorm_normalize to unorm2_normalize. (#569)

ebraminio (1):
      Merge pull request #572 from fanc999/master.msvc

Overview of changes leading to 1.6.1
Sunday, October 22nd, 2017
====================================

- Don't skip over COMBINING GRAPHEME JOINER when ligating, etc.
  To be refined: harfbuzz/harfbuzz#554
- Faster hb_set_t implementation.
- Don't use deprecated ICU API.
- Fix undefined-behavior in Myanmar shaper, introduced in 1.6.0
- Deprecated API:
  hb_set_invert()

commit 740fdbcd0e6d25c1d6f12537ca2aa559650b9d52
Author: jfkthame <jfkthame@gmail.com>
Date:   Mon Apr 3 12:22:39 2017 +0100

    avoid UBSan warning in get_stage_lookups (#450)

    See https://bugzilla.mozilla.org/show_bug.cgi?id=1336600

 src/hb-ot-map-private.hh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

commit 8d256841ca7462fd596329abf6f71bafb56fd621
Author: Dominik Schloesser <dsc@dosc.net>
Date:   Sun Mar 26 09:22:34 2017 +0200

(NEWS truncated at 15 lines)
@KrasnayaPloshchad
Copy link

KrasnayaPloshchad commented Nov 1, 2017

@khaledhosny What happened if you try to workaround for that? For example, add a zero-width glyph and mark attachments for CGJ into Amiri.

@behdad
Copy link
Member Author

behdad commented Nov 1, 2017

We don't want workarounds.

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Nov 23, 2017
Overview of changes leading to 1.7.1
Tuesday, November 14, 2017
====================================

- Fix atexit object destruction regression.
- Fix minor integer-overflow.


Overview of changes leading to 1.7.0
Monday, November 13, 2017
====================================

- Minor Indic fixes.
- Implement kerning and glyph names in hb-ot-font.
- Various DSO optimization re .data and .bss sizes.
- Make C++11 optional; build fixes.
- Mark all other backends "unsafe-to-break".
- Graphite fix.


Overview of changes leading to 1.6.3
Thursday, October 26th, 2017
====================================

- Fix hb_set_t some more.  Should be solid now.
- Implement get_glyph_name() for hb-ot-font.
- Misc fixes.


Overview of changes leading to 1.6.2
Monday, October 23nd, 2017
====================================

- Yesterday's release had a bad crasher; don't use it.  That's what
  happens when one works on Sunday...
  harfbuzz/harfbuzz#578
- Build fixes for FreeBSD and Chrome Android.


Overview of changes leading to 1.6.1
Sunday, October 22nd, 2017
====================================

- Don't skip over COMBINING GRAPHEME JOINER when ligating, etc.
  To be refined: harfbuzz/harfbuzz#554
- Faster hb_set_t implementation.
- Don't use deprecated ICU API.
- Fix undefined-behavior in Myanmar shaper, introduced in 1.6.0
- Deprecated API:
  hb_set_invert()


Overview of changes leading to 1.6.0
Friday, October the 13th, 2017
====================================

- Update to Unicode 10.

- Various Indic and Universal Shaping Engine fixes as a result of
  HarfBuzz Hackfest with Jonathan Kew at Web Engines Hackfest at
  the Igalia offices in A Coruña, Spain.  Thanks Igalia for having
  us!

- Implement Unicode Arabic Mark Ordering Algorithm UTR#53.

- Implement optical sizing / tracking in CoreText backend, using
  new API hb_font_set_ptem().

- Allow notifying hb_font_t that underlying FT_Face changed sizing,
  using new API hb_ft_font_changed().

- More Graphite backend RTL fixes.

- Fix caching of variable font shaping plans.

- hb-view / hb-shape now accept following new arguments:

  o --unicodes: takes a list of hex numbers that represent Unicode
    codepoints.

New API:
+hb_face_get_table_tags()
+hb_font_set_ptem()
+hb_font_get_ptem()
+hb_ft_font_changed()
@behdad behdad closed this as completed Jan 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants