Skip to content

Conversation

@gredler
Copy link
Contributor

@gredler gredler commented Feb 12, 2025

When a string contains zero-width characters, LineBreakMeasurer calculates line breaks incorrectly.

The root cause appears to be that LineBreakMeasurer eventually calls into StandardGlyphVector.getGlyphInfo(), which derives the glyph advances from the glyph IDs. However, HarfBuzz's default treatment of zero-width characters is to provide the glyph ID of the space character (U+0020) combined with an artificial zero advance (not the font's space glyph advance). Unaware of HarfBuzz's sleight of hand, StandardGlyphVector.getGlyphInfo() retrieves the actual advances of the space glyph (since that was the glyph ID returned) and provides these back up the call chain to LineBreakMeasurer et al.

I think the correct fix is to use hb_buffer_set_invisible_glyph to register 0xFFFF as the invisible glyph ID with HarfBuzz (matching CharToGlyphMapper.INVISIBLE_GLYPH_ID).

I haven't seen any unwanted side effects, but there is a risk, since this is changing the global HarfBuzz configuration.

For more information on HarfBuzz's behavior in this area, see: https://harfbuzz.github.io/setting-buffer-properties.html


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8270265: LineBreakMeasurer calculates incorrect line breaks with zero-width characters (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23603/head:pull/23603
$ git checkout pull/23603

Update a local copy of the PR:
$ git checkout pull/23603
$ git pull https://git.openjdk.org/jdk.git pull/23603/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 23603

View PR using the GUI difftool:
$ git pr show -t 23603

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23603.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 12, 2025

👋 Welcome back dgredler! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 12, 2025

@gredler This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8270265: LineBreakMeasurer calculates incorrect line breaks with zero-width characters

Reviewed-by: achung, prr

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 307 new commits pushed to the master branch:

  • cd9f1d3: 8286204: [Accessibility,macOS,VoiceOver] VoiceOver reads the spinner value 10 as 1 when user iterates to 10 for the first time on macOS
  • 4cf6316: 8351414: C2: MergeStores must happen after RangeCheck smearing
  • 8a5ed47: 8350148: Native stack overflow when writing Java heap objects into AOT cache
  • 5928209: 8347405: MergeStores with reverse bytes order value
  • f984c2b: 8351505: (fs) Typo in the documentation of java.nio.file.spi.FileSystemProvider.getFileSystem()
  • ffa6340: 8351567: Jar Manifest test ValueUtf8Coding produces misleading diagnostic output
  • 8d8bd0c: 8349492: Update sun/security/pkcs12/KeytoolOpensslInteropTest.java to use a recent Openssl version
  • 73465b9: 8160327: Support for thumbnails present in APP1 marker for JPEG
  • dbdbbd4: 8348597: Update HarfBuzz to 10.4.0
  • 7999091: 8351555: Help section added in JDK-8350638 uses invalid HTML
  • ... and 297 more: https://git.openjdk.org/jdk/compare/55097dd4cbb5d691c12cb0247d66dce593759d59...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@alisenchung, @prrace) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 12, 2025
@openjdk
Copy link

openjdk bot commented Feb 12, 2025

@gredler The following label will be automatically applied to this pull request:

  • client

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the client client-libs-dev@openjdk.org label Feb 12, 2025
@mlbridge
Copy link

mlbridge bot commented Feb 12, 2025

Webrevs

@prrace
Copy link
Contributor

prrace commented Feb 14, 2025

Early days but the test fails on macOS
Exception in thread "main" java.lang.RuntimeException: nextOffset 1 for char 00ad using font Dialog: 2 != 1
at FormatCharAdvanceTest.assertEqual(FormatCharAdvanceTest.java:289)
at FormatCharAdvanceTest.testChar(FormatCharAdvanceTest.java:282)
at FormatCharAdvanceTest.testChars(FormatCharAdvanceTest.java:165)
at FormatCharAdvanceTest.main(FormatCharAdvanceTest.java:154)

@gredler
Copy link
Contributor Author

gredler commented Feb 14, 2025

Interesting, I've been testing on Linux. I'll break out the Mac to see what's going on.

@gredler
Copy link
Contributor Author

gredler commented Feb 14, 2025

@prrace Two findings here:

First, it looks like macOS needs an extra pixel of wiggle room in the max string width that we measure; I've given it two pixels, just to be extra sure that the test is stable.

Second, the combination of (macOS Dialog font + chars U+200F or U+2067) has HarfBuzz removing the zero-width chars instead of replacing them with the invisible glyph. I think it has something to do with the font tables in that specific macOS font. It looks like in this scenario ExtendedTextSourceLabel.getLineBreakIndex(int, float) was communicating an early line break to the caller, rather than assuming that the shaper omitted or combined glyphs.

@alisenchung
Copy link
Contributor

@prrace Two findings here:

First, it looks like macOS needs an extra pixel of wiggle room in the max string width that we measure; I've given it two pixels, just to be extra sure that the test is stable.

Should you add only these pixels when running the test on macos? Or do these pixels not matter on other platforms?

Copy link
Contributor

@alisenchung alisenchung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, update copyright years on changed files

@gredler
Copy link
Contributor Author

gredler commented Feb 21, 2025

Should you add only these pixels when running the test on macos? Or do these pixels not matter on other platforms?

It doesn't matter on other platforms, since it's just a little bit of extra leniency. I'm relatively new to the codebase, but the other tests I've seen that run similar checks just add the necessary leniency across the board.

Also, update copyright years on changed files

Done, thanks!

@gredler
Copy link
Contributor Author

gredler commented Mar 5, 2025

@prrace This one is also ready for a second review, if you have time.

@prrace
Copy link
Contributor

prrace commented Mar 6, 2025

@prrace This one is also ready for a second review, if you have time.

I need to look again at the changes, but I'm going to start by submitting a new build/test job for the latest version.

@prrace
Copy link
Contributor

prrace commented Mar 6, 2025

I just deleted a comment added to the wrong PR!

@gredler
Copy link
Contributor Author

gredler commented Mar 10, 2025

@prrace Did you get a chance to submit that test job for this PR? If so, were there any failures? Thanks!

Copy link
Contributor

@prrace prrace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing passed. And I can't see a specific fault here, but have a sense of not being sure we have enough test coverage. Fingers crossed.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 11, 2025
@gredler
Copy link
Contributor Author

gredler commented Mar 11, 2025

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Mar 11, 2025
@openjdk
Copy link

openjdk bot commented Mar 11, 2025

@gredler
Your change (at version b9b707a) is now ready to be sponsored by a Committer.

@gredler
Copy link
Contributor Author

gredler commented Mar 13, 2025

This PR is ready to be sponsored. Thanks!

@prrace
Copy link
Contributor

prrace commented Mar 13, 2025

/sponsor

@openjdk
Copy link

openjdk bot commented Mar 13, 2025

Going to push as commit 7fc776e.
Since your change was applied there have been 359 commits pushed to the master branch:

  • 3da5e3f: 8349350: Unable to print using InputSlot and OutputBin print attributes at the same time
  • aa047ee: 8319192: Remove javax.swing.plaf.synth.SynthLookAndFeel.load(URL url)
  • cd1be91: 8319055: JCMD should not buffer the whole output of commands
  • 248c373: 8351266: JFR: -XX:StartFlightRecording:report-on-exit
  • 03ef79c: 8346470: Improve WriteBarrier JMH to have old-to-young refs
  • b50fe9b: 8280818: Expand bug8033699.java to iterate over all LaFs
  • 771e160: 8351323: Parameterize compiler and linker flags for iconv
  • 0ff1c08: 8349984: (jdeps) jdeps can use String.repeat instead of String.replaceAll
  • 37ec796: 8351500: G1: NUMA migrations cause crashes in region allocation
  • 4e51a8c: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long)
  • ... and 349 more: https://git.openjdk.org/jdk/compare/55097dd4cbb5d691c12cb0247d66dce593759d59...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 13, 2025
@openjdk openjdk bot closed this Mar 13, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Mar 13, 2025
@openjdk
Copy link

openjdk bot commented Mar 13, 2025

@prrace @gredler Pushed as commit 7fc776e.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

client client-libs-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

3 participants