Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selection or insertion point renders incorrectly when inside ligature #11302

Open
mbrubeck opened this issue May 20, 2016 · 20 comments
Open

Selection or insertion point renders incorrectly when inside ligature #11302

mbrubeck opened this issue May 20, 2016 · 20 comments
Assignees

Comments

@mbrubeck
Copy link
Contributor

@mbrubeck mbrubeck commented May 20, 2016

If Servo renders this textbox with a font that has an ffi ligature (which includes the default fonts on most systems), and you use the arrow and/or shift keys to position the insertion point or selection boundary within the "ffi" string, then Servo draws the selection or insertion point at the wrong location (after the ligature instead of inside it):

<textarea style="font-family: sans-serif;">ffi</textarea>

We could fix this by re-running text shaping when a selection boundary is moved into a ligature, splitting the text run at the selection boundary so the characters on opposite sides no longer form ligatures. But this would require reconstructing flows (which is expensive), and it would make glyphs change as the cursor moves around (which is ugly).

A better solution is to compute where inside the ligature to draw the selection or caret. Gecko does this just by dividing the total width of the ligature evenly over each grapheme clusters it contains. This will require extra flags in the glyph store to record which glyph entries are ligature starts and cluster starts. Then advance_for_byte_range will need to use this information to adjust the result if the range includes partial ligatures. For comparison, Gecko’s code for this is mostly in GetAdvanceWidth and ComputeLigatureData.

@shinglyu
Copy link
Member

@shinglyu shinglyu commented May 23, 2016

Can I give it a try?

@shinglyu
Copy link
Member

@shinglyu shinglyu commented May 23, 2016

How do I create a ffi ligature? if I use your example code, does it become a ligature automatically? I also tried

<textarea style="font-family: sans-serif;">&#xfb03;</textarea> 

The selection seems to be rendered correctly, but it is treated as a single character.

@shinglyu
Copy link
Member

@shinglyu shinglyu commented May 23, 2016

ping @mbrubeck

@mbrubeck
Copy link
Contributor Author

@mbrubeck mbrubeck commented May 23, 2016

Yes, using the example code, "ffi" (three characters) should be rendered as a ligature automatically as long as the font supports it. If the default font doesn't, you can try setting font-family to one that does. "DejaVu Sans" is one that works on my Linux machine.

@shinglyu
Copy link
Member

@shinglyu shinglyu commented May 24, 2016

@shinglyu add to my dashboard

@shinglyu
Copy link
Member

@shinglyu shinglyu commented May 24, 2016

@mbrubeck Thanks! I was able to see the ligature using DejaVu Sans. The problem is that I can't select the f in the middle. the ligature is treated as a single character. See this screen record:

output

I used keyboard, and I tried to use mouse to select the middle f but failed. The ffi on the right hand side is not a ligature because I use the default font.

This is different from what you described. Did I stumble upon another bug?

(p.s. I also tried Firefox, it can let me select the ffi as separate characters even if it's a ligature)

@mbrubeck
Copy link
Contributor Author

@mbrubeck mbrubeck commented May 24, 2016

That screencast shows the bug that I was trying to describe. Servo always paints the entire ligature as selected, even if internally only part of it is selected. (You can type another letter like "x" and only the "real" selection will be replaced.)

@shinglyu
Copy link
Member

@shinglyu shinglyu commented May 25, 2016

Thanks! typing the "x" reveals the secret!

@KiChjang
Copy link
Member

@KiChjang KiChjang commented Jun 11, 2016

@shinglyu Still holding up okay on this issue?

@shinglyu
Copy link
Member

@shinglyu shinglyu commented Jun 13, 2016

@KiChjang ya, still working on it, just has a vacation and a workweek coming, so probably not much progress until next week.

@shinglyu
Copy link
Member

@shinglyu shinglyu commented Jun 29, 2016

@mbrubeck I went through the code and got some rough idea on how to implement this. But I'm a little confused about how to use the various Glyph* structs. For instance, does the starts_ligature in GlyphEntry and the ligature_start in GlyphData useful for our case? And it not clear to me how the GlyphStore is constructed, so I"m not sure where I can set the flag?

@mbrubeck
Copy link
Contributor Author

@mbrubeck mbrubeck commented Jun 30, 2016

I went through the code and got some rough idea on how to implement this. But I'm a little confused about how to use the various Glyph* structs… And it not clear to me how the GlyphStore is constructed, so I'm not sure where I can set the flag?

Here's a quick overview of the different types:

  1. Font::shape_text creates an empty GlyphStore, then passes it to the Shaper which fills it in.
  2. The Shaper creates GlyphData structs. These are temporary objects that are just used to as arguments to the GlyphStore::add_glyph* methods.
  3. The GlyphStore takes each GlyphData from the shaper and translates it into a GlyphEntry. Some glyphs require more data than can fit into a 32-bit GlyphEntry; in that case it also creates one or more DetailedGlyphs.

In the end we have a GlyphStore, which contains an sequence of GlyphEntries and a sequence of DetailedGlyphs.

For instance, does the starts_ligature in GlyphEntry and the ligature_start in GlyphData useful for our case?

Yes, the cluster_start and ligature_start fields/arguments are used to pass the information that we need. However, we currently don’t actually use these fields for anything. (They get passed from the Shaper, to the GlyphStore, to the GlyphEntry constructor, which then discards them.) GlyphEntry used to store this information in its bit flags, but I removed those in #7434 because the flags were never actually used or tested, and I don't think they were set correctly.

Fixing this bug will require re-adding those flags, and making sure save_glyph_results passes correct values for them. Then advance_for_byte_range can read the flags, as described above.

@shinglyu
Copy link
Member

@shinglyu shinglyu commented Jul 1, 2016

Thanks! That's super helpful!

@shinglyu
Copy link
Member

@shinglyu shinglyu commented Jul 1, 2016

@mbrubeck : Should I follow the old convention and use a negative flag? Although I don't quite understand what "missing chars can be memset with zeros in one fell swoop" means.

@shinglyu
Copy link
Member

@shinglyu shinglyu commented Jul 6, 2016

I almost finished porting the Gecko algorithm to Servo. But there are a few blockers I need to fix before it works:

  • cluster_start flag looks wrong inside ligatures, I'll have to compare with Gecko.
  • the advance width before the text run is not drawn. (The one after the text run is OK)
  • If we extend the selection from the right side into the ligature, the ligature if broken into separate glyphs, not sure why.
@shinglyu
Copy link
Member

@shinglyu shinglyu commented Jul 7, 2016

@mbrubeck : I'm a little bit confused about how ligature continuations are processed. The second f and i in an ffi both returns is_cluster_start==false, so I think I might set it wrong. But when I check with gdb, the glyph code never enters the GlyphEntry::complex() function, where I set the flag. But in the iterator code, the entry.is_simple() returns false for the second f and i. In the harfbuzz code I only see it process the first f, I don't know how the second f and i get into the GlyphStore and how their flags are set?

@mbrubeck
Copy link
Contributor Author

@mbrubeck mbrubeck commented Jul 7, 2016

When the string "ffi" is shaped as a ligature, there is only one glyph. The first 'f' byte will map to the 'ffi' glyph, and the second and third bytes currently map to nothing. (We only call add_glyph_for_byte_index on byte_range.start, not on any other indices in byte_range, so the other bytes keep their initial value of GlyphEntry(0) in the GlyphStore::entry_buffer, which means they "complex" glyphs with zero DetailedGlyphs.) If we use a negative flag, then we'll need to explicitly set it for the GlyphEntry on each byte index in (byte_range.start + 1 .. byte_range.end).

@shinglyu
Copy link
Member

@shinglyu shinglyu commented Jul 11, 2016

@mbrubeck Do you think we should use a negative or positive flag? I don't quite understand the reason for a negative flag in Gecko. And does it imply that we only need to explictly set the "Not ligature start" case if we use a positive flag?

@jdm
Copy link
Member

@jdm jdm commented Oct 18, 2016

I'm removing the assigned and less easy tags because it seems like this was abandoned, and it also turned out to be more complicated than anticipated.

@shinglyu
Copy link
Member

@shinglyu shinglyu commented Oct 19, 2016

Sorry I was distracted by other tasks. I had a partial implementation here, which is able to select part of the ligature, but the rendering is not 100% correct. ANyone interested can pick it up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.