-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add harfbuzz shaping support #28
Conversation
69a0fcc
to
4425f99
Compare
Hey, I am not familiar with harfbuzz so much to be able to accept this PR just like that. It also makes Typr.js 15x larger :D I somehow fixed the Urdu support in my local version of Typr.js . Now I am trying to solve typing. The user has to be able to click into a rendered text, and the cursor will appear, and the user can type. My current problem is, that String characters and glyphs are not 1-to-1. The number of glyphs can be smaller or even larger, than the number of characters, depending on substitutions. I need some kind of map between characters and glyphs, so that I can find the output glyph for each character, and the original character for each glyph. Does harfbuzz provide such map? Is there any public manual for harfbuzz (either C or WASM version), so that I could see what it is capable of? |
Cluster numbers are for that :) |
BTW. what about the Bidi algorithm? Is it included in Harfbuzz? If I give it a string with both right-to-left and left-to-right runs, will it position glyphs correctly? Also, I need to separate long paragraphs into multiple runs. I use my own code for it right now. I just want to know, how much of my code will I be able to replace :) |
It is not included as harfbuzz supposed to work on text with same direction and script so you have to separate your text by direction and script before feeding it into harfbuzz.
They are provided with the json object the added function have, the "cl" field. |
Well, it is what you wanted! :) ~200kb if zipped
|
So I just found another bug in Khmer rendered by Typr.js : https://imgur.com/OcUt5sS (top is Google Chrome, bottom is Typr.js) I will try to invest more time into learning about HarfBuzz, as I think it may save me a lot of time in the future. |
You can never replicate all the considerations and sometime hacks harfbuzz have to support all the scripts :) |
Hey, so I tried it and it seems to be working quite well :) Do I understand these parts correctly?
I have some questions:
|
Correct.
Yes, always is ordered.
Both possible, isn't by number what I've put here already? I think I
changed it to that afterward
…On Fri, Mar 29, 2019, 5:02 PM Photopea ***@***.*** wrote:
Hey, so I tried it and it seems to be working quite well :) Do I
understand these parts correctly?
- HarfBuzz can detect the writing direction from Unicode values -
_hb_buffer_guess_segment_properties
- in the output, glyphs are ordered left-to-right, no matter what the
logical order was
- the "cl" property denotes the cluster beginning. It is not the index
of the initial character, but an index into the input byte array (so if
"cl"s go like 0, 2, 4 ... there can be 2 characters per cluster, or 1
character per cluster, which uses two bytes in UTF8).
I have some questions:
- in the output array, "cl" values are always in increasing, or in
decreasing order (so that I could get a cluster size)?
- is it possible to make HarfBuzz identify glyphs by the number -
index (in "glyf","CFF","SVG") instead of a string (the "g" property in the
output) ?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#28 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAy3wde8vKbQKjHQmU_evTzQkQjORxl-ks5vbgfAgaJpZM4cRRnY>
.
|
Oh, just change last argument of _serialize from 0 to 4
/*HB_BUFFER_SERIALIZE_FLAG_NO_GLYPH_NAMES*/
…On Fri, Mar 29, 2019, 5:54 PM Ebrahim Byagowi ***@***.*** wrote:
Correct.
Yes, always is ordered.
Both possible, isn't by number what I've put here already? I think I
changed it to that afterward
On Fri, Mar 29, 2019, 5:02 PM Photopea ***@***.*** wrote:
> Hey, so I tried it and it seems to be working quite well :) Do I
> understand these parts correctly?
>
> - HarfBuzz can detect the writing direction from Unicode values -
> _hb_buffer_guess_segment_properties
> - in the output, glyphs are ordered left-to-right, no matter what the
> logical order was
> - the "cl" property denotes the cluster beginning. It is not the
> index of the initial character, but an index into the input byte array (so
> if "cl"s go like 0, 2, 4 ... there can be 2 characters per cluster, or 1
> character per cluster, which uses two bytes in UTF8).
>
> I have some questions:
>
> - in the output array, "cl" values are always in increasing, or in
> decreasing order (so that I could get a cluster size)?
> - is it possible to make HarfBuzz identify glyphs by the number -
> index (in "glyf","CFF","SVG") instead of a string (the "g" property in the
> output) ?
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#28 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AAy3wde8vKbQKjHQmU_evTzQkQjORxl-ks5vbgfAgaJpZM4cRRnY>
> .
>
|
But when I am calling your function shapeStringToGlyphs for the second time with the same arguments, I am getting an error "OOM", which happens during var buffer = module._malloc(_font.byteLength); |
I see you are allocating two "buffer"s , but freeing just one of them. |
Oh fixed. Also take a look at https://github.com/harfbuzz/harfbuzz/blob/master/src/hb-ot-color.h you may find things there interesting also. Except glyf/CFF parsing, you can drop all of the font parsing you have locally using HarfBuzz. |
It is important however hb_face_t instances be cached somehow as preparing it inside iharfbuzz is a bit costly. |
hb_font_set_scale should be called for setting font size also sometime but it should be scaled by 16 to get subpixel information also
Hmm. Maybe not in some corner cases, but most cases I am aware at least. |
Also, note, that malloc() returns a pointer (integer), and you can not do
Instead, you should use
I am still wondering, how zero-ended text is popular in low-level programming :D To me, it always made more sense to represent it with a pointer + length (and pass as two integers instead of one) :) You save one Byte and can have zeros inside a string. |
BTW. I think it would be better to keep Typr.js and WASM HarfBuzz separated and independent. Typr.js is a tiny library, that can handle 95% of use cases. Making it 10x bigger, just to cover the remaining 5% of use cases is not worth it. I would like to change the interface of Typr.js to make it similar to HarfBuzz (add Typr.U.shape() function, that would return an array in the same format as HarfBuzz shape()). Programmers will be able to switch from Typr.js to HarfBuzz easily, as the interface will be identical, in case they need Urdu etc. |
**** I do plan to use HarfBuzz heavily in Photopea, and I will probably stop using Typr.js in Photopea completely (or use just parts of it). |
Oooh, fixed :) The problem is there we can't commit to some JavaScript interface, but well, let's see :) |
And the idea was to you have something to play :) Let's close this so |
Sadly we don't support some of things Typr.js support already, a glyf/CFF glyph points interface, but that is in the roadmap. |
What do you mean? All I need is the shape() function of HarfBuzz. Please, don't add any extra font processing, it would only make HarfBuzz bigger. |
NULL-terminated strings is not a requirement here, the length of the string can be passed (and should be, in this case since the string is not already NULL-terminated) instead of -1 to |
That is just in the plan for next year or so and may never happen and it is possible to reduce the size by subsetting the library by needed APIs.
Correct, I'll apply that on harfbuzz PR. |
@ebraminio BTW. how do I set the text direction in your WASM module? My Characters are typed in a following order: ... arabic1 ... 2018 .- 2019 .... arabic2 .... And they need to be printed as ... arabic2 ... 2019 -. 2018 .... arabic1 .... So the four-character sequence " .- " (between 2018 and 2019, which are in LTR) needs to be shaped in RTL order. I tried module._hb_buffer_set_direction(hbuf, 2); // or 1 But it returns glyphs with ax=0. |
You need to segment your input before feeding it into harfbuzz, setting direction won't be useful in that IIUC. |
I did "segment" it. I need to call shape() only on a string " .- " and need the output in RTL. Currently, module._hb_buffer_guess_segment_properties(); guesses the direction to be LTR. |
Aha, yes that call should be removed but you need to detect and set the script also, hmm, I should look if is possible to have the guess but set the direction also without script, most of clients do both at the same time so may is not supported, I am not sure. |
I solved it. Guess Properties only guesses the unset properties (if you set direction in advance, it will not change it).
But I had to find the constants 4 and 5 by trying :D |
Nice. We should have some way for that also, you can read the headers or use this https://36-24718300-gh.circle-artifacts.com/0/root/project/ABI.json "name" : "HB_DIRECTION_LTR", "value" : "4"; "name" : "HB_DIRECTION_RTL", "value" : "5" for now. |
@ebraminio I have a strange problem with this WASM version of HarfBuzz. I have two fonts: A - 700 kB and B - 10 MB. When I malloc() a space for A at the start and use it, everything works well. When I malloc() a space for B at the start, everything works well too. But when I allocate A, use it, then free() it and try to malloc() space B, I get OOM error (out of memory probably). My single shaping looks like this:
|
Maybe something is not freed, I don't see it in your code but created with a wasm version with a larger heap anyway, harfbuzzjs.zip |
Could you please take a closer look at my code? Maybe I am freeing something that should not be freed. Or can you test it yourself while using several large font files multiple times? |
Hmm. I don't see the issue here using your code, I guess having more of your code will be more helpful. |
Hopefully you will be unblocked with this ;)