Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reaching ttf_parser's face. #13

Closed
laurmaedje opened this issue Nov 5, 2020 · 7 comments
Closed

Reaching ttf_parser's face. #13

laurmaedje opened this issue Nov 5, 2020 · 7 comments

Comments

@laurmaedje
Copy link
Contributor

@RazrFalcon I've ported all the GSUB/GPOS-specific subtables (you can have a look here) and am now thinking about how to port the remaining code in the hb-ot-layout-* files. In doing that, I want to use some GDEF-related functions from ttf_parser. But it turns out that hb-ot-layout.h and hb-ot-layout.hh expose a lot of functions, which only take rb_face_t (and not rb_font_t). The ttfp_face, however, is stored in the font and not in the face and thus unreachable.

Now, I basically see two options:

  • Either somehow move the ttfp_face into the rb_face_t (not that easy because rb_face_t is still in C++),
  • or change a lot of occurences all the way up the call chain to rb_font_t and basically don't make the font/face distiction (which rustybuzz currently doesn't do anyway at the API level).

What do you think?

(If you have any other comments on the code I've produced so far, feel also free to review a bit.)

@RazrFalcon
Copy link
Owner

Wow! I haven't expected that.

Yes, rb_face_t and rb_font_t separation have no meaning in rustybuzz. In hb, they are used construct faces hierarchic, but we're not using it rb. So you can merge them together and use only rb_face_t everywhere. Then send a pull request, so the gsub/gpos code would be based on the new logic. Basically do not do this in your branch. Split it into multiple patches.

As for the new code, I will look into it later.
But the first thing I saw is DynArray. Isn't it the same as LazyArray16<u8>?

Also, prefer kind to type_.

@laurmaedje
Copy link
Contributor Author

laurmaedje commented Nov 5, 2020

DynArray is not quite the same as LazyArray because the stride is runtime-defined. The ValueRecords in GPOS are flexible in size depending on bitflags, so I had to come up with something. I can change type_ to kind. The top-level (not subtables) GSUB/GPOS code is pretty rough anyway because it's not yet integrated.

I'm hesitant to merge the logic as is because the FFI is pretty hacky. Basically, I construct the data slices for the subtables from the C++ table's this pointer and just set the len to isize::MAX because I couldn't get at the length easily and that was good enough for testing. Also, currently every subtable still exists in C++ as a small shim, which is not so nice, but was necessary so far because the C++ code uses an acceleration structure (which basically collects the coverage of the subtables into a bloom-filter like set) and that needs the coverage for each subtable. Maybe it would be okay to just remove the acceleration thing for the time being (it's just for performance as far as I can see) to simplify the porting and maybe add it in later with some benchmarking. In that case, it should be relatively straightforward to finish up the more top-level GSUB/GPOS and make the FFI a bit cleaner.

The remaining layout API and the font/face thing could then maybe be a separate PR.

@RazrFalcon
Copy link
Owner

Yes, the integration part is a complex one. That's why I wanted to implement AAT first, so we could easily break everything while finalizing GSUB/GPOS.

I suggest trying to remove AAT completely. Then remove all the C++ code (we have to port set and other containers). Then add your GSUB/GPOS + layout. And then re-add AAT as an optional C++ dependency, or even as pure Rust.

I haven't looked at hb source like 5 month, so maybe I'm missing something. But since GSUB/GPOS is the final milestone, I think it's fine to break the code/architecture a bit.

@laurmaedje
Copy link
Contributor Author

I think that might not even be necessary. It's definitely a step I would regard as last resort because my experience is that removing and rewriting lots of code at once (i.e. without testing in between) leads to very hard-to-find bugs. I studied the dependencies between the hb-ot-layout files and the rest of the code a bit and I think with some care and with removing the accelerations structures for now and merging font and face, it should be possible to finish GSUB/GPOS porting incrementally without too much changes to other code. I might have missed something, but I also didn't find that many dependencies from AAT on GSUB/GPOS-specific code.

@laurmaedje
Copy link
Contributor Author

I removed the acceleration structures (actually, hb-set is now not used anywhere anymore) and I also removed the subtable shims from GSUB/GPOS. So all lookups now go through a few FFI functions (rb_subst_lookup_apply, ...). These still take only a pointer without a length and therefore have undefined behaviour when the table uses invalid offsets. Since it's a bit difficult to get the length from deep in the call stack, I don't see how to easily fix that without porting the top-level GSUB/GPOS logic (scripts, language systems, features, lookup applying ...). But porting that top-level logic is blocked on merging font and face because the exported opentype API functions (e.g. rb_ot_layout_language_get_required_feature) depend on it, but only receive the rb_face_t and not the ttfp_face.

You mentioned earlier that you would prefer multiple patches. Do you prefer having a PR with the current code (with the undefined behaviour for invalid fonts) and have the font/face merging and the rest separately or should I continue on my branch?

@laurmaedje
Copy link
Contributor Author

Actually, I just realized that I can just get the GSUB data+length this way and pass that through FFI:

const char *data = (const char*)face->table.GSUB->table.get_blob();
unsigned int length = face->table.GSUB->table.get_length();

So top-level GSUB/GPOS is not blocked on font/face merging and it should probably be done before the PR to get rid of the undefined behaviour.

@RazrFalcon
Copy link
Owner

Hmm... Can we use Face::table_data to pass the table data to hb table parsing functions? So we can remove rb_face_t/rb_font_t completely? It would be way easier to simply call the C++ table parsing code from Rust.

Still, I've already forgot how it all works internally. All I know is that UB is bad, because it's way too easy to trigger it and disabling hb's table caching will decrease the performance.

I agree, that slow incremental update is always better. I've already burnt myself way too many times. But this part of hb is so interconnected, that a gradual rewrite becomes almost impossible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants