Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

freetype/harfbuzz: APIs to query font metadata #1215

Merged
merged 2 commits into from
Oct 15, 2020
Merged

Conversation

ezdiy
Copy link
Member

@ezdiy ezdiy commented Oct 12, 2020

This is necessary for proper language support in mupdf with arbitrary
fonts in html/pdf - mupdf expects a real font server for its callbacks,
we're told abstract typeface, script and language that must be
accounted for whenever offering close-kin substitutes of a typeface.

There are marginal uses for this elsewhere, eg. localized font
names in UI - koreader/koreader#6763


This change is Reviewable

@ezdiy
Copy link
Member Author

ezdiy commented Oct 12, 2020

This is one of the few things that can be chipped away from (already humongous) mupdf as there are no cross dependencies + some marginal use outside of it, so better make splinter PRs.

@ezdiy ezdiy requested a review from NiLuJe October 12, 2020 14:29
Copy link
Contributor

@poire-z poire-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks alright.

Comment on lines 5 to 7
#include <freetype/freetype.h>
#include <freetype/ftsynth.h>
#include <freetype/ftoutln.h>

#include FT_FREETYPE_H
#include FT_SYNTHESIS_H
#include FT_OUTLINE_H
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These #include FT__xxx were replaced with #include <freetype/xxx.h> in #1214 .
Is that an explicite revert, or you did not see it/forgot it while rebasing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was officially deprecated in 2.10.3, which is why it was coincidental to #1214 ;).

(It's a remnant of... DOS (!) era short filename FS limitations).

TL;DR: Yes, please use sane "modern" includes ^^.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, indeed accidental rebase resolution

Comment on lines +10 to +12
-- Table of ranges for entire scripts, ordered by UCDN_SCRIPT_* macros, starting with UCDN_SCRIPT_COMMON
-- Source data is https://www.unicode.org/Public/5.2.0/ucd/Scripts.txt
-- UCDN is internally used to map character classes (and detect scripts) both by harfbuzz and mupdf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments.
No risk these macros/indices could get out of sync between HB and MuPDF if a new script is introduced between others?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In mu/hb, the order is hardcoded without regard for the source data (that is compiled into UC DB contents later). So I'm essentially doing the same here (in gen.lua). The symbolic indices aren't all that interesting to us - mupdf tells us this indice directly as int, for HB its a bit awkward because they translate the UCD canonical order to their own hb_script_t, maintaining a lookup table between the two which sucks. Luckily, these indices are probably never gonna be used with HB directly.

Comment on lines +8 to +13
local hb_face_t = {}
hb_face_t.__index = hb_face_t
ffi.metatype("hb_face_t", hb_face_t)

-- Dump contents of OT name fields
function hb_face_t:getNames(maxlen)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unexperienced with all these metatype and ffi tricks.
But what's this hb_face_t ? Where does it end up - as it's not attached to the HB object we get when require(harfbuzz) ?
Or is that just attaching additional methods to the libharfbuzz internal hb_face_t type - that other code may use ? (so, these additional methods would be accessible only when used via ffi?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LuaJIT attaches it implicitly to any hb_face_t* typed C pointer we spawn through hb_face_create_* calls in various places. This is not as unparadigmatic as it may seem - lua does same sort of thing if you attach metatable to string, number or a function - in which case you too get mt per type, as opposed to value as would happen to UD/table.

Comment on lines +46 to +47
function hb_face_t:getCoverage()
local set, tmp = hb.hb_set_create(), hb.hb_set_create()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not trying to understand all that part :)

assert(ft2.FT_Done_Face(self) == 0, "freetype error when freeing face")
end

FTFace_mt.__gc = FTFace_mt.__index.done;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more need for this __gc ? We should not forget to explicitely call done() when done ? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It never worked properly (and be glad, as it would lead to a double-free). __gc is used only for types that do ffi.new() - which we don't (we ffi new its pointer box, different type).

Comment on lines +155 to +158
local kls = tonumber(ffi.cast("TT_OS2*", os2).sFamilyClass)
if kls ~= 0 then -- 0 is usually bogus
-- class 8 = sans, otherwise serif
finfo.serif = bit.band(kls, 0x0800) == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhhh, in crengine, I killed the loosy support for serif/sans as it looked we couldn't get that info :)
I'll pretend I didn't see that :) (and it would be painful as we'd need the user to choose a main font for serif and another for sans - and I don't really care)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not really all that accurate (panose tells a bit more for some fonts), but it's still 50/50, with heavy preference for serif, so eh.

end
end
if finfo.serif == nil then
--TODO: Maybe it's worth to consult panose tables too?
Copy link
Member

@NiLuJe NiLuJe Oct 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're also often depressingly wrong ^^.

It's been a recurrent source of issues for Access (the !RMSDK renderer used by Kobo).

To the extent that I usually run my fonts through this: http://trac.ak-team.com/trac/browser/niluje/Configs/trunk/Kindle/Kobo_Hacks/Patches/fix-panose.py

This is necessary for proper language support in mupdf with arbitray
fonts for html/pdf, as it expects a real font server for its callbacks
- we're told abstract typeface, script and language, that should be
accounted for whenever offering close-kin substitutes of a typeface.

There are also other uses for this on the frontend, eg. localized font
names in UI.
@ezdiy ezdiy merged commit e4a66e5 into koreader:master Oct 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants