Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New font format spec #995

Closed
puzrin opened this issue Apr 3, 2019 · 69 comments
Closed

New font format spec #995

puzrin opened this issue Apr 3, 2019 · 69 comments
Labels
architecture pinned Not closed automatically

Comments

@puzrin
Copy link
Contributor

puzrin commented Apr 3, 2019

This issue is to stabilize new font format at binary level. It accommodates features, discussed in #990.

Comments and suggestions welcome.

Latest spec: https://github.com/littlevgl/lv_font_conv/blob/master/doc/font_spec.md

@embeddedt embeddedt added architecture pinned Not closed automatically and removed not-template labels Apr 3, 2019
@puzrin
Copy link
Contributor Author

puzrin commented Apr 3, 2019

Initial idea of keeping chunk headers into font head was possibility of binary search. But this does not seems very useful

  • because number of chunks will be small
  • cache can be implemented in runtime

I tend move chunk headers back into chunks, to have this info more "condensed"

@puzrin
Copy link
Contributor Author

puzrin commented Apr 3, 2019

Moved chunk heads to chunk bodies.

https://github.com/littlevgl/lv_font_conv/blob/font_spec/font_spec.md - here is my "sandbox" of updates, is anyone interested.

@puzrin
Copy link
Contributor Author

puzrin commented Apr 4, 2019

After investigating kerning info, i found data to become too complicated. Tend to pack it more close to OpenType tables (without chains).

  1. Glyph data should have internal continuous enumeration (not pinned directly to char codes)
  2. Features of "continuous"/"sparse" compact store is related to charcodes only, not to glyph bitmaps. In this case it has equivalent in OpenType spec (subtables).

Pros:

  • More friendly to next extensions
  • More easy to understand/verify
  • Reduce "inventing new standard" attempt => less design errors
  • (?) able to deduplicate idedtic glyphs from different scriptings (unicode blocks)

Cons

  • May need some state store in RAM (pointers to active tables and so on) for effective work
  • Some overhead for small fonts (8px bitonal)

@puzrin
Copy link
Contributor Author

puzrin commented Apr 4, 2019

Updated first message with new format, based on OpenType principles.

History: https://github.com/littlevgl/lv_font_conv/commits/font_spec

@kisvegabor, @embeddedt, @beibean, I think this format can be useable. Need to discuss, check missed features and stabilize.

@kisvegabor
Copy link
Member

@puzrin This format will be used internally, and writes can convert it to the required format, right?

@puzrin
Copy link
Contributor Author

puzrin commented Apr 5, 2019

IMHO it would be nice to use it in LVGL directly.

Anyway, when we design complex data, we should think in "bytes" domain, not in specific language. That's more clear, more easy to apply external patterns (from OpenType in our case), to quick-check features availability (like binary search) and so on. I've completely rewritten spec 3 times, this would not be so easy if it was in C.

Probably, this data can be expressed in C, but for me this would be a waste of time. If someone provide me correct example of such translation - i probably could try to create such writer. But i would strongly recommend to start with redesign of public API (how LVGL should interact with new fonts). Font data expression in C is internal thing, and can be postponed to the end.

We are still in the middle of of new fonts data/API design. It's too early to speak about C writer. I tried to explain all process here https://github.com/littlevgl/lv_font_conv/issues/7#issuecomment-480119718

@puzrin
Copy link
Contributor Author

puzrin commented Apr 5, 2019

Under new API i mean something like:

  • font->render(string)
  • font->getGlyphId(charCode)
  • font->getBitmapInfo(id)
  • font->getKerning(left_id, right_id)

That's only an example to demonstrate that existing API should be reworked to reflect new data/features. Don't consider as real methods.

@kisvegabor
Copy link
Member

kisvegabor commented Apr 5, 2019

It sounds reasonable to update the font API this way, however, as I stated in other posts it can be part of an other release.

For this release, it would be a nice goal to migrate to the new font converter and use bound boxing model.

@puzrin
Copy link
Contributor Author

puzrin commented Apr 5, 2019

@kisvegabor i don't agree. It does not worth to plan jump over chasm with two jumps :). Data format change is ass pain every time. API rework is MUCH more simple, than change data twice.

To be constructive, i suggest to create new issue about font API rework, collect requirements and design function signatures. This does not involve coding, allows verify almost everything and do estimates about cost. If we find cost not acceptable - we can start search plan B. But new format is very modular and should not be difficult to switch.

@puzrin
Copy link
Contributor Author

puzrin commented Apr 5, 2019

For example: i you find decompressor implementation difficut, we can support "raw bits" in 6.0 release, and add compressor in 6.1. This will not affect existing spec and API.

@puzrin
Copy link
Contributor Author

puzrin commented Apr 6, 2019

Updated TBD section with questions. Need to discuss.

@kisvegabor
Copy link
Member

To be constructive, i suggest to create new issue about font API rework, collect requirements and design function signatures. This does not involve coding, allows verify almost everything and do estimates about cost. If we find cost not acceptable - we can start search plan B. But new format is very modular and should not be difficult to switch.

Sounds good. Please open a new issue for the new font API and we will see if it's possible to add it in v6.0

@kisvegabor
Copy link
Member

In light of the progress in the API topic, we need to clarify what is the exact purpose of this data representation format.

Anyway, to make some progress here I answered the general questions of the TBD section:

Coordinate system. Where to place (0, 0) ? Baseline? Ascent? Descent?

I vote for the ascent. It's the most natural while drawing. We should have "baseline" attribute too (e.g. baseline == 12) to enable positioning the label with its baseline.

Should we keep advanceWidth in separate table? (need measure size difference)
current location seems ok for quick lookup.

I see no disadvantage of storing advanceWidth in different tables. When people need a monospace font they usually have a resource constraint environment. So saving memory, in this case, is important.

Need samples for CJK subsets, to understand need of spatial subtables (still can
use lot of holed Subtable Format 6 with binary search for fast lookup).
Do we need something special for proper CJK render?

As you already know currently we have continuous and sparse fonts. Sparse is tested with CJK by a lot of people and they had no problem with that. So I think we don't need anything else than supporting sparse fonts.

Should we care about data align? What is the size cost of it?

Yes, sections/tables should be aligned to 4 bytes boundary.

@puzrin
Copy link
Contributor Author

puzrin commented Apr 14, 2019

In light of the progress in the API topic, we need to clarify what is the exact purpose of this data representation format.

That's a part of "standard" design process:

[ collect use cases ] => [ design data flow ] => [ design api/code ]

  1. This data reflects required features (New font format requirements #990).
  2. This format helps to verify logical correctness.
  3. It's used to design writer (font convertor) & reader (lvgl api).

I vote for the ascent. It's the most natural while drawing. We should have "baseline" attribute too (e.g. baseline == 12) to enable positioning the label with its baseline.

Just to clarify, you suggest set (0, 0) to upper left corner? And glyph will be drawn "right and down" (except case with negative shifts)?

I see no disadvantage of storing advanceWidth in different tables.

I'd say, i could not find reason to create additional table. OpenType's one is for more complicated features, not used in our case.

When people need a monospace font they usually have a resource constraint environment. So saving memory, in this case, is important.

That's not related to storage location. Just set number of bits in advanceWidth to 0 + add default value field to font header.

As you already know currently we have continuous and sparse fonts. Sparse is tested with CJK by a lot of people and they had no problem with that. So I think we don't need anything else than supporting sparse fonts.

Can i see examples from real world? I need to understand subsets of used charcodes

That's required to understand do we need additional subtable format, or we can use multiple of existing subtables.

Yes, sections/tables should be aligned to 4 bytes boundary.

Ok, i will align everything according to addressed data width

The only question is with glyf table. It would be nice to aligh each "glyph bitstream start" to 4 bytes, but this can add notable overhead. Need to compare losses for real fonts.

@kisvegabor
Copy link
Member

That's a part of "standard" design process:
[ collect use cases ] => [ design data flow ] => [ design api/code ]

  • This data reflects required features (New font format requirements #990).
  • This format helps to verify logical correctness.
  • It's used to design writer (font convertor) & reader (lvgl api).

I agree it is useful the verify logical correctness but not sure it's required to specify the exact format (bits, sizes, alignment). However, if you think it will help later, I believe. :)

Just to clarify, you suggest set (0, 0) to upper left corner? And glyph will be drawn "right and down" (except case with negative shifts)?

Yes.

AdvanceWidth table:

That's not related to storage location. Just set number of bits in advanceWidth to 0 + add default value field to font header.

Ok, if it works here. Probably in lv_font_t it really should be a different table but don't mix LittlevGL specific things here.

Can i see examples from real world? I need to understand subsets of used charcodes

I can't link any exact resources. I just saw here that people sent screen shoots with CJK characters to ask something.

That's required to understand do we need additional subtable format, or we can use multiple of existing subtables.

I'm not sure subtables really could work in this case because characters can be really sparse. My experience is CJK symbols, because of mean words, can't be grouped to logical subsets. Have a look at this. The symbols are just divided into 4 equal parts.

@puzrin
Copy link
Contributor Author

puzrin commented Apr 14, 2019

I agree it is useful the verify logical correctness but not sure it's required to specify the exact format (bits, sizes, alignment). However, if you think it will help later, I believe. :)

If people can understand something wrong - they will do it :). It's much more cheap to write exact spec, than do multiple code rewrites and try to find difference between JS writer and C reader :).

Ok, if it works here. Probably in lv_font_t it really should be a different table but don't mix LittlevGL specific things here.

If you speak about head table - that's for internal implementations and NOT lv_font_t. I guess, it can partially go to extension of lv_font_t, specific for LVGL font engine.

Note, font spec is not about particular implementation. It's about data to make features possible. head store assorted properties, because those are required.

Can i see examples from real world? I need to understand subsets of used charcodes

I can't link any exact resources. I just saw here that people sent screen shoots with CJK characters to ask something.

That's required to understand do we need additional subtable format, or we can use multiple of existing subtables.

I'm not sure subtables really could work in this case because characters can be really sparse. My experience is CJK symbols, because of mean words, can't be grouped to logical subsets. Have a look at this. The symbols are just divided into 4 equal parts.

Ordered tuples with delta-coded index should be ok:

subtable_format_sparsed[] = {
  { uint16_t (CJK1 - range_start), uint16_t glyph_id1 }
  { uint16_t (CJK2 - range_start), uint16_t glyph_id2 }
  ...
}
  • 4 bytes per entry (every subtable covers up to 65336 wide range).
  • Ready for binary search.

The only reason why i did not added this yet - there are no direct analogues in OpenType spec, and i can't understand why.

https://docs.microsoft.com/en-us/typography/opentype/spec/cmap there are Format2 & Format4 but those break my brain :).

@puzrin
Copy link
Contributor Author

puzrin commented Apr 14, 2019

Refreshed spec according to comments. Key points:

  • Updated header info (drop bboxMax, leave Y min/max only)
  • Improve cmap subtables descriptions
  • Add sparsed subtable format
  • Aligned everything possible
  • Rearranded kerning table to simplify aligned access

Diff: yireo-joomla1/lv_font_conv@43c6f31

@kisvegabor
Copy link
Member

kisvegabor commented Apr 14, 2019

Ordered tuples with delta-coded index should be ok:

Something like that could work.

There are Format2 & Format4 but those break my brain :).

It really looks complicated. I don't think we need to follow OpenFont so strictly.

Glyph's bitmap alignment:
As they will be stored and used byte-based I don't think alignment is required there. Now it's not aligned and there were no problems with that. A problem could occur if e.g. an uint32_t would be read from a not 4 aligned address. But for bytes, it should be fine.


The updated spec look good to me. Do we need this issue opened?

@puzrin
Copy link
Contributor Author

puzrin commented Apr 14, 2019

Glyph's bitmap alignment:
As they will be stored and used byte-based I don't think alignment is required there. Now it's not aligned and there were no problems with that. A problem could occur if e.g. an uint32_t would be read from a not 4 aligned address. But for bytes, it should be fine.

Content of glyph data is bit stream. Normal operation usually is:

  • prefetch some bytes
  • peak requested number of bits, until new prefetch needed

With aligned data we could prefetch 4 bytes at once. I guess, there should not be big difference, but byte access looks not nice.

The updated spec look good to me. Do we need this issue opened?

Yes, i'd like to keep it open for 1-2 weeks.

  • Too few feedback in general.
  • We did not collected feedback from API issue.
  • Need to decide where to land spec document.

@kisvegabor
Copy link
Member

kisvegabor commented Apr 15, 2019

With aligned data we could prefetch 4 bytes at once. I guess, there should not be big difference, but byte access looks not nice.

I see no advantage of accessing it as 32 bit data. You can consider the glyphs data as an uint8_t array:

const uint8_t * p = glyph_bitmap[glyph_start];
p[x] ... ;

It's quite convenient and practical

Yes, i'd like to keep it open for 1-2 weeks.

Ok

@puzrin
Copy link
Contributor Author

puzrin commented Apr 15, 2019

See updated TBD. Should we expose recommended glyph buffer size into header? This looks too "low-level-ish".

  • Rigth values is MAX(all_glyphs)
  • Each glyph's value is wordAlign(bitmap_width) * bitmap_height (bitmap_width depends on bpp)

This will require to scan loca / glyf tables.

I can't explain why i don't like to hardcode allocator hint into font.

@kisvegabor
Copy link
Member

I think storing this information is useful and I see no problem with that. It's better if do it instead of letting it to the user.

@puzrin
Copy link
Contributor Author

puzrin commented Apr 20, 2019

Updated spec. Found some minor issues after started with bin writer.

  • missed bpp info :)
  • added Descent field
  • BBox X/Y are signed, BBox W/H are unsigned. Splited bits length to separate filelds.
  • advanceWidth MAY be in fixed point format (if kerning info used, last 4 bits should be fractional). Added format field + appropriate notes. For "ugly" OLED fonts this may be integer.

Diff

@puzrin
Copy link
Contributor Author

puzrin commented Apr 21, 2019

https://iamvdo.me/en/blog/css-font-metrics-line-height-and-vertical-align - this is very useful article to understand font align inside of line.

I'm not sure we need features like font mixtures and different aligns, but still worth to read.

@kisvegabor
Copy link
Member

Not sure i understand you again :). See first post, font header table section. All this data seems already exist, and generated in --format bin.

I wanted to be 100% sure that we will have an "ultimate line-height" because it's the most important from LittelvGL's point of view. But then it seems fine. Thank you.

Next week I'll make some experiments with kerning too. When it's ready, I'll be able to provide a template file for LittlevGL built-in fonts. You mentioned that you are not interested in creating an lvgl6 text writer. Is this still true?

@puzrin
Copy link
Contributor Author

puzrin commented Apr 28, 2019

You mentioned that you are not interested in creating an lvgl6 text writer. Is this still true?

The most boring part is inventing good text format. I'm not interested in spending time for this. But if someone provides ready format, i could try to make JS part, if that helps.

Under "ready format" i mean "no magical numbers" (except array sizes). For example, i have no ideas how to define data offsets in glyf & cmap tables.

@kisvegabor
Copy link
Member

kisvegabor commented Apr 29, 2019

It would be great if you created the JS part. Thank you.

I will make a ready text format. Is it good if we will have a small example font? Like this

@puzrin
Copy link
Contributor Author

puzrin commented Apr 29, 2019

Is it good if we will have a small example font?

Yes. In case of "text format", example font with couple of dummy glyphs should be enougth.

@kisvegabor
Copy link
Member

Awesome!

@puzrin
Copy link
Contributor Author

puzrin commented Apr 30, 2019

Need to decide where to land font format spec:

  • lvgl repo
  • lv_font_conv repo

lv_font_conv would be more logical if script will be used independent, with other libs (to keep project consistent). In other cases lvgl may be preferable.

I'll be ok with any choice, if i have write access to spec next 1-2 months, for polish.

@kisvegabor
Copy link
Member

lv_font_conv would be more logical if script will be used independent, with other libs (to keep project consistent). In other cases lvgl may be preferable.

I agree. Please upload it there.

@puzrin
Copy link
Contributor Author

puzrin commented May 1, 2019

Landed to https://github.com/littlevgl/lv_font_conv/blob/master/doc/font_spec.md, replaced first post with reference.

Seems all done, but i'd suggest keep issue open until API finished. To discuss data change & clarification requests (i hope this will not happen but can not guarantee 100%).

@puzrin
Copy link
Contributor Author

puzrin commented May 4, 2019

Kerning table size

Usually ~50% of glyphs has kerning. Size of kerning table in Format 0 has quadratic dependency on glyphs count.

If we dump ALL Roboto glyphs (edge case), size will be huge:

$ env DEBUG=* ./lv_font_conv.js --font ./node_modules/roboto-fontface/fonts/roboto/Roboto-Regular.woff -r 0x20-0x10FFFF --size 16 --bpp 3 --format bin -o test.bin
  font last_id: 894 +0ms
  font minY: -4 +2ms
  font maxY: 18 +1ms
  font glyphIdFofmat: 1 +0ms
  font kerningFormat: 0 +25ms
  font advanceWidthFormat: 1 +0ms
  font xy_bits: 5 +3ms
  font wh_bits: 5 +1ms
  font advanceWidthBits: 10 +4ms
  font monospaced: false +0ms
  font indexToLocFormat: 0 +32ms
  font.table.head table size = 44 +0ms
  font.table.cmap table size = 1412 +0ms
  font.table.loca table size = 1800 +0ms
  font.table.glyph table size = 32484 +0ms
  font.table.kern 447 kerned glyphs of 893, 413 max list, 52248 total pairs +0ms
  font.table.kern table size = 261252 +1ms
  font font size: 296992 +161ms

Simple optimization (with 2-dimentional array) will reduce size to ~160K only.

As far as i understood, real kernig is defined not for glyphs but for classes (groups of glyphs). For example for AO, AC & AQ chars O,C,Q belongs to the same class.

  • See Format 2.
  • Also see Format 3, it seems more easy to understand, and more compact

So, we have to store:

  1. Mapping chars to Left Class ID (893 bytes), ~ 40 classes
  2. Mapping chars to Right Class ID (893 bytes), ~ 40 classes
  3. Table leftClassesCount * rightClassesCount * kern_value_size, ~1600 or 3200 bytes.

Big difference!

Of cause, this may work only if original font uses definition via table. But all modern fonts should have it (because that's more convenient for font authors).

I did not investigated details, may be for ascii Format 0 will be more compact.


Second point is "micro optimization" for kerning value size. 8 bits is not enougth for huge fonts (> 50px). Currently spec switches to 16-bits values.

Alternative may be introduce property kerningMultiplier. Actually, 7-bits (+sign) resolution is enougth, if used with proper scale. Requires trivial FP multiplication (hardware int mul support should be available in all lvgl targets). Or we can use 2n multiplier.

Just an idea.

This also open ability to create "font packs" with different font sizes and shared kerning table. BUT, at first glance, after add Kerning subtable Format 3 support, saving will be not significant.


Next 1-2 weeks will try to invent "table restorer from glyph pairs" algorythm.

@puzrin
Copy link
Contributor Author

puzrin commented May 4, 2019

Checked cmap subtables for full Roboto dump:

12 subtable(s): 9 "format 0", 3 "sparse"

That means, in worst case 4 binary search iterations required to find subtable. Nice. Should not affect performance.

@puzrin
Copy link
Contributor Author

puzrin commented May 5, 2019

yireo-joomla1/lv_font_conv@e4e0a85

Updated spec & code to use kerningScale FP12.4. That will simplify upcoming kerning subformats changes.


Also pushed to temporary kern branch calculation of glyph "classes" (grouping) stat, required for subtables format2/format3 support.

0x20-0x7f:

$ env DEBUG=* ./lv_font_conv.js --font ./node_modules/roboto-fontface/fonts/roboto/Roboto-Regular.woff -r 0x20-0x7F --size 16 --bpp 3 --format bin -o test.bin
  ...
  font.table.glyph table size = 2836 +0ms
  font.table.kern 51 kerned glyphs of 95, 40 max list, 434 total pairs +0ms
  font.table.kern unique right classes: 40 +8ms
  font.table.kern unique left classes: 35 +2ms
  font.table.kern table size = 1316 +0ms
  font font size: 4524 +25ms

0x20-0x10FFFF:

$ env DEBUG=* ./lv_font_conv.js --font ./node_modules/roboto-fontface/fonts/roboto/Roboto-Regular.woff -r 0x20-0x10FFFF --size 16 --bpp 3 --format bin -o test.bin
  ...
  font.table.glyph table size = 32484 +0ms
  font.table.kern 447 kerned glyphs of 893, 413 max list, 52248 total pairs +0ms
  font.table.kern unique right classes: 83 +116ms
  font.table.kern unique left classes: 103 +124ms
  font.table.kern table size = 261252 +1ms
  font font size: 296992 +412ms

In Format 3 kerning will be:

  • ascii - 96*2 + 40*35 => 1592 bytes
  • full -> 893*2 + 83*103 => 10335 bytes

Note, for unknown reasons Roboto stores data in 2 subtables (GPOS, pairs + format2). And it's format2 table dimension is ~40*40 (we have twice more). Need some checks what happens, but even preliminary results looks nice.

@puzrin puzrin mentioned this issue May 5, 2019
@puzrin
Copy link
Contributor Author

puzrin commented May 5, 2019

yireo-joomla1/lv_font_conv@70784b6

Upcoming update of Format 3 for kerning table.

@kisvegabor
Copy link
Member

kisvegabor commented May 6, 2019

I've also seen that usually classes are used to save memory. It seems more effective but we need to ensure that the converter works with any font. At worst case, it should skip the kern table if it's format is not supported.

Width byte size
I'm not sure 2 bytes are really required. Probably 2 bits for the fractional part would be enough too:

  • 1 bit sign
  • 5 bit integer part
  • 2 bit fractional part

This we will have -31.75 .. 31.75. It should be enough for 100 px fonts too.

@puzrin
Copy link
Contributor Author

puzrin commented May 6, 2019

It seems more effective but we need to ensure that the converter works with any font.

It will select most optimal format of possible. I don't understand why you decided in can not work with some fonts.

At worst case, it should skip the kern table if it's format is not supported.

Sounds like "magic". Convertor works as it can, and write least possible size. It's user's responsibility to take modern font.

Width byte size
I'm not sure 2 bytes are really required. Probably 2 bits for the fractional part would be enough too:

Are you about width or kerning? Kerning is 0.1 in many cases of 16px font. I don't like to restrict resolution with 2 bits fractional. And i don't like to restrict max possible size with 200px (some chars will have kerning > 31)

Also note, i avoid bits machinery in spec. That's too dangerous data optimization, except boolean flags. One day it may happen that bits not enougth and this can not be fixed because data become too coupled.

@kisvegabor
Copy link
Member

Sounds like "magic". Convertor works as it can, and write least possible size. It's user's responsibility to take modern font.

I absolutely don't agree. We need to add as few restrictions as possible. If the user wants to use a font for any reason we need to do our best to make possible the use of that fonts.

Are you about width or kerning? Kerning is 0.1 in many cases of 16px font. I don't like to restrict resolution with 2 bits fractional. And i don't like to restrict max possible size with 200px (some chars will have kerning > 31)

Limiting the font size around 200 px and 0.25 px resolution is reasonable compared to the cost of an extra byte. With proper rounding the error will be <= 0.125 px which is very acceptable.

@puzrin
Copy link
Contributor Author

puzrin commented May 7, 2019

I absolutely don't agree. We need to add as few restrictions as possible. If the user wants to use a font for any reason we need to do our best to make possible the use of that fonts.

Seems i missed the point of discussion. Initial (source) font font are correct, and author draw it with appropriate features (with of without kerning, depending). Changing author intents is not good. Convertor keep those features and write file of minimal size. What's wrong?

Limiting the font size around 200 px and 0.25 px resolution is reasonable compared to the cost of an extra byte. With proper rounding the error will be <= 0.125 px which is very acceptable.

IMO, you discuss problem that not exists.

yireo-joomla1/lv_font_conv@e4e0a85

Current spec has scale property, 8-bit kerning values and no restriction about font size.

@embeddedt
Copy link
Member

embeddedt commented May 7, 2019

@puzrin

And i don't like to restrict max possible size with 200px

As far as I know, LittlevGL is not designed with ultra-high-density screens or extremely large screens in mind. On my computer monitor (~27in with a resolution of 1920x1080) I can only display 4-5 lines of text with a 200px font size. Most embedded systems have much smaller displays. Therefore, I can't see a practical use case where 200px fonts would be required. If, at some point in the future, such display technology becomes mainstream in embedded systems, then an extension of the font specification to allow that would make sense. However, at the moment, I don't see why we need to support this for an embedded GUI library designed for constrained environments.

@puzrin
Copy link
Contributor Author

puzrin commented May 7, 2019

I consider a bad practice to select restricted solutions at data level. If i can suggest more universal solution for the same cost, why should i select one less universal?

On my computer monitor (~27in with a resolution of 1920x1080) I can only display 4-5 lines of text with a 200px font size.

  1. Device can show big single line of rolling text. Just an imaginated example.
  2. Spec is more universal thing, than for "lvgl only"

@puzrin
Copy link
Contributor Author

puzrin commented May 7, 2019

I've updated convertor & spec.

Now it tries to build kerning with Subtable Format 3 and Subtable Format 0, then selects least possible. Size of full Roboto dump (16px, 3 bpp) reduced from 250K to 50K.

Example with 2 languages, english and russian:

$ env DEBUG=* ./lv_font_conv.js --font ./node_modules/roboto-fontface/fonts/roboto/Roboto-Regular.woff -r 0x20-0x7F --size 16 --bpp 3 --format bin -o test.bin -r 0x401,0x410-0x44F,0x451
  ...
  font.table.head table size = 44 +0ms
  font.table.cmap 2 subtable(s): 2 "format 0", 0 "sparse" +0ms
  font.table.cmap table size = 224 +2ms
  font.table.loca table size = 336 +0ms
  font.table.glyph table size = 5104 +0ms
  font.table.kern 104 kerned glyphs of 161, 72 max list, 1668 total pairs +0ms
  font.table.kern table format0 size = 5008 +3ms
  font.table.kern unique left classes: 60 +4ms
  font.table.kern unique right classes: 53 +1ms
  font.table.kern table format3 size = 3508 +3ms
  font font size: 9228 +38ms

@embeddedt
Copy link
Member

Initial (source) font font are correct, and author draw it with appropriate features (with of without kerning, depending). Changing author intents is not good. Convertor keep those features and write file of minimal size. What's wrong?

We cannot introduce new restrictions on the user. If, for whatever reason, they wish to use a font in a different way than intended, that's their decision, not ours.

@puzrin
Copy link
Contributor Author

puzrin commented May 7, 2019

We cannot introduce new restrictions on the user. If, for whatever reason, they wish to use a font in a different way than intended, that's their decision, not ours.

Option to drop kerning data was added 2 weeks ago: yireo-joomla1/lv_font_conv@2a915e0. And this is convertor's feature, not related to data spec.

@kisvegabor
Copy link
Member

kisvegabor commented May 8, 2019

Kerning types
What happens if the user tries to convert an "exotic" font which kerning format is not supported? Will he get an error or just a warning saying that "kerning format is unknown so it is skipped".

Kerning value
How is it stored in higher resolution in one byte? Send a link please.

Kerning format 3
Where are the letters for each class are stored? I.e. how do you know that 'C', O, Q are in the same class?

Max height
We need to decide the requirements first then write spec accordingly. 200 px fonts are enough in our case so 1 byte is required here.

@puzrin
Copy link
Contributor Author

puzrin commented May 8, 2019

What happens if the user tries to convert an "exotic" font which kerning format is not supported? Will he get an error or just a warning saying that "kerning format is unknown so it is skipped".

We use opentype.js to extract kerning data. I guess, if format not supported it returns 0.

Kerning value
How is it stored in higher resolution in one byte? Send a link please.

You mix terms "scale" and resolution. Resilution stays the same, 8 bits. Scale is stored in header.

Kerning format 3
Where are the letters for each class are stored? I.e. how do you know that 'C', O, Q are in the same class?

Glyphs, having exactly the same combinations of kerning have the same class. Convertor scan values and build class lists.

https://github.com/littlevgl/lv_font_conv/blob/master/lib/font/table_kern.js#L111-L125

@puzrin
Copy link
Contributor Author

puzrin commented May 8, 2019

Note about single kerning value format. Now we have FP4 in 2 places:

  • kerning values (FP4.4)
  • kerning scale (FP12.4)

We could store instead:

  • kerning value as Uint8
  • Kerning scale as FP8.8

I don't know does it worth to do such thing or not. Technically - no difference, because those must be multiplied anyway.

@kisvegabor
Copy link
Member

In light of this comment I close this issue. Let continue in #1057

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture pinned Not closed automatically
Projects
None yet
Development

No branches or pull requests

3 participants