Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing fonts outside of linked executable #1237

Closed
kisvegabor opened this issue Oct 24, 2019 · 26 comments
Closed

Storing fonts outside of linked executable #1237

kisvegabor opened this issue Oct 24, 2019 · 26 comments
Labels

Comments

@kisvegabor
Copy link
Member

@kisvegabor kisvegabor commented Oct 24, 2019

@amirgon @puzrin

Let continue the discussion of C font format serialized here.

There is a topic in the Forum about it as well where the following question was asked:

The first difficulty is that in the lv_font_t structure definition there are function pointers which cannot be stored in external flash.

@amirgon do you have any idea for it?

@embeddedt

This comment has been minimized.

Copy link
Member

@embeddedt embeddedt commented Oct 24, 2019

I would actually side with @puzrin in regards to whether we serialize the C format or use the existing binary format. I think it makes more sense to implement support for the binary format.

Serialization is inherently a complicated procedure (especially when pointers are involved). It seems to me that it would make much more sense to choose a format that was already designed for the binary use case. Plus, as mentioned by @amirgon, there are platform issues like endianness that need to be taken into account with the serialized C format, whereas the binary format just doesn't have those problems.

@C47D

This comment has been minimized.

Copy link
Contributor

@C47D C47D commented Oct 24, 2019

I may be talking nonse but, is it possible to use protocol buffers to serialize the fonts? nanopb is the implementation of protocol buffers in C targeting embedded systems or use something like mcufont.

@puzrin

This comment has been minimized.

Copy link

@puzrin puzrin commented Oct 24, 2019

@embeddedt, note, binary reader should be guarded with tests. That may be not obvious, but important. Those will be ~ replacement of "text format readability".

@kisvegabor i'd suggest to remember why you NOT selected binary format from the begining and how to resolve those issues with serialization. Then answer to yourself why serialisation is better that direct binary format use.

@amirgon

This comment has been minimized.

Copy link
Contributor

@amirgon amirgon commented Oct 24, 2019

The first difficulty is that in the lv_font_t structure definition there are function pointers which cannot be stored in external flash.

@amirgon do you have any idea for it?

Well, that struct (lv_font_t) will have to be kept in RAM.
Fortunately, it's a very small top level struct and there are no function pointers elsewhere. All the other structs can be on ROM/flash.
The script will process lv_font_t but will not fill these function pointer fields (keep them NULL). It will fill all the other fields of that struct so we could copy them to the struct instance on RAM.
Then, we can allocate lv_font_t instance on RAM, copy its contents from flash and only fill in the missing function pointers.

I would actually side with @puzrin in regards to whether we serialize the C format or use the existing binary format. I think it makes more sense to implement support for the binary format.

@embeddedt Actually I agree with you, mainly because the original binary format is platform independent and implementation independent.

However,

  • When working on the new font converter, you (and @kisvegabor) chose not to use the binary format on lvgl and use the generated C file instead, probably for some good reasons.
  • No one volunteers to do that work. It would require resolving the binary format related issues you had, and a good understanding of the way fonts are represented and used.
  • The existing binary format is not fully "battle tested", as @puzrin put it.
    Trying to use it has a good chance to expose bugs/incompatibilities in the font converted, and maybe in the binary format itself.

So ideally - yes, using the existing binary format is a better idea.
Practically, however, I think I can solve the serialization issue (and my solution is relatively simple, even with pointers taken into account), and I can do it without knowing almost anything about the font internals, and keep the existing font structures as is with minimal impact on lvgl code and performance.

So - which do you choose? Do you want to wait for the "proper" solution with the original binary format, or would you consider my suggested "shortcut"?

is it possible to use protocol buffers to serialize the fonts?

@C47D I'm not sure Protocol Buffers could really help here.
If you have some idea how we could use it, please share it with us.
The problems I see are:

  • I don't know of a simple way to automatically convert C structs into protobuf .proto file.
    We could do it the other way around, define .proto file and create structs from it, but it would become a bit cumbersome in my opinion.
  • My suggestion allows putting the data directly on ROM/flash and just cast the structs there (assuming the ROM/flash are mapped to the microprocessor address space).
    In my suggestion there is no "deserialization" process that takes the data, parses it and fills in the structs, as you would find on standard serialization tools, protobuf included.
    If we were using serialization, all the parsed/deserialized data would have to consume RAM.

btw @kisvegabor for this reason I don't like the title of this issue. It should be "Font Binary Format" or something like that, because my solution is not really serialization per se.

@embeddedt embeddedt changed the title Font serializer Storing fonts outside of linked executable Oct 24, 2019
@embeddedt

This comment has been minimized.

Copy link
Member

@embeddedt embeddedt commented Oct 24, 2019

you (and @kisvegabor) chose not to use the binary format on lvgl and use the generated C file instead, probably for some good reasons.

From what I can recall (and rereading the issues) it was to:

  • Save development effort, since we could reuse some of the existing font code.
  • Make it easier for people to import fonts, since they can add a C file to their IDE instead of converting a binary.

Trying to use it has a good chance to expose bugs/incompatibilities in the font converted, and maybe in the binary format itself.

If proper tests are implemented, that shouldn't be an issue. Also keep in mind that I think @puzrin developed most of the font converter using the binary format and added the C format near the end.

for this reason I don't like the title of this issue. It should be "Font Binary Format" or something like that, because my solution is not really serialization per se.

Title changed.

@amirgon

This comment has been minimized.

Copy link
Contributor

@amirgon amirgon commented Oct 24, 2019

@embeddedt you didn't tell us what you think about:

So - which do you choose? Do you want to wait for the "proper" solution with the original binary format, or would you consider my suggested "shortcut"?

That was not a rhetorical question, I would really like to know your opinion.

@embeddedt

This comment has been minimized.

Copy link
Member

@embeddedt embeddedt commented Oct 24, 2019

I would really like to know your opinion.

There are pros and cons for both - I am not sure which one to choose. In the interest of clean code I would choose the binary format, but for ease of implementation I would choose your idea.

@amirgon

This comment has been minimized.

Copy link
Contributor

@amirgon amirgon commented Oct 24, 2019

There are pros and cons for both - I am not sure which one to choose. In the interest of clean code I would choose the binary format, but for ease of implementation I would choose your idea.

Another consideration is practical.
AFAIK no one volunteered to implement it with the original binary format.
I'm not sure where it stands on your/@kisvegabor's priorities and timelines.

I'm willing to spend some time on my suggested workaround, but I will do that only if we all agree it's an acceptable solution.

@amirgon

This comment has been minimized.

Copy link
Contributor

@amirgon amirgon commented Oct 24, 2019

or use something like mcufont.

That's interesting.

@kisvegabor , @puzrin, How does the font-converter / rendering-engine of lvgl compare with mcufont?
I noticed it is used as the font engine in µGFX.

@puzrin

This comment has been minimized.

Copy link

@puzrin puzrin commented Oct 24, 2019

Problem is with cost of next rewrite. Every new "kludge" increase it.

How does the font-converter / rendering-engine of lvgl compares with mcufont?

It compress data on byte level, that's less effective than pixel-level of lv_font_conv. I did not dived into details. Reason about size was enough to loose interest.

@zeile218

This comment has been minimized.

Copy link

@zeile218 zeile218 commented Oct 25, 2019

We extended the font system so it can handle paths like with the images over the filesystem function. You supply a path for the font "S:/somefont.bin" and it gets loaded into RAM. The bin file is the font data that gets generated by the online converter.

@amirgon

This comment has been minimized.

Copy link
Contributor

@amirgon amirgon commented Oct 25, 2019

We extended the font system so it can handle paths like with the images over the filesystem function. You supply a path for the font "S:/somefont.bin" and it gets loaded into RAM. The bin file is the font data that gets generated by the online converter.

Interesting.

@zeile218 - Could you share your code?

@kisvegabor

This comment has been minimized.

Copy link
Member Author

@kisvegabor kisvegabor commented Oct 25, 2019

@amirgon

When working on the new font converter, you (and @kisvegabor) chose not to use the binary format on lvgl and use the generated C file instead, probably for some good reasons.

My main concern against the binary format was that it's more difficult to understand/read/debug and therefore less contributor friendly. If I remember well @amirgon is the fourth person who is interested in binary format but so far I don't know anybody who really implemented it. (Maybe @zeile218 ?)

Besides in my opinion (so it's arguable) the C format is more flexible and easier to change because the compiler does a lot of things instead of you.


Supporting the binary format is a good idea. If you haven't seen yet here is the specification of the binary format. I don't think there are too many bugs because @puzrin carefully tests everything.

I was interested in the serializer because it could be a general-purpose tool whose result is almost fully transparent from lvgl's point of view. However, if the C font needs to be generated first, and it needs to be preprocessed then it seems a little bit circumstantial. Not mentioning the issues of the memory layout and packing.

So if @amirgon or anybody else has the interest to add support for the binary I'm absolutely in. You might get some inspiration (math, searching, etc) from the C format, and hopefully, @zeile218 also can share something which can be used to easier get started.


@puzrin
An lvgl_bin format still might be required to initialize a basic lv_font_t structure with a pinter to the hex data, function pointers and the basic metrics. What do you think?

@puzrin

This comment has been minimized.

Copy link

@puzrin puzrin commented Oct 25, 2019

My main concern against the binary format was that it's more difficult to understand/read/debug and therefore less contributor friendly. If I remember well @amirgon is the fourth person who is interested in binary format but so far I don't know anybody who really implemented it. (Maybe @zeile218 ?)

Besides in my opinion (so it's arguable) the C format is more flexible and easier to change because the compiler does a lot of things instead of you.

This may be true, until you write code without tests and use a lot of manual debug. When tests exist, that's all not a problem at all.

As you can see, my code has good tests coverage. When i do changes, i break tests sometime, and bugs are fixed very quick, because i see problems immediately.

Another problem i pointed at start - text format is not suitable to describe complicated data. I agree, it's readable. But this doesn't help to be suitable :)

@puzrin
An lvgl_bin format still might be required to initialize a basic lv_font_t structure with a pinter to the hex data, function pointers and the basic metrics. What do you think?

My opinion is, you propagate font internals to upper level code without need and use very aggressive (and fragile) manual optimizations. While benefits in RAM is not significant - lower than occupied by any single widget. As a result - code is tight coupled and you have to fall into compatibility issues on every change attempt.

I can understand idea "const pointers do not consume memory". But if you count real memory spends, benefit will be not significant.

IMO, on font load you could create dynamic descriptor with all required intermediate data, format-specific function pointers and so on. That will be not a big struct. Then, all next communications can be done via API, with data in memory:

  • glyph descriptor (something like "glyph slot" in freetype)
  • bitmap buffer

As you can see, this is independent on font internals and does not require const pointers to font inner.

@kisvegabor

This comment has been minimized.

Copy link
Member Author

@kisvegabor kisvegabor commented Oct 25, 2019

This may be true, until you write code without tests and use a lot of manual debug. When tests exist, that's all not a problem at all.

I suggest not starting this bloody topic again. 🙂

IMO, on font load you could create dynamic descriptor with all required intermediate data, format-specific function pointers and so on. That will be not a big struct. Then, all next communications can be done via API, with data in memory:

lv_font_t looks like this now:

/** Describe the properties of a font*/
typedef struct _lv_font_struct
{
    /** Get a glyph's  descriptor from a font*/
    bool (*get_glyph_dsc)(const struct _lv_font_struct *, lv_font_glyph_dsc_t *, uint32_t letter, uint32_t letter_next);

    /** Get a glyph's bitmap from a font*/
    const uint8_t * (*get_glyph_bitmap)(const struct _lv_font_struct *, uint32_t);

    /*Pointer to the font in a font pack (must have the same line height)*/
    uint8_t line_height;      /**< The real line height where any text fits*/
    uint8_t base_line;        /**< Base line measured from the top of the line_height*/
    uint8_t subpx  :2;        /**< An element of `lv_font_subpx_t`*/
    void * dsc;               /**< Store implementation specific data here*/
#if LV_USE_USER_DATA
    lv_font_user_data_t user_data; /**< Custom user data for font. */
#endif
} lv_font_t;

So in your suggestion, we need to get line_height, base_line line_height and subpx from the binary data and need an lv_font_bin_create(lv_font_t * font_p, const uint8_t * bin) function to initialize font_p, right?

@puzrin

This comment has been minimized.

Copy link

@puzrin puzrin commented Oct 25, 2019

So in your suggestion, we need to get line_height, base_line line_height and subpx from the binary data and need an lv_font_bin_create(lv_font_t * font_p, const uint8_t * bin) function to initialize font_p, right?

Principle is a bit different. Descriptor goal is:

  • Identify font.
  • Cache for fast operations.

For example, when speed is not required, you can extract data from bin everytime and store nothing. Another example - when you know format of loca table, kerning table, bpp resolution, compression... - you can assign function pointers to specific data extractors.

So, design process looks like this:

  • clarify external API (i'd suggest to inspect FreeType for ideas)
  • check every method implementation and add cache variables for critical chains.

As you can see, i don't name concrete field (for example, "you need to store vars from HEAD table"), because that's not correct. I explain principle, how to decide when data is important to be in RAM. Also note, this data should be private - user should not access it directly to avoid next spin of tight coupled coding.


So, lv_font_bin_create(lv_font_t * font_p, const uint8_t * bin) will create some kind of font_p, but it's content may differ significantly from what you use on upper levels.

@turoksama

This comment has been minimized.

Copy link
Contributor

@turoksama turoksama commented Oct 28, 2019

Hello there,
Is it convenient to do the bin font job from the very beginning?
something like:

#define LV_FONT_DECLARE(font_name) extern lv_font_t font_name;
#define LV_FONT_BIN_DECLARE(bin_font_struct_var_name, bin) {lv_font_bin_create(lv_font_t * bin_font_struct_var_name, const uint8_t * bin);}
@kisvegabor

This comment has been minimized.

Copy link
Member Author

@kisvegabor kisvegabor commented Oct 29, 2019

@turoksama
If LV_FONT_BIN_DECLARE is declared in a header file at the beginning of a C file then evaluating ot to a function won't work. So it should be rather a declaration.

lv_font_bin_create(&my_font, bin_data); // Can't call a function from here

void main(void) {

}
@turoksama

This comment has been minimized.

Copy link
Contributor

@turoksama turoksama commented Oct 30, 2019

@kisvegabor
No, it is not.
What i mean is declare in lv_init, perhaps as callback.

@kisvegabor

This comment has been minimized.

Copy link
Member Author

@kisvegabor kisvegabor commented Oct 30, 2019

@turoksama So your goal would be to make it work similarly to LV_FONT_CUSTOM_DECLARE?

@turoksama

This comment has been minimized.

Copy link
Contributor

@turoksama turoksama commented Oct 30, 2019

Yes. I prefer callbacks within integrated init sequence.

@kisvegabor

This comment has been minimized.

Copy link
Member Author

@kisvegabor kisvegabor commented Nov 2, 2019

@turoksama I still can't imagine how it could work. I don't know how it is possible to declare a global variable from a function (lv_font_t) and call callback a the same time.
But maybe I've just missed something. Can you give a code example to show how you mean it?

@stale

This comment has been minimized.

Copy link

@stale stale bot commented Nov 23, 2019

This issue or pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@kisvegabor

This comment has been minimized.

Copy link
Member Author

@kisvegabor kisvegabor commented Nov 28, 2019

@amirgon Do you still have time/interest to deal with this feature?

It seems it'll be required for me a little bit later too so we can join our forces! 🙂

@stale stale bot removed the stale label Nov 28, 2019
@amirgon

This comment has been minimized.

Copy link
Contributor

@amirgon amirgon commented Nov 28, 2019

@amirgon Do you still have time/interest to deal with this feature?

It seems it'll be required for me a little bit later too so we can join our forces!

I'm interested in it, but it's lower priority for now.
One of the reasons is that Micropython is working towards dynamic code loading support (like shared libraries on embedded platforms! for both Python and C code!).
This means that the fonts, even as they are represented now, could be provided as external components without the need to rebuild the firmware.

From the discussion we had above I understand that my original suggestion is less favorable (parsing the existing font structs and serializing them). Most people voted for the complete solution of changing the font's binary representation to match @puzrin binary format, although unfortunately no one volunteered to implement this.

Implementing the complete solution, from my point of view, would require much more effort because it requires deep understanding of both the existing font structures and @puzrin's font binary format (and the relation between them, and how fonts work in general, etc.)
My original suggestion didn't require any of these because it would just serialize any data opaquely, and as a bonus could be applied in general to any data represented statically on C structs. But it's clear we are not going on that way now.

@kisvegabor

This comment has been minimized.

Copy link
Member Author

@kisvegabor kisvegabor commented Nov 30, 2019

One of the reasons is that Micropython is working towards dynamic code loading support (like shared libraries on embedded platforms! for both Python and C code!).

It's awesome! Actually, It'd be required for me with Micropython too, so the same approach could work here as well. Please let us know if you made any progress with it.


To keep the list of issues clean I close this PR for now. We can reopen it if someone will be interested in this topic again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.