Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sparse font support #78

Closed
hotpeperoncino opened this Issue Jan 17, 2018 · 38 comments

Comments

Projects
None yet
5 participants
@hotpeperoncino
Copy link

hotpeperoncino commented Jan 17, 2018

Far east font like Chinese, Japanese font has mostly sparse data. (valid unicode values are sparse)
Current font implementation looks expecting that the font data is not sparse.

I guess such logic.

  1. if sparse data
    map unicode value to index in fontdata.
    if not spase data
    unicode value is used as index
  2. index is used to retrieve font data.

please prepare such mechanism to handle sparse font in the library

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Jan 17, 2018

Hi,

I'm planning to implement this feature in 5.1 release.

@hotpeperoncino

This comment has been minimized.

Copy link
Author

hotpeperoncino commented Jan 17, 2018

i'll wait for it

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Jan 18, 2018

20180118175156

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Jan 18, 2018

I'm sorry my English is not good, this passage is translated by Google translation.
The number of commonly used Chinese characters in Chinese characters is huge (Chinese common characters are 20902 words, Unicode encoding range is 0x4E00-0x9FA5), which means that I need a large external FLASH to store this continuous encoded font lattice data. But in general, the embedded system needs a limited number of text, usually 200-500 words, so a solution is needed to make 200-500 words as a font file, the Unicode encoding of the characters in the font is not continuous , So that you can use the Chinese font without increasing hardware costs.
Currently I've implemented the functionality I need by adjusting the lv_font_get_bitmap () and lv_font_get_width () functions in the lv_font.c file, but this is not universally applicable and I hope you can consider adding support for this requirement.
I personally feel the easiest way is to create the font, the text corresponding to the Unicode encoding value can also be stored in an array, read from the array to find the corresponding Unicode encoding value, and then return from the dot matrix data corresponding Text lattice.

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Jan 18, 2018

Hi,
Thanks you for the image!
Support for Chinese character will be added in the next release!

My idea is the following:
Now lv_font_t describes the fonts with arrays. I'm planning to extend it with 2 functions:

  1. read_bitmap(unicode_letter, buf) which fills buf with bit-based image of the letter. You can load it from anywhere, even from external flash. It depends on your implementation.
  2. read_width(unicode_letter) get the width of a character (can return a constant for Chinese letters)

You can create a font (in the next release) this way:

static lv_font_t custom_font;
custom_font.read_bitmap = my_font_30_read_bitmap;
custom_font.read_width = my_font_30_read_width;
custom_font.height = 30;
custom_font.first_unicode = 0xABCD
custom_font.last_unicode = 0xBCDE

...

style1.text.font = &custom_font;

The concatenation of fonts still possible to add e.g. symbols fonts.
lv_font_add($symbol_30, &custom_font); //Extend custom_font with a symbol font

What do think? Will it be usable for Chinese characters?

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Jan 18, 2018

Yes, this is an ideal way to achieve it. My current solution is the same one.
In addition, my development IDE is MDK, this tool is very poor support for UTF8 Chinese, I have been using GB2312 encoding and external font chip to provide Chinese fonts.
According to the current method of implementation, I can modify the UTF8 decode and change GB2312 encoding, thank you for your help.

@hotpeperoncino

This comment has been minimized.

Copy link
Author

hotpeperoncino commented Jan 18, 2018

The image looks good !!

I think it is diffcult to decide the font data format in memory to use memory efficiently.
"sparse" means "0x0-255" and the followng ranges

Japanese-style punctuation ( 3000 - 303f)
Hiragana ( 3040 - 309f)
Katakana ( 30a0 - 30ff)
Full-width roman characters and half-width katakana ( ff00 - ffef)
CJK unifed ideographs - Common and uncommon kanji ( 4e00 - 9faf)

We can put precondition like that all font in a range has same height and width. Probably such
precondition is useful for reducing memory.
As for Japanese font, 7000 characters are required, from my experience, 12x9 dot charater,
the font data size falls to about 150Kbytes. font image size is 126Kbytes and 14Kbytes indexes and more.

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Jan 19, 2018

You can "segment" a font by initializing more lv_font_tvariable and concatenate them with lv_font_add(child_font, parent_font) But the height must be same with a font.
This way you can assign your functions for each range.

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Jan 19, 2018

Most Japanese text strokes less, most of the commonly used Chinese text a lot of strokes. According to my past experience, at least 16 * 16 Chinese characters can be identified, if you need a good display at least 24 * 24 points. The text in the picture is 30 * 30 dot matrix, if you use anti-aliasing, then the number of points to be increased to at least 60 * 60. For 30 pixel fonts need at least 2.25MB space to store dot matrix information. Therefore, it is very important to load a non-contiguous Unicode-encoded user font.

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Jan 19, 2018

I see. But the described mechanism will make possible to use non-contiguous blocks. In your read_bitmap function you got an unicode letter as argument. In the function you can use a table to map this letter to a bitmap. Maybe you have only 80 bitmaps. Is you got a letter from this 80 you return with one of them. Else return with a default (e.g. empty) bitmap.

For example (only A, B and C is stored):

Letter argument Bitmap
A Get bitmap of 'A'
B Get bitmap of 'B'
C Get bitmap of 'C'
X Not implemented, get empty buffer
@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Jan 20, 2018

@xiaoleizii Finally you have used LittlevGL's UTF-8 decoder? If yes can you test your GUI in the previous post again? I've done some minor fixes in UTF-8 coding and want to be sure it is still working well.

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Jan 20, 2018

Yes. I am still use LittlevGL's UTF-8 decoder.I will try it later.

@hotpeperoncino

This comment has been minimized.

Copy link
Author

hotpeperoncino commented Jan 20, 2018

my inadequate explanation lead misunderstanding. segment mechanism is not enough.
sparse means that, in the range of CJK unification range, the data is sparse.the range has random spaces between the used indecies as japanese kanji character.

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Jan 21, 2018

Because I made some changes in the GUI library, it is not possible to directly override the previous version of the test.
The decoding part of Utf8 is located in the lv_txt.c file, which I compared with Beyond Compare.
The modified part is a 4-byte Utf8 character decoding.
Since I am using an editor in Chinese Utf8 is 3 bytes long and can not test 4 bytes section.
Chinese is still normal at the moment, and later I find a four-byte Utf8 character for testing.

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Jan 21, 2018

I tested it with five four-byte Chinese characters and it does not look right now.
The test character is "𪜀𪜁𪜂𪜃𪜄", the corresponding UNICODE code is "0x0002A700-0x0002A704", and the actual test result is "0x2A700,0x0002A740,0x0002A780,0x0002A7C0,0x0002A800".
Reference coding page: http://www.qqxiuzi.cn/zh/hanzi-unicode-bianma.php?zfj=kzc
` /4 bytes UTF-8 code/
else if((txt[*i] & 0xF8) == 0xF0) {
result = (uint32_t)(txt[*i] & 0x07) << 18;
(*i)++;

        if((txt[*i] & 0xC0) != 0x80) return 0;  /*Invalid UTF-8 code*/
        result += (uint32_t)(txt[*i] & 0x3F) << 12;
        (*i)++;

        if((txt[*i] & 0xC0) != 0x80) return 0;  /*Invalid UTF-8 code*/
        result += (uint32_t)(txt[*i] & 0x3F) << 6;
        (*i)++;

        if((txt[*i] & 0xC0) != 0x80) return 0;  /*Invalid UTF-8 code*/
       ##  result += (uint32_t)(txt[*i] & 0x3F) << 6;
        (*i)++;
    }`

There is a bug in four-byte UNICODE decoding, the last byte need not be shifted.

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Jan 21, 2018

This problem is not introduced to modify the decoding, the previous version is not correct. If you remove the left shift of the last byte, UNICOD decodes some of the results before and after the modified version gets the correct result.

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Jan 21, 2018

The four UNICODE characters of the five Chinese characters in memory are as shown below.
20180121122610

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Jan 21, 2018

@xiaoleizii Thank you very much for the deep examination! I will fix it!

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Jan 21, 2018

@hotpeperoncino
I think I understood you well. See the updated example (I hope it will be more informative)

Let's say we have only 3 (spare) characters: 'A', 'G' and 'P'

Letter argument Bitmap
A Get bitmap of 'A'
B Not implemented, get empty buffer
C Not implemented, get empty buffer
... ...
G Get bitmap of 'G'

Here is a code example:

typedef struct {
  uint32_t unicode;
  uint8_t * bitmap;
  uint16_t width;
}my_bitmap_t

my_bitmap_t bitmap_30_dsc[] = {
  { 'a', bitmap_30_of_a, 20},
  { 'g', bitmap_30_of_g, 22},
  { 'p', bitmap_30_of_p, 18},
} 

void my_bitmap_30_get(uint32_t letter, uint8_t * buf) 
{
  uint32_t i;
  for(i = 0; i <  3; i++) {
    if(bitmap_dsc[i].unicode == letter) {
      memcpy(buf, bitmap_30_dsc[i].bitmap, 30 * bitmap_30_dsc[i].width);
      return;
    }
  }
}

So LittlevGL will all this function to get the bitmap of letter. It is up to you how to implement the bitmaps. They don't have to be in a continuous block.

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Jan 22, 2018

I did this in this way:
20180122100344
Then call GetChineseUtf8Array() the function lv_font_init();
Then modified the other two functions:
20180122100408
20180122100425
This is a way to achieve the function of getting Chinese dot matrix.
The current display is like this:
20180122101306

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Jan 22, 2018

Awesome! Thank you for sharing!
I'm thinking something similar, but I would like to implement the 'bitmap get' via callback function in lv_font_t to fit smoothly into to the current mechanism.

@hotpeperoncino

This comment has been minimized.

Copy link
Author

hotpeperoncino commented Jan 31, 2018

@kisvegabor thank you for your understanding.

@sirius65535

This comment has been minimized.

Copy link

sirius65535 commented Jan 31, 2018

@kisvegabor Sounds great! Chinese font is a serious problem for me all the time, when will you plan to release 5.1 ? im looking forward it!

@sirius65535

This comment has been minimized.

Copy link

sirius65535 commented Feb 1, 2018

@xiaoleizii 老哥加个微信?

@qwert1213131

This comment has been minimized.

Copy link

qwert1213131 commented Feb 1, 2018

楼上稳
就扶你

@sirius65535

This comment has been minimized.

Copy link

sirius65535 commented Feb 1, 2018

@qwert1213131 陈独秀同志请坐下

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Feb 5, 2018

Hi,
I'm doing well with sparse font support. (and also a lot of other updates)
I would like to add some built-in Asian fonts just as a reference or example, but I'm not familiar with these characters.
Can you tell me an Unicode range (max 255 characters) which should be added as built in font?

@sirius65535

This comment has been minimized.

Copy link

sirius65535 commented Feb 6, 2018

@kisvegabor hey bro,as for Chinese fonts,you see it has so many characters ,but some of them we hardly ever use, and the some of the characters we use very ofen which the number are not continuous in the Unicodes.So,i think you could get a simple Chinese sentences like“你好老王昨晚你在哪吃的饭在餐馆吃的你媳妇手艺不错”,the UTF8 of the sentences are “\u4f60\u597d\u8001\u738b\u6628\u665a\u4f60\u5728\u54ea\u5403\u7684\u996d\u5728\u9910\u9986\u5403\u7684\u4f60\u5ab3\u5987\u624b\u827a\u4e0d\u9519”

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Feb 6, 2018

Thank you, @sirius65535
What does 你好老王昨晚你在哪吃的饭在餐馆吃的你媳妇手艺不错 mean? Google translate gave strange result. :)

@qwert1213131

This comment has been minimized.

Copy link

qwert1213131 commented Feb 6, 2018

@kisvegabor This is a Chinese joke.

  • xxx says : Hello Lao Wang, where did you eat last night? in the restaurant?
  • Lao Wang says: Your wife is a good cook.
@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Feb 6, 2018

Oh, thank you! :)

I will add these letters.

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Feb 7, 2018

@sirius65535 套路深啊 老哥

@xiaoleizii

This comment has been minimized.

Copy link

xiaoleizii commented Feb 7, 2018

I was very busy these days, so I did not have too much time to test this GUI library. I post a pic of my first applications use this GUI.
_20180207123458
I think this GUI is very useful for embedded applications with weak hardware performance.
I only use a STM32F407VET6 to drive a 5 inch LCD (854 * 480 dot) with Capacitive touch screen through FSMC.
_20180207124140
@kisvegabor thank you very much!
PS: Chinese characters provided upstairs is not a good example.
Try "富强民主文明和谐自由平等公正法治爱国敬业诚信友善"
timg

@qwert1213131

This comment has been minimized.

Copy link

qwert1213131 commented Feb 7, 2018

楼上老哥稳啊

@sirius65535

This comment has been minimized.

Copy link

sirius65535 commented Feb 7, 2018

@qwert1213131 @xiaoleizii 膜拜大佬们

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Feb 7, 2018

@xiaoleizii Thank you for the image! Looks great! It would be nice to put an example GUI with Asian characters on https://littlevgl.com with. If your or anybody else have a good looking one send me, please!

And for all: please talk in English :)

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Feb 24, 2018

Please check #125 and test the spare font support! :)

@kisvegabor

This comment has been minimized.

Copy link
Member

kisvegabor commented Mar 7, 2018

Sparse font support is officially added in v5.1.
Thank you for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.