Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about your dataset number? #1

Closed
Johnson-yue opened this issue Sep 24, 2020 · 2 comments
Closed

question about your dataset number? #1

Johnson-yue opened this issue Sep 24, 2020 · 2 comments

Comments

@Johnson-yue
Copy link
Contributor

Hi, good job and thank your sharing, but I have some question about your dataset number?
image

482 chinese fonts, a total of 19,514 characters? avg should be = 19514/482 ~ 40 characters/fonts but why you said is 6654?

@SanghyukChun
Copy link
Collaborator

SanghyukChun commented Sep 24, 2020

@Johnson-yue
Hi, thanks for your interest and the very first question for our paper.

We did not mean "19,514 characters" as the number of images (or glyphs),
but the number of real characters, i.e., Unicode.

Thus, each font has 6,654 images on average, and the union of the Unicode in the train set is 19,514.
On the other hand, the total number of "images" in the train set is 6,654 * 482 = 3,207,228.

Please don't hesitate to bother me if you have any other questions.

@Johnson-yue
Copy link
Contributor Author

Hi, I am making the lmdb file, sorry i am late。
the making lmdb is very slow, every 6734 “image” cost 544 s。
As your lmdb , 482 fonts * 6654 unicode, how much the size of file??

You mean that Unicode that can be used for each font is different. 482 fonts contains 19,514 characters , Yes I understand it, Thank you

btw, I tested the AGIS-net, it is impossible to reimplement their paper performance , by their github repo . After I asked two question, they close the issue...... And thanks for your reply

8uos pushed a commit that referenced this issue Sep 28, 2020
remove some unused code
This was referenced Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants