Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use the data produced by the build_dataset.py file during training #29

Closed
zj916716524 opened this issue Jan 4, 2022 · 9 comments

Comments

@zj916716524
Copy link

Hi, I am very interested in your work, so I would really like to train my own dataset using your network. But I'm having some problems with your network, I made the dataset according to the build_dataset.py file, but the format of the dataset you use in training is not quite the same as this, and I'd like to ask about the exact way to make the training data. Thank you for reading and I look forward to your reply!

@zj916716524
Copy link
Author

How to define unseen_unis and seen_unis, unseen_fonts and seen_fonts in the 'valid' field of the tarin.json file

@zj916716524
Copy link
Author

zj916716524 commented Jan 6, 2022

I have rewritten it according to the format given by train.json, combined with my own dataset.
unseen_unis : some characters that are not visible during training
seen_unis : I define all the characters that are seen in the training set
unseen_fonts: invisible fonts for training
seen_fonts: fonts used during training
I don't know if this is the correct way to write it. And about the avail, I defined the fonts seen and unseen in training and the characters contained in the fonts.

@8uos
Copy link
Collaborator

8uos commented Jan 7, 2022

Hi, sorry for the late reply.
Firstly, I recommend to use this repository (clovaai/fewshot-font-generation) because that repo builds the dataset from TTF files -- that means, you do not need to build the LMDB datasets.

However, you are doing right to build the dataset for this repository.

@zj916716524
Copy link
Author

zj916716524 commented Jan 7, 2022

Thank you very much for your reply, I followed the data set I made myself and trained the network, the network returned a BUG
{
File "/home/zeng/for_translation_stroke/lffont-master/datasets/p1dataset.py", line 129, in
content_imgs = torch.cat([self.env_get(self.env, self.content_font, uni, self.transform)for uni in trg_unis]).unsqueeze_(1)
File "train.py", line 141, in
env_get = lambda env, x, y, transform: transform(read_data_from_lmdb(env, f'{x}_{y}')['img'])
TypeError: 'NoneType' object is not subscriptable

}
Source code:
{
content_imgs = torch.cat([self.env_get(self.env, self.content_font, uni, self.transform)for uni in trg_unis]).unsqueeze_(1)
}
I don't know what caused this bug. Is it related to the insufficient memory of the graphics card?
Thanks a lot for the link, so should I build the dataset from this file?

@zj916716524
Copy link
Author

I ran the code of your other article MX-FONT. Can the method of making a dataset in that article be applied to LF-FONT? The network ran successfully when I made the MX-FONT dataset, and got exciting results.

@8uos
Copy link
Collaborator

8uos commented Jan 7, 2022

Thank you very much for your reply, I followed the data set I made myself and trained the network, the network returned a BUG { File "/home/zeng/for_translation_stroke/lffont-master/datasets/p1dataset.py", line 129, in content_imgs = torch.cat([self.env_get(self.env, self.content_font, uni, self.transform)for uni in trg_unis]).unsqueeze_(1) File "train.py", line 141, in env_get = lambda env, x, y, transform: transform(read_data_from_lmdb(env, f'{x}_{y}')['img']) TypeError: 'NoneType' object is not subscriptable

} Source code: { content_imgs = torch.cat([self.env_get(self.env, self.content_font, uni, self.transform)for uni in trg_unis]).unsqueeze_(1) } I don't know what caused this bug. Is it related to the insufficient memory of the graphics card? Thanks a lot for the link, so should I build the dataset from this file?

File "train.py", line 141, in
env_get = lambda env, x, y, transform: transform(read_data_from_lmdb(env, f'{x}_{y}')['img'])
TypeError: 'NoneType' object is not subscriptable
}
This error usually caused by that the given character does not exist in the given font.
Please check your meta file carefully.
If you ran build_dataset.py, you can find the json file which contains the available font-char dictionary at --json_path.

@8uos
Copy link
Collaborator

8uos commented Jan 7, 2022

I ran the code of your other article MX-FONT. Can the method of making a dataset in that article be applied to LF-FONT? The network ran successfully when I made the MX-FONT dataset, and got exciting results.

The repository clovaai/fewshot-font-generation uses very similar dataset code with MX-Font repository and it also supports LF-Font.
It will be much easier to use because it does not need to build LMDB dataset and also the meta file for that repository is much simpler.
Please check the repository and the docs/Dataset.md.

@zj916716524
Copy link
Author

Thank you very much for your quick reply. I will modify the project according to your suggestions next. If I have a problem, I'm here to ask you.

@SanghyukChun
Copy link
Collaborator

Closing the issue, assuming the answer resolves the problem.
Please re-open the issue as necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants