Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The network output is slightly different from our expectations #39

Closed
xa0082249956 opened this issue Nov 16, 2020 · 6 comments
Closed

Comments

@xa0082249956
Copy link

xa0082249956 commented Nov 16, 2020

Hi,
we use about 50k pairs of pictures generated by Toonify Stylegan to do pairs-training, and use the Toonify model provided in the readme as the weight.
After training, the results in the log are all very good. However, other results are not as expected.

For example, this image is given by the latent code optimized by Stylegan.

However, pixel2style2pixel gives this one.
773edbb674b99d79

More training results from pixel2style2pixel:
-5bfb05277a409917
-730846ae16b2e1f9

@xa0082249956 xa0082249956 changed the title The network output does not match expectations The network output is slightly different from our expectations Nov 16, 2020
@yuval-alaluf
Copy link
Collaborator

yuval-alaluf commented Nov 16, 2020

Hi @xa0082249956 ,
Can you please clarify what you mean by "the Toonify model provided in the readme as the weight."
Are you training on your 50,000 pairs of (reals, toons) starting from the pretrained model we uploaded to the repo?

@xa0082249956
Copy link
Author

Hi, we are training by

python scripts/train.py \
--dataset_type=ffhq_encode \
--exp_dir=experiment_disney \
--workers=8 \
--batch_size=2 \
--test_batch_size=2 \
--test_workers=8 \
--val_interval=2500 \
--save_interval=5000 \
--encoder_type=GradualStyleEncoder \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=1 \
--w_norm_lambda=0.025 \
--stylegan_weights /data/pixel2style2pixel/ffhq_cartoon_blended.pt

And the data are pairs like (real,toon).

@yuval-alaluf
Copy link
Collaborator

Cool. So a few points:

  1. We train our toonify model using only real images (no paired data) and therefore the loss lambdas we use may not be optimal for your training which you do with paired data. I would consider playing with trying to decrease the w_norm_lambda.
  2. In any case, I think the results you got are not bad at all (especially the second one). With that said, playing with the lambda values may give slightly better results since you have paired data to work with.
    Please note that the results you can't may not be as good as the results you get with optimization, which takes a substantially longer time for inference compared to our method.

@xa0082249956
Copy link
Author

Great! Thanks for your reply. I will try it.

@brightmart
Copy link

Two questions regarding "we use about 50k pairs of pictures generated by Toonify Stylegan to do pairs-training".

  1. Where to find the model of 'Toonify Stylegan'?
  2. How to do pair-training?

@yuval-alaluf
Copy link
Collaborator

Hi @brightmart
The toonify StyleGAN model is provided in the README and can be downloaded here:
https://drive.google.com/file/d/1r3XVCt_WYUKFZFxhNH-xO2dTtF6B5szu/view?usp=sharing

To do paired training, you need to first generated pairs of (real, toon) images. While I don't know exactly how the paired data was generated in this case, you can check out the following Google Colab from Justin Pinkney to see an example of how to generate the paired images:
https://colab.research.google.com/drive/1s2XPNMwf6HDhrJ1FMwlW1jl-eQ2-_tlk?usp=sharing#scrollTo=cuMEHnpmI1Mj

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants