Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to replicate results #4

Closed
rathodhare opened this issue Mar 19, 2021 · 11 comments
Closed

Not able to replicate results #4

rathodhare opened this issue Mar 19, 2021 · 11 comments

Comments

@rathodhare
Copy link

Hello, so i want to use your work for further research hence I am using your repository to replicate your results on my dataset. But while training, I am not able to get cartoonish image but the output looks nearly same to real image. The problem is that Discriminator is not training properly. It always give near 0 outputs (Real like patch prediction) for cartoon images (around 0.3~0.5 mean), and also for real images (~0.01) hence the generator is not able to train properly to yeild cartoon images.

I have followed procedure as you told and also as mentioned in the original Cartoon GAN paper: I pretrained generator to reproduce real images (6k images from Flickr30k dataset), for 10 epochs then I pretrained Discriminator as a normal classifier (6k images from flickr, 4.6k anime images from 3 movies of Hayao : PoppingHill, Princess Mononoke, Spirited away and 4.6k corresponding smooth images) for 50 epochs (still having same problem as i described earlier). After this I trained the combination for 50 epochs on the same data (6k images from flickr, 4.6k anime images from 3 movies of Hayao : PoppingHill, Princess Mononoke, Spirited away and 4.6k corresponding smooth images).

Can you suggest what is the problem? I have not made any changes to your code except for writing new codes for pretraining Generator and Discriminator. Maybe the problem lies in the dataset? or am I doing something else wrong? please help me out.

It would be great if you could share your dataset with us. Thanks.

@zimonitrome
Copy link
Collaborator

Hello.

We used around 60k anime images from probably 10-15 movies. I am afraid we can not shate the dataset since it is protected under copyright.

The problem might lie in the learning rate, which was raised some during training. Try using a lower one than specified.

I could also try to re-train the network in a couple of weeks or so. To confirm this issue.

@rathodhare
Copy link
Author

rathodhare commented Mar 25, 2021

Hey, so can you atleast share what movies you used for making dataset and way frames were extracted from them. Then i can make the same dataset on my own. Please also specify the quality (360p or 720p and so on).

P.S. Now I made some progress with the training after doing some changes in hyper parameter and so on. Now when I train, its exactly the other way around. The discriminator loss becomes very small after some epochs but the generator is not able to learn (errG just fluctuating around 2) the generator keeps on producing almost similar images as input.

I have arrived at a conclusion that maybe batchsize is a problem: I have kept batchsize as 4 since anything above is not supported by my gpu (@256 img size). Do you support this argument or do you think there can be other issue?

Finally i would request you to share the exact hyperparameters you used for a successful training and whether small batchsize is an issue and also the way you made your dataset(The movie names and how frames were extracted from them). It would be a huge help for my research. Thanks

@Yash619
Copy link

Yash619 commented Mar 26, 2021

Hey can you upload the "trained_netD.pth" file. I am training on my own dataset, but I cannot continue without that file. It would be really helpful if you could share me that file. Thanks

@zimonitrome
Copy link
Collaborator

zimonitrome commented Mar 28, 2021

@rathodhare The hyperparameters should not be changed from the ones specified.
The movies were all the Studio Ghibli movies. Resolution shouldn't matter since all images are resized and cropped to 256x256.

I am missing the dataset myself atm. But I am acquiring it and can try training soon to confirm it still works.

Edit: The movies used are:

  • Nausicaä of the Valley of the Wind
  • Castle in the Sky
  • Grave of the Fireflies
  • My Neighbor Totoro
  • Kiki's Delivery Service
  • Only Yesterday
  • Porco Rosso
  • Ocean Waves
  • Pom Poko
  • Whisper of the Heart
  • Princess Mononoke
  • My Neighbors the Yamadas
  • Spirited Away
  • The Cat Returns
  • Howl's Moving Castle
  • Tales from Earthsea
  • Ponyo
  • Arrietty
  • From Up on Poppy Hill
  • The Wind Rises
  • The Tale of the Princess Kaguya
  • When Marnie Was There

@zimonitrome
Copy link
Collaborator

I found a few bugs in the code but so far nothing seem to prevent training to achieve the results presented in the paper.

@rathodhare did you pre-train the Generator at all? We did not have any file for this but is now available in this branch:

https://github.com/FilipAndersson245/cartoon-gan/tree/replicationfixes

The code is WIP and for now it trains on a static 10 epochs on the real dataset.

I am currently training to make sure everything works as expected.

@zimonitrome
Copy link
Collaborator

Training for ~34 epochs seems successful.

Epoch 0:

0_0

Epoch 34:

24_368

Trained using pretrain.py and train.py.

@rathodhare See if you can confirm or I'll close this issue in a few weeks.

@Yash619
Copy link

Yash619 commented Mar 29, 2021 via email

@zimonitrome
Copy link
Collaborator

@Yash619 This is moved to another issue: #5

I am not sure we still have the original trained Discriminator weights but I can at least upload the new one I am training currently. I will upload tomorrow.

@zimonitrome
Copy link
Collaborator

Training for 50-80 epochs has generated the following:

63_441

Features presented in the paper (mono color blobs, eyes, etc.) are all present here. We deem it successfully replicated.

@rathodhare
Copy link
Author

So just to confirm, the current version of code would replicate the results with no changes in code in anyway by pretraining on Flickr30k (entire dataset?) and then training on 60k anime images from the movies you described and again same Flickr30k? Also how powerful is your GPU man, mine shows CUDA out of memory in batch size 4 cant hope to train on batchsize 32 ....

Please confirm the above and upload the final trained Gen and Disc also the pretrained Gen if possible. Thanks for all your hard efforts :)

@zimonitrome
Copy link
Collaborator

Yes this is correct.

You also need to process your anime images with create_smooth_dataset.ipynb.
It's a very simple notebook that could be made into a normal .py file if so needed.

We currently have access to an RTX 3090, hence the enormous batch size. You could experiment with higher learning rate for quicker results. The default is 1e-4 but I would not go over e1-3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants