Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buidling Inference Model #6

Closed
Youngwoo-git opened this issue Aug 13, 2021 · 12 comments
Closed

Buidling Inference Model #6

Youngwoo-git opened this issue Aug 13, 2021 · 12 comments

Comments

@Youngwoo-git
Copy link

Hello, I'm here to build a inference model to test several things that are exhibited on Additional examples in your README, such as turning photos into cubism painting style

but it is a bit challenging to find proper pre-trained models (if you've got one) for it.

If you do happen to have them, would you be able to share the link for it?

thanks in advance :)

@rinongal
Copy link
Owner

You can find most of the models here: https://drive.google.com/drive/folders/1Z76nD8pXIL2O5f6xV8VjM4DUCmhbzn0l

If there's a specific one you're looking for that isn't there, let me know and I'll look for it and add it.

@Youngwoo-git
Copy link
Author

Thanks!

Would it also be possible for you to share the method you used to train the models that you have provided above? like, what are the possible "source classes" and "target classes" and how I can train with my own set of dataset

@rinongal
Copy link
Owner

rinongal commented Aug 17, 2021

First of all you can find a few tips here: #5

Other than that, what I usually do is start by using a fairly high number of iterations, because if I 'overshoot' I can always see what iterations had good results and just re-train with that number of iterations.

Source and target classes - let your imagination run wild. It doesn't really work at the level of detail you see in VQGAN + CLIP, but you can take a look at our paper for some inspirations for prompts. Things that work very well are usually photo to an artistic or rendering style, changes to creatures that have distinct looks, or to specific celebrities / fictional characters. It's also easier to modify GANs that already have a lot of variability to begin with. The LSUN church GAN for example really lets you run wild. Pretty much everything I tried worked there.

You can even try nonsensical stuff like our Nicolas Cage dogs, 'Dog' -> 'Avocado Dog' or 'Car' -> 'Car covered in lion fur' and get something (though it's often not what you expect).

Training with your own set - we don't use any dataset to train. If what you mean is that you want to modify a GAN that was trained on your own dataset, it depends on what you used to train that GAN - was it StyleGAN2 using the official version / Rosinality's code, or StyleGAN-ADA?

@Youngwoo-git
Copy link
Author

The tips you shared was helpful! thank you :)

For the training again, does the pre-trained model from official StyleGAN2 (from Nvidia) is compatible to this "StyleGAN-nada" network? I'm planning on various experiments using StyleGANv2 and before I begin those, just want to make sure that I am on the right track

Thanks :)

@rinongal
Copy link
Owner

We're using Rosinality's Pytorch implementation of StyleGAN2, so you'll need a model compatible with that. That said, there's a script in the repo that converts the .pkl model files from the official tensorflow implementations of both StyleGAN2 and StyleGAN-ADA to the .pt format of Rosinality (and you can see how to use it in the colab).

If you train with the pytorch version of StyleGAN-ADA, there's a script out there (by justinpinkney and dvschultz) to convert 1024x1024 models, but last I checked it didn't work well for lower resolutions. I don't know if it has been fixed since.

To sum things up, your 'safe' options are:

  1. Train with Rosinality's StyleGAN2 implementation.
  2. Train with the official Tensorflow StyleGAN2 implementation.
  3. Train with the official Tensorflow StyleGAN-ADA implementation.

And if you used 2 or 3 you can convert the model using the provided script and fine-tune it with our method.

@Altheim
Copy link

Altheim commented Aug 27, 2021

You can find most of the models here: https://drive.google.com/drive/folders/1Z76nD8pXIL2O5f6xV8VjM4DUCmhbzn0l

If there's a specific one you're looking for that isn't there, let me know and I'll look for it and add it.

I failed to convert the sg2 model, could you provide afhqdog.pt and afhqcat.pt? thanks

@rinongal
Copy link
Owner

Did you use the afhq models from the SG-ADA tensorflow or pytorch implementations? You need to use the former, from here.

Alternatively, the colab notebook is already set up to download and convert the right afhqdog model, so you can just download the result (and it should be easy to modify the code of step 2 to get the cat model as well).

@rinongal
Copy link
Owner

Did you manage to resolve your issue? Anything I can still help with?

@Altheim
Copy link

Altheim commented Sep 13, 2021

Yes, I have solved this issue, but a new problem is that I found that the models obtained through new words lack diversity. For example, the eyes and hair always tend to be the same.

@Youngwoo-git
Copy link
Author

We're using Rosinality's Pytorch implementation of StyleGAN2, so you'll need a model compatible with that. That said, there's a script in the repo that converts the .pkl model files from the official tensorflow implementations of both StyleGAN2 and StyleGAN-ADA to the .pt format of Rosinality (and you can see how to use it in the colab).

If you train with the pytorch version of StyleGAN-ADA, there's a script out there (by justinpinkney and dvschultz) to convert 1024x1024 models, but last I checked it didn't work well for lower resolutions. I don't know if it has been fixed since.

To sum things up, your 'safe' options are:

  1. Train with Rosinality's StyleGAN2 implementation.
  2. Train with the official Tensorflow StyleGAN2 implementation.
  3. Train with the official Tensorflow StyleGAN-ADA implementation.

And if you used 2 or 3 you can convert the model using the provided script and fine-tune it with our method.

Many issues are now resolved :) thank you for the help!

@rinongal
Copy link
Owner

Yes, I have solved this issue, but a new problem is that I found that the models obtained through new words lack diversity. For example, the eyes and hair always tend to be the same.

Unfortunately that does tend to happen with texts that are linked to specific colors (i.e. names of specific characters/people). It may be possible to avoid this by explicitly freezing network weights that are tied to e.g. hair. I'll see if I can get around to testing that at some point, but it's fairly involved and will likely take some time (if anyone reading this wants to give it a try and open a PR, feel free to do so 😉).

Other than that, I'm afraid I don't have a 'trivial' solution.

Many issues are now resolved :) thank you for the help!

Happy to help! Let me know if you need anything else!

@rinongal
Copy link
Owner

@Youngwoo-git There's a follow-up work published on arxiv this week (https://arxiv.org/abs/2110.08398) which might help with the problem you're experiencing. I'm closing this issue for now, but feel free to re-open if you need further help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants