Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated images are completely black?! 馃樀 What am I doing wrong? #15

Closed
illtellyoulater opened this issue Mar 8, 2022 · 10 comments

Comments

@illtellyoulater
Copy link

Hello,
I am on Windows 10, and my gpu is a PNY Nvidia GTX 1660 TI 6 Gb.
I installed V-Diffusion like so:

  • conda create --name v-diffusion python=3.8
  • conda activate v-diffusion
  • conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch (as per Pytorch website instructions)
  • pip install requests tqdm

The problem is that when I launch the cfg_sample.py or clip_sample.py command lines, the generated images are completely black, although the inference process seems to run nicely and without errors.

Things I've tried:

  • installing previous pytorch version with conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
  • removing V-Diffusion conda environment completely and recreating it anew
  • uninstalling nvidia drivers and performing a new clean driver install (I tried both Nvidia Studio drivers and Nvidia Game Ready drivers)
  • uninstalling and reinstalling Conda completely

But nothing helped... and at this point I don't know what else to try...

The only interesting piece of information I could gather is that for some reason this problem also happens with another text-to-image project called Big Sleep where similar to V-Diffusion the inference process appears to run correctly but the generated images are all black.

I think there must be some simple detail I'm overlooking... which it's making me go insane... 馃樀
Please let me know something if you think you can help!
THANKS !

@crowsonkb
Copy link
Owner

I'm not entirely sure what's going on but if you are getting an all black image it is probably because at some point during the inference process a value becomes NaN (due to floating point overflow or some other problem) and this NaN propagates to the entire image and it is displayed as black. This is probably happening with Big Sleep too but I don't know why there either...

@illtellyoulater
Copy link
Author

illtellyoulater commented Mar 8, 2022

@crowsonkb you nailed it :) This is exactly what's happening at least for Big Sleep where I reported the image containing NaN values with more detail: please see lucidrains/big-sleep#129 (comment). I really hope you could be able to provide some more valuable insight!

I'm very inexperienced in ML and related libraries, so I cannot fully debug this on my own, however I need to do it because I'm a little concerned those problems could be due to the recent gpu I got (PNY Nvidia GTX 1660 TI 6 Gb) so I will have to find out and in case it's faulty I will have to return it... I know it seems unlikely, but why am I the only one facing this issue? 馃様

Just when I was so excited that with my new purchase I could finally be able to run bigger models, now I have to deal with this weird mysterious problem...

@illtellyoulater
Copy link
Author

illtellyoulater commented Mar 8, 2022

Hey @crowsonkb, hold on, look at this!
Another ML project is joining the "black images" party...

In fact I've just found out that in my case glide-text2im is also generating black images!!! 馃槷

Now this is starting to get a little weird, isn't it ?

@crowsonkb
Copy link
Owner

That is really weird! If you run cc12m_1_cfg in CPU mode (use cfg_sample.py with --device cpu and set --steps 25 or something so it goes faster) , you get an image output that isn't black, right?

@illtellyoulater
Copy link
Author

Yes, right! Not black at all! I get a very colorful image using CPU... This is so puzzling and frustrating at the same...

@crowsonkb
Copy link
Owner

I would really suspect the GPU might be bad at this point tbh.

@illtellyoulater
Copy link
Author

illtellyoulater commented Mar 19, 2022

Just a quick update. I think we can rule out the option of a faulty GPU, as I could finally make it work with BigSleep and a couple of other projects. In those cases the solution was installing the latest version of torch with pip, like so:
pip3 install torch==1.11.0+cu115 torchvision==0.12.0+cu115 -f https://download.pytorch.org/whl/torch_stable.html

However this did not work for both v-diffusion and glide-text2im, which continue to generate NaN values.
All I can say for now is I suspect it has to do with the clip library...

@illtellyoulater
Copy link
Author

illtellyoulater commented Mar 21, 2022

Ok apparently it seems that for some reason I'm only having problems with projects involving OpenAI models.
Those models are the ones generating NaN values, while models from other projects seems to work just fine (provided I'm using the torch==1.11.0+cu115 package).

What could it be that prevents OpenAI models in particular from working as expected?
Is it possible that some models are built or trained without taking into account compatibility with some recent GPUs ?
Please let me know anything you suspect is remotely useful! 馃

@woctezuma
Copy link

woctezuma commented Mar 21, 2022

Provide links to:

  • recent projects which work,
  • OpenAI projects which do not work.

Then compare the requirements.

@illtellyoulater
Copy link
Author

illtellyoulater commented Mar 29, 2022

I fixed it by installing (with pip) a version of Torch that includes CUDA Toolkit v10.2, like this:

pip install torch==1.10.1+cu102 torchvision==0.11.2+cu102 -f https://download.pytorch.org/whl/torch_stable.html

With any CUDA Toolkit version higher than that it would not work... not sure if this strict requirement comes from my system's particular configuration, or this project, or the CLIP project..

Anyway, using v10.2 was enough to make cfg_sample.py work, but not to make clic_sample.py work.

To also fix that I had to make the following change to clip.py, at line 118:

https://github.com/openai/CLIP/blob/main/clip/clip.py#L118-L123

where, before the if condition, I had to add:

name = "ViT-B/32"

in order to force this model.
Otherwise, by using other models I would either receive:

  • a CUDA OUT OF MEMORY error
  • A CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling 'cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)
  • or a RuntimeError: mat1 dim 1 must match mat2 dim 0

That's it!
Maybe it doesn't work at 100%, but it works and I'm having fun with it already! :)
Thank you for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants