Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated images are completely black?! 馃樀 What am I doing wrong? #129

Closed
illtellyoulater opened this issue Mar 8, 2022 · 7 comments

Comments

@illtellyoulater
Copy link

illtellyoulater commented Mar 8, 2022

Hello,
I am on Windows 10, and my gpu is a PNY Nvidia GTX 1660 TI 6 Gb.
I installed Big Sleep like so:

  • conda create --name bigsleep python=3.8
  • conda activate bigsleep
  • conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch (as per Pytorch website instructions)
  • pip install big-sleep

The problem is that when I launch dream --text="a beautiful mind" --num-cutouts=64 --image-size=512 --save_every=10 --seed=12345 the generated images are completely black (although the inference process seems to run nicely and without errors).

Things I've tried:

  • installing previous pytorch version with conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
  • removing the big-sleep conda environment completely and recreating it anew
  • uninstalling nvidia drivers and performing a new clean driver install (I tried both Nvidia Studio drivers and Nvidia Game Ready drivers)
  • uninstalling and reinstalling Conda completely

But nothing helped... and at this point I don't know what else to try...

The only interesting piece of information I could gather is that for some reason this problem also happens with another text-to-image project called v-diffusion-pytorch where similar to Big Sleep the inference process appears to run correctly but the generated images are all black.

I think there must be some simple detail I'm overlooking... which it's making me go insane... 馃樀
Please let me know something if you think you can help!
THANKS !

@htoyryla
Copy link

htoyryla commented Mar 8, 2022

"the generated images are completely black (although the inference process seems to run nicely and without errors)"

Most often this happens, because when processing images with Python, pixel values can be represented either as 0..1 floats or 0...255 integers. Now, if the generated image is 0..1 but the library which is used to store it into a file expects 0..255, the result is a black image.

Some libraries are clever enough to adapt to the correct range based on the type (integer or float) but often not. It could even be that a windows implementation of a library is different.

Just my 2c worth.

@illtellyoulater
Copy link
Author

illtellyoulater commented Mar 8, 2022

@htoyryla thanks so much for the input!
Based on your reasoning, the first thing I did was inspecting the image variable that is later saved to an image file... and apparently it's just nan values, like so:

print(image)

tensor([[[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         ...,
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan]],

        [[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         ...,
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan]],

        [[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         ...,
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan]]])

image is created like so:
image = self.model.model()[best].cpu()
(in big_sleep.py, line 464)

So I wonder if this line could be causing any problem?
I admit I have close to zero knowledge about Torch or other ML libraries... I am actually a total newbie at ML in general, so all I can do is basically try to draw some attention to some potentially problematic code, like I did, but that's all...

I really hope this could be of any help to you or to someone else knowing more than me, who could actually try to understand what's going on...

Another super simple speculation I could make is, the image file is saved using torchvision.utils.save_image(...) function, and I am skeptic that such an important and popular library would suffer from the kind of int/float inconsistency you described in your reply... which would restrict the potentially problematic code to just big_sleep.py.

But if there's a problem in that file, then how could I be the only one experiencing it given there are many other Windows users?
It's so frustrating... :\

@htoyryla
Copy link

htoyryla commented Mar 8, 2022

If image is full of nans then it is definitely not the 1 vs 255 problem, and torchvision for sure works correctly given a torch tensor. Something runs amok in the model itself. Don't have time to look deeper, it is a year since I have used this project.

But what I'd look at in this case... The process is iterative, so I'd look at the image as it evolves. An iterative process like this can run amok at some point, for instance if the learning rate is too high, and then you may get nans.

But my perspective is different, being an ai artist working with my own code, so I run into these situations all the time and have to solve them myself.

@illtellyoulater
Copy link
Author

Hey, hold on, look at this!
Another ML project is joining the "black images" party...

In fact I've just found out that in my case glide-text2im is also generating black images!!! 馃槷

Now this is starting to get a little weird, isn't it ?

@illtellyoulater
Copy link
Author

illtellyoulater commented Mar 25, 2022

I fixed it!

Apparently the problem was that the cuda toolkit version I was installing with the conda package (11.3) was not recent enough to match my video drivers.

So I uninstalled the conda packages for torch, torchvision and cudatoolkit, and I installed torch and torchvision via pip, paying attention to pick a version which embedded a much more recent cuda toolkit version (11.5):

pip3 install torch==1.11.0+cu115 -f https://download.pytorch.org/whl/torch_stable.html
pip3 install torchvision==0.12.0+cu115 -f https://download.pytorch.org/whl/torch_stable.html

Now I can happily dream! ;)
Thanks again for your help and to all those who have worked on this!

Btw, the other project which was also generating black images (glide-text2im), did not benefit from this fix, so it was just a coincidence! And it has to be caused by something else...

@MrPalais
Copy link

@illtellyoulater I had to reinstall numpy 1.22.3, but your trick works, thanks a lot :)

@illtellyoulater
Copy link
Author

@MrPalais glad I could help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants