# Pokémon text to image

This notebook demonstrates inference using Stable Diffusion fine tuned on Pokémon to generate new Pokémon form text prompts. The model has been ported to Huggingface Diffusers for easier inference.

For more details about the fine tuning process and how to make your own specialised model see [this guide](https://github.com/LambdaLabsML/examples/tree/main/stable-diffusion-finetuning).

Also see the following links for more info:

- [Lambda Diffusers](https://github.com/LambdaLabsML/lambda-diffusers)
- [Captioned Pokémon dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions)
- [Model weights in Diffusers format](https://huggingface.co/lambdalabs/sd-pokemon-diffusers)
- [Original model weights](https://huggingface.co/justinpinkney/pokemon-stable-diffusion)
- [Training code](https://github.com/justinpinkney/stable-diffusion)

Created by Justin Pinkney at [Lambda Labs](https://lambdalabs.com/)

In [57]:
!nvidia-smi

Thu Feb  8 10:48:47 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA RTX A6000               On  | 00000000:01:00.0 Off |                  Off |
| 30%   32C    P8              27W / 300W |  44635MiB / 49140MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA RTX A6000               On  | 00000000:25:00.0 Off |  

In [2]:
!pip install diffusers==0.3.0
!pip install transformers scipy ftfy

Collecting diffusers==0.3.0
  Downloading diffusers-0.3.0-py3-none-any.whl (153 kB)
     ---------------------------------------- 0.0/153.9 kB ? eta -:--:--
     -- ------------------------------------- 10.2/153.9 kB ? eta -:--:--
     ----------------- ------------------- 71.7/153.9 kB 787.7 kB/s eta 0:00:01
     -------------------------------------- 153.9/153.9 kB 1.3 MB/s eta 0:00:00
Collecting importlib-metadata (from diffusers==0.3.0)
  Using cached importlib_metadata-7.0.1-py3-none-any.whl.metadata (4.9 kB)
Collecting filelock (from diffusers==0.3.0)
  Using cached filelock-3.13.1-py3-none-any.whl.metadata (2.8 kB)
Collecting huggingface-hub>=0.8.1 (from diffusers==0.3.0)
  Using cached huggingface_hub-0.20.3-py3-none-any.whl.metadata (12 kB)
Collecting numpy (from diffusers==0.3.0)
  Using cached numpy-1.26.4-cp312-cp312-win_amd64.whl.metadata (61 kB)
Collecting regex!=2019.12.17 (from diffusers==0.3.0)
  Using cached regex-2023.12.25-cp312-cp312-win_amd64.whl.metadata (41 kB)


In [3]:
from PIL import Image

def image_grid(imgs, rows, cols):
    assert len(imgs) == rows*cols

    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols*w, rows*h))
    grid_w, grid_h = grid.size
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid

In [88]:
import torch
from diffusers import StableDiffusionPipeline
from torch import autocast

pipe = StableDiffusionPipeline.from_pretrained("lambdalabs/sd-pokemon-diffusers", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "Yoda"
scale = 10
n_samples = 4

# Sometimes the nsfw checker is confused by the Pokémon images, you can disable
# it at your own risk here
disable_safety = False

if disable_safety:
  def null_safety(images, **kwargs):
      return images, False
  pipe.safety_checker = null_safety

with autocast("cuda"):
  images = pipe(n_samples*[prompt], guidance_scale=scale).images

for idx, im in enumerate(images):
  im.save(f"{idx:06}.png")

Unexpected exception formatting exception. Falling back to standard exception


Traceback (most recent call last):
  File "/home/team03/.local/lib/python3.10/site-packages/diffusers/utils/import_utils.py", line 704, in _get_module
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/team03/.local/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/__init__.py", line 31, in <module>
    from .pipeline_stable_diffusion import StableDiffusionPipeline
  File "/home/team03/.local/lib/python3.10/site-packages/diffusers/pipelines

In [89]:
from diffusers import StableDiffusionImageVariationPipeline
from PIL import Image

device = "cuda:0"
sd_pipe = StableDiffusionImageVariationPipeline.from_pretrained(
  "lambdalabs/sd-image-variations-diffusers",
  revision="v2.0",
  )
sd_pipe = sd_pipe.to(device)

im = Image.open("bulbasaur.png")
tform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Resize(
        (224, 224),
        interpolation=transforms.InterpolationMode.BICUBIC,
        antialias=False,
        ),
    transforms.Normalize(
      [0.48145466, 0.4578275, 0.40821073],
      [0.26862954, 0.26130258, 0.27577711]),
])
inp = tform(im).to(device)

out = sd_pipe(inp, guidance_scale=3)
out["images"][0].save("result.jpg")


Unexpected exception formatting exception. Falling back to standard exception


Traceback (most recent call last):
  File "/home/team03/.local/lib/python3.10/site-packages/diffusers/utils/import_utils.py", line 704, in _get_module
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/team03/.local/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/__init__.py", line 31, in <module>
    from .pipeline_stable_diffusion import StableDiffusionPipeline
  File "/home/team03/.local/lib/python3.10/site-packages/diffusers/pipelines