# Robot Face Generator Tool # 
Written by: Jasper Bosschart
## Context ##

This is a tool that can generate robot faces using Image Generation AI. <br>
The tool is built around the HuggingFace🤗 [Diffusers](https://huggingface.co/docs/diffusers/index) library. This library can be used to access a wide variety of AI models available on the HuggingFace🤗 [website](https://huggingface.co/). The tool uses an Image Generation Model called ["stable-diffusion-v1-5"](https://huggingface.co/runwayml/stable-diffusion-v1-5) by runwayml. <br>
The tool is still a work in progress and is subjected to change. It is currently built within Jupyter Notebook using Python, as this allows the file to be ran off a webserver. This is necessary as the tool requires large amounts of computational power, something a simple laptop is not able to handle. Future development might enable for a standalone application.
## Sources ##
For this project the following sources have been used:
-  https://huggingface.co/blog/stable_diffusion#how-does-stable-diffusion-work,
-  https://huggingface.co/runwayml/stable-diffusion-v1-5,
-  https://huggingface.co/docs/diffusers/using-diffusers/write_own_pipeline,
-  https://huggingface.co/docs/diffusers/v0.16.0/en/optimization/fp16,

In [1]:
!pip install xformers==0.0.16
!pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116
!pip install diffusers 
!pip install accelerate 
!pip install transformers

Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu116
Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable


In [2]:
import torch
import accelerate 
import transformers
from diffusers import StableDiffusionPipeline

2023-05-31 19:00:57.890224: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-31 19:00:58.089736: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-05-31 19:00:59.044201: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-05-31 19:00:59.044306

In [None]:
torch.cuda.empty_cache()

In [3]:
from PIL import Image

def image_grid(imgs, rows, cols):
    assert len(imgs) == rows*cols

    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols*w, rows*h))
    grid_w, grid_h = grid.size
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid

In [4]:
#orch.backends.cuda.matmul.allow_tf32 = False
pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16,
)
pipe.enable_sequential_cpu_offload()
#generator = torch.Generator("cuda").manual_seed(1024)

Fetching 19 files:   0%|          | 0/19 [00:00<?, ?it/s]

`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.


In [5]:
pipe = pipe.to("cuda")

In [None]:
prompt = "robot portrait, intricate, elegant, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, 8k"
neg_prompt = "human features"


In [None]:
prompt = "robot portrait, realistic"
neg_prompt = "painting"


In [None]:
!nvidia-smi

In [None]:
!nvcc --version

In [None]:
!nvcc --run /path/to/sample/cuda/cublas

In [None]:
num_images = 2
Multi_prompt = [prompt] * num_images
Multi_prompt_N = [neg_prompt] * num_images

images = pipe(prompt=Multi_prompt, negative_prompt=Multi_prompt_N).images

grid = image_grid(images, rows=1, cols=4)
display(grid)

In [None]:
grid.save(f"robot1.png")

In [None]:
images = pipe(prompt=prompt, negative_prompt=neg_prompt, num_images_per_prompt=2).images

grid = image_grid(images, rows=1, cols=4)
display(grid)

In [None]:
prompt = "portrait photograph of a robot"
image = pipe(prompt).images[0]
#image.show("image.png")
#display(image)

In [None]:
display(grid)

In [None]:
display(images[2])

In [None]:
image.save(f"can1.png")

In [None]:
@InProceedings{Rombach_2022_CVPR,
    author    = {Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bj\"orn},
    title     = {High-Resolution Image Synthesis With Latent Diffusion Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {10684-10695}
}