#### **MSc Data Science and AI for Creative Industries Thesis Project** 
# AI Concept Art Generator

This notebook can be used to try out my concept art generator, a stable diffusion model I finetuned using anime characters; with this code, you can create rough images that can inspire character designs for your project. You can simply pull this notebook or copy it into your own notebook or Google Colab space.

Carrying out the stable diffusion relies on Huggingface's [Diffusers](https://github.com/huggingface/diffusers/tree/main/examples/text_to_image) git repository. Please note that, for easier use, you may want to have a HuggingFace account in order to use the models and datasets involved.

Training was carried out using a P5000 using the Paperspace Gradient platform.

## SET UP ENVIRONMENT

Please ensure that all appropriate libraries, repositories and scripts are installed to use the generator.

In [None]:
# Install libraries

%pip install torch
%pip install accelerate

In [None]:
#Install Diffusers git repository

!git clone https://github.com/huggingface/diffusers
%cd diffusers
%pip install .
%cd /notebooks/diffusers/examples/text_to_image
%pip install -r requirements.txt

In [None]:
!accelerate config default

## SET UP LOGIN

As stated earlier, to have the best experience with the generator, you may want to login with a HuggingFace account (although this migh not be necessary).

In [None]:
#Log into Huggingface to access models and saving abilities
from huggingface_hub import interpreter_login

interpreter_login()

## USING THE MODEL

With the model, you can generate one image at a time or multiple at once. When inputting text prompts, please do so in the format of a list of phrases as this would make it easier for the AI to understand. If you would like to have characters of darker skin tones, please add "dark skin" and "tan skin" into your text prompt.

**Please take note of the following**
- This is an experimental generator finetuned on a small dataset, there may be issues with the output (especially with darker characters). You are advised to alter the text prompts or re-run the model until you obtain images that are to your liking.

- The output is **not** meant to be high res or realistic, the point of this generator is to generate rough sketches that can inspire your character designs (this means that you can draw the designs however way you want, changing colour scheme, skin complexions, gender, etc.)

In [None]:
from huggingface_hub import model_info

# LoRA weights ~3 MB
model_path = "Christabelle/sd_anime_concept_generator"

info = model_info(model_path)
model_base = info.cardData["base_model"]
print(model_base)

In [None]:
import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler, UniPCMultistepScheduler

pipe = StableDiffusionPipeline.from_pretrained(model_base, torch_dtype=torch.float16)
#pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) #using this scheduler is faster

Single image generation

In [None]:
from PIL import Image

pipe.unet.load_attn_procs(model_path)
pipe.to("cuda")

# Code adapted from:
# https://www.reddit.com/r/StableDiffusion/comments/wxba44/comment/ilqa7an/?utm_source=share&utm_medium=web2x&context=3
pipe.safety_checker = lambda images, **kwargs: (images, [False] * len(images))

prompt = "a man in a hunter outfit, white hair, dark skin, tan skin, red eyes, punk motif"
image = pipe(prompt, negative_prompt="monochrome,low res,poorly drawn face, mutated body parts, deformed body features, bad anatomy, worst quality, low quality",num_inference_steps=250).images[0]
image.save("magician.png")
image.show()

Multiple image generation

In [None]:
# Code from: 
# https://colab.research.google.com/github/LambdaLabsML/lambda-diffusers/blob/main/notebooks/pokemon_demo.ipynb#scrollTo=so1GmFN0q_M4

from PIL import Image

def image_grid(imgs, rows, cols):
    assert len(imgs) == rows*cols

    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols*w, rows*h))
    grid_w, grid_h = grid.size
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid

In [None]:
from torch import autocast

prompt = "a man or woman in a school uniform, moon motif, magician, blue hair, mint hair, holding a staff"
scale = 7.5
n_samples = 4

pipe.safety_checker = lambda images, **kwargs: (images, [False] * len(images))

with autocast("cuda"):
  images = pipe(n_samples*[prompt], negative_prompt=n_samples*["monochrome,low res,poorly drawn face, mutated body parts, deformed body features, bad anatomy, worst quality, low quality"],guidance_scale=scale,num_inference_steps=250).images

grid = image_grid(images, rows=2, cols=2)
grid