# Stable Diffusion playbook

### Mal Minhas, 02.11.22
v0.2

## Introduction

[Stable Diffusion](https://en.wikipedia.org/wiki/Stable_Diffusion) is a state of the art deep learning, text-to-image model released in 2022 which can <em>generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt."</em>  Importantly it differs from it's two major competitors [DALL-E 2](https://openai.com/dall-e-2/) and [Midjourney](https://www.midjourney.com/home/) in terms of its accessibility: 
> Stable Diffusion's code and model weights have been released publicly, and it can run on most consumer hardware equipped with a modest GPU. This marked a departure from previous proprietary text-to-image models such as DALL-E and Midjourney which were accessible only via cloud services

You can sign up for free to get StableDiffusion [here]().  You get 200 free credits for image generation.  Beyond that you have to pay $10 per 1000 additional images. The screenshot below shows the membership control panel where you can generate and copy an API key for programmatic access.  The Stable Diffusion Stability SDK is available [here](https://github.com/Stability-AI/stability-sdk).

In [1]:
import IPython.display
IPython.display.Image(url="membership.png", width=1200, height=900)

## Configuration

In order to use the Stability SDK, you must first of all set up the `STABILITY_HOST` environment variable to point to the `grpc.stability.ai:443` endpoint and the `STABILITY_KEY` environment variable to be your Stable Diffusion API key.  To get your API key, visit [https://beta.dreamstudio.ai/membership](https://beta.dreamstudio.ai/membership):

In [2]:
import getpass, os

def configureEnvVars(key, host='grpc.stability.ai:443'):
    # NB: host url is not prepended with \"https\" nor does it have a trailing slash.
    os.environ['STABILITY_HOST'] = host
    os.environ['STABILITY_KEY'] = key
    
def getAPIKey(file=None):
    has_key = os.environ.get('STABILITY_KEY')
    if file:
        with open(file) as f:
            key = f.read()
    else:
        key = getpass.getpass('Enter your API Key')
    return key

configureEnvVars(getAPIKey('.stableDiffusionKey'))

Now setup the Stability API:

In [3]:
from stability_sdk import client

stability_api = client.StabilityInference(
    key=os.environ['STABILITY_KEY'], 
    verbose=True,
    )

The following functions encapsulate:

In [4]:
import io, warnings
from PIL import Image
from IPython.display import display
import stability_sdk.interfaces.gooseai.generation.generation_pb2 as generation

def generateImage(text):
    img = None
    # the object returned is a python generator
    answers = stability_api.generate(
        prompt=text,
        seed=34567, # if provided, specifying a random seed makes results deterministic
        steps=30, # defaults to 50 if not specified
    )
    # iterating over the generator produces the api response
    for resp in answers:
        for artifact in resp.artifacts:
            if artifact.finish_reason == generation.FILTER:
                warnings.warn(
                    "Your request activated the API's safety filters and could not be processed."
                    "Please modify the prompt and try again.")
            if artifact.type == generation.ARTIFACT_IMAGE:
                img = Image.open(io.BytesIO(artifact.binary))
    return img

def generateImageFromImage(img, text):
    answers = stability_api.generate(
        prompt=text,
        init_image=img,
        seed=54321, # if we're passing in an image generated by SD, you may get better results by providing a different seed value than was used to generate the image
        start_schedule=0.6, # this controls the "strength" of the prompt relative to the init image
    )
    # iterating over the generator produces the api response
    for resp in answers:
        for artifact in resp.artifacts:
            if artifact.finish_reason == generation.FILTER:
                warnings.warn(
                    "Your request activated the API's safety filters and could not be processed."
                    "Please modify the prompt and try again.")
            if artifact.type == generation.ARTIFACT_IMAGE:
                img = Image.open(io.BytesIO(artifact.binary))
    return img

def textToImage(text, img=None, target=None, show=False):
    if not img:
        img = generateImage(prompt_text)
    else:
        img = generateImageFromImage(img, text)
    if img:
        if target:
            img.save(target, target[-3:])
        if show:
            display(img)
    return img

## Prompt Engineering - text to image

Now you can input your prompt:

In [5]:
prompt_text = input("Please enter your prompt ->")

Please enter your prompt -> a well functioning engineering team


And generate your image with it. `textToImage` will show the image by default. We will override that default and set a target file to store the image to.

In [6]:
img = textToImage(prompt_text, target='engineers.png', show=False)

Now we can display the file we created:

In [7]:
IPython.display.Image(url="engineers.png", width=1200, height=900)

## Prompt Engineering - using a seed image

We can now use the image from the previous stage as input into another generation stage to get modified results.  In this case we are turning our image into a cartoon.

In [8]:
img2 = textToImage(prompt_text + " cartoon", img=img, target='engineers2.png', show=False)

In [9]:
IPython.display.Image(url="engineers2.png", width=1200, height=900)