# Chapter 2: Building Strong Prompts

---

**Lesson:**

As we have seen, more details in your prompt often add greater complexity and quality to the generated image. This is not to say more is *always* better. However, in general with Stable Diffusion, being as concise as possible with your prompt will lead to outputs more closely resembling the images you require.

In [None]:
%%capture
# Install dependencies
%pip install --no-build-isolation --force-reinstall \
    "boto3>=1.28.57" \
    "awscli>=1.29.57" \
    "botocore>=1.31.57"

%pip install --quiet "pillow>=9.5,<10"

# Python Built-Ins:
import base64
import io
import json
import os
import sys

# External Dependencies:
import boto3
from PIL import Image

module_path = ".."
sys.path.append(os.path.abspath(module_path))
from utils import bedrock, print_ww


# ---- ⚠️ Un-comment and edit the below lines as needed for your AWS setup ⚠️ ----

# os.environ["AWS_DEFAULT_REGION"] = "<REGION_NAME>"  # E.g. "us-east-1"
# os.environ["AWS_PROFILE"] = "<YOUR_PROFILE>"
# os.environ["BEDROCK_ASSUME_ROLE"] = "<YOUR_ROLE_ARN>"  # E.g. "arn:aws:..."

boto3_bedrock = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", None)
)

modelId = "stability.stable-diffusion-xl"

## Main components of a strong prompt

In the end, what is most important to keep in mind is that Stable Diffusion prompts require a comma-separated list of descriptions. It is up to you to decide what those descriptions are and which matter the most. That being said, below is a common template of categories to keep in mind when crafting your prompt:

 - Subject
 - Setting
 - Overall characteristics (i.e., lighting, style, color, quality)
 
 Let's dig into each of these categories in the below examples, building upon our image with each step.


## Examples:

**Example 2.1 - Subject**

The subject is the focus of the image. It can be a person, animal, landscape, inanimate object - anything / anywhere your mind can take you.

At the basics, we could simply input the following:

In [None]:
prompt = "a human"

Stable Diffusion builds upon its vast corpus of human / human-related images to generate its own image. However, what we will see is this image can take many forms without any further descriptive elements in the prompt, paricularly if the cfg_scale is set low and the seed to its default.

In [None]:
request = json.dumps({
    "text_prompts": (
        [{"text": prompt}]
    )
})

response = boto3_bedrock.invoke_model(body=request, modelId=modelId)
response_body = json.loads(response.get("body").read())

print(response_body["result"])
base_64_img_str = response_body["artifacts"][0].get("base64")
print(f"{base_64_img_str[0:80]}...")

os.makedirs("data", exist_ok=True)
eg2_1a = Image.open(io.BytesIO(base64.decodebytes(bytes(base_64_img_str, "utf-8"))))
eg2_1a.save("data/eg2_1a.png")
eg2_1a

What we can do to better control the output of our image is add keywords to describe our subject more specifically. Who is this human? What are they doing? What do they look like? What are some small nuances in their clothing, hair color, eyes, or posture? There is a world of possibly in what our subject could look like, and we have to dictate this clearly to Stable Diffusion.

Let's continue by adding some more flavor to our subject. Feel free to uncomment and experiment further with the parameters.

In [None]:
prompt = "a human coder, wearing a tuxedo, typing furiously, determined look on the face"

# cfg_scale = 10
# seed = 12345
# steps = 40
# style_preset = "photographic"  # (e.g. photographic, digital-art, cinematic, ...)
# clip_guidance_preset = "FAST_GREEN" # (e.g. FAST_BLUE FAST_GREEN NONE SIMPLE SLOW SLOWER SLOWEST)
# sampler = "K_DPMPP_2S_ANCESTRAL" # (e.g. DDIM, DDPM, K_DPMPP_SDE, K_DPMPP_2M, K_DPMPP_2S_ANCESTRAL, K_DPM_2, K_DPM_2_ANCESTRAL, K_EULER, K_EULER_ANCESTRAL, K_HEUN, K_LMS)
# width = 768

In [None]:
request = json.dumps({
    "text_prompts": (
        [{"text": prompt}]
    ),
#    "cfg_scale": cfg_scale,
#    "seed": seed,
#    "steps": steps,
#    "style_preset": style_preset,
#    "clip_guidance_preset": clip_guidance_preset,
#    "sampler": sampler,
#    "width": width
})

response = boto3_bedrock.invoke_model(body=request, modelId=modelId)
response_body = json.loads(response.get("body").read())

print(response_body["result"])
base_64_img_str = response_body["artifacts"][0].get("base64")
print(f"{base_64_img_str[0:80]}...")

os.makedirs("data", exist_ok=True)
eg2_1b = Image.open(io.BytesIO(base64.decodebytes(bytes(base_64_img_str, "utf-8"))))
eg2_1b.save("data/eg2_1b.png")
eg2_1b

**Example 2.2 - Setting**

Similar to the subject is the setting. The setting is the surrounding environment of the subject, which may include where and when the subject finds itself (such as a quiet village in the mountains during the American Revolution or a monk soldier dancing in a temple during the Song dynasty).

In [None]:
prompt = "a human coder, wearing a tuxedo, typing furiously, determined look on the face, window overlooking a 1950s English town, an ornate and oak room"

# cfg_scale = 10
# seed = 12345
# steps = 40
# style_preset = "photographic"  # (e.g. photographic, digital-art, cinematic, ...)
# clip_guidance_preset = "FAST_GREEN" # (e.g. FAST_BLUE FAST_GREEN NONE SIMPLE SLOW SLOWER SLOWEST)
# sampler = "K_DPMPP_2S_ANCESTRAL" # (e.g. DDIM, DDPM, K_DPMPP_SDE, K_DPMPP_2M, K_DPMPP_2S_ANCESTRAL, K_DPM_2, K_DPM_2_ANCESTRAL, K_EULER, K_EULER_ANCESTRAL, K_HEUN, K_LMS)
# width = 768

In [None]:
request = json.dumps({
    "text_prompts": (
        [{"text": prompt}]
    ),
#    "cfg_scale": cfg_scale,
#    "seed": seed,
#    "steps": steps,
#    "style_preset": style_preset,
#    "clip_guidance_preset": clip_guidance_preset,
#    "sampler": sampler,
#    "width": width
})

response = boto3_bedrock.invoke_model(body=request, modelId=modelId)
response_body = json.loads(response.get("body").read())

print(response_body["result"])
base_64_img_str = response_body["artifacts"][0].get("base64")
print(f"{base_64_img_str[0:80]}...")

os.makedirs("data", exist_ok=True)
eg2_2 = Image.open(io.BytesIO(base64.decodebytes(bytes(base_64_img_str, "utf-8"))))
eg2_2.save("data/eg2_2.png")
eg2_2

**Example 2.3 - Overall Characteristics**

This is where things get interesting. Here, we can take a given image / overall description as we created above, and completely change the style if we so desire. We can do the following:

 - *style*: paint an image in the manner of Claude Monet
 - *quality*: take a grainy or super high-resolution photograph
 - *medium*: sketch a cartoon on a pad of paper or generate a digital image
 - *color*: draw a vibrant image of a rainbow kaleidoscope
 - *lighting*: encapsulate a warm family dinner with cinematic lighting
 
 
 Let's implement some of the above characteristics in our prompt.

In [None]:
prompt = "a human coder, wearing a tuxedo, typing furiously, determined look on the face, window overlooking a 1950s English town, an ornate and oak room, cinematic lighting, grainy quality"

# cfg_scale = 10
# seed = 12345
# steps = 40
# style_preset = "photographic"  # (e.g. photographic, digital-art, cinematic, ...)
# clip_guidance_preset = "FAST_GREEN" # (e.g. FAST_BLUE FAST_GREEN NONE SIMPLE SLOW SLOWER SLOWEST)
# sampler = "K_DPMPP_2S_ANCESTRAL" # (e.g. DDIM, DDPM, K_DPMPP_SDE, K_DPMPP_2M, K_DPMPP_2S_ANCESTRAL, K_DPM_2, K_DPM_2_ANCESTRAL, K_EULER, K_EULER_ANCESTRAL, K_HEUN, K_LMS)
# width = 768

In [None]:
request = json.dumps({
    "text_prompts": (
        [{"text": prompt}]
    ),
#    "cfg_scale": cfg_scale,
#    "seed": seed,
#    "steps": steps,
#    "style_preset": style_preset,
#    "clip_guidance_preset": clip_guidance_preset,
#    "sampler": sampler,
#    "width": width
})

response = boto3_bedrock.invoke_model(body=request, modelId=modelId)
response_body = json.loads(response.get("body").read())

print(response_body["result"])
base_64_img_str = response_body["artifacts"][0].get("base64")
print(f"{base_64_img_str[0:80]}...")

os.makedirs("data", exist_ok=True)
eg2_3 = Image.open(io.BytesIO(base64.decodebytes(bytes(base_64_img_str, "utf-8"))))
eg2_3.save("data/eg2_3.png")
eg2_3

## Exercises:

Let's experiment with some image generation that requires detail in the prompt.

**Example 1.1 - Replicating stories**

Using proper formatting, the included parameters, and detailed descriptions, generate an image based on a scene in a book you read or a movie you saw recently.

In [None]:
prompt = "INSERT PROMPT"

cfg_scale = 
seed = 
steps = 
style_preset =   # (e.g. photographic, digital-art, cinematic, ...)
clip_guidance_preset =  # (e.g. FAST_BLUE FAST_GREEN NONE SIMPLE SLOW SLOWER SLOWEST)
sampler =  # (e.g. DDIM, DDPM, K_DPMPP_SDE, K_DPMPP_2M, K_DPMPP_2S_ANCESTRAL, K_DPM_2, K_DPM_2_ANCESTRAL, K_EULER, K_EULER_ANCESTRAL, K_HEUN, K_LMS)
width = 

In [None]:
request = json.dumps({
    "text_prompts": (
        [{"text": prompt}]
    ),
    "cfg_scale": cfg_scale,
    "seed": seed,
    "steps": steps,
    "style_preset": style_preset,
    "clip_guidance_preset": clip_guidance_preset,
    "sampler": sampler,
    "width": width
})

response = boto3_bedrock.invoke_model(body=request, modelId=modelId)
response_body = json.loads(response.get("body").read())

print(response_body["result"])
base_64_img_str = response_body["artifacts"][0].get("base64")
print(f"{base_64_img_str[0:80]}...")

os.makedirs("data", exist_ok=True)
ex2_1 = Image.open(io.BytesIO(base64.decodebytes(bytes(base_64_img_str, "utf-8"))))
ex2_1.save("data/ex2_1.png")
ex2_1

**Example 1.2 - Capturing an image**

See if you can recreate an image of any particular scene in your life, be it your office, park, house, etc. Think about the details required to make this come to life.

In [None]:
prompt = "INSERT PROMPT"

cfg_scale = 
seed = 
steps = 
style_preset =   # (e.g. photographic, digital-art, cinematic, ...)
clip_guidance_preset =  # (e.g. FAST_BLUE FAST_GREEN NONE SIMPLE SLOW SLOWER SLOWEST)
sampler =  # (e.g. DDIM, DDPM, K_DPMPP_SDE, K_DPMPP_2M, K_DPMPP_2S_ANCESTRAL, K_DPM_2, K_DPM_2_ANCESTRAL, K_EULER, K_EULER_ANCESTRAL, K_HEUN, K_LMS)
width = 

In [None]:
request = json.dumps({
    "text_prompts": (
        [{"text": prompt}]
    ),
    "cfg_scale": cfg_scale,
    "seed": seed,
    "steps": steps,
    "style_preset": style_preset,
    "clip_guidance_preset": clip_guidance_preset,
    "sampler": sampler,
    "width": width,
})

response = boto3_bedrock.invoke_model(body=request, modelId=modelId)
response_body = json.loads(response.get("body").read())

print(response_body["result"])
base_64_img_str = response_body["artifacts"][0].get("base64")
print(f"{base_64_img_str[0:80]}...")

os.makedirs("data", exist_ok=True)
ex2_2 = Image.open(io.BytesIO(base64.decodebytes(bytes(base_64_img_str, "utf-8"))))
ex2_2.save("data/ex2_2.png")
ex2_2