# Amazon Bedrock and Stability.ai Stable Diffusion 3.5 Demo

In this notebook we will learn how to use the AWS SDK for Python (Boto3) to create gaming assets using [Amazon Bedrock](https://aws.amazon.com/bedrock/) and [Stability.ai](https://stability.ai/stable-image) Stable Diffusion 3.5 model, with the help of [Anthropic Claude 3](https://www.anthropic.com/claude).  By leveraging Stability AI's capabilities, organizations can address practical gaming needs, from concept art and character design to level creation and marketing campaigns. However, it is essential to approach this technology with a responsible and ethical mindset, considering potential biases, respecting intellectual property rights, and mitigating the risks of misuse.

It demonstrates how to produce a series of images using LLM refined prompts for a game called “Mystic Realms”. By combining the ideation capabilities of LLMs with advanced image generation, this workflow empowers gaming teams to produce early stage high-quality, tailored visual assets that resonate with their target audience more efficiently than ever before. By embracing these models while being aware of their limitations and ethical considerations, gaming professionals can push the boundaries of what’s possible in game design and visual content creation.


Technologies:

- **Amazon Bedrock**: Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon via a single API

- **Stable Diffusion 3 Models**: Stable Diffusion 3.5 Large (SD3.5 Large), available through Amazon Bedrock, is Stability AI's most advanced text-to-image model to date. With 8.1 billion parameters, this model excels at generating high-quality, 1-megapixel images from text descriptions, making it ideal for creating detailed game environments at speed. Its improved architecture, based on the Multimodal Diffusion Transformer (MMDiT), combines multiple pretrained text encoders for enhanced text understanding and utilizes QK-normalization to improve training stability.

- **Anthropic Claude 3 Model**: Claude3 is a family of state-of-the-art large language models developed by Anthropic, offering 200k context window.



## Prerequisites


### Python Environment

Create a virtual Python environment and install the required packages.


In [19]:
%%sh
# Install Python requirements
python3 -m pip install -r requirements.txt -Uq

[0m

### Authenticate with Your AWS Credentials

Your method of authentication may vary depending on your environment.


In [20]:
# Authenticate with AWS using your credentials

import os

# os.environ["AWS_ACCESS_KEY_ID"] = ""
# os.environ["AWS_SECRET_ACCESS_KEY"] = ""
# os.environ["AWS_SESSION_TOKEN"] = ""

## Define functions
Define the text-to-image, image-to-image and other utility functions

In [21]:
import base64
import io
import json
import logging
import boto3
from PIL import Image
import time
from enum import Enum, unique

from botocore.exceptions import ClientError

GENERATED_IMAGES = "./generated_images"


In [22]:
# Amazon Bedrock Model ID used throughout this notebook
# Model IDs: https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html#model-ids-arns
MODEL_ID = "stability.sd3-5-large-v1:0" 


In [23]:
directory = "./generated_images"
if not os.path.exists(directory):
    os.makedirs(directory)

### Define text to image function

In [24]:
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
"""
Shows how to generate an image with SD3.5 Large
"""

class ImageError(Exception):
    """
    Custom exception for errors returned by SD3.5 Large
    """

    def __init__(self, message):
        self.message = message


# Set up logging for notebook environment
logger = logging.getLogger(__name__)
if logger.hasHandlers():
    logger.handlers.clear()
handler = logging.StreamHandler()
logger.addHandler(handler)
formatter = logging.Formatter("%(asctime)s - %(levelname)s - %(message)s")
handler.setFormatter(formatter)
logger.setLevel(logging.INFO)


def generate_image_from_text(model_id, body):
    """
    Generate an image using SD3.5 Large on demand.
    Args:
        model_id (str): The model ID to use.
        body (str) : The request body to use.
    Returns:
        image_bytes (bytes): The image generated by the model.
    """

    logger.info("Generating image with SD3.5 Large model %s", model_id)

    bedrock = boto3.client("bedrock-runtime", region_name="us-west-2")
    
    response = bedrock.invoke_model(modelId=model_id,body=body)
    response_body= json.loads(response["body"].read())
    image_data = base64.b64decode(response_body.get("images")[0])

    logger.info("Successfully generated image with the SD3.5 Large model %s", model_id)
    return image_data

def text_to_image_request(
    model_id,
    positive_prompt,
    # negative_prompt,
    save_image_path=None,
    seed=1664300763
):
    """
    Args:
        model_id (str): The model ID to use.
        positive_prompt (str): The positive prompt to use.
    """
    
    # Build request body
    body = json.dumps(
        {
            "prompt": positive_prompt, 
            "mode" : "text-to-image"
        }
    )

    # Generate and save image
    try:
        image = generate_image_from_text(model_id=model_id, body=body)

        if  save_image_path:
            generated_image_path = save_image_path
        else:
            epoch_time = int(time.time())
            generated_image_path = f"{GENERATED_IMAGES}/image_{epoch_time}.jpg"
        
        logger.info(f"Generated image: {generated_image_path}")
        with open(generated_image_path, "wb") as file:
            file.write(image)

        print(f"The generated image has been saved to {generated_image_path}.")
        
    except ClientError as err:
        message = err.response["Error"]["Message"]
        logger.error("A client error occurred: %s", message)
    except ImageError as err:
        logger.error(err.message)

    else:
        logger.info(f"Finished generating image with SD3.5 Large model {model_id}.")

In [25]:
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
"""
Shows how to generate an image from a reference image with SD3.5 Large (on demand).
"""

class ImageToImageRequest:
    """
    Class for handling image to image request parameters.
    """

    def __init__(
        self,
        image_width,
        image_height,
        positive_prompt,
        # negative_prompt,
        init_image_mode="IMAGE_STRENGTH",
        image_strength=0.5,
        cfg_scale=7,
        clip_guidance_preset="SLOWER",
        sampler="K_DPMPP_2M",
        samples=1,
        seed=1,
        steps=30,
        style_preset="photographic",
        extras=None,
    ):
        self.image_width = image_width
        self.image_height = image_height
        self.positive_prompt = positive_prompt
        # self.negative_prompt = negative_prompt
        self.init_image_mode = init_image_mode
        self.image_strength = image_strength
        self.cfg_scale = cfg_scale
        self.clip_guidance_preset = clip_guidance_preset
        self.sampler = sampler
        self.samples = samples
        self.seed = seed
        self.steps = steps
        self.style_preset = style_preset
        self.extras = extras


@unique
class StylesPresets(Enum):
    """
    Enumerator for SD3.5 Large style presets.
    """

    THREE_D_MODEL = "3d-model"
    ANALOG_FILM = "analog-film"
    ANIME = "anime"
    CINEMATIC = "cinematic"
    COMIC_BOOK = "comic-book"
    DIGITAL_ART = "digital-art"
    ENHANCE = "enhance"
    FANTASY_ART = "fantasy-art"
    ISOMETRIC = "isometric"
    LINE_ART = "line-art"
    LOW_POLY = "low-poly"
    MODELING_COMPOUND = "modeling-compound"
    NEON_PUNK = "neon-punk"
    ORIGAMI = "origami"
    PHOTOGRAPHIC = "photographic"
    PIXEL_ART = "pixel-art"
    TILE_TEXTURE = "tile-texture"


def generate_image_from_image(model_id, body):
    """
    Generate an image using SD3.5 Large on demand.
    Args:
        model_id (str): The model ID to use.
        body (str) : The request body to use.
    Returns:
        image_bytes (bytes): The image generated by the model.
    """

    logger.info("Generating image with SD3.5 Large model %s", model_id)

    bedrock = boto3.client(service_name="bedrock-runtime")

    accept = "application/json"
    content_type = "application/json"

    response = bedrock.invoke_model(
        body=body, modelId=model_id, accept=accept, contentType=content_type
    )
    response_body = json.loads(response.get("body").read())
    logger.info(f"Bedrock result: {response_body['result']}")

    base64_image = response_body.get("artifacts")[0].get("base64")
    base64_bytes = base64_image.encode("ascii")
    image_bytes = base64.b64decode(base64_bytes)

    finish_reason = response_body.get("artifacts")[0].get("finishReason")

    if finish_reason == "ERROR" or finish_reason == "CONTENT_FILTERED":
        raise ImageError(f"Image generation error. Error code is {finish_reason}")

    logger.info("Successfully generated image with the SD3.5 Large model %s", model_id)

    return image_bytes


def image_to_image_request(
    imageToImageRequest,
    source_image,
    save_image_path=None,
    save_image_folder=None,
):
    """
    Args:
        imageToImageRequest (ImageToImageRequest): The image to image request to use.
        generated_images (str): The directory to save the generated images to.
        source_image (str): The source image to use.
    """

    # Read source image from file and encode as base64 strings
    image = Image.open(source_image)
    new_image = image.resize(
        (imageToImageRequest.image_width, imageToImageRequest.image_height)
    )

    new_image.save(f"{source_image[:-4]}_tmp.jpg")

    with open(f"{source_image[:-4]}_tmp.jpg", "rb") as image_file:
        init_image = base64.b64encode(image_file.read()).decode("utf8")

    # Build request body
    body = json.dumps(
        {
            "text_prompts": [
                {"text": imageToImageRequest.positive_prompt, "weight": 1}
                # {"text": imageToImageRequest.negative_prompt, "weight": -1},
            ],
            "init_image": init_image,
            "init_image_mode": imageToImageRequest.init_image_mode,
            "image_strength": imageToImageRequest.image_strength,
            "cfg_scale": imageToImageRequest.cfg_scale,
            "clip_guidance_preset": imageToImageRequest.clip_guidance_preset,
            "sampler": imageToImageRequest.sampler,
            "samples": imageToImageRequest.samples,
            "seed": imageToImageRequest.seed,
            "steps": imageToImageRequest.steps,
            "style_preset": imageToImageRequest.style_preset,
        }
    )

    try:
        logger.info(f"Source image: {source_image}")
        image_bytes = generate_image_from_image(model_id=MODEL_ID, body=body)
        image = Image.open(io.BytesIO(image_bytes))
        epoch_time = int(time.time())

        if save_image_path is not None:
            generated_image_path = save_image_path
        elif save_image_folder is not None:
            generated_image_path = f"{save_image_folder}/image_{epoch_time}_{imageToImageRequest.seed}_{imageToImageRequest.sampler}_{imageToImageRequest.image_strength}_{imageToImageRequest.cfg_scale}_{imageToImageRequest.steps}_{imageToImageRequest.style_preset}.jpg"
        else:
            generated_image_path = f"{GENERATED_IMAGES}/image_{epoch_time}_{imageToImageRequest.seed}_{imageToImageRequest.sampler}_{imageToImageRequest.image_strength}_{imageToImageRequest.cfg_scale}_{imageToImageRequest.steps}_{imageToImageRequest.style_preset}.jpg"


        logger.info(f"Generated image: {generated_image_path}")
        image.save(generated_image_path, format="JPEG", quality=95)

    except ClientError as err:
        message = err.response["Error"]["Message"]
        logger.error("A client error occurred: %s", message)
    except ImageError as err:
        logger.error(err.message)

    else:
        logger.info(f"Finished generating image with SD3.5 Large model {MODEL_ID}.")

In [26]:
from PIL import Image
from IPython.display import display

def display_image(source_image_name, width=None, height=None):
    source_image = Image.open(source_image_name)
    if width and height:
        display(source_image.resize((width, height)))
    else:
        display(source_image)
    print(source_image_name)

### Define the Claude function

In [27]:
def invoke_claude(client, prompt, max_tokens_to_sample=2000, modelId="anthropic.claude-3-sonnet-20240229-v1:0", temperature=1, top_k=250, top_p=0.999, stop_sequences=[], retry=3):
    body_dict = {"messages": [
          {
            "role": "user",
            "content": [
              {
                "type": "text",
                "text": prompt
              }
            ]}],
            "max_tokens": max_tokens_to_sample,
            "temperature": temperature,
            "top_k": top_k,
            "top_p": top_p,
            "stop_sequences": stop_sequences+["\n\nHuman:"], 
            "anthropic_version": "bedrock-2023-05-31"}
    body = json.dumps(body_dict)

    request = {
      "modelId": modelId,
      "contentType": "application/json",
      "accept": "application/json",
      "body": body
    }
    
    for trial in range(retry):
        try:
            response = client.invoke_model(**request)
            response_body = json.loads(response.get('body').read())
            break
        except Exception as e:
            print(str(e))
            print("Bedrock request is throttled. Retry in a minute. (In production this should not happen.)")
            time.sleep(60)
    
    return response_body["content"][0]["text"]

## Let's start the Ad demo

### Generate the advertising concepts

In [28]:
prompt = """You are an expert game concept artist and world-builder for MysticMingle Games, specializing in creating immersive visual narratives for the Mystic Realms universe.

Your task is to generate comprehensive, creative concepts for game assets, characters, environments, and marketing materials that capture the unique essence of Mystic Realms in json format.

Key Generation Parameters:
1. Game Genre: Fantasy RPG with multi-dimensional magical world
2. Core Visual Themes: 
   - Elemental magic
   - Shapeshifting mechanics
   - Technological-magical fusion
   - Mythical creature interactions

Asset Generation Requirements:
- Detailed character designs with unique magical attributes
- Environment concepts showcasing diverse, interconnected realms
- Prop and asset designs that reflect the game's magical-technological aesthetic
- Color palettes that emphasize mystical and ethereal qualities


Target Audience: 
- Gamers aged 18-35
- Fans of complex, immersive fantasy worlds
- Players who enjoy strategic magical gameplay

Artistic Style: 
- High-detail, semi-realistic with mystical overtones
- Color palette: Jewel tones, ethereal blues, magical purples


json Output Format:
[
    {
        "assetType": "Character/Environment/Prop",
        "name": "Specific Asset Name",
        "description": "Detailed visual and thematic description",
        "uniqueFeatures": "Special magical or technological attributes"
    }
]


"""
client = boto3.client(service_name="bedrock-runtime")
result = invoke_claude(client, prompt)
print(result)

```json
[
  {
    "assetType": "Character",
    "name": "Arenys, the Shapeshifter Mage",
    "description": "A powerful sorceress with the ability to transform into mythical creatures. Her human form is a lithe, agile woman with long azure hair and glowing amethyst eyes. When shapeshifted, she takes on draconic qualities with scales shimmering like opals and wings that leave a trail of stardust.",
    "uniqueFeatures": "Shapeshifting abilities, elemental magic affinities, mystical energy aura"
  },
  {
    "assetType": "Environment",
    "name": "Arcanian Skylands",
    "description": "A realm suspended in the celestial skies, where floating islands drift amidst swirling nebulae. Crystal spires erupt from the islands, channeling raw magical energy. Atmospheric plants and trees flourish, their roots intertwined with pulsating ley lines. The skies are a kaleidoscope of auroras and cosmic phenomena.",
    "uniqueFeatures": "Anti-gravitational properties, dynamic celestial phenomena, float

In [29]:
def clean_json_string(result):
    if result.startswith('```json\n'):
        result = result[8:]
    if result.endswith('\n```'):
        result = result[:-4]
    return result.strip()

In [30]:
result = clean_json_string(result)

In [31]:
import json
data = json.loads(result)
for item in data:
    print(item['assetType'])
    print(item['name'])
    print(item['description'])
    print(item['uniqueFeatures'])
#     final_prompt = internal_prompt.format(
                    
#             )

Character
Arenys, the Shapeshifter Mage
A powerful sorceress with the ability to transform into mythical creatures. Her human form is a lithe, agile woman with long azure hair and glowing amethyst eyes. When shapeshifted, she takes on draconic qualities with scales shimmering like opals and wings that leave a trail of stardust.
Shapeshifting abilities, elemental magic affinities, mystical energy aura
Environment
Arcanian Skylands
A realm suspended in the celestial skies, where floating islands drift amidst swirling nebulae. Crystal spires erupt from the islands, channeling raw magical energy. Atmospheric plants and trees flourish, their roots intertwined with pulsating ley lines. The skies are a kaleidoscope of auroras and cosmic phenomena.
Anti-gravitational properties, dynamic celestial phenomena, floating architecture
Prop
Runeforged Technoblade
An elegant yet deadly weapon that fuses arcane runes with advanced technology. The blade is composed of a crystalline alloy that refracts l

### Parse the advertising concepts and generate prompts for Stable Image Ultra

In [32]:
import json

def parse_json_string_to_prompt(json_string):
    prompts = []
    internal_prompt = """
    You are an expert who wants to use the stable diffusion model to generate gaming assets and worlds. Please use the following content to generate the prompts for stable diffusion model:
    - "assetType": {assetType}
    - "name": {name}
    - "description": {description}
    - "uniqueFeatures": {uniqueFeatures}
    
    """
    tailend_prompt = """
    Output format should be json format as below:
    [
        {
            "positive_prompt": "prompt",
        }
    ]"""
    
    try:
        data = json.loads(json_string)
        for item in data:
            final_prompt = internal_prompt.format(
                assetType=item['assetType'],
                name=item['name'],
                description=item['description'],
                uniqueFeatures=item['uniqueFeatures'],
            )
            generated_prompt = invoke_claude(client, final_prompt + tailend_prompt)
            print(f"generated_prompt: {generated_prompt}")
            prompts.append(generated_prompt)
    except json.JSONDecodeError:
        print("Invalid JSON string")
    
    return prompts

# Sample JSON string

#json_string = result

prompts = parse_json_string_to_prompt(result)
print(prompts)


generated_prompt: [
    {
        "positive_prompt": "A powerful sorceress with long azure hair and glowing amethyst eyes, lithe and agile in her human form, transforming into a majestic draconic creature with opal-shimmering scales and wings trailing stardust, surrounded by an aura of mystical elemental energy, assetType:character, name:Arenys the Shapeshifter Mage"
    }
]
generated_prompt: [
    {
        "positive_prompt": "A breathtaking environment of floating islands suspended in a celestial sky, drifting amidst swirling nebulae and auroras, crystal spires erupting from the islands and channeling raw magical energy, atmospheric plants and trees with roots intertwined with pulsating ley lines, anti-gravitational properties, dynamic celestial phenomena, floating architecture, astonishing cosmic vistas, masterpiece, 8k, photorealistic"
    }
]
generated_prompt: [
    {
        "positive_prompt": "An elegant prop sword with a crystalline blade composed of a refracting alloy, with in

### Generate the Ad poster for the advertising concepts

In [33]:
import time 
for index, prompt_json in enumerate(prompts):
    prompt = json.loads(prompt_json)
    for item in prompt:
        if "positive_prompt" in item:
            POSITIVE_PROMPT =item["positive_prompt"]
            print(POSITIVE_PROMPT)

    SAVE_IMAGE_PATH = f"./generated_images/game_{index}.jpg"
    print(f"Attempting to save image to: {SAVE_IMAGE_PATH}")

    text_to_image_request(
        MODEL_ID,
        POSITIVE_PROMPT,
        SAVE_IMAGE_PATH
    )
    time.sleep(60)

2025-01-22 00:37:02,270 - INFO - Generating image with SD3.5 Large model stability.sd3-5-large-v1:0


A powerful sorceress with long azure hair and glowing amethyst eyes, lithe and agile in her human form, transforming into a majestic draconic creature with opal-shimmering scales and wings trailing stardust, surrounded by an aura of mystical elemental energy, assetType:character, name:Arenys the Shapeshifter Mage
Attempting to save image to: ./generated_images/game_0.jpg


2025-01-22 00:37:08,213 - INFO - Successfully generated image with the SD3.5 Large model stability.sd3-5-large-v1:0
2025-01-22 00:37:08,219 - INFO - Generated image: ./generated_images/game_0.jpg
2025-01-22 00:37:08,265 - INFO - Finished generating image with SD3.5 Large model stability.sd3-5-large-v1:0.


The generated image has been saved to ./generated_images/game_0.jpg.


2025-01-22 00:38:08,267 - INFO - Generating image with SD3.5 Large model stability.sd3-5-large-v1:0


A breathtaking environment of floating islands suspended in a celestial sky, drifting amidst swirling nebulae and auroras, crystal spires erupting from the islands and channeling raw magical energy, atmospheric plants and trees with roots intertwined with pulsating ley lines, anti-gravitational properties, dynamic celestial phenomena, floating architecture, astonishing cosmic vistas, masterpiece, 8k, photorealistic
Attempting to save image to: ./generated_images/game_1.jpg


2025-01-22 00:38:14,053 - INFO - Successfully generated image with the SD3.5 Large model stability.sd3-5-large-v1:0
2025-01-22 00:38:14,056 - INFO - Generated image: ./generated_images/game_1.jpg
2025-01-22 00:38:14,095 - INFO - Finished generating image with SD3.5 Large model stability.sd3-5-large-v1:0.


The generated image has been saved to ./generated_images/game_1.jpg.


2025-01-22 00:39:14,096 - INFO - Generating image with SD3.5 Large model stability.sd3-5-large-v1:0


An elegant prop sword with a crystalline blade composed of a refracting alloy, with intricate arcane runes etched along the edges, a pulsing power core in the hilt, emanating elemental energies, Runeforged Technoblade, highly detailed, 8k, photorealistic
Attempting to save image to: ./generated_images/game_2.jpg


2025-01-22 00:39:19,621 - INFO - Successfully generated image with the SD3.5 Large model stability.sd3-5-large-v1:0
2025-01-22 00:39:19,624 - INFO - Generated image: ./generated_images/game_2.jpg
2025-01-22 00:39:19,669 - INFO - Finished generating image with SD3.5 Large model stability.sd3-5-large-v1:0.


The generated image has been saved to ./generated_images/game_2.jpg.


2025-01-22 00:40:19,670 - INFO - Generating image with SD3.5 Large model stability.sd3-5-large-v1:0


A towering mechanized humanoid figure, fusion of gears, metal plates and conduits pulsing with ethereal energies, gemstones as visual receptors, slivers of metallic flesh imbued with faint consciousness, sentient machinery, mystical power source, morphing mechanical anatomy, epic fantasy, magic technology
Attempting to save image to: ./generated_images/game_3.jpg


2025-01-22 00:40:25,523 - INFO - Successfully generated image with the SD3.5 Large model stability.sd3-5-large-v1:0
2025-01-22 00:40:25,560 - INFO - Generated image: ./generated_images/game_3.jpg
2025-01-22 00:40:25,627 - INFO - Finished generating image with SD3.5 Large model stability.sd3-5-large-v1:0.


The generated image has been saved to ./generated_images/game_3.jpg.
