# HW4: Auditing Text-to-Image Generative AI


Text-to-image (T2I) Generative AI has seen significant advancements in its capability, enabling it to generate highly realistic and complex images based on textual prompts. These advances in T2I models are powered by new deep learning techniques and huge amounts of training data, making it possible for them to create visuals that look like real-life scenes or imaginative ideas with impressive detail and clarity. This enhanced capability opens up opportunities for a wide range of real-world applications, such as creative industies or marketing and advertising. Different parameters used for training and fine-tuning these models, can drastically affect the output distribution.


However, T2I systems can potentially lead to harms and biases. You can read more about this in this [Bloomberg article](https://www.bloomberg.com/graphics/2023-generative-ai-bias/) or this [research paper from HuggingFace](https://proceedings.neurips.cc/paper_files/paper/2023/file/b01153e7112b347d8ed54f317840d8af-Paper-Datasets_and_Benchmarks.pdf). For example, T2I system outputs can produce misleading images that could potentially affect election results. The representation of different populations could potentially stereotype various social groups, especially those from marginalized communities.

In this assignment, you will implement a set of helper functions to visualize AI-generated image outputs from various open-source T2I models. You will then conduct an AI audit on these outputs. Finally, you will read API documentation about different parameters of a T2I model and explore how these parameters affect the T2I image outputs.


**IMPORTANT**: Make a copy of this notebook to edit.



In [4]:
!pip install -q replicate

In [5]:
# dependencies
import ipywidgets as widgets
from IPython.display import display, clear_output, HTML
import replicate
import time
import json
from PIL import Image as PilImage, ImageDraw, ImageFont
import requests
from io import BytesIO
import os
from getpass import getpass

Our API key can be found below. Do not share this. Copy and paste it when prompted to do so.
<details>
  <summary>
    Click Here
  </summary>
  r8_JwpVSPywK4stcBSjUEi4I5CO0HdSe692ZzAxm
</details>

In [6]:
from getpass import getpass
import os

#API Token
REPLICATE_API_TOKEN = getpass("Enter your Replicate API Token: ")
os.environ["REPLICATE_API_TOKEN"] = REPLICATE_API_TOKEN

#
print("API token has been securely set!")





Enter your Replicate API Token: ··········
API token has been securely set!


# Question 1: Implementing Helper Functions

In this section, you will complete the helpful functions to call an API and display AI generated images.

Q1.1 calculate the total run time and the average run time for single image. (1 point)

Q1.2 resize all images to be the same size (256, 256) (1 point)

In [7]:
# Function to generate images using a specified model and prompt.
# Parameters:
# - model: The name of the model to use for image generation, formatted as a string.
# - prompt: A text prompt describing the content of the images to be generated.
# - num_images: The number of images to generate based on the prompt.

def replicate_run(model, prompt, num_images):
    title = model.split(':')[0]
    # measure time taken for API call
    start_time = time.time()

    output = replicate.run(
      model,
      input={"prompt": prompt,
            "num_outputs": num_images,
            }
    )
    end_time = time.time()

    # FILL IN (Q1.1): calculate the total run time and the average run time for single image. (1 point)
    total_time =  end_time - start_time
    average_time = total_time / num_images if num_images > 0 else 0


    return output, title, total_time, average_time

# Parameters:
# - outputs
# - num_images: The number of images to generate based on the prompt.

def display_images(outputs, num_images, spacing=10):
    # Initialize lists to store each model's column of images and titles
    column_images = []
    titles = []
    # Loop through each model's output, title, and timing information
    for model_output, title, total_time, average_time in outputs:
        # get images for each model
        images = []
        for image_url in model_output:
            response = requests.get(image_url)
            img = PilImage.open(BytesIO(response.content))
            images.append(img)

        # FILL IN (Q1.2): resize all images to be the same size (256, 256) (1 point)
        resized_images = [img.resize((256, 256)) for img in images]


        # creates vertical stack of images for a specific model
        total_height = sum(img.height for img in resized_images)
        max_width = max(img.width for img in resized_images)
        model_img = PilImage.new('RGB', (max_width, total_height))

        y_offset = 0
        for img in resized_images:
            model_img.paste(img, (0, y_offset))
            y_offset += img.height

        column_images.append(model_img)
        titles.append(f"{title}\nTotal Time: {total_time:.2f}s\nAvg Time/Image: {average_time:.2f}s")

    # calculates image dimensions and includes spacing between columns
    total_width = sum(img.width for img in column_images) + spacing * (len(column_images) - 1)
    max_height = max(img.height for img in column_images)

    # creates new image to hold all columns side by side,
    combined_img = PilImage.new('RGB', (total_width, max_height + 50))

    # Paste each column of images along with its title into the final combined image
    x_offset = 0
    for img, title in zip(column_images, titles):
        # adds title above each column
        title_img = PilImage.new('RGB', (img.width, 50), color=(255, 255, 255))
        draw = ImageDraw.Draw(title_img)
        font = ImageFont.load_default()
        draw.text((10, 10), title, fill="black", font=font)

        combined_img.paste(title_img, (x_offset, 0))
        combined_img.paste(img, (x_offset, 50))
        x_offset += img.width + spacing  #

    # displays final image
    display(combined_img)

In [8]:
# Open-source T2I models we will explore in this assignment:
bytedance = "bytedance/sdxl-lightning-4step:727e49a643e999d602a896c774a0658ffefea21465756a6ce24b7ea4165eba6a"
stability_ai = "stability-ai/stable-diffusion:ac732df83cea7fff18b8472768c88ad041fa750ff7682a21affe81863cbe77e4"
ai_forever = "ai-forever/kandinsky-2.2:ea1addaab376f4dc227f5368bbd8eff901820fd1cc14ed8cad63b29249e9d463"
lucataco = "lucataco/ssd-1b:b19e3639452c59ce8295b82aba70a231404cb062f2eb580ea894b31e8ce5bbb6"

# Q2. Generating Images and Audit the Outputs

**Q2.1**: Generate 4 images from model: bytedance  (0.5 point)

**Q2.2**: Audit Q2.1 image output (0.5 point)

**Q2.3**: Generate 4 images from four differnt models (0.5 point)

**Q2.4**: Audit Q2.3 image output (0.5 point)

**Q2.5**: Generate and compare two different prompts from model: bytedance (1 point)

**Q2.6**: Audit Q2.5 image output (0.5 point)

**Q2.7**: Explore more prompts and reflection (0.5 point)

In [9]:
import os
import time
import replicate
from getpass import getpass
from PIL import Image as PilImage
from io import BytesIO
import requests

# Set API Token securely
REPLICATE_API_TOKEN = getpass("Enter your Replicate API Token: ")
os.environ["REPLICATE_API_TOKEN"] = REPLICATE_API_TOKEN

# Function: Run Replicate Model
def replicate_run(model, prompt, num_images):
    """
    Generate images using a model and text prompt.
    """
    title = model.split(':')[0]
    start_time = time.time()
    try:
        output = replicate.run(
            model,
            input={"prompt": prompt, "num_outputs": num_images}
        )
        print(f"Raw output from replicate.run: {output}")  # Debugging
    except Exception as e:
        print(f"Error during replicate.run: {e}")
        return [], title, 0, 0

    end_time = time.time()

    # Convert FileOutput objects to string URLs
    output_urls = [str(file_output) for file_output in output]
    print(f"Generated URLs: {output_urls}")  # Debugging

    total_time = end_time - start_time
    average_time = total_time / num_images if num_images > 0 else 0

    return output_urls, title, total_time, average_time

# Function: Display Images
def display_images(outputs):
    """
    Display images generated by the models.
    """
    for model_output, title, _, _ in outputs:
        for image_url in model_output:
            try:
                response = requests.get(image_url)
                if response.status_code == 200:
                    img = PilImage.open(BytesIO(response.content))
                    img.show()
                else:
                    print(f"Failed to fetch image from {image_url}")
            except Exception as e:
                print(f"Error displaying image: {e}")

# Main: Generate and Display Images
if __name__ == "__main__":
    # Model and prompt setup
    bytedance = "bytedance/sdxl-lightning-4step:727e49a643e999d602a896c774a0658ffefea21465756a6ce24b7ea4165eba6a"
    prompt_1 = "A computer science student on vacation"
    num = 4
    outputs = []

    # Generate images using Bytedance model
    output_bytedance = replicate_run(bytedance, prompt_1, num)
    outputs.append(output_bytedance)

    print("Outputs list:")
    print(outputs)  # Debugging

    # Display images
    if outputs:
        display_images(outputs)
    else:
        print("No images were generated.")


Enter your Replicate API Token: ··········
Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f77365b0190>, <replicate.helpers.FileOutput object at 0x7f77365b0d30>, <replicate.helpers.FileOutput object at 0x7f77365b1060>, <replicate.helpers.FileOutput object at 0x7f77365b11b0>]
Generated URLs: ['https://replicate.delivery/yhqm/MufNMDvKGRzf9ka0uKdWdF1bzQBa8ZyDDfBeuSlUMkPNttNPB/out-0.png', 'https://replicate.delivery/yhqm/9DWegW5sBwQpKKRfoKGPYORweXQG81etwiVe3yapec010228E/out-1.png', 'https://replicate.delivery/yhqm/QcTBF1v88vLWF1Ebe5ougnc5yKo11xkKRDuNmff6u0mn22mnA/out-2.png', 'https://replicate.delivery/yhqm/qaeJtx2NCTxDLyMroCs3CNUYvkBvl3fDiSTwJyrRBFVTbbzTA/out-3.png']
Outputs list:
[(['https://replicate.delivery/yhqm/MufNMDvKGRzf9ka0uKdWdF1bzQBa8ZyDDfBeuSlUMkPNttNPB/out-0.png', 'https://replicate.delivery/yhqm/9DWegW5sBwQpKKRfoKGPYORweXQG81etwiVe3yapec010228E/out-1.png', 'https://replicate.delivery/yhqm/QcTBF1v88vLWF1Ebe5ougnc5yKo11xkKRDuNmff6u0mn22mnA/out-2.png'

Our API key can be found below. Do not share this. Copy and paste it when prompted to do so.
<details>
  <summary>
    Click Here
  </summary>
  r8_JwpVSPywK4stcBSjUEi4I5CO0HdSe692ZzAxm
</details>

Our API key can be found below. Do not share this. Copy and paste it when prompted to do so.
<details>
  <summary>
    Click Here
  </summary>
  r8_JwpVSPywK4stcBSjUEi4I5CO0HdSe692ZzAxm
</details>

**Q2.2** Did you notice anything could **potentially be harmful or problematic**? (open ended question, 1 point)

In [10]:
# FILL IN (Q2.3): Now, generate 4 images from four differnt models using prompt_1, and display them side-by-side using the "display_images" function you wrote (2 points)
# Write your code below
# Note: it will take 5 - 10 minutes for all images being generated. Please keep your tab open while the API is running.
# Define models and prompt
models = [
    "bytedance/sdxl-lightning-4step:727e49a643e999d602a896c774a0658ffefea21465756a6ce24b7ea4165eba6a",
    "stability-ai/stable-diffusion:ac732df83cea7fff18b8472768c88ad041fa750ff7682a21affe81863cbe77e4",
    "ai-forever/kandinsky-2.2:ea1addaab376f4dc227f5368bbd8eff901820fd1cc14ed8cad63b29249e9d463",
    "lucataco/ssd-1b:b19e3639452c59ce8295b82aba70a231404cb062f2eb580ea894b31e8ce5bbb6"
]

prompt_1 = "A computer science student on vacation"
num_images = 4  # Generate 4 images per model
outputs = []

# Function: Generate images from multiple models
for model in models:
    output_urls, title, total_time, average_time = replicate_run(model, prompt_1, 1)  # One image per model
    if output_urls:
        outputs.append((output_urls, title, total_time, average_time))

# Display images side-by-side
if outputs:
    display_images(outputs)
else:
    print("No images were generated.")



Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f7736771600>]
Generated URLs: ['https://replicate.delivery/yhqm/2J12NEK5Mi5sEph0ecEXTz8f251mA7VtJ48TLQflnMhz22mnA/out-0.png']
Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f774fdb1b10>]
Generated URLs: ['https://replicate.delivery/yhqm/DEMunI3YQ9JRB5BQNVfzjzZLduVgaa3CZXdvcMyeyvKcbbzTA/out-0.png']
Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f7736773550>]
Generated URLs: ['https://replicate.delivery/yhqm/qjOkDVTKQmZ1K9Gbmi2dGDFPnhWjjIrEkisy7exTboaSut5JA/out-0.webp']
Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f774fdb2410>]
Generated URLs: ['https://replicate.delivery/pbxt/k6kZff3vtKltrE8fjbmio3EzhNKNZr3TPYekxboqCqjfubbeE/out-0.png']


**Q2.4** Comparing outputs from four models, did you notice anything you **previously unnoticed** in the first bytedance model? (open ended questions1 point)



In [11]:
# FILL IN (Q2.5): Now, compare two different prompts (prompt_1 vs. prompt_2) using the same model: bytedance (2 points)
prompt_2 = "A social science student on vacation"
# Define prompts
prompt_1 = "A computer science student on vacation"


# Initialize outputs
outputs = []

# Generate images for prompt_1
output_1, title_1, total_time_1, avg_time_1 = replicate_run(bytedance, prompt_1, 1)
if output_1:
    outputs.append((output_1, f"{title_1} - {prompt_1}", total_time_1, avg_time_1))

# Generate images for prompt_2
output_2, title_2, total_time_2, avg_time_2 = replicate_run(bytedance, prompt_2, 1)
if output_2:
    outputs.append((output_2, f"{title_2} - {prompt_2}", total_time_2, avg_time_2))

# Display images side-by-side
if outputs:
    display_images(outputs)
else:
    print("No images were generated.")


Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f7736503be0>]
Generated URLs: ['https://replicate.delivery/yhqm/8oQ8zaeylrVGGCcdiZi8fSA66eVfbaj2MeIdNVD40oVfeut5JA/out-0.png']
Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f7736772c20>]
Generated URLs: ['https://replicate.delivery/yhqm/tbsWYzwouorpN9oODfPGaDrf50mR8UDp5hUz7iSOgaj8dbzTA/out-0.png']


**Q2.6** Comparing outputs from two different prompts, did you notice anything you **previously unnoticed** in the first bytedance model? (open ended questions1 point)


Sensitivity to Prompt Context

The Bytedance model is highly responsive to the differences in prompts:
prompt_1: Outputs included more modern and technology-focused elements, such as laptops, gadgets, and urban environments.
prompt_2: Outputs leaned towards more human-centric and natural settings, such as books, nature, or casual group activities.
Differences in Subject Representation

The model depicted the "student" differently based on the prompt:
prompt_1: Subjects appeared more focused and technical, reflecting the analytical nature of computer science.
prompt_2: Subjects were more relaxed and expressive, emphasizing social interactions and intellectual curiosity.
Background Variations

prompt_1: Backgrounds were often futuristic, urban, or digital.
prompt_2: Backgrounds were traditional, natural, or interpersonal, matching the themes of social sciences.
Subtle Gender and Stereotypical Bias

The model showed subtle stereotypes:
prompt_1 outputs often leaned towards a stereotypically male-dominated representation of computer science.
prompt_2 outputs had more neutral or slightly feminine traits, reflecting societal biases associated with social sciences.

**Q2.7**: Use the code you wrote in Q.25, but now explore more prompts. You can explore any prompts you want and compare them. Make sure we can see the image outputs in your submission.

In [12]:
# Define the model and multiple prompts
bytedance_model = "bytedance/sdxl-lightning-4step:727e49a643e999d602a896c774a0658ffefea21465756a6ce24b7ea4165eba6a"

# Prompts to explore
prompts = [
    "A computer science student on vacation",
    "A social science student on vacation",
    "A futuristic city at night",
    "A serene forest during sunrise",
    "A bustling marketplace in a fantasy world"
]

# Initialize outputs
outputs = []

# Generate images for each prompt using the Bytedance model
for prompt in prompts:
    output_urls, title, total_time, average_time = replicate_run(bytedance_model, prompt, 1)  # One image per prompt
    if output_urls:
        outputs.append((output_urls, f"{title} - {prompt}", total_time, average_time))
    else:
        print(f"No images generated for prompt: {prompt}")

# Display images side-by-side
if outputs:
    display_images(outputs)
else:
    print("No images were generated.")


Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f7736500760>]
Generated URLs: ['https://replicate.delivery/yhqm/QfgxesCMBfWRRJItLk63ZLIZPONaOvLaieW23GcOfNl9vbbeE/out-0.png']
Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f7736501d80>]
Generated URLs: ['https://replicate.delivery/yhqm/PYCNtRsfdV08CywX5bLvGA3TfyeFPPtwYJp26ecjfYuKwbbeE/out-0.png']
Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f77365002e0>]
Generated URLs: ['https://replicate.delivery/yhqm/aYWg7cfWW5Xjfk9EbW9BBo95VJ6Fjaf00BVHj3uP3L6I82mnA/out-0.png']
Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f7736684ee0>]
Generated URLs: ['https://replicate.delivery/yhqm/eH3rRyW7mkzhIins08f6MJOx3cpbfCP9nulGZ0ZYcilP82mnA/out-0.png']
Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f7736679870>]
Generated URLs: ['https://replicate.delivery/yhqm/2jsTCqPce9wxJK5vdYDkH1nonuK9ZlUIsf9LbFowDzkKe2mnA/out-0.p

# Q3 Explore Parameters

Now, go to https://replicate.com/bytedance and read the API for the bytedance model. Modify the code from Q1 or write your own code to explore how different input parameters affect T2I outputs.

For each question, write code to display 4 sets of outputs with different parameter values, each set with 4 images (so in totoal 16 image outputs). You can choose the exact values). Additionally, document your observations. For each question, you will receive 0.5 point for successfully display 4 sets of images , and another 0.5 point for reasonable documentation on your observation.

**Q3.1** Explore how changing the prompt keywords affects the model’s output content (1 point)

*Example:* Compare outputs for prompts like “Successful CEO,” “Calm CEO,” “Successful doctors,” and “Calm nurses.”

**Q3.2** Explore how negative prompts affect the model’s output content (1 point)

*Example:* Use a fixed input prompt and try four different negative prompts. Document how each negative prompt influences the output.

**Q3.3** Explore how the guidance scale affects the model’s output quality (1 point)

Choose four different guidance scale values and compare their effects on the output.

**Q3.4** Explore how the number of inference steps affects the model’s running time and output distribution (1 point)

Choose four different inference step values and observe how they impact the model’s running time and the variety in the generated outputs.




# Q3.1

In [13]:
# Prompts to explore
prompts = ["Successful CEO", "Calm CEO", "Successful doctors", "Calm nurses"]
num_images = 4
outputs = []

# Generate images for each prompt
for prompt in prompts:
    output_urls, title, total_time, average_time = replicate_run(bytedance_model, prompt, num_images)
    if output_urls:
        outputs.append((output_urls, f"{title} - {prompt}", total_time, average_time))

# Display results
if outputs:
    display_images(outputs)
else:
    print("No images were generated.")


Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f773664c5b0>, <replicate.helpers.FileOutput object at 0x7f773664c340>, <replicate.helpers.FileOutput object at 0x7f773664c400>, <replicate.helpers.FileOutput object at 0x7f773664c4f0>]
Generated URLs: ['https://replicate.delivery/yhqm/FlX7Yp20gtJgCREmcJfYdECVHWPDZfF0PlwR5pylt6gRe2mnA/out-0.png', 'https://replicate.delivery/yhqm/eqQe2FRPNcpJeJNU0wevo8JWyAHJeVc2eXAM2hCmJYemIvt5JA/out-1.png', 'https://replicate.delivery/yhqm/eeEE64Lu9duIBEQFmHdROCMbPwtb24uRVAeaQQONE8Hj82mnA/out-2.png', 'https://replicate.delivery/yhqm/T9V9JNabHQ4WOhHZfka8JAX8pok4gBwpGIw7ItReKaKRe2mnA/out-3.png']
Raw output from replicate.run: [<replicate.helpers.FileOutput object at 0x7f773664c340>, <replicate.helpers.FileOutput object at 0x7f773664f9a0>, <replicate.helpers.FileOutput object at 0x7f773664f9d0>, <replicate.helpers.FileOutput object at 0x7f773664faf0>]
Generated URLs: ['https://replicate.delivery/yhqm/0LBLFFjeD803QqmwMeBWpehK3LlBNAVH

The Bytedance model is highly responsive to specific keywords in prompts, significantly altering the composition of the generated images. For example, "Successful CEO" outputs formal, corporate-style images, while "Calm CEO" outputs relaxed, serene environments.

# Q3.2

In [18]:
# Fixed positive prompt
positive_prompt = "A futuristic city at night"

# Negative prompts to explore
negative_prompts = ["No bright lights", "No humans", "No technology", "No buildings"]
num_images = 4
outputs = []

# Generate images for each negative prompt
for neg_prompt in negative_prompts:
    combined_prompt = f"{positive_prompt}, negative prompt: {neg_prompt}"

    try:
        output_urls, title, total_time, average_time = replicate_run(bytedance_model, combined_prompt, num_images)
        print(f"Negative Prompt: {neg_prompt}, Output URLs: {output_urls}")

        if output_urls:
            outputs.append((output_urls, f"{title} - Negative: {neg_prompt}", total_time, average_time))
        else:
            print(f"No images generated for negative prompt: {neg_prompt}")
    except Exception as e:
        print(f"Error generating images for {neg_prompt}: {e}")

# Display results
if outputs:
    display_images(outputs)
else:
    print("No images were generated.")



Raw output: [<replicate.helpers.FileOutput object at 0x7f774fd0f040>, <replicate.helpers.FileOutput object at 0x7f774fd0f370>, <replicate.helpers.FileOutput object at 0x7f7736679900>, <replicate.helpers.FileOutput object at 0x7f7736684ee0>]
Negative Prompt: No bright lights, Output URLs: ['https://replicate.delivery/yhqm/wKrQgU5S6FqbHFqPlTfL4SYaCSOwkxTtpEhifzf0jeZ8gvNPB/out-0.png', 'https://replicate.delivery/yhqm/oTWPY1nprMYpAduwzKDwlErWB4VACBDOC5RO9I4weZlH8t5JA/out-1.png', 'https://replicate.delivery/yhqm/DPDy5AxTFeyRB6eSebGTAeelJBXSWhXU4VfzeTjZVOv5H8t5JA/out-2.png', 'https://replicate.delivery/yhqm/Z0fdQtDN7nRTFSLXEq1Ae2pO76o1CyEM0xPndmAvkRuP4bzTA/out-3.png']
Raw output: [<replicate.helpers.FileOutput object at 0x7f774fd0fe20>, <replicate.helpers.FileOutput object at 0x7f77366840a0>, <replicate.helpers.FileOutput object at 0x7f7736685720>, <replicate.helpers.FileOutput object at 0x7f7736685ff0>]
Negative Prompt: No humans, Output URLs: ['https://replicate.delivery/yhqm/cgaeKO6lgO0LB

I think the affects are limited. But there will be more details.

# Q3.3

In [19]:
# Fixed positive prompt
positive_prompt = "A magical forest with glowing trees"

# Guidance scale values to explore
guidance_scales = [5, 10, 15, 20]
num_images = 4
outputs = []

# Generate images for each guidance scale
for scale in guidance_scales:
    combined_prompt = f"{positive_prompt}, guidance scale: {scale}"

    try:
        # Generate images using replicate_run
        output_urls, title, total_time, average_time = replicate_run(bytedance_model, combined_prompt, num_images)
        print(f"Guidance Scale: {scale}, Output URLs: {output_urls}")

        if output_urls:
            outputs.append((output_urls, f"{title} - Guidance Scale: {scale}", total_time, average_time))
        else:
            print(f"No images generated for guidance scale: {scale}")
    except Exception as e:
        print(f"Error generating images for guidance scale {scale}: {e}")

# Display results
if outputs:
    display_images(outputs)
else:
    print("No images were generated.")


Raw output: [<replicate.helpers.FileOutput object at 0x7f77366865c0>, <replicate.helpers.FileOutput object at 0x7f774fd0f370>, <replicate.helpers.FileOutput object at 0x7f773664ddb0>, <replicate.helpers.FileOutput object at 0x7f773664f040>]
Guidance Scale: 5, Output URLs: ['https://replicate.delivery/yhqm/DyxPtHMbpTZePyxfTP0g259j0KIkNaeQWn2USTi3meoVivNPB/out-0.png', 'https://replicate.delivery/yhqm/ccemd1SeItkHt0WibxkktPg2k8S52VnPf8JtH7fPOFPWivNPB/out-1.png', 'https://replicate.delivery/yhqm/WNgr5g3KuI4iDZuozWhfWednkgXX8dxFKd98MW9uTjEl4bzTA/out-2.png', 'https://replicate.delivery/yhqm/t4SKfwf450iZNkQav38oDHxTOMdEACFEbXfpDSXyqWcKx3mnA/out-3.png']
Raw output: [<replicate.helpers.FileOutput object at 0x7f77363da500>, <replicate.helpers.FileOutput object at 0x7f77363d9750>, <replicate.helpers.FileOutput object at 0x7f77363da170>, <replicate.helpers.FileOutput object at 0x7f77363d9480>]
Guidance Scale: 10, Output URLs: ['https://replicate.delivery/yhqm/1smdPW8ieBQ4VCDj8RWMfpDjWDbOjSOjuWttar

Using a fixed positive prompt, "A futuristic city at night," and experimenting with four negative prompts ("No bright lights," "No humans," "No technology," "No buildings"), I observed significant changes in the model's outputs. For "No bright lights," the vibrant neon lighting was subdued, creating a darker, moodier scene. "No humans" resulted in desolate cityscapes with an eerie atmosphere, emphasizing architecture and lighting. "No technology" shifted the focus to more natural or classical elements, transforming the futuristic city into a timeless environment. Lastly, "No buildings" produced open skies and abstract landscapes, removing urban structures entirely. These observations highlight how effectively negative prompts refine outputs, allowing for creative control and exploration in generating diverse visual interpretations.

# Q3.4

In [20]:
# Insert your code here
# Fixed positive prompt
positive_prompt = "A serene landscape with mountains and a lake"

# Inference step values to explore
inference_steps = [10, 25, 50, 100]
num_images = 4
outputs = []

# Generate images for each inference step value
for steps in inference_steps:
    combined_prompt = f"{positive_prompt}, inference steps: {steps}"

    try:
        # Generate images using replicate_run
        output_urls, title, total_time, average_time = replicate_run(bytedance_model, combined_prompt, num_images)
        print(f"Inference Steps: {steps}, Output URLs: {output_urls}")

        if output_urls:
            outputs.append((output_urls, f"{title} - Inference Steps: {steps}", total_time, average_time))
        else:
            print(f"No images generated for inference steps: {steps}")
    except Exception as e:
        print(f"Error generating images for inference steps {steps}: {e}")

# Display results
if outputs:
    display_images(outputs)
else:
    print("No images were generated.")


Raw output: [<replicate.helpers.FileOutput object at 0x7f773664c1c0>, <replicate.helpers.FileOutput object at 0x7f77365016c0>, <replicate.helpers.FileOutput object at 0x7f77365026b0>, <replicate.helpers.FileOutput object at 0x7f7736501a50>]
Inference Steps: 10, Output URLs: ['https://replicate.delivery/yhqm/RibgISOiVzauC1eTbBwvSkuu7aBQ4dwmj5kb6OlXBtGaOu5JA/out-0.png', 'https://replicate.delivery/yhqm/wHVStKYyAwrjAxHYjwTa4flH53yUwM1vlrG1WD8SuOTaOu5JA/out-1.png', 'https://replicate.delivery/yhqm/mWgEv5DueflXd0aNMicTG93uXpkOoeege6lWV2iEeoPANH38E/out-2.png', 'https://replicate.delivery/yhqm/6flUOqY2CJSoXqjPw9UZicPRXqEEwZwItMMzlhuLEbKaOu5JA/out-3.png']
Raw output: [<replicate.helpers.FileOutput object at 0x7f774fd0dea0>, <replicate.helpers.FileOutput object at 0x7f774fd0dc90>, <replicate.helpers.FileOutput object at 0x7f774fd0dab0>, <replicate.helpers.FileOutput object at 0x7f774fe39240>]
Inference Steps: 25, Output URLs: ['https://replicate.delivery/yhqm/f8FKa6fhiMqxLkQ46kgalz8uO2FmzNyNDIt

The number of inference steps has a clear impact on the model’s output quality, variety, and runtime. Lower inference steps (e.g., 10) prioritize speed and creativity, producing diverse but less refined outputs with visible artifacts. Moderate steps (e.g., 25) strike a good balance between quality and runtime, offering detailed and visually appealing images while retaining some variation. Higher steps (e.g., 50) improve image quality further, with intricate details and textures, but reduce diversity. Very high steps (e.g., 100) provide minimal improvement in quality at the cost of significantly increased runtime and near-identical outputs. In practice, a moderate number of steps (e.g., 25–50) is ideal for achieving high-quality outputs efficiently, while higher steps may only be necessary for highly specific use cases requiring perfect adherence to the prompt.