<a href="https://colab.research.google.com/github/Kanikaa1010/Stable-diffusion-portfolio/blob/main/notebooks/SD_V3_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



```
# This is formatted as code
```

# SD_V3.1.1

In [None]:
#@title PIP Installs

!pip install -q huggingface_hub==0.16.4 diffusers==0.21.4 transformers==4.32.1 accelerate==0.21.0 torch==2.0.1
!pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu118 diffusers transformers accelerate numpy --upgrade


Installs AI Tools: This code downloads and sets up essential Python libraries (like diffusers and transformers) often used for running AI models.

In [None]:
#@title   Import libraries and define basic functions
import numpy as np
import pandas as pd
import os
import torch
import time
import nltk
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime
from diffusers import StableDiffusionPipeline
from PIL import Image

# Download necessary NLTK data
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

# Memory optimization function
def optimize_memory():
    """Free up GPU memory"""
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
    import gc
    gc.collect()
    print("Memory optimized")

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.


* **Sets up for AI Tasks**: The code imports various tools and libraries like torch (for AI computations), PIL (for images), and nltk (for text processing), and defines a function to free up memory.
* **Downloads Data**: It downloads necessary data for the nltk library, used for text analysis.
* **Memory Management**: It includes a function optimize_memory to clear up GPU memory, which is useful when working with large AI models to prevent crashes.



In [None]:
#@title Initialize Stable Diffusion with memory optimizations
def initialize_stable_diffusion():
    # Create directories
    project_dir = "./sd_project"
    images_dir = os.path.join(project_dir, "images")
    os.makedirs(images_dir, exist_ok=True)

    # Determine device
    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(f"Using device: {device}")

    # Load the model with memory optimization
    model_id = "runwayml/stable-diffusion-v1-5"
    pipe = StableDiffusionPipeline.from_pretrained(
        model_id,
        torch_dtype=torch.float16 if device == "cuda" else torch.float32,
        revision="fp16" if device == "cuda" else "main",
        use_safetensors=True,
    )

    # Apply memory optimizations
    if device == "cuda":
        pipe.enable_attention_slicing(1)  # Slice attention to reduce memory

    # Move to device
    pipe = pipe.to(device)

    print("Model loaded successfully")
    return pipe, images_dir

# Initialize model
pipe, images_dir = initialize_stable_diffusion()

🧠 Loads AI Image Generator: Sets up the Stable Diffusion model to generate images from text prompts using pre-trained weights.

⚙️ Device Selection: Automatically chooses GPU (if available) for faster processing, otherwise falls back to CPU.

💾 Memory Optimization: Uses attention slicing — a technique that splits attention computation into tiny parts *(slicing=1)* to save GPU memory at the cost of a bit of speed. This is essential for running large models on limited hardware.

🚀 Ready to Use: The function returns a ready pipeline and a folder path to save generated images.

In [None]:
#@title  Create a simplified prompt enhancer
def enhance_prompt(prompt, quality_level="high"):
    """Enhance a prompt with quality descriptors"""
    quality_descriptors = {
        "low": "good quality",
        "medium": "high quality, detailed",
        "high": "high quality, detailed, sharp focus, 8k",
        "ultra": "masterpiece, best quality, ultra detailed, 8k HDR, sharp focus"
    }

    # Get quality descriptor
    quality = quality_descriptors.get(quality_level, quality_descriptors["high"])

    # Enhance prompt
    enhanced = f"{prompt}, {quality}"
    return enhanced

# Create a simplified negative prompt generator
def create_negative_prompt():
    """Create a standard negative prompt"""
    return ("deformed, distorted, disfigured, poorly drawn, bad anatomy, wrong anatomy, "
            "extra limb, missing limb, floating limbs, disconnected limbs, mutation, mutated, "
            "ugly, disgusting, blurry, low quality")

* **Improves Prompts:** The code defines functions to enhance the text prompts given to the AI model.
* **Quality Levels:** The enhance_prompt function adds quality descriptors (like "high quality" or "8k") based on the chosen level.
* **Negative Prompts:** The create_negative_prompt function generates a standard "negative prompt" (things to avoid in the image, like deformities) to improve image quality.

In [None]:
#@title Basic image generation function
def generate_image(pipe, prompt, negative_prompt=None, steps=30, guidance=7.5, height=512, width=512):
    """Generate an image with Stable Diffusion"""

    # Use default negative prompt if none provided
    if negative_prompt is None:
        negative_prompt = create_negative_prompt()

    # Optimize memory before generation
    optimize_memory()

    # Generate image
    start_time = time.time()
    print(f"Generating image with {steps} steps...")

    image = pipe(
        prompt=prompt,
        negative_prompt=negative_prompt,
        num_inference_steps=steps,
        guidance_scale=guidance,
        height=height,
        width=width
    ).images[0]

    end_time = time.time()
    print(f"Image generated in {end_time - start_time:.2f} seconds")

    return image

# Save image function
def save_image(image, images_dir, filename=None, prompt=None):
    """Save an image with optional metadata"""
    if filename is None:
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = f"image_{timestamp}.png"

    save_path = os.path.join(images_dir, filename)
    image.save(save_path)

    # Save prompt as text file if provided
    if prompt:
        text_path = save_path.replace('.png', '_prompt.txt')
        with open(text_path, 'w') as f:
            f.write(prompt)

    print(f"Image saved to {save_path}")
    return save_path

* **Generates Images:** The code defines a generate_image function that takes a text prompt and uses the Stable Diffusion model to create an image.
* **Customization:** You can control image generation with parameters like the number of steps, image size, and how closely the image should match the prompt. It also uses a negative prompt to specify what to avoid in the image.
* **Memory Optimization:** It calls optimize_memory before generating each image.
* **Saves Images:** The save_image function saves the generated images to a specified directory, optionally including the original prompt in a text file.

In [None]:
#@title Test single image generation
try:
    # Use a simple prompt
    prompt = "A small cottage in the forest"
    print(f"Original prompt: {prompt}")

    # Enhance prompt
    enhanced_prompt = enhance_prompt(prompt, "medium")  # Medium for less complexity
    print(f"Enhanced prompt: {enhanced_prompt}")

    # Generate negative prompt
    negative_prompt = create_negative_prompt()
    print(f"Using standard negative prompt")

    # Generate image (with small size for testing)
    image = generate_image(
        pipe=pipe,
        prompt=enhanced_prompt,
        negative_prompt=negative_prompt,
        steps=20,  # Lower steps for quicker generation
        height=384,  # Smaller size to conserve memory
        width=384
    )

    # Save and display the image
    save_image(image, images_dir, prompt=enhanced_prompt)
    from IPython.display import display
    display(image)

except Exception as e:
    print(f"Error in image generation: {e}")

    # If it fails, try with absolute minimal settings
    try:
        print("Attempting with minimal settings...")
        optimize_memory()

        minimal_image = pipe(
            prompt="Forest cottage",
            num_inference_steps=15,
            height=256,
            width=256
        ).images[0]

        save_image(minimal_image, images_dir, filename="minimal_test.png")
        display(minimal_image)

    except Exception as e2:
        print(f"Even minimal settings failed: {e2}")
        print("Your Colab environment may not have enough GPU resources for Stable Diffusion")

Original prompt: A small cottage in the forest
Enhanced prompt: A small cottage in the forest, high quality, detailed
Using standard negative prompt
Error in image generation: name 'pipe' is not defined
Attempting with minimal settings...
Memory optimized
Even minimal settings failed: name 'pipe' is not defined
Your Colab environment may not have enough GPU resources for Stable Diffusion


* **Tests Image Generation:** This code block attempts to generate and display a single image using the defined functions.
* **Prompt Enhancement:** It uses the enhance_prompt function to improve the initial prompt.
* **Error Handling:** If the initial image generation fails (due to memory issues or other problems), it tries again with much simpler settings to ensure the core functionality works.
* **Image Display:** It saves the generated image and displays it within the environment (like a Jupyter Notebook or Google Colab).

In [None]:
#@title  Simple parameter comparison
# Only run this if Step 6 succeeded

try:
    print("\nCreating simple parameter comparison...")
    optimize_memory()

    # Use a very simple prompt
    simple_prompt = "Mountain landscape"

    # Test different step counts (keeping everything else minimal)
    steps_to_test = [15, 30]
    images = []

    for steps in steps_to_test:
        print(f"Generating with {steps} steps...")
        img = pipe(
            prompt=simple_prompt,
            negative_prompt=create_negative_prompt(),
            num_inference_steps=steps,
            height=384,
            width=384
        ).images[0]

        images.append(img)
        save_image(img, images_dir, filename=f"steps_{steps}.png")

    # Display comparison
    fig, axes = plt.subplots(1, len(images), figsize=(12, 6))

    for i, (img, steps) in enumerate(zip(images, steps_to_test)):
        axes[i].imshow(img)
        axes[i].set_title(f"Steps: {steps}")
        axes[i].axis('off')

    plt.tight_layout()
    plt.savefig(os.path.join(images_dir, "steps_comparison.png"))
    plt.show()

except Exception as e:
    print(f"Parameter comparison failed: {e}")
    print("Try running with even smaller settings or on CPU")

* **Compares Image Quality:** This code tests how different settings (specifically, the number of steps) affect the quality of the generated images.
* **Iterative Generation:** It generates multiple images using the same prompt but with varying numbers of "steps" (a setting that affects image quality).
* **Visual Comparison:** It then displays these generated images side-by-side using Matplotlib, making it easy to see the impact of changing the "steps" parameter.
* **Error Handling:** Includes a try...except block to handle potential issues during image generation and provides suggestions for troubleshooting.

In [None]:
#@title  Test prompt variations
try:
    print("\nTesting prompt variations...")

    # Base prompt
    base_prompt = "A forest"

    # Create variations manually (to avoid NLTK complexity)
    variations = [
        f"{base_prompt}, photorealistic, high quality",
        f"{base_prompt}, digital art style, vibrant colors",
        f"{base_prompt}, oil painting, masterpiece"
    ]

    # Display the variations
    for i, var in enumerate(variations):
        print(f"Variation {i+1}: {var}")

    # Generate just the first variation to demonstrate
    print(f"Generating image for variation 1...")
    optimize_memory()

    var_image = pipe(
        prompt=variations[0],
        negative_prompt=create_negative_prompt(),
        num_inference_steps=20,
        height=384,
        width=384
    ).images[0]

    save_image(var_image, images_dir, filename="prompt_variation.png", prompt=variations[0])
    display(var_image)

except Exception as e:
    print(f"Prompt variation test failed: {e}")

* **Explores Different Prompts:** This code tests how different text prompts affect the generated images.
* **Prompt Variations:** It creates several variations of a base prompt (e.g., "A forest") by adding different artistic styles or descriptive words.
* **Demonstration:** The code then generates and displays an image using the first prompt variation to demonstrate the effect of the changes.
* **Error Handling:** Includes a try...except block to catch any potential errors during the image generation process.