# Install Required Libraries
In this section, we install the required libraries for Generative AI, focusing on image generation using Stable Diffusion.


Installing the Required External Libraries via pip : 

In [None]:
!pip install diffusers transformers accelerate

# Here we are installing the required libraries with pip i.e the native package manager of python
# Additionally we are using the '!' before the command, which lets us run terminal Commands in Colab's runtime

# Introduction to Generative AI
Generative AI involves creating new content, such as images, text, or audio, using AI models. For image generation:
- **Generator**: Creates new images based on input (e.g., a text prompt).
- **Discriminator**: Evaluates the generated content (mainly used in GANs).

In this notebook, we'll use a pre-trained Stable Diffusion model to generate images from text prompts.


# Import Libraries
Import the necessary libraries for using the Stable Diffusion model.


Importing the necessary packages

In [None]:
# First, we make sure that the necessary libraries are available in our runtime environment.
# Now, we import the specific packages that we'll use in this program.

# Import the StableDiffusionPipeline class from the diffusers library,
# which allows us to generate images from text prompts using a pre-trained model.
from diffusers import StableDiffusionPipeline

# Import the PyTorch library, which provides tools for deep learning and tensor computations.
import torch

# Import the Image module from the PIL (Python Imaging Library) package,
# which helps us to work with images (opening, saving, and manipulating images).
from PIL import Image


# Load a Pre-trained Model
Here we load a pre-trained Stable Diffusion model from the `diffusers` library. This model generates images based on a given text prompt.


In [None]:
# Load the Stable Diffusion model pipeline from the pretrained model hosted on the Hugging Face Hub.
# "runwayml/stable-diffusion-v1-5" is the specific version of the model we want to use.
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")

# Move the model to the GPU (CUDA) if available for faster processing;
# otherwise, use the CPU. This helps the program run more efficiently depending on the hardware.
pipe = pipe.to("cuda" if torch.cuda.is_available() else "cpu")

# Print a message to confirm the model has been successfully loaded and is ready to use.
print("Model loaded successfully!")


# Generate an Image
Provide a text prompt to the Stable Diffusion model to generate an image.


In [None]:
# Ask the user to enter a text description (called a "prompt") that they want to turn into an image.
prompt = input("Enter prompt : ")

# Use the Stable Diffusion model pipeline to generate an image based on the user's text prompt.
# The pipeline returns a list of generated images; we take the first one ([0]).
image = pipe(prompt).images[0]

# Show the generated image in a separate window using the default image viewer.
image.show()


Enter prompt : A couple is walking in a snow valley in sunsine weather


  0%|          | 0/50 [00:00<?, ?it/s]

# Save the Generated Image
Save the generated image to the local directory for future use.


In [None]:
# Save the generated image to a file named "generated_image7.png" on your computer.
image.save("generated_image7.png")

# Print a message to let the user know that the image has been saved successfully.
print("Image saved as 'generated_image7.png'")


Image saved as 'generated_image7.png'


# Generator and Discriminator Concepts
- **Generator**: In this case, the Stable Diffusion model acts as the generator, transforming a text prompt into an image.
- **Discriminator**: While not explicitly used here, in other models like GANs (Generative Adversarial Networks), the discriminator evaluates the quality of generated images.

This distinction is important in understanding how Generative AI works:
1. The generator creates content.
2. The discriminator evaluates and provides feedback (if applicable).


# Install Libraries for Discriminator
We need `torchvision` to use a pre-trained ResNet-50 model for evaluating or classifying the generated image.


In [None]:
# This command installs the 'torchvision' library directly within the notebook environment.
# 'torchvision' provides additional tools and datasets for computer vision tasks in PyTorch.
# The exclamation mark (!) lets us run shell commands like pip install from inside the notebook.
!pip install torchvision


# Load a Pre-trained Discriminator
Here, we load a ResNet-50 model pre-trained on ImageNet. This model acts as the discriminator to classify the generated image.


In [None]:
# Import tools to transform images and load pre-trained models from torchvision
import torchvision.transforms as transforms
from torchvision.models import resnet50
import torch.nn as nn

# Load the ResNet-50 model pre-trained on the ImageNet dataset
# This model will be used as a "discriminator" to evaluate images
discriminator = resnet50(pretrained=True)

# Set the model to evaluation mode, which turns off training-specific features like dropout
discriminator.eval()

# Define a series of transformations to prepare input images for the ResNet-50 model:
transform = transforms.Compose([
    transforms.Resize((224, 224)),  # Resize input image to 224x224 pixels, which is the expected size for ResNet-50
    transforms.ToTensor(),          # Convert the image from PIL format to a PyTorch tensor (numeric array)
    transforms.Normalize(           # Normalize the tensor using mean and standard deviation values from ImageNet dataset
        mean=[0.485, 0.456, 0.406],  # These values center the color channels for better model performance
        std=[0.229, 0.224, 0.225]
    )
])

# Print confirmation that the ResNet-50 discriminator is ready for use
print("Discriminator (ResNet-50) loaded successfully!")


# Evaluate the Generated Image
We use the pre-trained ResNet-50 to classify the generated image and output the top predictions. This simulates how a discriminator evaluates or labels the generated image.


In [None]:
# Convert the generated PIL image into a tensor that the ResNet model can process.
# The 'transform' function applies resizing, tensor conversion, and normalization.
# unsqueeze(0) adds a batch dimension since the model expects input in batches.
input_image = transform(image).unsqueeze(0)  # Shape: [1, 3, 224, 224]

# Pass the processed image through the ResNet-50 model (discriminator).
# 'torch.no_grad()' disables gradient calculations since we're only doing inference (not training),
# which saves memory and speeds up computation.
with torch.no_grad():
    output = discriminator(input_image)

# Download the file containing human-readable labels for ImageNet classes.
# This file maps numeric class IDs to descriptive class names.
!wget -qO imagenet_classes.txt https://raw.githubusercontent.com/anishathalye/imagenet-simple-labels/master/imagenet-simple-labels.json

import json
# Load the ImageNet class labels from the downloaded JSON file.
with open("imagenet_classes.txt", "r") as f:
    labels = json.load(f)

# Apply softmax to convert raw model outputs (logits) into probabilities.
probabilities = torch.nn.functional.softmax(output[0], dim=0)

# Find the top 5 categories with the highest probabilities.
top5_prob, top5_catid = torch.topk(probabilities, 5)

# Print the top 5 predicted labels along with their confidence scores.
print("Top 5 Predictions for the Generated Image:")
for i in range(top5_prob.size(0)):
    print(f"{labels[top5_catid[i]]}: {top5_prob[i].item():.4f}")
