# **Imagic Image-Text Embedding Notebook**

This notebook generates image-text embeddings using Imagic framework. It supports single image-text embeddings and batch embedding creation for multiple images and texts. The notebook is structured to align with the objectives outlined in the thesis.

## **1. Environment Setup**
Install necessary libraries and clone the required repository.

In [None]:
!git clone https://github.com/Reouth/Movie-Character-Identification-With-Perosnalized-Generative-Models.git

%pip install -qq git+https://github.com/huggingface/diffusers.git
%pip install -q accelerate
!pip install bitsandbytes

## **2. Import Libraries**
Load necessary Python libraries and scripts.

In [None]:
import os
import torch
import gc

# Change directory to cloned repository
os.chdir('/content/Movie-Character-Identification-With-Perosnalized-Generative-Models')

from models.Diffusion import ImagicTrain
import handlers import ImageHandler

## **3. Configure Authentication**
Login to Hugging Face to access the Stable Diffusion model.

In [None]:
from huggingface_hub import notebook_login
!git config --global credential.helper store
notebook_login()

## **4. Mount Google Drive**
Store and retrieve files from Google Drive.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## **5. Configuration**
Set model and training parameters.

*   text_inputs specifies prompts for generating embeddings.
*   images_folder_path defines where input images are stored.
*   output_path determines where embeddings will be saved.








In [None]:
# Configuration
MODEL_NAME = "CompVis/stable-diffusion-v1-4"
SEED = 42
RESOLUTION = 1024
EMB_LEARNING_RATE = 1e-3
LEARNING_RATE = 2e-6
EMB_TRAIN_STEPS = 2000
MAX_TRAIN_STEPS = 4000

# Path configurations
text_inputs = ["a photo of a person"]  # Add text prompts
images_folder_path = "/content/drive/MyDrive/thesis_OO_SD/ex_machina/ID_images"
output_path = "/content/drive/MyDrive/thesis_OO_SD/ex_machina/Imagic_embeddings/"
os.makedirs(output_path, exist_ok=True)


## **6. Image-Text Embedding**
Generate embeddings for image and text prompt.

In [None]:
def train_imagic_embedding(image_path, text, output_folder, model_name=MODEL_NAME):
    """
    Trains Imagic embeddings for a single image-text pair.

    Args:
        image_path (str): Path to the input image.
        text (str): Text prompt for embedding.
        output_folder (str): Directory to save the embeddings.
        model_name (str): Pretrained model name.
    """
    if os.path.isdir(os.path.join(output_folder, "vae")):
        print(f"Embeddings already exist for {image_path} with text: {text}")
        return

    print(f"Training embeddings for {image_path} with text: {text}")
    !accelerate launch ImagicTrain.py \
        --pretrained_model_name_or_path={model_name} \
        --output_dir={output_folder} \
        --input_image={image_path} \
        --target_text="{text}" \
        --seed={SEED} \
        --resolution={RESOLUTION} \
        --mixed_precision="fp16" \
        --use_8bit_adam \
        --gradient_accumulation_steps=1 \
        --emb_learning_rate={EMB_LEARNING_RATE} \
        --learning_rate={LEARNING_RATE} \
        --emb_train_steps={EMB_TRAIN_STEPS} \
        --max_train_steps={MAX_TRAIN_STEPS} \
        --gradient_checkpointing


In [None]:
#@title Single Image-Text Embedding

single_text = text_inputs[0] #single text/first text in text_inputs list
single_image = os.path.join(images_folder_path, "Mitzi_2.jpg") #single image from folder
train_imagic_embedding(image_path, text_imagic, output_folder_path)

In [None]:
#@title Batch Image-Text Embeddings
for text in text_inputs:
    text_path = text.replace(' ', '_')
    if text == "":
        text_path = "no_text_prompt"
    text_folder_path = os.path.join(output_path, text_path)
    os.makedirs(text_folder_path, exist_ok=True)

    for image_name, _, image_path in ImageHandler.upload_images(images_folder_path):
        gc.collect()
        if torch.cuda.is_available():
            torch.cuda.empty_cache()

        output_folder_path = os.path.join(text_folder_path, image_name)
        train_imagic_embedding(image_path, text, output_folder_path)