# DiffyFace - Inference Notebook

- Load pretrained LoRA weights
- Generate face images from text prompts
- Control random seed for reproducibility
- Display generated images
- Save images to organized folders

## Setup
Run the cells below to set up the environment and load the model.


## Step 1: Install Dependencies

Install required packages for inference.


In [None]:
# Install required packages
%pip install -q diffusers==0.27.2 transformers==4.40.1 accelerate==0.29.3 huggingface-hub==0.22.2 peft==0.10.0 safetensors torch torchvision torchaudio pillow


In [None]:
!git clone https://github.com/rigvedrs/DiffyFace.git

## Step 2: Import Libraries and Setup Paths


In [None]:
import os
import sys
from pathlib import Path
import torch
from PIL import Image
from IPython.display import display
import json
from datetime import datetime

# Setup paths - Works for both Colab and local environments
def find_project_root():
    """Find the project root directory."""
    # Check if we're in Google Colab
    try:
        import google.colab
        IN_COLAB = True
    except:
        IN_COLAB = False
    
    if IN_COLAB:
        # In Colab, check common locations after git clone
        possible_paths = [
            Path('/content/DiffyFace'),  
            Path('/content/drive/MyDrive/DiffyFace'),  # If cloned to drive
            Path.cwd(),  # Current directory
        ]
        
        # Check which path contains the project structure
        for path in possible_paths:
            if (path / "Generation" / "inference.ipynb").exists() or \
               (path / "checkpoints").exists() or \
               (path / "README.md").exists():
                return path
        
        # If not found, use current directory
        return Path.cwd()
    else:
        # Local environment - use current directory or find project root
        current = Path.cwd()
        # Walk up to find project root (has Generation folder)
        for parent in [current] + list(current.parents):
            if (parent / "Generation").exists() and (parent / "Training").exists():
                return parent
        return current

# Find and set project root
PROJECT_ROOT = find_project_root()
sys.path.insert(0, str(PROJECT_ROOT))

# Create output directory for images (Images/colab)
OUTPUT_DIR = PROJECT_ROOT / "Images" / "colab"
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)

print(f"✓ Project root: {PROJECT_ROOT}")
print(f"✓ Output directory: {OUTPUT_DIR}")
print(f"✓ PyTorch version: {torch.__version__}")
print(f"✓ CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"✓ GPU: {torch.cuda.get_device_name(0)}")
    
# Verify project structure
if (PROJECT_ROOT / "Generation").exists():
    print(f"✓ Generation folder found")
if (PROJECT_ROOT / "checkpoints").exists():
    print(f"✓ Checkpoints folder found")


In [None]:
from huggingface_hub import hf_hub_download

# Check if checkpoints already exist
checkpoint_dir = PROJECT_ROOT / "checkpoints" / "lora30k"
checkpoint_file = checkpoint_dir / "pytorch_lora_weights.safetensors"
USE_LOCAL_MODEL = False  # Set to True if you have the model locally

if checkpoint_file.exists():
    print(f"✓ Checkpoints already exist at: {checkpoint_file}")
else:
    print("Downloading DiffyFace LoRA weights from Hugging Face...")
    checkpoint_dir.mkdir(parents=True, exist_ok=True)
    
    try:
        hf_hub_download(
            repo_id="rigvedrs/DiffyFace",
            filename="pytorch_lora_weights.safetensors",
            local_dir=str(checkpoint_dir),
            local_dir_use_symlinks=False
        )
        print(f"✓ Successfully downloaded checkpoints to: {checkpoint_dir}")
    except Exception as e:
        print(f"✗ Error downloading checkpoints: {e}")
        print("Please download manually or check your internet connection")


**⚠️ Important:** Make sure the checkpoint download cell above completed successfully before proceeding to load the model.

The checkpoint file should be at: `checkpoints/lora30k/pytorch_lora_weights.safetensors`

## Step 3: Load the Model

Load the Stable Diffusion 2.1 model with LoRA weights. **Requires CUDA/GPU.**


In [None]:
from diffusers import StableDiffusionPipeline
import torch

# Check CUDA availability
if not torch.cuda.is_available():
    raise RuntimeError("CUDA not available! DiffyFace requires a GPU. Please use Google Colab with GPU or a machine with CUDA support.")

device = "cuda"
print(f"✓ Using device: {device}")

# Setup paths
checkpoint_path = PROJECT_ROOT / "checkpoints" / "lora30k"
weight_name = "pytorch_lora_weights.safetensors"

# Check for local model
local_model_path = PROJECT_ROOT / "models" / "stable-diffusion-2-1"
use_local = False

if use_local:
    print(f"✓ Using local model from: {local_model_path}")
    model_path = str(local_model_path)
else:
    print("Using Hugging Face model: rigvedrs/Diffy-2-1")
    model_path = "rigvedrs/Diffy-2-1"

# Check if checkpoint file exists
checkpoint_file = checkpoint_path / weight_name
if not checkpoint_file.exists():
    raise FileNotFoundError(
        f"LoRA checkpoint file not found at {checkpoint_file}\n"
        f"Please run the checkpoint download cell above first, or download manually from:\n"
        f"https://huggingface.co/rigvedrs/DiffyFace/tree/main"
    )

# Load LoRA state dict
print(f"Loading LoRA weights from {checkpoint_path}...")
try:
    state_dict, network_alphas = StableDiffusionPipeline.lora_state_dict(
        str(checkpoint_path),
        weight_name=weight_name
    )
    print("✓ LoRA weights loaded successfully")
except Exception as e:
    raise RuntimeError(f"Failed to load LoRA weights: {e}\nMake sure the checkpoint file exists at {checkpoint_path / weight_name}")

# Load the base Stable Diffusion 2.1 model
print(f"Loading Stable Diffusion 2.1 model from {model_path}...")
try:
    pipe = StableDiffusionPipeline.from_pretrained(
        model_path,
        torch_dtype=torch.float16,
        local_files_only=use_local  # Use local files only if using local model
    ).to(device)
    print("✓ Base model loaded successfully")
except Exception as e:
    error_msg = str(e)
    if "401" in error_msg or "Unauthorized" in error_msg:
        print("\n" + "="*80)
        print("AUTHENTICATION ERROR")
        print("="*80)
        print("If authentication is required, you need to:")
        print("1. Get your Hugging Face token from: https://huggingface.co/settings/tokens")
        print("2. Set HF_TOKEN environment variable or authenticate in the previous cell")
        print("\nAlternatively, download the model locally:")
        print("  python Generation/download_and_upload_model.py --download-only")
        print("  Then set USE_LOCAL_MODEL = True in the previous cell")
        print("="*80)
    elif "not cached locally" in error_msg.lower():
        print("\n" + "="*80)
        print("MODEL NOT FOUND")
        print("="*80)
        print("The model is not cached locally. Options:")
        print("1. Authenticate with Hugging Face (see previous cell)")
        print("2. Download the model locally:")
        print("   python Generation/download_and_upload_model.py --download-only")
        print("   Then set USE_LOCAL_MODEL = True")
        print("="*80)
    raise

# Load LoRA weights into model
print("Loading LoRA weights into model...")
pipe.load_lora_into_unet(
    state_dict, network_alphas, pipe.unet, adapter_name='diffyface_lora'
)
pipe.load_lora_into_text_encoder(
    state_dict, network_alphas, pipe.text_encoder, adapter_name='diffyface_lora'
)
pipe.set_adapters(["diffyface_lora"], adapter_weights=[1.0])

print("✓ Model loaded successfully and ready for inference!")


In [None]:
from diffusers import StableDiffusionPipeline
import torch

# Check CUDA availability
if not torch.cuda.is_available():
    raise RuntimeError("CUDA not available! DiffyFace requires a GPU. Please use Google Colab with GPU or a machine with CUDA support.")

device = "cuda"
print(f"✓ Using device: {device}")

# Setup paths
checkpoint_path = PROJECT_ROOT / "checkpoints" / "lora30k"
weight_name = "pytorch_lora_weights.safetensors"

# Load LoRA state dict
print(f"Loading LoRA weights from {checkpoint_path}...")
state_dict, network_alphas = StableDiffusionPipeline.lora_state_dict(
    str(checkpoint_path),
    weight_name=weight_name
)

# Load the base Stable Diffusion 2.1 model
print("Loading Stable Diffusion 2.1 model from rigvedrs/Diffy-2-1...")
pipe = StableDiffusionPipeline.from_pretrained(
    "rigvedrs/Diffy-2-1",
    torch_dtype=torch.float16
).to(device)

# Load LoRA weights into model
pipe.load_lora_into_unet(
    state_dict, network_alphas, pipe.unet, adapter_name='diffyface_lora'
)
pipe.load_lora_into_text_encoder(
    state_dict, network_alphas, pipe.text_encoder, adapter_name='diffyface_lora'
)
pipe.set_adapters(["diffyface_lora"], adapter_weights=[1.0])

print("✓ Model loaded successfully and ready for inference!")


## Step 4: Inference Function

Function to generate images with customizable parameters.


In [None]:
def generate_face(
    prompt: str,
    negative_prompt: str = "",
    num_inference_steps: int = 50,
    guidance_scale: float = 7.5,
    seed: int = None,
    save_image: bool = True,
    display_image: bool = True
):
    """
    Generate a face image from a text prompt.
    
    Args:
        prompt: Text description of the face to generate
        negative_prompt: What to avoid in the generation
        num_inference_steps: Number of denoising steps (more = better quality, slower)
        guidance_scale: How closely to follow the prompt (higher = more adherence)
        seed: Random seed for reproducibility (None = random)
        save_image: Whether to save the image to disk
        display_image: Whether to display the image in the notebook
        
    Returns:
        Generated PIL Image
    """
    # Set random seed if provided
    generator = None
    if seed is not None:
        generator = torch.manual_seed(seed)
        print(f"Using seed: {seed}")
    
    print(f"Generating image with prompt: '{prompt}'")
    print(f"Inference steps: {num_inference_steps}, Guidance scale: {guidance_scale}")
    
    # Generate image
    with torch.no_grad():
        result = pipe(
            prompt,
            negative_prompt=negative_prompt if negative_prompt else None,
            num_inference_steps=num_inference_steps,
            guidance_scale=guidance_scale,
            generator=generator,
            cross_attention_kwargs={"scale": 1.0} if device == "cuda" else None,
            output_type="pil"
        )
    
    image = result.images[0]
    
    # Save image if requested
    if save_image:
        # Create filename from prompt
        filename = prompt.replace(".", " ").replace(",", " ").replace("!", " ")
        filename = "_".join(filename.split())
        filename = filename.replace("__", "_").strip("_")
        
        # Add timestamp and seed if provided
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        seed_str = f"_seed{seed}" if seed is not None else ""
        filename = f"{timestamp}_{filename[:50]}{seed_str}.png"
        
        # Ensure output directory exists
        OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
        
        save_path = OUTPUT_DIR / filename
        image.save(save_path)
        print(f"✓ Image saved to: {save_path}")
    
    # Display image if requested
    if display_image:
        display(image)
    
    return image


In [None]:
# ============================================
# CONFIGURATION - Modify these values
# ============================================

# Your prompt describing the face
PROMPT = "A happy 25 year old male with blond hair and a french beard smiles with visible teeth."

# Negative prompt (what to avoid)
NEGATIVE_PROMPT = "blurry, distorted, low quality, deformed"

# Generation parameters
NUM_INFERENCE_STEPS = 50  # More steps = better quality but slower (20-100)
GUIDANCE_SCALE = 7.5      # How closely to follow prompt (1.0-20.0)
SEED = 42                 # Random seed (None for random, or any integer for reproducibility)

# Options
SAVE_IMAGE = True         # Save image to disk
DISPLAY_IMAGE = True      # Display image in notebook

# ============================================
# Generate the image
# ============================================

image = generate_face(
    prompt=PROMPT,
    negative_prompt=NEGATIVE_PROMPT,
    num_inference_steps=NUM_INFERENCE_STEPS,
    guidance_scale=GUIDANCE_SCALE,
    seed=SEED,
    save_image=SAVE_IMAGE,
    display_image=DISPLAY_IMAGE
)


## Batch Generation

Generate multiple images with different prompts or seeds.


In [None]:
# Example: Generate multiple images with different seeds
prompts = [
    "A happy 25 year old male with blond hair and a french beard smiles with visible teeth.",
    "A young woman with dark hair and glasses looks serious.",
    "An elderly man with gray hair and a kind expression."
]

for i, prompt in enumerate(prompts):
    print(f"\n{'='*80}")
    print(f"Generating image {i+1}/{len(prompts)}")
    print(f"{'='*80}")
    
    image = generate_face(
        prompt=prompt,
        negative_prompt="blurry, distorted, low quality",
        num_inference_steps=50,
        guidance_scale=7.5,
        seed=42 + i,  # Different seed for each image
        save_image=True,
        display_image=True
    )


## View Saved Images

List and display all saved images.


In [None]:
# List all saved images
saved_images = sorted(OUTPUT_DIR.glob("*.png"))
print(f"Total saved images: {len(saved_images)}")
print(f"\nSaved images in {OUTPUT_DIR}:")

for img_path in saved_images[-10:]:  # Show last 10 images
    print(f"  - {img_path.name}")

# Display the most recent image
if saved_images:
    print(f"\nMost recent image:")
    latest_image = Image.open(saved_images[-1])
    display(latest_image)
    print(f"File: {saved_images[-1].name}")


## Advanced: Custom Generation

For more control, use this cell to experiment with different parameters.


In [None]:
# Advanced generation with full control
custom_prompt = "A 30 year old Asian woman with long black hair, wearing glasses, smiling warmly"
custom_negative = "blurry, distorted, low quality, deformed, ugly"
custom_steps = 75  # Higher quality, slower
custom_guidance = 8.0  # Stronger prompt adherence
custom_seed = 123  # Set to None for random

image = generate_face(
    prompt=custom_prompt,
    negative_prompt=custom_negative,
    num_inference_steps=custom_steps,
    guidance_scale=custom_guidance,
    seed=custom_seed,
    save_image=True,
    display_image=True
)


## Tips for Better Results

1. **Detailed Prompts**: Be specific about age, gender, hair, facial features, expression
   - Good: "A happy 25 year old male with blond hair and a french beard smiles with visible teeth."
   - Bad: "A person"

2. **Negative Prompts**: Use to avoid unwanted features
   - "blurry, distorted, low quality, deformed, ugly"

3. **Inference Steps**: 
   - 20-30: Fast, lower quality
   - 50: Balanced (recommended)
   - 75-100: High quality, slower

4. **Guidance Scale**:
   - 5.0-7.5: More creative, less strict
   - 7.5-10.0: Balanced (recommended)
   - 10.0-15.0: Very strict prompt adherence

5. **Seed**: Use the same seed to reproduce the same image
