<a href="https://colab.research.google.com/github/your-username/flux-transparent-png/blob/main/colab/generate_transparent_png.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Flux.1 Transparent PNG Generation

This notebook allows you to generate transparent PNG images without backgrounds using a trained VAE model with Flux.1 dev.

## Setup

First, let's install the required dependencies and clone the repository.

In [None]:
# Check if running in Google Colab
import sys
IN_COLAB = 'google.colab' in sys.modules
print(f"Running in Colab: {IN_COLAB}")

# Mount Google Drive if in Colab
if IN_COLAB:
    from google.colab import drive
    drive.mount('/content/drive')
    print("Google Drive mounted at /content/drive")

In [None]:
# Clone the repository
!git clone https://github.com/your-username/flux-transparent-png.git
%cd flux-transparent-png/python

In [None]:
# Install dependencies
!pip install torch torchvision diffusers transformers pillow numpy matplotlib tqdm

## Configuration

Set up the generation configuration parameters.

In [None]:
# Configuration parameters
MODEL_PATH = "/content/drive/MyDrive/VAE-DECODER/transparent_vae.pt"  # Path to trained VAE model
OUTPUT_DIR = "/content/drive/MyDrive/VAE-DECODER/OUT"  # Output directory for generated images
USE_DECODER_ONLY = False  # Whether to use only the decoder for generation
HEIGHT = 512  # Height of generated images
WIDTH = 512  # Width of generated images
GUIDANCE_SCALE = 3.5  # Guidance scale for generation
NUM_INFERENCE_STEPS = 50  # Number of inference steps
SEED = 42  # Random seed for reproducibility

# Create output directory
!mkdir -p {OUTPUT_DIR}

## Define Prompts

Define the prompts for image generation.

In [None]:
# Define prompts
prompts = [
    "A beautiful red rose on a transparent background",
    "A cute cartoon cat with big eyes on a transparent background",
    "A golden trophy with a star on top on a transparent background",
    "A colorful butterfly with detailed wings on a transparent background",
    "A simple logo design with abstract shapes on a transparent background"
]

# Save prompts to a file
with open("prompts.txt", "w") as f:
    for prompt in prompts:
        f.write(f"{prompt}\n")

## Generate Images

Now let's generate transparent PNG images using the trained model.

In [None]:
# Generate images
!python generate_transparent_png.py \
  --model_path="{MODEL_PATH}" \
  --prompts_file="prompts.txt" \
  --output_dir="{OUTPUT_DIR}" \
  --height={HEIGHT} \
  --width={WIDTH} \
  --guidance_scale={GUIDANCE_SCALE} \
  --num_inference_steps={NUM_INFERENCE_STEPS} \
  --seed={SEED} \
  {'--use_decoder_only' if USE_DECODER_ONLY else ''}

## Display Generated Images

Let's display the generated transparent PNG images.

In [None]:
import glob
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np

# Find generated images
image_files = sorted(glob.glob(f"{OUTPUT_DIR}/*.png"))
print(f"Found {len(image_files)} generated images")

# Display images
n = len(image_files)
cols = 2
rows = (n + cols - 1) // cols

fig, axes = plt.subplots(rows, cols, figsize=(12, 6 * rows))
if rows == 1 and cols == 1:
    axes = np.array([axes])
axes = axes.flatten()

for i, image_file in enumerate(image_files):
    if i < len(axes):
        # Load image
        img = Image.open(image_file)
        
        # Create checkerboard background to show transparency
        bg_size = 10  # Size of checkerboard squares
        bg = Image.new('RGBA', img.size, (255, 255, 255, 255))
        for y in range(0, img.height, bg_size):
            for x in range(0, img.width, bg_size):
                if (x // bg_size + y // bg_size) % 2 == 0:
                    for dy in range(bg_size):
                        for dx in range(bg_size):
                            if x+dx < img.width and y+dy < img.height:
                                bg.putpixel((x+dx, y+dy), (200, 200, 200, 255))
        
        # Composite image over checkerboard background
        composite = Image.alpha_composite(bg, img)
        
        # Display image
        axes[i].imshow(composite)
        axes[i].set_title(f"Image {i+1}")
        axes[i].axis('off')

# Hide unused axes
for i in range(len(image_files), len(axes)):
    axes[i].axis('off')

plt.tight_layout()
plt.show()

## Generate Custom Image

You can also generate a custom image with your own prompt.

In [None]:
# Custom prompt
custom_prompt = "A majestic eagle with spread wings on a transparent background"  # Change this to your desired prompt

# Generate custom image
!python generate_transparent_png.py \
  --model_path="{MODEL_PATH}" \
  --prompt="{custom_prompt}" \
  --output_dir="{OUTPUT_DIR}/custom" \
  --height={HEIGHT} \
  --width={WIDTH} \
  --guidance_scale={GUIDANCE_SCALE} \
  --num_inference_steps={NUM_INFERENCE_STEPS} \
  --seed={SEED} \
  {'--use_decoder_only' if USE_DECODER_ONLY else ''}

# Display custom image
custom_image_file = glob.glob(f"{OUTPUT_DIR}/custom/*.png")[0]
img = Image.open(custom_image_file)

# Create checkerboard background to show transparency
bg_size = 10  # Size of checkerboard squares
bg = Image.new('RGBA', img.size, (255, 255, 255, 255))
for y in range(0, img.height, bg_size):
    for x in range(0, img.width, bg_size):
        if (x // bg_size + y // bg_size) % 2 == 0:
            for dy in range(bg_size):
                for dx in range(bg_size):
                    if x+dx < img.width and y+dy < img.height:
                        bg.putpixel((x+dx, y+dy), (200, 200, 200, 255))

# Composite image over checkerboard background
composite = Image.alpha_composite(bg, img)

# Display image
plt.figure(figsize=(10, 10))
plt.imshow(composite)
plt.title(f"Custom Image: {custom_prompt}")
plt.axis('off')
plt.show()

## Experiment with Different Parameters

You can experiment with different parameters to see how they affect the generated images.

In [None]:
# Parameters to experiment with
experiment_prompt = "A crystal clear water droplet on a transparent background"
guidance_scales = [1.0, 3.5, 7.0]
inference_steps = [20, 50, 100]

# Create experiment directory
experiment_dir = f"{OUTPUT_DIR}/experiment"
!mkdir -p {experiment_dir}

# Generate images with different parameters
results = []

for gs in guidance_scales:
    for steps in inference_steps:
        output_path = f"{experiment_dir}/gs_{gs}_steps_{steps}.png"
        
        # Generate image
        !python generate_transparent_png.py \
          --model_path="{MODEL_PATH}" \
          --prompt="{experiment_prompt}" \
          --output_dir="{experiment_dir}" \
          --height={HEIGHT} \
          --width={WIDTH} \
          --guidance_scale={gs} \
          --num_inference_steps={steps} \
          --seed={SEED} \
          {'--use_decoder_only' if USE_DECODER_ONLY else ''}
        
        # Find the generated image
        image_files = glob.glob(f"{experiment_dir}/*.png")
        if image_files:
            latest_image = max(image_files, key=os.path.getctime)
            results.append((gs, steps, latest_image))

# Display experiment results
rows = len(guidance_scales)
cols = len(inference_steps)
fig, axes = plt.subplots(rows, cols, figsize=(15, 5 * rows))

for i, gs in enumerate(guidance_scales):
    for j, steps in enumerate(inference_steps):
        # Find matching result
        matching_results = [r for r in results if r[0] == gs and r[1] == steps]
        if matching_results:
            image_file = matching_results[0][2]
            img = Image.open(image_file)
            
            # Create checkerboard background
            bg_size = 10
            bg = Image.new('RGBA', img.size, (255, 255, 255, 255))
            for y in range(0, img.height, bg_size):
                for x in range(0, img.width, bg_size):
                    if (x // bg_size + y // bg_size) % 2 == 0:
                        for dy in range(bg_size):
                            for dx in range(bg_size):
                                if x+dx < img.width and y+dy < img.height:
                                    bg.putpixel((x+dx, y+dy), (200, 200, 200, 255))
            
            # Composite image
            composite = Image.alpha_composite(bg, img)
            
            # Display image
            axes[i, j].imshow(composite)
            axes[i, j].set_title(f"GS: {gs}, Steps: {steps}")
            axes[i, j].axis('off')

plt.tight_layout()
plt.show()

## Conclusion

You have successfully generated transparent PNG images using the trained VAE model with Flux.1 dev. You can continue to experiment with different prompts and parameters to create more images.

If you want to train your own model, check out the `train_transparent_png.ipynb` notebook.