Skip to content

Latest commit

 

History

History
 
 

258-blip-diffusion-subject-generation

Subject-driven image generation and editing using BLIP Diffusion and OpenVINO

Task Inference result demo
Zero-shot subject-driven generation image
Controlled subject-driven generation (Canny-edge) image
Controlled subject-driven generation (Scribble) image

BLIP-Diffusion is a text-to-image diffusion model with built-in support for multimodal subject-and-text condition. BLIP-Diffusion enables zero-shot subject-driven generation, and efficient fine-tuning for customized subjects with up to 20x speedup. In addition, BLIP-Diffusion can be flexibly combined with ControlNet and prompt-to-prompt to enable novel subject-driven generation and editing applications.

Notebook contents

The tutorial consists of the following steps:

  • Prerequisites
  • Load the model
  • Infer the original model
    • Zero-Shot subject-driven generation
    • Controlled subject-driven generation (Canny-edge)
    • Controlled subject-driven generation (Scribble)
  • Convert the model to OpenVINO Intermediate Representation (IR)
    • QFormer
    • Text encoder
    • ControlNet
    • UNet
    • Variational Autoencoder (VAE)
    • Select inference device
  • Inference
    • Zero-Shot subject-driven generation
    • Controlled subject-driven generation (Canny-edge)
    • Controlled subject-driven generation (Scribble)
  • Interactive inference

Installation instructions

This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start. For details, please refer to Installation Guide.