add hunyuan dit image generation (#2108)

TO DO - [x] extend descriptions - [x] table of content - [x] device selection - [x] readme and meta
openvinotoolkit · Jun 18, 2024 · 8a6ed47 · 8a6ed47
1 parent 584ce55
commit 8a6ed47
Show file tree

Hide file tree

Showing 8 changed files with 1,344 additions and 3 deletions.
diff --git a/.ci/ignore_convert_execution.txt b/.ci/ignore_convert_execution.txt
@@ -60,4 +60,5 @@ notebooks/llava-next-multimodal-chatbot/llava-next-multimodal-chatbot.ipynb
 notebooks/stable-video-diffusion/stable-video-diffusion.ipynb
 notebooks/llm-agent-langchain/llm-agent-langchain.ipynb
 notebooks/hello-npu/hello-npu.ipynb
-notebooks/yolov10-optimization/yolov10-optimization.ipynb
+notebooks/yolov10-optimization/yolov10-optimization.ipynb
+notebooks/hunyuan-dit-image-generation/hunyuan-dit-image-generation.ipynb
diff --git a/.ci/ignore_treon_docker.txt b/.ci/ignore_treon_docker.txt
@@ -68,3 +68,4 @@ notebooks/dynamicrafter-animating-images/dynamicrafter-animating-images.ipynb
 notebooks/yolov10-optimization/yolov10-optimization.ipynb
 notebooks/whisper-subtitles-generation/whisper-subtitles-generation.ipynb
 notebooks/speechbrain-emotion-recognition/speechbrain-emotion-recognition.ipynb
+notebooks/hunyuan-dit-image-generation/hunyuan-dit-image-generation.ipynb
diff --git a/.ci/ignore_treon_linux.txt b/.ci/ignore_treon_linux.txt
@@ -67,4 +67,5 @@ notebooks/hello-npu/hello-npu.ipynb
 notebooks/stable-cascade-image-generation/stable-cascade-image-generation.ipynb
 notebooks/dynamicrafter-animating-images/dynamicrafter-animating-images.ipynb
 notebooks/yolov10-optimization/yolov10-optimization.ipynb
-notebooks/whisper-subtitles-generation/whisper-subtitles-generation.ipynb
+notebooks/whisper-subtitles-generation/whisper-subtitles-generation.ipynb
+notebooks/hunyuan-dit-image-generation/hunyuan-dit-image-generation.ipynb
diff --git a/.ci/ignore_treon_mac.txt b/.ci/ignore_treon_mac.txt
@@ -69,4 +69,5 @@ notebooks/stable-cascade-image-generation/stable-cascade-image-generation.ipynb
 notebooks/dynamicrafter-animating-images/dynamicrafter-animating-images.ipynb
 notebooks/yolov10-optimization/yolov10-optimization.ipynb
 notebooks/nano-llava-multimodal-chatbot/nano-llava-multimodal-chatbot.ipynb
-notebooks/whisper-subtitles-generation/whisper-subtitles-generation.ipynb
+notebooks/whisper-subtitles-generation/whisper-subtitles-generation.ipynb
+notebooks/hunyuan-dit-image-generation/hunyuan-dit-image-generation.ipynb
diff --git a/.ci/ignore_treon_win.txt b/.ci/ignore_treon_win.txt
@@ -66,3 +66,4 @@ notebooks/stable-cascade-image-generation/stable-cascade-image-generation.ipynb
 notebooks/dynamicrafter-animating-images/dynamicrafter-animating-images.ipynb
 notebooks/yolov10-optimization/yolov10-optimization.ipynb
 notebooks/whisper-subtitles-generation/whisper-subtitles-generation.ipynb
+notebooks/hunyuan-dit-image-generation/hunyuan-dit-image-generation.ipynb
diff --git a/.ci/spellcheck/.pyspelling.wordlist.txt b/.ci/spellcheck/.pyspelling.wordlist.txt
@@ -190,6 +190,10 @@ distil
 DistilBERT
 distilbert
 distiluse
+DIT
+DiT
+DiT’s
+DiT’s
 DL
 DocLayNet
 docstring
@@ -218,6 +222,7 @@ editability
 EfficientNet
 EfficientSAM
 EfficientSAMs
+Embedder
 embeddings
 EnCodec
 encodec
@@ -292,6 +297,9 @@ HH
 hoc
 HuggingFace
 huggingfacehub
+Hunyuan
+hunyuan
+HunyuanDIT
 Husain
 HWC
 hyperparameters
@@ -325,6 +333,7 @@ InstructPix
 intel
 InternLM
 internlm
+Interpolative
 invertible
 intervaling
 im
@@ -661,6 +670,7 @@ RMBG
 RoBERTa
 roberta
 ROI
+RoPE
 Ruizhongtai
 Runtime
 runtime

diff --git a/notebooks/hunyuan-dit-image-generation/README.md b/notebooks/hunyuan-dit-image-generation/README.md
@@ -0,0 +1,33 @@
+# Image generation with HunyuanDIT and OpenVINO
+
+Hunyuan-DiT is a powerful text-to-image diffusion transformer with fine-grained understanding of both English and Chinese. The model architecture expertly blends diffusion models and transformer networks to unlock the potential of Chinese text-to-image generation.
+
+![](https://raw.githubusercontent.com/Tencent/HunyuanDiT/main/asset/framework.png)
+
+More details about model can be found in original [repository](https://github.com/Tencent/HunyuanDiT), [project web page](https://dit.hunyuan.tencent.com/) and [paper](https://arxiv.org/abs/2405.08748).
+
+In this tutorial we consider how to convert and run Hunyuan-DIT model using OpenVINO. Additionally, we will use [NNCF](https://github.com/openvinotoolkit/nncf) for optimizing model in low precision.
+
+The notebook provides a simple interface that allows communication with a model using text instruction on English or Chinese. In this demonstration user can provide input instructions and the model generates an image. 
+The image below illustrates the provided generated image example.
+![](https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/b541d7d9-da82-4fe9-a98b-e744cb25c3c6)
+
+>**Note**: Demonstrated models can require at least 32GB RAM for conversion and running.
+
+
+### Notebook Contents
+
+The tutorial consists of the following steps:
+
+- Install prerequisites
+- Prepare Diffusers pipeline
+- Convert PyTorch models and compress model weights.
+- Prepare OpenVINO inference pipeline
+- Run model inference
+- Launch interactive demo
+
+## Installation Instructions
+
+This is a self-contained example that relies solely on its own code.</br>
+We recommend  running the notebook in a virtual environment. You only need a Jupyter server to start.
+For details, please refer to [Installation Guide](../../README.md).