diff --git a/.github/workflows/build_documentation.yml b/.github/workflows/build_documentation.yml
index bd45b08d24f7..67229d634c91 100644
--- a/.github/workflows/build_documentation.yml
+++ b/.github/workflows/build_documentation.yml
@@ -16,7 +16,7 @@ jobs:
install_libgl1: true
package: diffusers
notebook_folder: diffusers_doc
- languages: en ko zh
+ languages: en ko zh ja
secrets:
token: ${{ secrets.HUGGINGFACE_PUSH }}
diff --git a/.github/workflows/build_pr_documentation.yml b/.github/workflows/build_pr_documentation.yml
index 18b606ca754c..f5b666ee27ff 100644
--- a/.github/workflows/build_pr_documentation.yml
+++ b/.github/workflows/build_pr_documentation.yml
@@ -15,4 +15,4 @@ jobs:
pr_number: ${{ github.event.number }}
install_libgl1: true
package: diffusers
- languages: en ko zh
+ languages: en ko zh ja
diff --git a/docs/source/ja/_toctree.yml b/docs/source/ja/_toctree.yml
new file mode 100644
index 000000000000..7af1f9f2b28d
--- /dev/null
+++ b/docs/source/ja/_toctree.yml
@@ -0,0 +1,10 @@
+- sections:
+ - local: index
+ title: ð§š Diffusers
+ - local: quicktour
+ title: ç°¡åãªæ¡å
+ - local: stable_diffusion
+ title: 广çã§å¹ççãªæ¡æ£ã¢ãã«
+ - local: installation
+ title: ã€ã³ã¹ããŒã«
+ title: ã¯ããã«
\ No newline at end of file
diff --git a/docs/source/ja/index.md b/docs/source/ja/index.md
new file mode 100644
index 000000000000..6e8ba78dd55f
--- /dev/null
+++ b/docs/source/ja/index.md
@@ -0,0 +1,98 @@
+
+
+
+
+
+
+
+
+# Diffusers
+
+ð€ Diffusers ã¯ãç»åãé³å£°ãããã«ã¯ååã®3Dæ§é ãçæããããã®ãæå
端ã®äºååŠç¿æžã¿Diffusion Model(æ¡æ£ã¢ãã«)ãæäŸããã©ã€ãã©ãªã§ããã·ã³ãã«ãªçæãœãªã¥ãŒã·ã§ã³ããæ¢ãã®å Žåã§ããç¬èªã®æ¡æ£ã¢ãã«ããã¬ãŒãã³ã°ãããå Žåã§ããð€ Diffusers ã¯ãã®äž¡æ¹ããµããŒãããã¢ãžã¥ãŒã«åŒã®ããŒã«ããã¯ã¹ã§ããæã
ã®ã©ã€ãã©ãªã¯ã[æ§èœãã䜿ãããã](conceptual/philosophy#usability-over-performance)ã[ç°¡åããã·ã³ãã«](conceptual/philosophy#simple-over-easy)ã[æœè±¡åããã«ã¹ã¿ãã€ãºæ§](conceptual/philosophy#tweakable-contributorfriendly-over-abstraction)ã«éç¹ã眮ããŠèšèšãããŠããŸãã
+
+ãã®ã©ã€ãã©ãªã«ã¯3ã€ã®äž»èŠã³ã³ããŒãã³ãããããŸã:
+
+- æå
端ã®[æ¡æ£ãã€ãã©ã€ã³](api/pipelines/overview)ã§æ°è¡ã®ã³ãŒãã§çæãå¯èœã§ãã
+- 亀æå¯èœãª[ãã€ãºã¹ã±ãžã¥ãŒã©](api/schedulers/overview)ã§çæé床ãšå質ã®ãã¬ãŒããªãã®ãã©ã³ã¹ããšããŸãã
+- äºåã«èšç·Žããã[ã¢ãã«](api/models)ã¯ããã«ãã£ã³ã°ãããã¯ãšããŠäœ¿çšããããšãã§ããã¹ã±ãžã¥ãŒã©ãšçµã¿åãããããšã§ãç¬èªã®ãšã³ãããŒãšã³ãã®æ¡æ£ã·ã¹ãã ãäœæããããšãã§ããŸãã
+
+
+
+## Supported pipelines
+
+| Pipeline | Paper/Repository | Tasks |
+|---|---|:---:|
+| [alt_diffusion](./api/pipelines/alt_diffusion) | [AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679) | Image-to-Image Text-Guided Generation |
+| [audio_diffusion](./api/pipelines/audio_diffusion) | [Audio Diffusion](https://github.com/teticio/audio-diffusion.git) | Unconditional Audio Generation |
+| [controlnet](./api/pipelines/controlnet) | [Adding Conditional Control to Text-to-Image Diffusion Models](https://arxiv.org/abs/2302.05543) | Image-to-Image Text-Guided Generation |
+| [cycle_diffusion](./api/pipelines/cycle_diffusion) | [Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance](https://arxiv.org/abs/2210.05559) | Image-to-Image Text-Guided Generation |
+| [dance_diffusion](./api/pipelines/dance_diffusion) | [Dance Diffusion](https://github.com/williamberman/diffusers.git) | Unconditional Audio Generation |
+| [ddpm](./api/pipelines/ddpm) | [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) | Unconditional Image Generation |
+| [ddim](./api/pipelines/ddim) | [Denoising Diffusion Implicit Models](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation |
+| [if](./if) | [**IF**](./api/pipelines/if) | Image Generation |
+| [if_img2img](./if) | [**IF**](./api/pipelines/if) | Image-to-Image Generation |
+| [if_inpainting](./if) | [**IF**](./api/pipelines/if) | Image-to-Image Generation |
+| [latent_diffusion](./api/pipelines/latent_diffusion) | [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752)| Text-to-Image Generation |
+| [latent_diffusion](./api/pipelines/latent_diffusion) | [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752)| Super Resolution Image-to-Image |
+| [latent_diffusion_uncond](./api/pipelines/latent_diffusion_uncond) | [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752) | Unconditional Image Generation |
+| [paint_by_example](./api/pipelines/paint_by_example) | [Paint by Example: Exemplar-based Image Editing with Diffusion Models](https://arxiv.org/abs/2211.13227) | Image-Guided Image Inpainting |
+| [pndm](./api/pipelines/pndm) | [Pseudo Numerical Methods for Diffusion Models on Manifolds](https://arxiv.org/abs/2202.09778) | Unconditional Image Generation |
+| [score_sde_ve](./api/pipelines/score_sde_ve) | [Score-Based Generative Modeling through Stochastic Differential Equations](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation |
+| [score_sde_vp](./api/pipelines/score_sde_vp) | [Score-Based Generative Modeling through Stochastic Differential Equations](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation |
+| [semantic_stable_diffusion](./api/pipelines/semantic_stable_diffusion) | [Semantic Guidance](https://arxiv.org/abs/2301.12247) | Text-Guided Generation |
+| [stable_diffusion_adapter](./api/pipelines/stable_diffusion/adapter) | [**T2I-Adapter**](https://arxiv.org/abs/2302.08453) | Image-to-Image Text-Guided Generation | -
+| [stable_diffusion_text2img](./api/pipelines/stable_diffusion/text2img) | [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) | Text-to-Image Generation |
+| [stable_diffusion_img2img](./api/pipelines/stable_diffusion/img2img) | [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) | Image-to-Image Text-Guided Generation |
+| [stable_diffusion_inpaint](./api/pipelines/stable_diffusion/inpaint) | [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) | Text-Guided Image Inpainting |
+| [stable_diffusion_panorama](./api/pipelines/stable_diffusion/panorama) | [MultiDiffusion](https://multidiffusion.github.io/) | Text-to-Panorama Generation |
+| [stable_diffusion_pix2pix](./api/pipelines/stable_diffusion/pix2pix) | [InstructPix2Pix: Learning to Follow Image Editing Instructions](https://arxiv.org/abs/2211.09800) | Text-Guided Image Editing|
+| [stable_diffusion_pix2pix_zero](./api/pipelines/stable_diffusion/pix2pix_zero) | [Zero-shot Image-to-Image Translation](https://pix2pixzero.github.io/) | Text-Guided Image Editing |
+| [stable_diffusion_attend_and_excite](./api/pipelines/stable_diffusion/attend_and_excite) | [Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models](https://arxiv.org/abs/2301.13826) | Text-to-Image Generation |
+| [stable_diffusion_self_attention_guidance](./api/pipelines/stable_diffusion/self_attention_guidance) | [Improving Sample Quality of Diffusion Models Using Self-Attention Guidance](https://arxiv.org/abs/2210.00939) | Text-to-Image Generation Unconditional Image Generation |
+| [stable_diffusion_image_variation](./stable_diffusion/image_variation) | [Stable Diffusion Image Variations](https://github.com/LambdaLabsML/lambda-diffusers#stable-diffusion-image-variations) | Image-to-Image Generation |
+| [stable_diffusion_latent_upscale](./stable_diffusion/latent_upscale) | [Stable Diffusion Latent Upscaler](https://twitter.com/StabilityAI/status/1590531958815064065) | Text-Guided Super Resolution Image-to-Image |
+| [stable_diffusion_model_editing](./api/pipelines/stable_diffusion/model_editing) | [Editing Implicit Assumptions in Text-to-Image Diffusion Models](https://time-diffusion.github.io/) | Text-to-Image Model Editing |
+| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Stable Diffusion 2](https://stability.ai/blog/stable-diffusion-v2-release) | Text-to-Image Generation |
+| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Stable Diffusion 2](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Image Inpainting |
+| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Depth-Conditional Stable Diffusion](https://github.com/Stability-AI/stablediffusion#depth-conditional-stable-diffusion) | Depth-to-Image Generation |
+| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Stable Diffusion 2](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Super Resolution Image-to-Image |
+| [stable_diffusion_safe](./api/pipelines/stable_diffusion_safe) | [Safe Stable Diffusion](https://arxiv.org/abs/2211.05105) | Text-Guided Generation |
+| [stable_unclip](./stable_unclip) | Stable unCLIP | Text-to-Image Generation |
+| [stable_unclip](./stable_unclip) | Stable unCLIP | Image-to-Image Text-Guided Generation |
+| [stochastic_karras_ve](./api/pipelines/stochastic_karras_ve) | [Elucidating the Design Space of Diffusion-Based Generative Models](https://arxiv.org/abs/2206.00364) | Unconditional Image Generation |
+| [text_to_video_sd](./api/pipelines/text_to_video) | [Modelscope's Text-to-video-synthesis Model in Open Domain](https://modelscope.cn/models/damo/text-to-video-synthesis/summary) | Text-to-Video Generation |
+| [unclip](./api/pipelines/unclip) | [Hierarchical Text-Conditional Image Generation with CLIP Latents](https://arxiv.org/abs/2204.06125)(implementation by [kakaobrain](https://github.com/kakaobrain/karlo)) | Text-to-Image Generation |
+| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Text-to-Image Generation |
+| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Image Variations Generation |
+| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Dual Image and Text Guided Generation |
+| [vq_diffusion](./api/pipelines/vq_diffusion) | [Vector Quantized Diffusion Model for Text-to-Image Synthesis](https://arxiv.org/abs/2111.14822) | Text-to-Image Generation |
+| [stable_diffusion_ldm3d](./api/pipelines/stable_diffusion/ldm3d_diffusion) | [LDM3D: Latent Diffusion Model for 3D](https://arxiv.org/abs/2305.10853) | Text to Image and Depth Generation |
diff --git a/docs/source/ja/installation.md b/docs/source/ja/installation.md
new file mode 100644
index 000000000000..dbfd19d6cb7a
--- /dev/null
+++ b/docs/source/ja/installation.md
@@ -0,0 +1,145 @@
+
+
+# ã€ã³ã¹ããŒã«
+
+ã䜿ãã®ãã£ãŒãã©ãŒãã³ã°ã©ã€ãã©ãªã«åãããŠDiffusersãã€ã³ã¹ããŒã«ã§ããŸãã
+
+ð€ Diffusersã¯Python 3.8+ãPyTorch 1.7.0+ãFlaxã§ãã¹ããããŠããŸãã䜿çšãããã£ãŒãã©ãŒãã³ã°ã©ã€ãã©ãªã®ä»¥äžã®ã€ã³ã¹ããŒã«æé ã«åŸã£ãŠãã ããïŒ
+
+- [PyTorch](https://pytorch.org/get-started/locally/)ã®ã€ã³ã¹ããŒã«æé ã
+- [Flax](https://flax.readthedocs.io/en/latest/)ã®ã€ã³ã¹ããŒã«æé ã
+
+## pip ã§ã€ã³ã¹ããŒã«
+
+Diffusersã¯[ä»®æ³ç°å¢](https://docs.python.org/3/library/venv.html)ã®äžã§ã€ã³ã¹ããŒã«ããããšãæšå¥šãããŠããŸãã
+Python ã®ä»®æ³ç°å¢ã«ã€ããŠããç¥ããªãå Žåã¯ããã¡ãã® [ã¬ã€ã](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) ãåç
§ããŠãã ããã
+ä»®æ³ç°å¢ã¯ç°ãªããããžã§ã¯ãã®ç®¡çã容æã«ããäŸåé¢ä¿éã®äºææ§ã®åé¡ãåé¿ããŸãã
+
+ã§ã¯ãã£ããããããžã§ã¯ããã£ã¬ã¯ããªã«ä»®æ³ç°å¢ãäœã£ãŠã¿ãŸãïŒ
+
+```bash
+python -m venv .env
+```
+
+ä»®æ³ç°å¢ãã¢ã¯ãã£ãã«ããŸãïŒ
+
+```bash
+source .env/bin/activate
+```
+
+ð€ Diffusers ããŸã ð€ Transformers ã©ã€ãã©ãªã«äŸåããŠããã以äžã®ã³ãã³ãã§äž¡æ¹ãã€ã³ã¹ããŒã«ã§ããŸãïŒ
+
+
+
+```bash
+pip install diffusers["torch"] transformers
+```
+
+
+```bash
+pip install diffusers["flax"] transformers
+```
+
+
+
+## ãœãŒã¹ããã®ã€ã³ã¹ããŒã«
+
+ãœãŒã¹ããð€ Diffusersãã€ã³ã¹ããŒã«ããåã«ã`torch`ãšð€ Accelerateãã€ã³ã¹ããŒã«ãããŠããããšã確èªããŠãã ããã
+
+`torch`ã®ã€ã³ã¹ããŒã«ã«ã€ããŠã¯ã`torch` [ã€ã³ã¹ããŒã«](https://pytorch.org/get-started/locally/#start-locally)ã¬ã€ããåç
§ããŠãã ããã
+
+ð€ Accelerateãã€ã³ã¹ããŒã«ããã«ã¯ïŒ
+
+```bash
+pip install accelerate
+```
+
+以äžã®ã³ãã³ãã§ãœãŒã¹ããð€ Diffusersãã€ã³ã¹ããŒã«ã§ããŸãïŒ
+
+```bash
+pip install git+https://github.com/huggingface/diffusers
+```
+
+ãã®ã³ãã³ãã¯ææ°ã® `stable` ããŒãžã§ã³ã§ã¯ãªããæå
端㮠`main` ããŒãžã§ã³ãã€ã³ã¹ããŒã«ããŸãã
+`main`ããŒãžã§ã³ã¯ææ°ã®éçºã«å¯Ÿå¿ããã®ã«äŸ¿å©ã§ãã
+äŸãã°ãååã®å
¬åŒãªãªãŒã¹ä»¥éã«ãã°ãä¿®æ£ãããããæ°ãããªãªãŒã¹ããŸã ãªãªãŒã¹ãããŠããªãå Žåãªã©ã«ã¯éœåãããã§ãã
+ãããããã㯠`main` ããŒãžã§ã³ãåžžã«å®å®ããŠãããšã¯éããªãã§ãã
+ç§ãã¡ã¯ `main` ããŒãžã§ã³ãéçšãç¶ããããåªåããŠãããã»ãšãã©ã®åé¡ã¯éåžžæ°æéãã1æ¥ä»¥å
ã«è§£æ±ºãããŸãã
+ããåé¡ãçºçããå Žåã¯ã[Issue](https://github.com/huggingface/diffusers/issues/new/choose) ãéããŠãã ããïŒ
+
+## ç·šéå¯èœãªã€ã³ã¹ããŒã«
+
+以äžã®å Žåãç·šéå¯èœãªã€ã³ã¹ããŒã«ãå¿
èŠã§ãïŒ
+
+* ãœãŒã¹ã³ãŒãã® `main` ããŒãžã§ã³ã䜿çšããã
+* ð€ Diffusers ã«è²¢ç®ããã³ãŒãã®å€æŽããã¹ãããå¿
èŠãããå Žåã
+
+ãªããžããªãã¯ããŒã³ããæ¬¡ã®ã³ãã³ãã§ ð€ Diffusers ãã€ã³ã¹ããŒã«ããŠãã ããïŒ
+
+```bash
+git clone https://github.com/huggingface/diffusers.git
+cd diffusers
+```
+
+
+
+```bash
+pip install -e ".[torch]"
+```
+
+
+```bash
+pip install -e ".[flax]"
+```
+
+
+
+ãããã®ã³ãã³ãã¯ããªããžããªãã¯ããŒã³ãããã©ã«ããš Python ã®ã©ã€ãã©ãªãã¹ããªã³ã¯ããŸãã
+Python ã¯éåžžã®ã©ã€ãã©ãªãã¹ã«å ããŠãã¯ããŒã³ãããã©ã«ãã®äžãæ¢ãããã«ãªããŸãã
+äŸãã°ãPython ããã±ãŒãžãéåžž `~/anaconda3/envs/main/lib/python3.8/site-packages/` ã«ã€ã³ã¹ããŒã«ãããŠããå ŽåãPython ã¯ã¯ããŒã³ãã `~/diffusers/` ãã©ã«ããåæ§ã«åç
§ããŸãã
+
+
+
+ã©ã€ãã©ãªã䜿ãç¶ãããå Žåã¯ã`diffusers`ãã©ã«ããæ®ããŠããå¿
èŠããããŸãã
+
+
+
+ããã§ã以äžã®ã³ãã³ãã§ç°¡åã«ã¯ããŒã³ãææ°çã®ð€ Diffusersã«ã¢ããããŒãã§ããŸãïŒ
+
+```bash
+cd ~/diffusers/
+git pull
+```
+
+Pythonç°å¢ã¯æ¬¡ã®å®è¡æã« `main` ããŒãžã§ã³ã®ð€ DiffusersãèŠã€ããŸãã
+
+## ãã¬ã¡ããªãŒã»ãã®ã³ã°ã«é¢ãããç¥ãã
+
+ãã®ã©ã€ãã©ãªã¯ `from_pretrained()` ãªã¯ãšã¹ãäžã«ããŒã¿ãåéããŸãã
+ãã®ããŒã¿ã«ã¯ Diffusers ãš PyTorch/Flax ã®ããŒãžã§ã³ãèŠæ±ãããã¢ãã«ããã€ãã©ã€ã³ã¯ã©ã¹ãå«ãŸããŸãã
+ãŸããHubã§ãã¹ããããŠããå Žåã¯ãäºåã«åŠç¿ããããã§ãã¯ãã€ã³ããžã®ãã¹ãå«ãŸããŸãã
+ãã®äœ¿çšããŒã¿ã¯åé¡ã®ãããã°ãæ°æ©èœã®åªå
é äœä»ãã«åœ¹ç«ã¡ãŸãã
+ãã¬ã¡ããªãŒã¯HuggingFace Hubããã¢ãã«ããã€ãã©ã€ã³ãããŒããããšãã®ã¿éä¿¡ãããŸããããŒã«ã«ã§ã®äœ¿çšäžã¯åéãããŸããã
+
+æã
ã¯ããã¹ãŠã®äººãè¿œå æ
å ±ãå
±æããããªãããšãçè§£ããããªãã®ãã©ã€ãã·ãŒãå°éããŸãã
+ãã®ãããã¿ãŒããã«ãã `DISABLE_TELEMETRY` ç°å¢å€æ°ãèšå®ããããšã§ãããŒã¿åéãç¡å¹ã«ããããšãã§ããŸãïŒ
+
+Linux/MacOSã®å Žå
+```bash
+export DISABLE_TELEMETRY=YES
+```
+
+Windows ã®å Žå
+```bash
+set DISABLE_TELEMETRY=YES
+```
diff --git a/docs/source/ja/quicktour.md b/docs/source/ja/quicktour.md
new file mode 100644
index 000000000000..04c93af4168c
--- /dev/null
+++ b/docs/source/ja/quicktour.md
@@ -0,0 +1,316 @@
+
+
+[[open-in-colab]]
+
+# ç°¡åãªæ¡å
+
+æ¡æ£ã¢ãã«(Diffusion Model)ã¯ãã©ã³ãã ãªæ£èŠååžããæ®µéçã«ãã€ãºé€å»ããããã«åŠç¿ãããç»åãé³å£°ãªã©ã®ç®çã®ãã®ãçæã§ããŸããããã¯çæAIã«å€å€§ãªé¢å¿ãåŒã³èµ·ãããŸãããã€ã³ã¿ãŒãããäžã§æ¡æ£ã«ãã£ãŠçæãããç»åã®äŸãèŠãããšãããã§ããããð§š Diffusersã¯ã誰ããæ¡æ£ã¢ãã«ã«åºãã¢ã¯ã»ã¹ã§ããããã«ããããšãç®çãšããã©ã€ãã©ãªã§ãã
+
+ãã®æ¡å
ã§ã¯ãéçºè
ãŸãã¯æ¥åžžçãªãŠãŒã¶ãŒã«é¢ããããð§š Diffusers ã玹ä»ããçŽ æ©ãç®çã®ãã®ãçæã§ããããã«ããŸãïŒãã®ã©ã€ãã©ãªã«ã¯3ã€ã®äž»èŠã³ã³ããŒãã³ãããããŸã:
+
+* [`DiffusionPipeline`]ã¯äºåã«åŠç¿ãããæ¡æ£ã¢ãã«ãããµã³ãã«ãè¿
éã«çæããããã«èšèšãããé«ã¬ãã«ã®ãšã³ãããŒãšã³ãã¯ã©ã¹ã
+* æ¡æ£ã·ã¹ãã ãäœæããããã®ãã«ãã£ã³ã°ãããã¯ãšããŠäœ¿çšã§ããã人æ°ã®ããäºååŠç¿ããã[ã¢ãã«](./api/models)ã¢ãŒããã¯ãã£ãšã¢ãžã¥ãŒã«ã
+* å€ãã®ç°ãªã[ã¹ã±ãžã¥ãŒã©](./api/schedulers/overview) - ãã€ãºãã©ã®ããã«ãã¬ãŒãã³ã°ã®ããã«å ãããããããããŠçæäžã«ã©ã®ããã«ãã€ãºé€å»ãããç»åãçæããããå¶åŸ¡ããã¢ã«ãŽãªãºã ã
+
+ãã®æ¡å
ã§ã¯ã[`DiffusionPipeline`]ãçæã«äœ¿çšããæ¹æ³ã玹ä»ããã¢ãã«ãšã¹ã±ãžã¥ãŒã©ãçµã¿åãããŠ[`DiffusionPipeline`]ã®å
éšã§èµ·ãã£ãŠããããšãåçŸããæ¹æ³ã説æããŸãã
+
+
+
+ãã®æ¡å
ã¯ð§š Diffusers [ããŒãããã¯](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/diffusers_intro.ipynb)ãç°¡ç¥åãããã®ã§ãããã«äœ¿ãå§ããããšãã§ããŸããDiffusers ð§šã®ãŽãŒã«ãèšèšå²åŠãã³ã¢APIã®è©³çްã«ã€ããŠãã£ãšç¥ãããæ¹ã¯ãããŒãããã¯ãã芧ãã ããïŒ
+
+
+
+å§ããåã«å¿
èŠãªã©ã€ãã©ãªãŒããã¹ãŠã€ã³ã¹ããŒã«ãããŠããããšã確èªããŠãã ããïŒ
+
+```py
+# uncomment to install the necessary libraries in Colab
+#!pip install --upgrade diffusers accelerate transformers
+```
+
+- [ð€ Accelerate](https://huggingface.co/docs/accelerate/index)çæãšãã¬ãŒãã³ã°ã®ããã®ã¢ãã«ã®ããŒããé«éåããŸã
+- [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/overview)ãããªæãäžè¬çãªæ¡æ£ã¢ãã«ãå®è¡ããã«ã¯ã[ð€ Transformers](https://huggingface.co/docs/transformers/index)ãå¿
èŠã§ãã
+
+## æ¡æ£ãã€ãã©ã€ã³
+
+[`DiffusionPipeline`]ã¯äºååŠç¿ãããæ¡æ£ã·ã¹ãã ãçæã«äœ¿çšããæãç°¡åãªæ¹æ³ã§ããããã¯ã¢ãã«ãšã¹ã±ãžã¥ãŒã©ãå«ããšã³ãããŒãšã³ãã®ã·ã¹ãã ã§ãã[`DiffusionPipeline`]ã¯å€ãã®äœæ¥ïŒã¿ã¹ã¯ã«ããã«äœ¿çšããããšãã§ããŸãããŸãããµããŒããããŠããã¿ã¹ã¯ã®å®å
šãªãªã¹ãã«ã€ããŠã¯[ð§šDiffusersã®æŠèŠ](./api/pipelines/overview#diffusers-summary)ã®è¡šãåç
§ããŠãã ããã
+
+| **ã¿ã¹ã¯** | **説æ** | **ãã€ãã©ã€ã³**
+|------------------------------|--------------------------------------------------------------------------------------------------------------|-----------------|
+| Unconditional Image Generation | æ£èŠååžããç»åçæ | [unconditional_image_generation](./using-diffusers/unconditional_image_generation) |
+| Text-Guided Image Generation | æç« ããç»åçæ | [conditional_image_generation](./using-diffusers/conditional_image_generation) |
+| Text-Guided Image-to-Image Translation | ç»åãšæç« ããæ°ããªç»åçæ | [img2img](./using-diffusers/img2img) |
+| Text-Guided Image-Inpainting | ç»åããã¹ã¯ãããã³æç« ãæå®ãããå Žåã«ãç»åã®ãã¹ã¯ãããéšåãæç« ãããšã«ä¿®åŸ© | [inpaint](./using-diffusers/inpaint) |
+| Text-Guided Depth-to-Image Translation | æç« ãšæ·±åºŠæšå®ã«ãã£ãŠæ§é ãä¿æããªããç»åçæ | [depth2img](./using-diffusers/depth2img) |
+
+ãŸãã[`DiffusionPipeline`]ã®ã€ã³ã¹ã¿ã³ã¹ãäœæããããŠã³ããŒãããããã€ãã©ã€ã³ã®ãã§ãã¯ãã€ã³ããæå®ããŸãã
+ãã®[`DiffusionPipeline`]ã¯Hugging Face Hubã«ä¿åãããŠããä»»æã®[ãã§ãã¯ãã€ã³ã](https://huggingface.co/models?library=diffusers&sort=downloads)ã䜿çšããããšãã§ããŸãã
+ãã®æ¡å
ã§ã¯ã[`stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)ãã§ãã¯ãã€ã³ãã§ããã¹ãããç»åãžçæããŸãã
+
+
+
+[Stable Diffusion]ã¢ãã«ã«ã€ããŠã¯ãã¢ãã«ãå®è¡ããåã«ãŸã[ã©ã€ã»ã³ã¹](https://huggingface.co/spaces/CompVis/stable-diffusion-license)ãæ³šææ·±ããèªã¿ãã ãããð§š Diffusers ã¯ãæ»æçãŸãã¯æå®³ãªã³ã³ãã³ããé²ãããã« [`safety_checker`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/safety_checker.py) ãå®è£
ããŠããŸãããã¢ãã«ã®æ¹è¯ãããç»åçææ©èœã«ãããæœåšçã«æå®³ãªã³ã³ãã³ããçæãããå¯èœæ§ããããŸãã
+
+
+
+ã¢ãã«ã[`~DiffusionPipeline.from_pretrained`]ã¡ãœããã§ããŒãããŸãïŒ
+
+```python
+>>> from diffusers import DiffusionPipeline
+
+>>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", use_safetensors=True)
+```
+[`DiffusionPipeline`]ã¯å
šãŠã®ã¢ããªã³ã°ãããŒã¯ã³åãã¹ã±ãžã¥ãŒãªã³ã°ã³ã³ããŒãã³ããããŠã³ããŒãããŠãã£ãã·ã¥ããŸããStable Diffusionãã€ãã©ã€ã³ã¯[`UNet2DConditionModel`]ãš[`PNDMScheduler`]ãªã©ã§æ§æãããŠããŸãïŒ
+
+```py
+>>> pipeline
+StableDiffusionPipeline {
+ "_class_name": "StableDiffusionPipeline",
+ "_diffusers_version": "0.13.1",
+ ...,
+ "scheduler": [
+ "diffusers",
+ "PNDMScheduler"
+ ],
+ ...,
+ "unet": [
+ "diffusers",
+ "UNet2DConditionModel"
+ ],
+ "vae": [
+ "diffusers",
+ "AutoencoderKL"
+ ]
+}
+```
+
+ãã®ã¢ãã«ã¯ããã14ååã®ãã©ã¡ãŒã¿ã§æ§æãããŠãããããGPUäžã§ãã€ãã©ã€ã³ãå®è¡ããããšãåŒ·ãæšå¥šããŸãã
+PyTorchãšåãããã«ããžã§ãã¬ãŒã¿ãªããžã§ã¯ããGPUã«ç§»ãããšãã§ããŸãïŒ
+
+```python
+>>> pipeline.to("cuda")
+```
+
+ããã§ãæç« ã `pipeline` ã«æž¡ããŠç»åãçæãããã€ãºé€å»ãããç»åã«ã¢ã¯ã»ã¹ã§ããããã«ãªããŸãããããã©ã«ãã§ã¯ãç»ååºåã¯[`PIL.Image`](https://pillow.readthedocs.io/en/stable/reference/Image.html?highlight=image#the-image-class)ãªããžã§ã¯ãã§ã©ãããããŸãã
+
+```python
+>>> image = pipeline("An image of a squirrel in Picasso style").images[0]
+>>> image
+```
+
+
+

+
+
+`save`颿°ã§ç»åãä¿åã§ããŸã:
+
+```python
+>>> image.save("image_of_squirrel_painting.png")
+```
+
+### ããŒã«ã«ãã€ãã©ã€ã³
+
+ããŒã«ã«ã§ãã€ãã©ã€ã³ã䜿çšããããšãã§ããŸããå¯äžã®éãã¯ãæåã«ãŠã§ã€ããããŠã³ããŒãããå¿
èŠãããããšã§ãïŒ
+
+```bash
+!git lfs install
+!git clone https://huggingface.co/runwayml/stable-diffusion-v1-5
+```
+
+ä¿åãããŠã§ã€ãããã€ãã©ã€ã³ã«ããŒãããŸãïŒ
+
+```python
+>>> pipeline = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-5", use_safetensors=True)
+```
+
+ããã§ãäžã®ã»ã¯ã·ã§ã³ãšåãããã«ãã€ãã©ã€ã³ãåããããšãã§ããŸãã
+
+### ã¹ã±ãžã¥ãŒã©ã®äº€æ
+
+ã¹ã±ãžã¥ãŒã©ãŒã«ãã£ãŠããã€ãºé€å»ã®ã¹ããŒããå質ã®ãã¬ãŒããªããç°ãªããŸããã©ããèªåã«æé©ããç¥ãæåã®æ¹æ³ã¯ãå®éã«è©ŠããŠã¿ãããšã§ãïŒDiffusers ð§šã®äž»ãªæ©èœã®1ã€ã¯ãã¹ã±ãžã¥ãŒã©ãç°¡åã«åãæ¿ããããšãã§ããããšã§ããäŸãã°ãããã©ã«ãã®[`PNDMScheduler`]ã[`EulerDiscreteScheduler`]ã«çœ®ãæããã«ã¯ã[`~diffusers.ConfigMixin.from_config`]ã¡ãœããã§ããŒãã§ããŸãïŒ
+
+```py
+>>> from diffusers import EulerDiscreteScheduler
+
+>>> pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", use_safetensors=True)
+>>> pipeline.scheduler = EulerDiscreteScheduler.from_config(pipeline.scheduler.config)
+```
+
+æ°ããã¹ã±ãžã¥ãŒã©ã䜿ã£ãŠç»åãçæãããã®éãã«æ°ã¥ããã©ãã詊ããŠã¿ãŠãã ããïŒ
+
+次ã®ã»ã¯ã·ã§ã³ã§ã¯ã[`DiffusionPipeline`]ãæ§æããã³ã³ããŒãã³ãïŒã¢ãã«ãšã¹ã±ãžã¥ãŒã©ïŒã詳ããèŠãŠããããã®ã³ã³ããŒãã³ãã䜿ã£ãŠç«ã®ç»åãçæããæ¹æ³ãåŠã³ãŸãã
+
+## ã¢ãã«
+
+ã»ãšãã©ã®ã¢ãã«ã¯ãã€ãºã®å€ããµã³ãã«ãåããåã¿ã€ã ã¹ãããã§*æ®ãã®ãã€ãº*ãäºæž¬ããŸãïŒä»ã®ã¢ãã«ã¯åã®ãµã³ãã«ãçŽæ¥äºæž¬ããããé床ãŸãã¯[`v-prediction`](https://github.com/huggingface/diffusers/blob/5e5ce13e2f89ac45a0066cb3f369462a3cf1d9ef/src/diffusers/schedulers/scheduling_ddim.py#L110)ãäºæž¬ããããã«åŠç¿ããŸãïŒãã¢ãã«ãæ··ããŠä»ã®æ¡æ£ã·ã¹ãã ãäœãããšãã§ããŸãã
+
+ã¢ãã«ã¯[`~ModelMixin.from_pretrained`]ã¡ãœããã§éå§ãããŸãããã®ã¡ãœããã¯ã¢ãã«ãããŒã«ã«ã«ãã£ãã·ã¥ããã®ã§ã次ã«ã¢ãã«ãããŒããããšãã«é«éã«ãªããŸãããã®æ¡å
ã§ã¯ã[`UNet2DModel`]ãããŒãããŸããããã¯åºæ¬çãªç»åçæã¢ãã«ã§ãããç«ç»åã§åŠç¿ããããã§ãã¯ãã€ã³ãã䜿ããŸãïŒ
+
+```py
+>>> from diffusers import UNet2DModel
+
+>>> repo_id = "google/ddpm-cat-256"
+>>> model = UNet2DModel.from_pretrained(repo_id, use_safetensors=True)
+```
+
+ã¢ãã«ã®ãã©ã¡ãŒã¿ã«ã¢ã¯ã»ã¹ããã«ã¯ã`model.config` ãåŒã³åºããŸãïŒ
+
+```py
+>>> model.config
+```
+
+ã¢ãã«æ§æã¯ð§åçµð§ããããã£ã¯ã·ã§ããªã§ãããã¢ãã«äœæåŸã«ãããã®ãã©ã¡ãŒ ã¿ã倿Žããããšã¯ã§ããŸãããããã¯æå³çãªãã®ã§ãæåã«ã¢ãã«ã»ã¢ãŒããã¯ãã£ãå®çŸ©ããããã«äœ¿çšããããã©ã¡ãŒã¿ãåããŸãŸã§ããããšãä¿èšŒããŸããä»ã®ãã©ã¡ãŒã¿ã¯çæäžã«èª¿æŽããããšãã§ããŸãã
+
+æãéèŠãªãã©ã¡ãŒã¿ã¯ä»¥äžã®éãã§ãïŒ
+
+* sample_size`: å
¥åãµã³ãã«ã®é«ããšå¹
ã
+* `in_channels`: å
¥åãµã³ãã«ã®å
¥åãã£ã³ãã«æ°ã
+* down_block_types` ãš `up_block_types`: UNet ã¢ãŒããã¯ãã£ãäœæããããã«äœ¿çšãããããŠã³ãµã³ããªã³ã°ãããã¯ãšã¢ãããµã³ããªã³ã°ãããã¯ã®ã¿ã€ãã
+* block_out_channels`: ããŠã³ãµã³ããªã³ã°ãããã¯ã®åºåãã£ã³ãã«æ°ãéé ã§ã¢ãããµã³ããªã³ã°ãããã¯ã®å
¥åãã£ã³ãã«æ°ã«ã䜿çšãããŸãã
+* layer_per_block`: å UNet ãããã¯ã«å«ãŸãã ResNet ãããã¯ã®æ°ã
+
+ãã®ã¢ãã«ãçæã«äœ¿çšããã«ã¯ãã©ã³ãã ãªç»åã®åœ¢ã®æ£èŠååžãäœæããŸãããã®ã¢ãã«ã¯è€æ°ã®ã©ã³ãã ãªæ£èŠååžãåãåãããšãã§ãããã`batch`軞ãå
¥ããŸããå
¥åãã£ã³ãã«æ°ã«å¯Ÿå¿ãã`channel`軞ãå¿
èŠã§ããç»åã®é«ããšå¹
ã«å¯Ÿå¿ãã`sample_size`軞ãæã€å¿
èŠããããŸãïŒ
+
+```py
+>>> import torch
+
+>>> torch.manual_seed(0)
+
+>>> noisy_sample = torch.randn(1, model.config.in_channels, model.config.sample_size, model.config.sample_size)
+>>> noisy_sample.shape
+torch.Size([1, 3, 256, 256])
+```
+
+ç»åçæã«ã¯ããã€ãºã®å€ãç»åãš `timestep` ãã¢ãã«ã«æž¡ããŸãã`timestep`ã¯å
¥åç»åãã©ã®çšåºŠãã€ãºãå€ããã瀺ããŸããããã¯ãã¢ãã«ãæ¡æ£ããã»ã¹ã«ãããèªåã®äœçœ®ã決å®ããã®ã«åœ¹ç«ã¡ãŸããã¢ãã«ã®åºåãåŸãã«ã¯ `sample` ã¡ãœããã䜿çšããŸãïŒ
+
+```py
+>>> with torch.no_grad():
+... noisy_residual = model(sample=noisy_sample, timestep=2).sample
+```
+
+ããããå®éã®äŸãçæããã«ã¯ããã€ãºé€å»ããã»ã¹ãã¬ã€ãããã¹ã±ãžã¥ãŒã©ãå¿
èŠã§ããæ¬¡ã®ã»ã¯ã·ã§ã³ã§ã¯ãã¢ãã«ãã¹ã±ãžã¥ãŒã©ãšçµã¿åãããæ¹æ³ãåŠã³ãŸãã
+
+## ã¹ã±ãžã¥ãŒã©
+
+ã¹ã±ãžã¥ãŒã©ã¯ãã¢ãã«ã®åºåïŒãã®å Žå㯠`noisy_residual` ïŒãäžãããããšãã«ããã€ãºã®å€ããµã³ãã«ãããã€ãºã®å°ãªããµã³ãã«ãžã®ç§»è¡ã管çããŸãã
+
+
+
+
+ð§š Diffusersã¯æ¡æ£ã·ã¹ãã ãæ§ç¯ããããã®ããŒã«ããã¯ã¹ã§ãã[`DiffusionPipeline`]ã¯äºåã«æ§ç¯ãããæ¡æ£ã·ã¹ãã ã䜿ãå§ããã®ã«äŸ¿å©ãªæ¹æ³ã§ãããç¬èªã®ã¢ãã«ãšã¹ã±ãžã¥ãŒã©ã³ã³ããŒãã³ããåå¥ã«éžæããŠã«ã¹ã¿ã æ¡æ£ã·ã¹ãã ãæ§ç¯ããããšãã§ããŸãã
+
+
+
+ãã®æ¡å
ã§ã¯ã[`DDPMScheduler`]ã[`~diffusers.ConfigMixin.from_config`]ã¡ãœããã§ã€ã³ã¹ã¿ã³ã¹åããŸãïŒ
+
+```py
+>>> from diffusers import DDPMScheduler
+
+>>> scheduler = DDPMScheduler.from_config(repo_id)
+>>> scheduler
+DDPMScheduler {
+ "_class_name": "DDPMScheduler",
+ "_diffusers_version": "0.13.1",
+ "beta_end": 0.02,
+ "beta_schedule": "linear",
+ "beta_start": 0.0001,
+ "clip_sample": true,
+ "clip_sample_range": 1.0,
+ "num_train_timesteps": 1000,
+ "prediction_type": "epsilon",
+ "trained_betas": null,
+ "variance_type": "fixed_small"
+}
+```
+
+
+
+ð¡ ã¹ã±ãžã¥ãŒã©ãã©ã®ããã«ã³ã³ãã£ã®ã¥ã¬ãŒã·ã§ã³ããã€ã³ã¹ã¿ã³ã¹åããããã«æ³šç®ããŠãã ãããã¢ãã«ãšã¯ç°ãªããã¹ã±ãžã¥ãŒã©ã¯åŠç¿å¯èœãªéã¿ãæããããã©ã¡ãŒã¿ãŒãæã¡ãŸããïŒ
+
+
+
+æãéèŠãªãã©ã¡ãŒã¿ã¯ä»¥äžã®éãã§ãïŒ
+
+* num_train_timesteps`: ãã€ãºé€å»åŠçã®é·ããèšãæããã°ãã©ã³ãã ãªæ£èŠååžãããŒã¿ãµã³ãã«ã«åŠçããã®ã«å¿
èŠãªã¿ã€ã ã¹ãããæ°ã§ãã
+* `beta_schedule`: çæãšãã¬ãŒãã³ã°ã«äœ¿çšãããã€ãºã¹ã±ãžã¥ãŒã«ã®ã¿ã€ãã
+* `beta_start` ãš `beta_end`: ãã€ãºã¹ã±ãžã¥ãŒã«ã®éå§å€ãšçµäºå€ã
+
+å°ããã€ãºã®å°ãªãç»åãäºæž¬ããã«ã¯ãã¹ã±ãžã¥ãŒã©ã® [`~diffusers.DDPMScheduler.step`] ã¡ãœããã«ä»¥äžãæž¡ããŸã: ã¢ãã«ã®åºåã`timestep`ãçŸåšã® `sample`ã
+
+```py
+>>> less_noisy_sample = scheduler.step(model_output=noisy_residual, timestep=2, sample=noisy_sample).prev_sample
+>>> less_noisy_sample.shape
+```
+
+`less_noisy_sample`ã¯æ¬¡ã®`timestep`ã«æž¡ãããšãã§ããããã§ããã«ãã€ãºãå°ãªããªããŸãïŒ
+
+ã§ã¯ããã¹ãŠããŸãšããŠããã€ãºé€å»ããã»ã¹å
šäœãèŠèŠåããŠã¿ãŸãããã
+
+ãŸãããã€ãºé€å»ãããç»åãåŸåŠçã㊠`PIL.Image` ãšããŠè¡šç€ºãã颿°ãäœæããŸãïŒ
+
+```py
+>>> import PIL.Image
+>>> import numpy as np
+
+
+>>> def display_sample(sample, i):
+... image_processed = sample.cpu().permute(0, 2, 3, 1)
+... image_processed = (image_processed + 1.0) * 127.5
+... image_processed = image_processed.numpy().astype(np.uint8)
+
+... image_pil = PIL.Image.fromarray(image_processed[0])
+... display(f"Image at step {i}")
+... display(image_pil)
+```
+
+ãã€ãºé€å»åŠçãé«éåããããã«å
¥åãšã¢ãã«ãGPUã«ç§»ããŸãïŒ
+
+```py
+>>> model.to("cuda")
+>>> noisy_sample = noisy_sample.to("cuda")
+```
+
+ããã§ããã€ãºãå°ãªããªã£ããµã³ãã«ã®æ®ãã®ãã€ãºãäºæž¬ãããã€ãºé€å»ã«ãŒããäœæããã¹ã±ãžã¥ãŒã©ã䜿ã£ãŠããã«ãã€ãºã®å°ãªããµã³ãã«ãèšç®ããŸãïŒ
+
+```py
+>>> import tqdm
+
+>>> sample = noisy_sample
+
+>>> for i, t in enumerate(tqdm.tqdm(scheduler.timesteps)):
+... # 1. predict noise residual
+... with torch.no_grad():
+... residual = model(sample, t).sample
+
+... # 2. compute less noisy image and set x_t -> x_t-1
+... sample = scheduler.step(residual, t, sample).prev_sample
+
+... # 3. optionally look at image
+... if (i + 1) % 50 == 0:
+... display_sample(sample, i + 1)
+```
+
+äœããªããšããããç«ãçæãããã®ãã座ã£ãŠèŠãŠãã ããïŒð»
+
+
+

+
+
+## 次ã®ã¹ããã
+
+ãã®ã¯ã€ãã¯ãã¢ãŒã§ãð§šãã£ãã¥ãŒã¶ãŒã䜿ã£ãã¯ãŒã«ãªç»åãããã€ãäœæã§ãããšæããŸãïŒæ¬¡ã®ã¹ããããšããŠ
+
+* ã¢ãã«ããã¬ãŒãã³ã°ãŸãã¯åŸ®èª¿æŽã«ã€ããŠã¯ã[training](./tutorials/basic_training)ãã¥ãŒããªã¢ã«ãåç
§ããŠãã ããã
+* æ§ã
ãªäœ¿çšäŸã«ã€ããŠã¯ãå
¬åŒããã³ã³ãã¥ããã£ã®[training or finetuning scripts](https://github.com/huggingface/diffusers/tree/main/examples#-diffusers-examples)ã®äŸãåç
§ããŠãã ããã
+* ã¹ã±ãžã¥ãŒã©ã®ããŒããã¢ã¯ã»ã¹ã倿Žãæ¯èŒã«ã€ããŠã¯[Using different Schedulers](./using-diffusers/schedulers)ã¬ã€ããåç
§ããŠãã ããã
+* ããã³ãããšã³ãžãã¢ãªã³ã°ãã¹ããŒããšã¡ã¢ãªã®æé©åãããé«å質ãªç»åãçæããããã®ãã³ããããªãã¯ã«ã€ããŠã¯ã[Stable Diffusion](./stable_diffusion)ã¬ã€ããåç
§ããŠãã ããã
+* ð§š Diffusers ã®é«éåã«ã€ããŠã¯ãæé©åããã [PyTorch on a GPU](./optimization/fp16)ã®ã¬ã€ãã[Stable Diffusion on Apple Silicon (M1/M2)](./optimization/mps)ãš[ONNX Runtime](./optimization/onnx)ãåç
§ããŠãã ããã
diff --git a/docs/source/ja/stable_diffusion.md b/docs/source/ja/stable_diffusion.md
new file mode 100644
index 000000000000..fb5afc49435b
--- /dev/null
+++ b/docs/source/ja/stable_diffusion.md
@@ -0,0 +1,260 @@
+
+
+# 广çã§å¹ççãªæ¡æ£ã¢ãã«
+
+[[open-in-colab]]
+
+[`DiffusionPipeline`]ã䜿ã£ãŠç¹å®ã®ã¹ã¿ã€ã«ã§ç»åãçæããããåžæããç»åãçæãããããã®ã¯é£ããããšã§ããå€ãã®å Žåã[`DiffusionPipeline`]ãäœåºŠãå®è¡ããŠããã§ãªããšæºè¶³ã®ããç»åã¯åŸãããŸãããããããäœããªããšããããäœããçæããã«ã¯ããããã®èšç®ãå¿
èŠã§ããçæãäœåºŠãäœåºŠãå®è¡ããå Žåãç¹ã«ããããã®èšç®éãå¿
èŠã«ãªããŸãã
+
+ãã®ããããã€ãã©ã€ã³ãã*èšç®*ïŒé床ïŒãš*ã¡ã¢ãª*ïŒGPU RAMïŒã®å¹çãæå€§éã«åŒãåºããçæãµã€ã¯ã«éã®æéãççž®ããããšã§ãããé«éãªå埩åŠçãè¡ããããã«ããããšãéèŠã§ãã
+
+ãã®ãã¥ãŒããªã¢ã«ã§ã¯ã[`DiffusionPipeline`]ãçšããŠãããéããããè¯ãèšç®ãè¡ãæ¹æ³ã説æããŸãã
+
+ãŸãã[`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)ã¢ãã«ãããŒãããŸãïŒ
+
+```python
+from diffusers import DiffusionPipeline
+
+model_id = "runwayml/stable-diffusion-v1-5"
+pipeline = DiffusionPipeline.from_pretrained(model_id, use_safetensors=True)
+```
+
+ããã§äœ¿çšããããã³ããã®äŸã¯å¹ŽèããæŠå£«ã®é·ã®èåç»ã§ããããèªç±ã«å€æŽããŠãã ããïŒ
+
+```python
+prompt = "portrait photo of a old warrior chief"
+```
+
+## Speed
+
+
+
+ð¡ GPUãå©çšã§ããªãå Žåã¯ã[Colab](https://colab.research.google.com/)ã®ãããªGPUãããã€ããŒããç¡æã§å©çšã§ããŸãïŒ
+
+
+
+ç»åçæãé«éåããæãç°¡åãªæ¹æ³ã®1ã€ã¯ãPyTorchã¢ãžã¥ãŒã«ãšåãããã«GPUäžã«ãã€ãã©ã€ã³ãé
眮ããããšã§ãïŒ
+
+```python
+pipeline = pipeline.to("cuda")
+```
+
+åãã€ã¡ãŒãžã䜿ã£ãŠæ¹è¯ã§ããããã«ããã«ã¯ã[`Generator`](https://pytorch.org/docs/stable/generated/torch.Generator.html)ã䜿ãã[reproducibility](./using-diffusers/reproducibility)ã®çš®ãèšå®ããŸãïŒ
+
+```python
+import torch
+
+generator = torch.Generator("cuda").manual_seed(0)
+```
+
+ããã§ç»åãçæã§ããŸãïŒ
+
+```python
+image = pipeline(prompt, generator=generator).images[0]
+image
+```
+
+
+

+
+
+ãã®åŠçã«ã¯T4 GPUã§~30ç§ããããŸããïŒå²ãåœãŠãããŠããGPUãT4ããåªããŠããå Žåã¯ãã£ãšéããããããŸããïŒãããã©ã«ãã§ã¯ã[`DiffusionPipeline`]ã¯å®å
šãª`float32`粟床ã§çæã50ã¹ãããå®è¡ããŸããfloat16`ã®ãããªäœã粟床ã«å€æŽããããæšè«ã¹ãããæ°ãæžããããšã§é«éåããããšãã§ããŸãã
+
+ãŸã㯠`float16` ã§ã¢ãã«ãããŒãããŠç»åãçæããŠã¿ãŸãããïŒ
+
+```python
+import torch
+
+pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16, use_safetensors=True)
+pipeline = pipeline.to("cuda")
+generator = torch.Generator("cuda").manual_seed(0)
+image = pipeline(prompt, generator=generator).images[0]
+image
+```
+
+
+

+
+
+ä»åãç»åçæã«ããã£ãæéã¯ããã11ç§ã§ã以åãã3åè¿ãéããªããŸããïŒ
+
+
+
+ð¡ ãã€ãã©ã€ã³ã¯åžžã« `float16` ã§å®è¡ããããšã匷ããå§ãããŸãã
+
+
+
+çæã¹ãããæ°ãæžãããšããæ¹æ³ããããŸããããå¹ççãªã¹ã±ãžã¥ãŒã©ãéžæããããšã§ãåºåå質ãç ç²ã«ããããšãªãã¹ãããæ°ãæžããããšãã§ããŸãã`compatibles`ã¡ãœãããåŒã³åºãããšã§ã[`DiffusionPipeline`]ã®çŸåšã®ã¢ãã«ãšäºææ§ã®ããã¹ã±ãžã¥ãŒã©ãèŠã€ããããšãã§ããŸãïŒ
+
+```python
+pipeline.scheduler.compatibles
+[
+ diffusers.schedulers.scheduling_lms_discrete.LMSDiscreteScheduler,
+ diffusers.schedulers.scheduling_unipc_multistep.UniPCMultistepScheduler,
+ diffusers.schedulers.scheduling_k_dpm_2_discrete.KDPM2DiscreteScheduler,
+ diffusers.schedulers.scheduling_deis_multistep.DEISMultistepScheduler,
+ diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler,
+ diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler,
+ diffusers.schedulers.scheduling_ddpm.DDPMScheduler,
+ diffusers.schedulers.scheduling_dpmsolver_singlestep.DPMSolverSinglestepScheduler,
+ diffusers.schedulers.scheduling_k_dpm_2_ancestral_discrete.KDPM2AncestralDiscreteScheduler,
+ diffusers.schedulers.scheduling_heun_discrete.HeunDiscreteScheduler,
+ diffusers.schedulers.scheduling_pndm.PNDMScheduler,
+ diffusers.schedulers.scheduling_euler_ancestral_discrete.EulerAncestralDiscreteScheduler,
+ diffusers.schedulers.scheduling_ddim.DDIMScheduler,
+]
+```
+
+Stable Diffusionã¢ãã«ã¯ããã©ã«ãã§[`PNDMScheduler`]ã䜿çšããŸãããã®ã¹ã±ãžã¥ãŒã©ã¯éåžž~50ã®æšè«ã¹ããããå¿
èŠãšããŸããã[`DPMSolverMultistepScheduler`]ã®ãããªé«æ§èœãªã¹ã±ãžã¥ãŒã©ã§ã¯~20ãŸãã¯25ã®æšè«ã¹ãããã§æžã¿ãŸãã[`ConfigMixin.from_config`]ã¡ãœããã䜿çšãããšãæ°ããã¹ã±ãžã¥ãŒã©ãããŒãããããšãã§ããŸãïŒ
+
+```python
+from diffusers import DPMSolverMultistepScheduler
+
+pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)
+```
+
+ããã§ `num_inference_steps` ã20ã«èšå®ããŸãïŒ
+
+```python
+generator = torch.Generator("cuda").manual_seed(0)
+image = pipeline(prompt, generator=generator, num_inference_steps=20).images[0]
+image
+```
+
+
+

+
+
+æšè«æéãããã4ç§ã«ççž®ããããšã«æåããïŒâ¡ïž
+
+## ã¡ã¢ãªãŒ
+
+ãã€ãã©ã€ã³ã®ããã©ãŒãã³ã¹ãåäžããããã1ã€ã®éµã¯ãæ¶è²»ã¡ã¢ãªãå°ãªãããããšã§ããäžåºŠã«çæã§ããç»åã®æ°ã確èªããæãç°¡åãªæ¹æ³ã¯ã`OutOfMemoryError`ïŒOOMïŒãçºçãããŸã§ãããŸããŸãªããããµã€ãºã詊ããŠã¿ãããšã§ãã
+
+æç« ãš `Generators` ã®ãªã¹ãããç»åã®ããããçæãã颿°ãäœæããŸããå `Generator` ã«ã·ãŒããå²ãåœãŠãŠãè¯ãçµæãåŸãããå Žåã«åå©çšã§ããããã«ããŸãã
+
+```python
+def get_inputs(batch_size=1):
+ generator = [torch.Generator("cuda").manual_seed(i) for i in range(batch_size)]
+ prompts = batch_size * [prompt]
+ num_inference_steps = 20
+
+ return {"prompt": prompts, "generator": generator, "num_inference_steps": num_inference_steps}
+```
+
+`batch_size=4`ã§éå§ããã©ãã ãã¡ã¢ãªãæ¶è²»ãããã確èªããŸãïŒ
+
+```python
+from diffusers.utils import make_image_grid
+
+images = pipeline(**get_inputs(batch_size=4)).images
+make_image_grid(images, 2, 2)
+```
+
+倧容éã®RAMãæèŒããGPUã§ãªãéããäžèšã®ã³ãŒãã¯ãããã`OOM`ãšã©ãŒãè¿ããã¯ãã§ãïŒã¡ã¢ãªã®å€§åã¯ã¯ãã¹ã¢ãã³ã·ã§ã³ã¬ã€ã€ãŒãå ããŠããŸãããã®åŠçããããã§å®è¡ãã代ããã«ã鿬¡å®è¡ããããšã§ã¡ã¢ãªã倧å¹
ã«ç¯çŽã§ããŸããå¿
èŠãªã®ã¯ã[`~DiffusionPipeline.enable_attention_slicing`]颿°ã䜿çšããããšã ãã§ãïŒ
+
+```python
+pipeline.enable_attention_slicing()
+```
+
+ä»åºŠã¯`batch_size`ã8ã«ããŠã¿ãŠãã ããïŒ
+
+```python
+images = pipeline(**get_inputs(batch_size=8)).images
+make_image_grid(images, rows=2, cols=4)
+```
+
+
+

+
+
+以åã¯4æã®ç»åã®ããããçæããããšããã§ããŸããã§ããããä»ã§ã¯8æã®ç»åã®ãããã1æãããïœ3.5ç§ã§çæã§ããŸãïŒããã¯ãããããå質ãç ç²ã«ããããšãªãT4 GPUã§ã§ããæéã®åŠçé床ã§ãã
+
+## å質
+
+åã®2ã€ã®ã»ã¯ã·ã§ã³ã§ã¯ã`fp16` ã䜿ã£ãŠãã€ãã©ã€ã³ã®é床ãæé©åããæ¹æ³ãããããã©ãŒãã³ ã¹ãªã¹ã±ãžã¥ãŒã©ãŒã䜿ã£ãŠçæã¹ãããæ°ãæžããæ¹æ³ãã¢ãã³ã·ã§ã³ã¹ã©ã€ã¹ãæå¹ ã«ããŠã¡ã¢ãªæ¶è²»éãæžããæ¹æ³ã«ã€ããŠåŠã³ãŸãããä»åºŠã¯ãçæãããç»åã®å質ãåäžãããæ¹æ³ã«çŠç¹ãåœãŠãŸãã
+
+### ããè¯ããã§ãã¯ãã€ã³ã
+
+æãåçŽãªã¹ãããã¯ãããè¯ããã§ãã¯ãã€ã³ãã䜿ãããšã§ããStable Diffusionã¢ãã«ã¯è¯ãåºçºç¹ã§ãããå
¬åŒçºè¡šä»¥æ¥ãããã€ãã®æ¹è¯çããªãªãŒã¹ãããŠããŸããããããæ°ããããŒãžã§ã³ã䜿ã£ããããšãã£ãŠãèªåçã«è¯ãçµæãåŸãããããã§ã¯ãããŸãããæè¯ã®çµæãåŸãããã«ã¯ãèªåã§ããŸããŸãªãã§ãã¯ãã€ã³ãã詊ããŠã¿ãããã¡ãã£ãšããç ç©¶ïŒ[ãã¬ãã£ãããã³ãã](https://minimaxir.com/2022/11/stable-diffusion-negative-prompt/)ã®äœ¿çšãªã©ïŒããããããå¿
èŠããããŸãã
+
+ãã®åéãæé·ããã«ã€ããŠãç¹å®ã®ã¹ã¿ã€ã«ãçã¿åºãããã«åŸ®èª¿æŽãããããã質ã®é«ããã§ãã¯ãã€ã³ããå¢ããŠããŸãã[Hub](https://huggingface.co/models?library=diffusers&sort=downloads)ã[Diffusers Gallery](https://huggingface.co/spaces/huggingface-projects/diffusers-gallery)ãæ¢çŽ¢ããŠãèå³ã®ãããã®ãèŠã€ããŠã¿ãŠãã ããïŒ
+
+### ããè¯ããã€ãã©ã€ã³ã³ã³ããŒãã³ã
+
+çŸåšã®ãã€ãã©ã€ã³ã³ã³ããŒãã³ããæ°ããããŒãžã§ã³ã«çœ®ãæããŠã¿ãããšãã§ããŸããStability AIãæäŸããææ°ã®[autodecoder](https://huggingface.co/stabilityai/stable-diffusion-2-1/tree/main/vae)ããã€ãã©ã€ã³ã«ããŒãããç»åãçæããŠã¿ãŸãããïŒ
+
+```python
+from diffusers import AutoencoderKL
+
+vae = AutoencoderKL.from_pretrained("stabilityai/sd-vae-ft-mse", torch_dtype=torch.float16).to("cuda")
+pipeline.vae = vae
+images = pipeline(**get_inputs(batch_size=8)).images
+make_image_grid(images, rows=2, cols=4)
+```
+
+
+

+
+
+### ããè¯ãããã³ããã»ãšã³ãžãã¢ãªã³ã°
+
+ç»åãçæããããã«äœ¿çšããæç« ã¯ã*ããã³ãããšã³ãžãã¢ãªã³ã°*ãšåŒã°ããåéãäœãããã»ã©ãéåžžã«éèŠã§ããããã³ããã»ãšã³ãžãã¢ãªã³ã°ã§èæ
®ãã¹ãç¹ã¯ä»¥äžã®éãã§ãïŒ
+
+- çæãããç»åããã®é¡äŒŒç»åã¯ãã€ã³ã¿ãŒãããäžã«ã©ã®ããã«ä¿åãããŠãããïŒ
+- ç§ãæãã¹ã¿ã€ã«ã«ã¢ãã«ãèªå°ããããã«ãã©ã®ãããªè¿œå 詳现ãäžããã¹ããïŒ
+
+ãã®ããšã念é ã«çœ®ããŠãããã³ããã«è²ããã質ã®é«ããã£ããŒã«ãå«ããããã«æ¹è¯ããŠã¿ãŸãããïŒ
+
+```python
+prompt += ", tribal panther make up, blue on red, side profile, looking away, serious eyes"
+prompt += " 50mm portrait photography, hard rim lighting photography--beta --ar 2:3 --beta --upbeta"
+```
+
+æ°ããããã³ããã§ç»åã®ããããçæããŸãããïŒ
+
+```python
+images = pipeline(**get_inputs(batch_size=8)).images
+make_image_grid(images, rows=2, cols=4)
+```
+
+
+

+
+
+ããªãããã§ãïŒçš®ã`1`ã®`Generator`ã«å¯Ÿå¿ãã2çªç®ã®ç»åã«ã被åäœã®å¹Žéœ¢ã«é¢ããããã¹ãã远å ããŠãããå°ãæãå ããŠã¿ãŸãããïŒ
+
+```python
+prompts = [
+ "portrait photo of the oldest warrior chief, tribal panther make up, blue on red, side profile, looking away, serious eyes 50mm portrait photography, hard rim lighting photography--beta --ar 2:3 --beta --upbeta",
+ "portrait photo of a old warrior chief, tribal panther make up, blue on red, side profile, looking away, serious eyes 50mm portrait photography, hard rim lighting photography--beta --ar 2:3 --beta --upbeta",
+ "portrait photo of a warrior chief, tribal panther make up, blue on red, side profile, looking away, serious eyes 50mm portrait photography, hard rim lighting photography--beta --ar 2:3 --beta --upbeta",
+ "portrait photo of a young warrior chief, tribal panther make up, blue on red, side profile, looking away, serious eyes 50mm portrait photography, hard rim lighting photography--beta --ar 2:3 --beta --upbeta",
+]
+
+generator = [torch.Generator("cuda").manual_seed(1) for _ in range(len(prompts))]
+images = pipeline(prompt=prompts, generator=generator, num_inference_steps=25).images
+make_image_grid(images, 2, 2)
+```
+
+
+

+
+
+## 次ã®ã¹ããã
+
+ãã®ãã¥ãŒããªã¢ã«ã§ã¯ã[`DiffusionPipeline`]ãæé©åããŠèšç®å¹çãšã¡ã¢ãªå¹çãåäžãããçæãããåºåã®å質ãåäžãããæ¹æ³ãåŠã³ãŸããããã€ãã©ã€ã³ãããã«é«éåããããšã«èå³ãããã°ã以äžã®ãªãœãŒã¹ãåç
§ããŠãã ããïŒ
+
+- [PyTorch 2.0](./optimization/torch2.0)ãš[`torch.compile`](https://pytorch.org/docs/stable/generated/torch.compile.html)ãã©ã®ããã«çæé床ã5-300%é«éåã§ããããåŠãã§ãã ãããA100 GPUã®å Žåãç»åçæã¯æå€§50%éããªããŸãïŒ
+- PyTorch 2ã䜿ããªãå Žåã¯ã[xFormers](./optimization/xformers)ãã€ã³ã¹ããŒã«ããããšããå§ãããŸãããã®ã©ã€ãã©ãªã®ã¡ã¢ãªå¹çã®è¯ãã¢ãã³ã·ã§ã³ã¡ã«ããºã 㯠PyTorch 1.13.1 ãšçžæ§ãè¯ããé«éåãšã¡ã¢ãªæ¶è²»éã®åæžãåæã«å®çŸããŸãã
+- ã¢ãã«ã®ãªãããŒããªã©ããã®ä»ã®æé©åãã¯ããã¯ã¯ [this guide](./optimization/fp16) ã§ã«ããŒãããŠããŸãã