[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ChenyangQiQi/FateZero/blob/main/colab_fatezero.ipynb)

# FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

[Chenyang Qi](https://chenyangqiqi.github.io/), [Xiaodong Cun](http://vinthony.github.io/), [Yong Zhang](https://yzhang2016.github.io), [Chenyang Lei](https://chenyanglei.github.io/), [Xintao Wang](https://xinntao.github.io/), [Ying Shan](https://scholar.google.com/citations?hl=zh-CN&user=4oXBp9UAAAAJ), and [Qifeng Chen](https://cqf.io)


[![Project Website](https://img.shields.io/badge/Project-Website-orange)](https://fate-zero-edit.github.io/)
[![arXiv](https://img.shields.io/badge/arXiv-2303.09535-b31b1b.svg)](https://arxiv.org/abs/2303.09535)
[![GitHub](https://img.shields.io/github/stars/ChenyangQiQi/FateZero?style=social)](https://github.com/ChenyangQiQi/FateZero)

In [1]:
#@markdown Check type of GPU and VRAM available.
!nvidia-smi --query-gpu=name,memory.total,memory.free --format=csv,noheader

Tesla T4, 15360 MiB, 15101 MiB


In [2]:
#@title Install requirements

!git clone https://github.com/ChenyangQiQi/FateZero /content/FateZero
%cd /content/FateZero
# %pip install -r requirements.txt
%pip install -q -U --pre triton
%pip install -q diffusers[torch]==0.11.1 transformers==4.26.0 bitsandbytes==0.35.4 \
decord accelerate omegaconf einops ftfy gradio imageio-ffmpeg xformers

Cloning into '/content/FateZero'...
remote: Enumerating objects: 1602, done.[K
remote: Counting objects: 100% (130/130), done.[K
remote: Compressing objects: 100% (26/26), done.[K
remote: Total 1602 (delta 120), reused 106 (delta 104), pack-reused 1472[K
Receiving objects: 100% (1602/1602), 204.94 MiB | 28.43 MiB/s, done.
Resolving deltas: 100% (541/541), done.
/content/FateZero
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m167.9/167.9 MB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torch 2.1.0+cu121 requires triton==2.1.0, but you have triton 2.2.0 which is incompatible.[0m[31m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m524.9/524.9 kB[0m [31m9.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.3/6.3 MB[0m [31m

In [3]:
#@title Download pretrained model

#@markdown Name/Path of the initial model.
MODEL_NAME = "CompVis/stable-diffusion-v1-4" #@param {type:"string"}

#@markdown If model should be download from a remote repo. Untick it if the model is loaded from a local path.
download_pretrained_model = True #@param {type:"boolean"}
if download_pretrained_model:
    !git lfs install
    !git clone https://huggingface.co/$MODEL_NAME ckpt/$MODEL_NAME
    MODEL_NAME = f"./ckpt/{MODEL_NAME}"
print(f"[*] MODEL_NAME={MODEL_NAME}")

Updated git hooks.
Git LFS initialized.
Cloning into 'ckpt/CompVis/stable-diffusion-v1-4'...
remote: Enumerating objects: 768, done.[K
remote: Total 768 (delta 0), reused 0 (delta 0), pack-reused 768[K
Receiving objects: 100% (768/768), 685.42 KiB | 23.63 MiB/s, done.
Resolving deltas: 100% (140/140), done.
Filtering content: 100% (18/18), 21.71 GiB | 98.72 MiB/s, done.
[*] MODEL_NAME=./ckpt/CompVis/stable-diffusion-v1-4


# **Usage**


## FateZero Edit with low resource cost


In [15]:
#@markdown Edit config

#@markdown More details of the configuration will be given soon.

from omegaconf import OmegaConf

VIDEO_FILE = 'data/teaser_car-turn' #@param {type:"string"}

VIDEO_ID = VIDEO_FILE.split('/')[-1]

RESULT_DIR = 'result/'+VIDEO_ID

CONFIG_NAME = "config/"+VIDEO_ID+".yaml"

source_prompt = "a silver jeep driving down a curvy road in the countryside" #@param {type:"string"}
edit_prompt = "an infrared video of a yellow taxi driving down a curvy road in the countryside"  #@param {type:"string"}
EMPHYSIS_WORD = "infrared" #@param {type:"string"}
EMPHYSIS_VALUE = 10 #@param {type:"number"}
video_length = 8 #@param {type:"number"}
INVERSION_STEP = 8 #@param {type:"number"}
REPLACE_STRENGTH = 0.8 #@param {type:"slider", min:0, max:1, step:0.1}
STORE_ATTENTION_ON_disk = False #@param {type:"boolean"}
width = 512
height = 512

config = {
  "pretrained_model_path": MODEL_NAME,
  "logdir": RESULT_DIR,
  "dataset_config": {
    "path": VIDEO_FILE,
    "prompt": source_prompt,
    "n_sample_frame": video_length,
    "sampling_rate": 1,
    "stride": 80,
    "offset":
    {
        "left": 0,
        "right": 0,
        "top": 0,
        "bottom": 0,
    }
  },
  "editing_config":{
      "use_invertion_latents": True,
      "use_inversion_attention": True,
      "guidance_scale": 7.5,
      "editing_prompts":[
          source_prompt,
          edit_prompt,
      ],
      "p2p_config":[
          {
          "cross_replace_steps":{
              "default_":0.8
              },
          "self_replace_steps": 0.8,
          "blend_self_attention": True,
           "blend_th": [2, 2],
          "is_replace_controller": False
          },
          {
          "cross_replace_steps":{
              "default_":0.8
              },
          "self_replace_steps": 0.8,
          "eq_params":{
              "words":[EMPHYSIS_WORD],
              "values": [EMPHYSIS_VALUE]
            },
          "use_inversion_attention": True,
          "is_replace_controller": False
          }]
          ,
    "clip_length": "${..dataset_config.n_sample_frame}",
    "sample_seeds": [0],
    "num_inference_steps": INVERSION_STEP,
    "prompt2prompt_edit": True
     },
  "disk_store": STORE_ATTENTION_ON_disk,
  "model_config":{
      "lora": 160,
      "SparseCausalAttention_index": ['mid'],
      "least_sc_channel": 640
  },
  "test_pipeline_config":{
    "target": "video_diffusion.pipelines.p2p_ddim_spatial_temporal.P2pDDIMSpatioTemporalPipeline",
    "num_inference_steps": "${..validation_sample_logger.num_inference_steps}"
  },
  "seed": 0,
}

OmegaConf.save(config, CONFIG_NAME)
print('save new configue to ', CONFIG_NAME)

save new configue to  config/teaser_car-turn.yaml


In [16]:
!accelerate launch test_fatezero.py --config=$CONFIG_NAME

2024-01-30 17:33:35.494151: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-30 17:33:35.494208: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-30 17:33:35.495726: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-30 17:33:35.503694: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
The following values were not passed to `acce

### Show the results

In [17]:
from IPython.display import HTML
from base64 import b64encode
import os, sys
import glob

# get the last from results
mp4_name = sorted(glob.glob('./result/*/sample/step_0.mp4'))[-1]

print(mp4_name)
mp4 = open('{}'.format(mp4_name),'rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()

print('Display animation: {}'.format(mp4_name), file=sys.stderr)
display(HTML("""
  <video width=512 controls>
        <source src="%s" type="video/mp4">
  </video>
  """ % data_url))

./result/teaser_car-turn_240130-173345/sample/step_0.mp4


Display animation: ./result/teaser_car-turn_240130-173345/sample/step_0.mp4


## Edit your video

In [7]:
#@markdown Upload your video(.mp4) by running this cell or skip this cell using the default data

import os
from google.colab import files
import shutil
from IPython.display import HTML
from base64 import b64encode

uploaded = files.upload()
for filename in uploaded.keys():
    dst_path = os.path.join("data", filename)
    shutil.move(filename, dst_path)

file_id = dst_path.replace('.mp4', '')

! mkdir -p $file_id
! ffmpeg -hide_banner -loglevel error -i $dst_path -vf scale="512:512" -vf fps=25 $file_id/%05d.png

mp4 = open('{}'.format(dst_path),'rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()

display(HTML("""
  <video width=512 controls>
        <source src="%s" type="video/mp4">
  </video>
  """ % data_url))


NameError: name 'dst_path' is not defined