# Stable Diffusion Videos

This notebook allows you to generate videos by interpolating the latent space of [Stable Diffusion](https://github.com/CompVis/stable-diffusion).

You can either dream up different versions of the same prompt, or morph between different text prompts (with seeds set for each for reproducibility).

If you like this notebook:
- consider giving the [repo a star](https://github.com/nateraw/stable-diffusion-videos) ⭐️
- consider following me on Github [@nateraw](https://github.com/nateraw) 

You can file any issues/feature requests [here](https://github.com/nateraw/stable-diffusion-videos/issues)

Enjoy 🤗

## Setup

In [None]:
%%capture
! pip install realesrgan stable_diffusion_videos[realesrgan]
! git config --global credential.helper store

### Authenticate with Hugging Face Hub

You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens).

  > ⚠️ **Important**: You must also go to the [model repository](https://huggingface.co/CompVis/stable-diffusion-v1-4) and click "Access Repository" so you can download the model.

In [None]:
from huggingface_hub import notebook_login

notebook_login()

In [None]:
#@title Connect to Google Drive to Save Outputs

#@markdown If you want to connect Google Drive, click the checkbox below and run this cell. You'll be prompted to authenticate.

#@markdown If you just want to save your outputs in this Colab session, don't worry about this cell

connect_google_drive = True #@param {type:"boolean"}

#@markdown Then, in the interface, use this path as the `output` in the Video tab to save your videos to Google Drive:

#@markdown > /content/gdrive/MyDrive/stable_diffusion_videos


if connect_google_drive:
    from google.colab import drive

    drive.mount('/content/gdrive')

## Generate video clip for Ballet Dancer

### Setup seeds and prompts

#### Seeds generator

Generate random seeds only once to be then copied into the next setup script

In [None]:
import random

# Number of seed to generate, adapt to your need
N = 61

print([random.randint(2000000000, 8000000000) for _ in range(N)])

#### Parameters

In [None]:
audio_path = '/content/Ballet Dancer - draft 3.mp3'
# Audio duration in seconds
# duration = 350
duration = 70

# Audio offset (every 5 seconds)
offsets = list(range(0, duration, 5))

# List of original seeds used for the first part of the song
# NOTE: seeds must be fixed and not random to provide
#       reproductibility
seeds = [
    6871596188, 7042399203, 4066412822, 7484131661, 3425344691, 3779981234,
    7163318970, 7148941744, 3335470119, 6964904650, 5851479726, 7041406661,
    2823231592,
]

fps = 30
steps =  [(b-a) * fps for a, b in zip(offsets, offsets[1:])]


# NOTE: on free GPU limit is 2, premium GPU limit is 10
batch_size = 10

# NOTE: Google Drive root relative path
drive_folder = 'stable-diffusion'
name = 'ballet-dancer'

#### Prompts

In [None]:
# NOTE : 2 was supposed to be 'rhads' but mistake at first seeding
artists = ['ivan aivazovsky', 'greg rutkowski', 'rutkowski']
container = 'beautiful painting'
cues = [
    'digital art',
    'hyper detailed, sharp focus, soft light',
    'octane render',
    'ray tracing',
    'trending on artstation'
]

template = ''.join([
    'A ',
    container,
    ' of {0} by ',
    ' and '.join(artists),
    ', in style of ',
    '. '.join(cues)
])

# NOTE: Size must be (duration / time_per_prompt)
prompts = [
    template.format(prompt)
    for prompt in [
      # 0:00
      'a ballet dancer girl in a city',
      'a ballet dancer girl watching a skyscraper',
      'a ballet dancer girl dancing with a skyscraper',
      'a skyscraper transforming into a burger',
      'a ballet dancer girl eating a burger in a city',
      'a burger transforming into a man shadow',
      'a ballet dancer girl watching a man shadow leaving',
      'a ballet dancer girl crying in a city',
      'a ballet dancer girl disappearing in dust',
      'a ballet dancer girl wake up in a forest',
      'a ballet dancer girl dancing in a forest',
      'a ballet dancer girl dancing with a tree',
      # 1:00
    ]
]

print(prompts)

### Load model from HuggingFace

This step will take a couple minutes the first time you run it.

In [None]:
import torch

from stable_diffusion_videos import StableDiffusionWalkPipeline, Interface

from diffusers.models import AutoencoderKL
from diffusers.schedulers import LMSDiscreteScheduler

pipeline = StableDiffusionWalkPipeline.from_pretrained(
    'runwayml/stable-diffusion-v1-5',
    vae=AutoencoderKL.from_pretrained(f"stabilityai/sd-vae-ft-ema"),
    torch_dtype=torch.float16,
    revision="fp16",
    safety_checker=None,
    scheduler=LMSDiscreteScheduler(
        beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear"
    )
).to("cuda")

### Generate video

In [None]:
path = pipeline.walk(
    prompts=prompts,
    seeds=seeds,
    num_inference_steps=50,
    guidance_scale=10,
    margin=1.0,
    smooth=0.2,
    resume=True,
    upsample=True,
    num_interpolation_steps=steps,
    height=512, width=512,
    audio_filepath=audio_path,
    audio_start_sec=offsets[0],
    fps=fps,
    batch_size=batch_size,
    output_dir=f'/content/gdrive/MyDrive/{drive_folder}',
    name=name,
)
print(f'video generated at {path}')