# Stable Diffusion Animations With Simple Variations

Notebook by David Marx ([@DigThatData](https://twitter.com/digthatdata))

Shared under MIT license

## How this animation technique works

For each text prompt you provide, the notebook will...

1. Generate an image based on that text prompt
2. Use the generated image as the `init_image` to recombine with the text prompt to generate variations similar to the first image. This produces a sequence of extremely similar images based on the original text prompt
3. This image sequence is then repeated several times to produce a longer sequence

The technique demonstrated in this notebook was inspired by a [video](https://www.youtube.com/watch?v=WJaxFbdjm8c) created by Ben Gillin.


In [None]:
# @title # 1. 🔑 Provide your API Key
# @markdown Running this cell will prompt you to enter your API Key below. 

# @markdown To get your API key, visit https://beta.dreamstudio.ai/membership

# @markdown ---

# @markdown A note on security best practices: **don't publish your API key.**

# @markdown We're using a form field designed for sensitive data like passwords.
# @markdown This notebook does not save your API key in the notebook itself,
# @markdown but instead loads your API Key into the colab environment. This way,
# @markdown you can make changes to this notebook and share it without concern
# @markdown that you might accidentally share your API Key. 
# @markdown 


import os, getpass

os.environ['STABILITY_KEY'] = getpass.getpass('Enter your API Key')

!pip install stability-sdk

from stability_sdk import client

stability_api = client.StabilityInference(
    key=os.environ['STABILITY_KEY'], 
    verbose=False,
)

In [None]:
# @title # 2. 📜 Provide Text prompts

# @markdown Put Each prompt on its own line, in between the two `"""`

# @markdown Below, we're using the lyrics to the song "Virus" by Del The Funky Homosapien

text = """
Global controls will have to be imposed
And a world governing body, will be created to enforce them
Crises, precipitate change
Secretly plotting your demise

I wanna devise a virus
To bring dire straits to your environment
Crush your corporations with a mild touch
Trash your whole computer system and revert you to papyrus
I want to make a super virus
Strong enough to cause blackouts in every single metropolis
Cause they don't wanna unify us
So fuck-it total anarchy and can't nobody stop us

You see late in the evening
Fucked up on my computer and my mind starts roaming
I create like a heathen
The first cycles of this virus I can send through a modem
Infiltration hits your station
No Microsoft or enhanced DOS will impede
Society thinks they're safe when
Bingo! Hard drive crashes from the rending
A lot of hackers tried viruses before
Vaporize your text like so much white out
I want it where a file replication is a chore
Lights out shut down entire White House
I don't want just a bug that could be corrected
I'm erecting immaculate design
Break the nation down, section by section
Even to the greatest minds it's impossible to find

I wanna devise a virus
To bring dire straits to your environment
Crush your corporations with a mild touch
Trash your whole computer system and revert you to papyrus
I wanna devise a virus
To bring dire straits to your environment
Crush your corporations with a mild touch
Trash your whole computer system and revert you to papyrus

We have already planned
The plan is programmed into every one of my thousand robots
We will not hesitate; we will destroy the Homosapien!
Please, stay where you are

Psst, ay, I'm makin' some shit in my basement
Bout to do it to 'em, don't tell 'em though
Alright I love you, peace

I want to develop a super virus
Better by far than that old Y2K
This is 3030 the time of global unification
Break right through they
Terminals, burn 'em all, slaves to silicon
Corrupt politicians with leaders and their keywords
F.B.I. and spies stealin' bombs
De-cipitate they plans in their face and catch the fever
Everybody loot the stores get your canned goods
Even space stations are having a hard time
Peacekeepers seek to take our manhood
Which results in the form of global apartheid
Ghettos are trash dumps with gas pumps
Exploding and burnt out since before the great union
The last punks walk around like masked monks
Ready to manipulate the database or break through 'em
Human rights come in a hundredth place
Mass production has always been number one
New Earth has become a repugnant place
So it's time to spread the fear to thunder some

Too long have we tried 
to extend our glorious empire out to the stars
Only to be driven back

I wanna devise a virus
To bring dire straits to your environment
Crush your corporations with a mild touch
Trash your whole computer system and revert you to papyrus
I wanna devise a virus
To bring dire straits to your environment
Crush your corporations with a mild touch
Trash your whole computer system and revert you to papyrus
""" 

In [None]:
# @title # 3. Provide a "theme" prompt (optional)

# @markdown If you provide a "theme prompt" here, we'll append that on the end
# @markdown of each text prompt to provide some consistency across the full 
# @markdown animation sequence.

# tack a suffix on the prompt to give the overall generation some commonality
theme_prompt = "extremely detailed, painted by ralph steadman and radiohead, beautiful, wow" # @param {type: 'string'}

In [None]:
# @title # 4. Animation parameters

add_caption = True # @param {type:'boolean'}
display_frames_as_we_get_them = True # @param {type:'boolean'}
n_variations=3 # @param {type:'integer'}
repeat=6 # @param {type:'integer'}
image_consistency=0.85 # @param {type:"slider", min:0, max:1, step:0.01}
fps = 12 # @param {type:"slider", min:4, max:60, step:1}
output_filename = 'output.mp4' # @param {type:'string'}
max_video_duration_in_seconds = 300 # @param {type:'integer'}


max_frames = fps * max_video_duration_in_seconds

print(f"max total frames: {max_frames}")
print(f"Max API requests: {int(max_frames/repeat)}")

# @markdown ---

# @markdown `add_caption` - Whether or not to overlay the prompt text on the image

# @markdown `display_frames_as_we_get_them` - Displaying frames will make the notebook slightly slower

# @markdown `n_variations` - How many unique variations to generate for a given text prompt

# @markdown `repeat` - how many times to repeat the sequence of variations to elongate the animation for that prompt


# @markdown `image_consistency` - controls similarity between images generated by the prompt.
# @markdown - 0: ignore the init image
# @markdown - 1: true as possible to the init image

# @markdown `fps` - Frames-per-second of generated animations

# @markdown `output_filename` - filename your video will be saved to

# @markdown `max_video_duration_in_seconds` - The 'generating images from prompts' procedure will exit early after enough frames have been created for a video of this duration.


max total frames: 3600
Max API requests: 600


In [None]:
# @title # 5. 🚀 Generate images!

import getpass
import io
import os
from subprocess import Popen, PIPE
import warnings

from IPython.display import display
from PIL import Image, ImageDraw, ImageFont
import textwrap
from tqdm.notebook import tqdm

import stability_sdk.interfaces.gooseai.generation.generation_pb2 as generation

#################################
# set up a few helper functions #
#################################

def get_image_for_prompt(prompt, max_retries=3, **kargs):
  # auto-retry if mitigation triggered
  while max_retries:
    try:
      answers = stability_api.generate(prompt=prompt, **kargs)
      response = process_response(answers)
      for img in response:
        yield img
      break
    except RuntimeError:
      max_retries -= 1
      warnings.warn(f"mitigation triggered, retries remaining: {max_retries}")

def process_response(answers):
  # iterating over the generator produces the api response
  for resp in answers:
      for artifact in resp.artifacts:
          #print(artifact.finish_reason)
          if artifact.finish_reason == generation.FILTER:
              #warnings.warn(
              #    "Your request activated the API's safety filters and could not be processed."
              #    "Please modify the prompt and try again.")
              raise RuntimeError
          if artifact.type == generation.ARTIFACT_IMAGE:
              img = Image.open(io.BytesIO(artifact.binary))
              yield img

def get_variations_w_init(prompt, init_image, **kargs):
  return list(get_image_for_prompt(prompt=prompt, init_image=init_image, **kargs))

def get_close_variations_from_prompt(prompt, n_variations=2, image_consistency=.7):
  """
  prompt: a text prompt
  n_variations: total number of images to return
  image_consistency: float in [0,1], controls similarity between images generated by the prompt.
                     you can think of this as controlling how much "visual vibration" there will be.
                     - 0=regenerate each image independently without consideration for other images generated by prompt
                     - 1=images are all completely identical
  """
  images = list(get_image_for_prompt(prompt))
  #display(images[0])
  for _ in range(n_variations - 1):
     img = get_variations_w_init(prompt, images[0], start_schedule=(1-image_consistency))[0]
     #display(img)
     images.append(img)
  return images

# /usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf
def add_caption2image(
      image, 
      caption, 
      text_font='LiberationSans-Regular.ttf', 
      font_size=20,
      fill_color=(255, 255, 255),
      stroke_color=(0, 0, 0), #stroke_fill
      stroke_width=2,
      align='center',
      ):
    # via https://stackoverflow.com/a/59104505/819544
    wrapper = textwrap.TextWrapper(width=50) 
    word_list = wrapper.wrap(text=caption) 
    caption_new = ''
    for ii in word_list[:-1]:
        caption_new = caption_new + ii + '\n'
    caption_new += word_list[-1]

    draw = ImageDraw.Draw(image)

    # Download the Font and Replace the font with the font file. 
    font = ImageFont.truetype(text_font, size=font_size)
    w,h = draw.textsize(caption_new, font=font, stroke_width=stroke_width)
    W,H = image.size
    x,y = 0.5*(W-w),0.90*H-h
    draw.text(
        (x,y), 
        caption_new,
        font=font,
        fill=fill_color, 
        stroke_fill=stroke_color,
        stroke_width=stroke_width,
        align=align,
    )

    return image

############
# Do stuff #
############

# split the lyrics into lines
lyrics = [line.strip() for line in text.split('\n') if line.strip()]

# generate animation
frames = []
#for lyric in lyrics[:12]:
for lyric in lyrics:
  prompt = f"{lyric}, {theme_prompt}"
  images = get_close_variations_from_prompt(prompt, n_variations=4, image_consistency=.8)

  if add_caption:
    images = [add_caption2image(im, lyric) for im in images]

  if display_frames_as_we_get_them:
    print(lyric)
    for im in images:
      display(im)
  images *= repeat
  frames.extend(images)
  if len(frames) > max_frames:
    break

In [None]:
# @title # 6. 🎥 Compile your video!

cmd_in = ['ffmpeg', '-y', '-f', 'image2pipe', '-vcodec', 'png', '-r', str(fps), '-i', '-']
cmd_out = ['-vcodec', 'libx264', '-r', str(fps), '-pix_fmt', 'yuv420p', '-crf', '1', '-preset', 'veryslow', output_filename]

cmd = cmd_in + cmd_out

p = Popen(cmd, stdin=PIPE)
for im in tqdm(frames):
  im.save(p.stdin, 'PNG')
p.stdin.close()

print("Encoding video...")
p.wait()
print("Video complete.")
print(f"Video saved to: {output_filename}")

  0%|          | 0/1776 [00:00<?, ?it/s]

Encoding video...
Video complete.
Video saved to: output.mp4


In [None]:
# @title # 7. 📺 Enjoy your animation!

download_video = True # @param {type:'boolean'}
embed_video_in_notebook = False # @param {type:'boolean'}

# @markdown NB: only embed short videos

if download_video:
    from google.colab import files
    files.download(output_filename)

if embed_video_in_notebook:
    from IPython.display import display, Video
    display(Video(output_filename, embed=True))

**MIT License**
```
Copyright 2022 David Marx

Permission is hereby granted, free of charge, to any person obtaining a copy of 
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```