# Video GAN

This is both a replication exercise for learning and an experiment in novel artistic outputs. The goal is to generate abstracted videos evoking the subjective affect of certain objects, actions, and scenes in motion, through generative adversarial networks trained on input videos of the desired subjects.

Generative models are based on ["Generating Videos with Scene Dynamics" (2016)](http://www.cs.columbia.edu/~vondrick/tinyvideo/paper.pdf) and ["Improving Video Generation for Multi-functional Applications" (2017)](https://arxiv.org/pdf/1711.11453.pdf). 

The creative part of this project is more nebulous for now but will require manipulating the generated videos such that they're able to be projected in a live setting paired with musical compositions. TBD...

Using Google Colaboratory for TPU access. Will refactor once validated.

In [0]:
import cv2
import numpy as np
import os
import tensorflow as tf

## Settings

In [0]:
# Video settings
video_dir = ''
video_size = []
frame_int = 2
frame_cap = 32

# Training parameters
epochs = 50
z_dim = 100
read_threads = 16

# Adam optimizer
learning_rate = 0.0001
beta1 = 0.5

# Output frequency
sample_rate = 100

# Use eager execution
tf.enable_eager_execution()

## Video Processing

### Extract frames

In [0]:
videos = glob.glob(os.path.join(video_dir, '*.avi'))

# For each video in directory, capture every frame_int number of frames and store in 4D array.
for vnum, video in enumerate(videos):
  description = os.path.splitext(video)[0]
  vidcap = cv2.VideoCapture(os.path.join(video_dir, video))
  success, image = vidcap.read()
  output = np.zeros(frame_cap, image.shape[0], image.shape[1], image.shape[2])
  loc, frames = 0
  while success and frames < frame_cap:
    output[frames] = image
    loc += frame_int
    frames += 1
    vidcap.set(cv2.CAP_PROP_POS_MSEC, count)
    success, image = vidcap.read()
  cv2.imwrite(os.path.join(video_dir, description + str(vnum) + '.jpg', output)

### Read frames into tf data object

In [0]:
# Reads video image, decodes into a dense tensor, resized to desired shape.
def _parse_function(filename, label):
  image_string = tf.read_file(filename)
  image_decoded = tf.image.decode_jpeg(image_string)
  image_resized = tf.image.resize_images(image_decoded, video_size)
  return image_resized, label

# File name vector.
video_files = glob.glob(os.path.join(video_dir, '*.jpg'))
filenames = tf.constant(video_files)

# Label vector.
labels = tf.constant([os.path.splitext(vid)[0] for vid in video_files])

dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))
dataset = dataset.map(_parse_function)

## Utilities

## Model

In [0]:
class VideoGAN():
  
  def __init___(self):
    
    
  def build_model(self):
    
  def train(self):
    
  def generator(self):
    
  def discriminator(self):