# Perceiver Autoencoder

In this notebook, I prototype the **full** version of the Perceiver autoencoder. This includes the training loop and any data management that needs to occur with this. Disk-memory-GPU management of data will not be performed, however. 

**Goals:**
Fully adjustable + parameterizable Perceiver for Predictive Coding.
 - **Encoder**: Adjustable encoder shape, # re-exposures as data is compressed. 
	 - Query = current latent state; key-values = input byte array. 
	 	 - Re-exposures: query = current latent state estimate; key-values = input byte array.
		 - **Option**: residual connections around TF blocks.  
		 - **Option**: use different block types for each block after the first. 
	 - Let's avoid having a bunch of intermediate {#token, token dim} sizes between the original byte array and the final latent state. 
	 - The only adjustment should be the # re-exposures -- i.e., the number of different $\mathbb{R}^{(M \times D)} \to \mathbb{R}^{(N\times C)}$ encoders there are that re-query the byte array using the current latent estimate. 
 - **Latent-latent**: Number of distinct blocks, number of block repeats between new information exposure. 
 - **Decoder**: Similar to encoder, queries = positional codes we want to reconstruct, key-values = latent matrix. 
	 - Avoid intermediate dimensionailties. 
	 - For repeated querying, do we use query = current reconstruction, key-value = latent matrix? 
	 	 - **Optional**: Can have residual connections, too!
		 - **Optional**: use the same blocks for every step of re-exposure. 


## Pseudocode 

```python
class Model: 
	governing class vars: 
		encoder = [several TF Blocks]
		num_distinct_encoder = [int]
		num_total_encoder = [int]
		residual_encoder = [bool]
		
		latent_evolver = [several TF Blocks]
		num_distinct_latent = [int]
		num_total_latent_cycles = [int] # if this is a 2-tuple: we randomly select some # of latent cycles in that range. 
		
		
		decoder = [several TF Blocks]
		num_distinct_decoder = [int]
		num_total_decoder = [int]	
		residual_decoder = [bool]

	class state vars: 
		latent_state = [TF variable, learnable initial pos code]

	function encode(input_tokens): 
		...

	function evolve_latent(): 
		...

	function decode(positional_codes):
		...


	function test(new_datum): 
		""" Basically just `call` but it won't incorporate the new datum into
		the latent state. It will also add the test performance to the model's 
		`test loss` records.  
		"""
		...

	function call(new_datum, return_latent=False):
		""" Given some new patches, we calculate the "surprise" 
			then incorporate the data into the latent state. 

			We finally return the surprise value, just for metric tracking. 
		"""
		# Computing surprise		
		predicted_input = decode(new_datum.positional_codes)
		new_loss = loss(predicted_input, new_datum.tokens) 
	
		# Incorporating new info -> latent state, performing latent evolutions.
		encode(new_datum)
		evolve_latent()

		# returning values 
		if return_latent: 
			return new_loss, self.latent
		else:
			return new_loss
		
	
```

## 0: Imports & Data Acquisition

In [1]:
## Import Box 
import os 
import sys 
import random
import pathlib
import itertools
import collections
import math

import tensorflow as tf 
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm
import cv2
# Some modules to display an animation using imageio.
import imageio

2022-10-17 14:37:48.229837: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-17 14:37:48.525672: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2022-10-17 14:37:48.647148: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-10-17 14:37:49.596170: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; 

In [2]:
## GPU Setup
physical_devices = tf.config.list_physical_devices("GPU")
for device in physical_devices:
    tf.config.experimental.set_memory_growth(device, True)

In [3]:
## Get some data 
# Utility imports  
sys.path.append("../src")
import video_loader as vl
import video_preprocess as vp 

## Meta/constants 
DATA_FOLDER = "../datasets/downloads"
num_videos, num_frames = 16, 20
output_size = (120, 180)

patch_height = 16
patch_width = 16
patch_duration = 3

batch_size=1

# Fourier feature codes 
k_space = 15
mu_space = 20 
k_time = 64 
mu_time = 200

print("Getting VideoSet...")
VideoSet = vl.get_videoset("../datasets/downloads", num_videos, num_frames, output_size=output_size)

print("Making patches from Videoset...")
PatchSet = vp.make_patchset(VideoSet, patch_duration, patch_height, patch_width)

print("Making the flat patch set...")
FlatPatchSet = vp.patch_to_flatpatch(PatchSet, batch_size=batch_size)

print("Adding codes to the PatchSet...")
CodedPatchedSet = PatchSet.map(lambda x: vp.add_spacetime_codes(x, 
		k_space=k_space, mu_space=mu_space, k_time=k_time, mu_time=mu_time))

print("Flattening the coded + patched dataset...")
FlatCodedPatchedSet = vp.patch_to_flatpatch(CodedPatchedSet, batch_size=batch_size)

Getting VideoSet...


  0%|          | 0/16 [00:00<?, ?it/s]2022-10-17 14:38:10.690864: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-17 14:38:11.978166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 20168 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3090 Ti, pci bus id: 0000:1a:00.0, compute capability: 8.6
2022-10-17 14:38:11.979143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 22279 MB memory:  -> device: 1, name: NVIDIA GeForce RTX 3090 Ti, pci bus id: 0000:68:00.0, compute capability: 8.6
100%|██████████| 16/16 [00:03<00:00,  4.87it/s]


Making patches from Videoset...
Making the flat patch set...
Adding codes to the PatchSet...
Flattening the coded + patched dataset...
