# Encoder decoder chapter notes
This type of model is actually two models in one. Given a variable length input, one half of the model transforms that into a meaningful fixed length vector, and that is then fed into the traditional LSTM we know and love. The book discusses some strange shinanigans about how it was used for natural language processing, and a huge improvement to the model was feeding the input in backwards??? That was very strange. I didn't understand that one bit. 

The structure of this is similar to the CNN_LSTM<br>
Input --> Encoder --> Decoder --> Dense --> Output

### Uses
* Translating languages
* Predicting code execution
* Image captioning

In [2]:
# Necessary neural net imports
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Dense
from tensorflow.keras import activations

# Necessary number manip
import numpy as np
import pandas as pd
import math
import random as rand

# Necessary plotting imports
import matplotlib.pyplot as plt
import matplotlib.patches as patches

# I also want to hide stinky warning boxes, and 
# this: https://stackoverflow.com/questions/9031783/hide-all-warnings-in-ipython
# had some solutions.
def hide_warnings():
    from IPython.display import HTML
    HTML('''<script>
    var code_show_err = false; 
    var code_toggle_err = function() {
     var stderrNodes = document.querySelectorAll('[data-mime-type="application/vnd.jupyter.stderr"]')
     var stderr = Array.from(stderrNodes)
     if (code_show_err){
         stderr.forEach(ele => ele.style.display = 'block');
     } else {
         stderr.forEach(ele => ele.style.display = 'none');
     }
     code_show_err = !code_show_err
    } 
    document.addEventListener('DOMContentLoaded', code_toggle_err);
    </script>
    To toggle on/off output_stderr, click <a onclick="javascript:code_toggle_err()">here</a>.''')
    
hide_warnings()
###########################################################
print(f"Tensor Flow Version: {tf.__version__}")


2022-02-01 20:33:33.111421: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-02-01 20:33:33.111452: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


Tensor Flow Version: 2.4.0


---
O ho! On page 128 I get a clue to the way ```TimeDistributed``` works!!!
> The same weights can
be used to output each time step in the output sequence by wrapping the ```Dense``` layer in a
```TimeDistributed``` wrapper.

It somehow lets the weights not change with every output????

---

We have an issue, the encoder will not output a properly shaped array that will fit onto the decoder nicely. To get around this, we can use the ```RepeatVector``` layer as an adapter. 

# Defining the Addition Prediction Problem
If I write 10+6= I would expect 16. So in total I would have "10+6=16". For the computer I can define this as 
```
Input: ['1', '0', '+', '6']
Output: ['1', '6']
```

I imagine this data generation is going to be the hardest part!

### random_sum_pairs()

In [36]:
def random_sum_pairs(n_examples, largest_n, n_summation):
    '''
    Given:
    * n_examples: How many examples we want to generate
    * largest_n: The largest number we want
    * n_summation: the number of numbers we will be summing
    
    Output:
    * X: [[1,2,...n_summation],[largest_n, 2, ...],...n_examples]
    * y: [m, l,...n_examples]
    '''
    X, y = list(), list()
    for _ in range(n_examples):
        left_side = [rand.randint(1,largest_n) for _ in range(n_summation)]
        right_side = sum(left_side)
        X.append(left_side)
        y.append(right_side)
    return X,y

# Well this works well :)
random_sum_pairs(1, 10, 2)

([[2, 5]], [7])

The book now says we need to turn these into padded strings. Well lets first think on how we will do this. We need to ensure that we don't run out of space, and yet don't create too much padding. We can figure out how much space a number will take up via ```floor(log10(largest_n)+1)```. For example plugging 32 will give us 2 digits, 100 will give us 3, etc. Now we can multiply that by ```n_summation```, and add ```n_summation-1``` (for the operation character). The full formula for length would be<br>
**INPUT**
> ```max_left = n_summation * np.floor(np.log10(largest_n)+1) + (n_summation-1)```

**OUTPUT**
> ```max_right =  np.floor(np.log10(n_summation * largest_n)+1)```

In [50]:
n_summation=3
largest_n=50
max_len = n_summation * np.floor(np.log10(largest_n)+1) + (n_summation-1); print("Max input: ",max_len)
max_right = np.floor(np.log10(n_summation * largest_n)+1); print("Max output: ", max_right)

Max input:  8.0
Max output:  3.0


### to_string(X,y,n_summation,largest_n)

In [107]:
def to_string(X_num,y_num,n_summation,largest_n):
    X,y = list(), list()
    max_left = n_summation * np.floor(np.log10(largest_n)+1) + (n_summation-1)
    max_right = np.floor(np.log10(n_summation * largest_n)+1)
    for X_pattern, y_pattern in zip(X_num, y_num):
        # X padding
        X_str = "+".join([str(value) for value in X_pattern])
        X_pad = " "*int(max_left-len(X_str))
        X.append(X_pad+X_str)
        
        # y padding
        y_str = str(y_pattern)
        y_pad = " "*int(max_right-len(y_str))
        y.append(y_pad+y_str)
    return X,y
        
n_examples = 1
n_summation = 2
largest_n = 10
X,y = random_sum_pairs(n_examples, largest_n, n_summation)
print(X,y)
X,y = to_string(X,y,n_summation,largest_n)

[[9, 1]] [10]


In [87]:
def make_dummies(sequence, categories=None):
    '''
    Given a list "sequence" return it one-hot encoded
    This automatically finds all the unique values in
    the input and makes them the categories. Requires
    numpy and pandas as imports
    Usage: make_dummies([1,5,6])
    Returns an np array: 
    array([[1, 0, 0],
           [0, 1, 0],
           [0, 0, 1]], dtype=uint8)
    '''
    if not categories:
        categories = set(np.array(sequence).flatten())
    sequence = pd.DataFrame(sequence,
                            dtype=pd.CategoricalDtype(categories=categories))
    sequence = pd.get_dummies(sequence)
    return np.array(sequence)

In [108]:
categories = [str(x) for x in range(0,10)]
categories.append('+')
categories.append(' ')
print(X[0])
make_dummies(list(X[0]), categories=categories)

  9+1


array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0],
       [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)

In [95]:
list(X)

[' 6+10']

In [96]:
X

[' 6+10']