# Recurrent Neural Network
### Austin Bean
### 06/20/2019
# nn notes

<!-- idea: write in JMD to produce PDF via command below, but convert to ipynb to make executable notebook. -->
<!-- in principle notebook may be more important so probably worth writing that first... -->
<!-- see: https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html -->
<!--from within the directory: weave(joinpath(pwd(), "nn_notes.jmd"), out_path=:pwd, doctype="md2pdf")-->
<!-- convert to ipynb: convert_doc("nn_notes.jmd", "nn_notes.ipynb") -->
<!-- http://weavejl.mpastell.com/stable/notebooks/#Output-to-Jupyter-notebooks-1 -->
<!--
This can run within the Jupyter environment after conversion using the function above
Requires:
- install anaconda
- install IJulia
- from within Julia, run:
using IJulia
notebook()

note that the weave command should be run from a terminal, not within Atom.

Julia markdown syntax:
https://docs.julialang.org/en/v1/stdlib/Markdown/#Markdown-1

Weave extension:
http://weavejl.mpastell.com/stable/publish/
(under supported markdown syntax)
-->


- Word embeddings from word2vec
- Train RNN before this or after?
- Which features, in other words

## import some labeled data

- Word embeddings come from word2vec
- Embeddings are organized in the file train_embedding.jl
- Labels come from a Stata file
- Import data, pull out


First load the data

In [None]:
using Flux, CSVFiles, DataFrames, Word2Vec

dat1 = DataFrame(load("embedded_data.csv"));

Now separate the labels:

In [None]:
target = convert(Array{Float64,1}, dat1[:x1])

And load the inputs - in this case, the longest diet sentence is 22 words long, so there are at most 22
columns in this data.

In [None]:
input = convert(Array{Float64,2}, dat1[:,[:x2, :x3, :x4, :x5, :x6, :x7, :x8, :x9, :x10, :x11, :x12, :x13, :x14, :x15, :x16, :x17, :x18, :x19, :x20, :x21, :x22]])

Load the vector of word embeddings:

In [None]:
embed_1 = wordvectors("./diet_embed");

get_vector(embed_1, "NEOSURE")

- Now setup the RNN.
- Bi-directional RNN is best, I think.
- This should take each (word, state) as an input and output a (state) to be fed back in.
<!-- https://colah.github.io/posts/2015-08-Understanding-LSTMs/ -->

<!-- https://github.com/maetshju/flux-blstm-implementation/blob/master/01-blstm.jl -->

## Recurrent RNN from Goldberg:

- We are given an input $x_{1}, …, x_{n}$.  Each input is represented by a vector $x_{i} \in \mathbb{R}^{d_{in}}$
- In my case the inputs are the words.  Each word has a single-dimensional input $\mathbb{R}$ from the embeddings
given by word2vec, though this need not be the case more generally. Single sentences are stored in the `input` table above.
- The RNN maps the vector of inputs $x_{1}, \ldots, x_{n}$ to the output $y_{n} \in \mathbb{R}^{d_{out}}$
$$
y_{1:n} = RNN^* (x_{1:n})
$$
$$
y_{i} = RNN(x_{1:i})
$$
$$
x_{i} \in \mathbb{R}^{d_{in}}, y_{i} \in \mathbb{R}^{d_{out}}
$$
- Think of $y_{1:i}$ as a different output for each possible $i=1, \ldots, n$.
- $y_{1:n}=RNN^* (x_{1:n})$ represents the entire sequence of outputs $y_{1:1}, y_{1:2}, y_{1:3}, \ldots$

Features in this case are given very easily by the row of the `input` table, but this does not capture the
fact that the set of features input to the model might be growing.


# Implementation

Now define some functions.

In [None]:
"""
    `inp_vec(x, i; bi=false)`
    this function takes a set of features and returns input vector.
    Can do bidirectional as well w/ `bi = true`.
    Returns a one-dimensional vector of embeddings of words either up to `i`
    or from `1:i` and then `i:end` (including `i` twice).
"""
function inp_vec(x, i; bi=false)
  if bi
    return vcat(copy(x[1:i]), reverse(x[i:end]))
  else
    return copy(x[1:i])
  end
end