### Sparse Representation
A method of directly mapping a word or meaning to a specific dimension of a vector

> Let's say Apple is labeled 1 and Banana is labeled 2 <br>
Apple : [0,0]<br>
Banana : [1,1]<br>
Pear : [0,1]<br>
pear has similar shape to apple and similar color to banana

Shortcomings :<br>
There are almost countless meaning categories in the world. you can't really put every own meaning to each dimension of the word vector.





### Distributed Representation

#### distribution hypothesis
Word that appear in a similar context are similar in meaning.

#### distributed representation
The distance between the two word vectors that appear in a similar context is closer to one another , and words that do not are adjusted little by little so that they are far away. 

- *specific dimension of vector does not contain specific meaning. The meaning is distributed across several dimensions of the vector.*
- *able to calculate similarity between words*

### Embedding Layer
Basically vocabulary for computer 

The Embedding layer connects the word that come in as input to a distributed expressions.

THEN HOW DOES IT MAP THE WORDS?<br>
*By using One-hot Encoding*
> One-hot Encoding : <br>
- Expressing N words as N-dimensional vectors
- Put 1 in the place where the word is included and 0 in the rest.


In [None]:
import tensorflow as tf

vocab = {
    'i' : 0,
    'need' : 1,
    'some' : 2,
    'more' : 3,
    'coffee' : 4,
    'cake' : 5,
    'cat' : 6,
    'dog' : 7
}

sentence = 'i i i i need some more coffee coffee coffee'

_input = [vocab[w] for w in sentence.split()]

vocab_size = len(vocab)

one_hot = tf.one_hot(_input,vocab_size)
print(one_hot.numpy())

[[1. 0. 0. 0. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0.]]


In [None]:
distribution_size = 2
linear = tf.keras.layers.Dense(units=distribution_size,use_bias=False)
one_hot_linear = linear(one_hot)

print('Linear Weight')
print(linear.weights[0].numpy())

print('\nOne-Hot Linear Result')
print(one_hot_linear.numpy())

Linear Weight
[[-0.22614408 -0.14133441]
 [ 0.6897342   0.5778433 ]
 [-0.47542423  0.43041813]
 [ 0.5307485  -0.20240974]
 [-0.5897074  -0.02290857]
 [ 0.20062464  0.65808415]
 [ 0.2632308  -0.3739692 ]
 [ 0.11422312  0.02952564]]

One-Hot Linear Result
[[-0.22614408 -0.14133441]
 [-0.22614408 -0.14133441]
 [-0.22614408 -0.14133441]
 [-0.22614408 -0.14133441]
 [ 0.6897342   0.5778433 ]
 [-0.47542423  0.43041813]
 [ 0.5307485  -0.20240974]
 [-0.5897074  -0.02290857]
 [-0.5897074  -0.02290857]
 [-0.5897074  -0.02290857]]


In [None]:
some_words = tf.constant([3,57,35])

print('Sentence for Embedding : ',some_words.shape)
embedding_layer = tf.keras.layers.Embedding(input_dim=64,output_dim=100)

print('Embedded Sentence : ',embedding_layer(some_words).shape)
print('Weight Form of Embedding Layer : ',embedding_layer.weights[0])

Sentence for Embedding :  (3,)
Embedded Sentence :  (3, 100)
Weight Form of Embedding Layer :  <tf.Variable 'embedding/embeddings:0' shape=(64, 100) dtype=float32, numpy=
array([[ 0.01321033, -0.01810903,  0.00808107, ...,  0.00741605,
        -0.01477399, -0.00947008],
       [-0.03704268,  0.0153797 ,  0.03202203, ..., -0.01356131,
        -0.01437421, -0.03563211],
       [ 0.01879281, -0.00787201,  0.00132377, ..., -0.02735884,
         0.01027121,  0.01781802],
       ...,
       [ 0.04857221, -0.03118016,  0.03480772, ...,  0.01560559,
        -0.03280371,  0.0484472 ],
       [ 0.04397647, -0.02406415, -0.04433781, ...,  0.04927639,
         0.04118277, -0.04138882],
       [-0.03885026,  0.02158064,  0.02357776, ...,  0.02342123,
        -0.00356811, -0.03133275]], dtype=float32)>


Because the Embedding layer only corresponds to words, differentiation is impossible.
Therfore, it is impossible to connect any result to the embedding layer.

*The Embedding layer should be used  directly connected to the input.*


### Sequential
- Just because there's no connection between the lists of data, it doesn't mean that it's not sequential data.
- But in deep learning sequential data must have the connections between the list of data.


### RNN
Model that processes Sequential data
Using only one Weight parameter and sequentially updata corresponding to dimension of `(input dimension ,output dimension)`

*problems of RNN*
- Vanishing Gradient : The front part of the input becomes lighter as it goes back, causes the loss.
 

In [None]:
sentence = 'What time is it ?'
dic = {
    'is' : 0,
    'it' : 1,
    'What' : 2,
    'time' : 3,
    '?' : 4
}

print('Sentence for RNN',sentence)

sentence_tensor = tf.constant([[dic[word] for word in sentence.split()]])

print('word mapping for Embedding : ',sentence_tensor.numpy())
print('Form of input data : ',sentence_tensor.shape)

embedding_layer = tf.keras.layers.Embedding(input_dim=len(dic),output_dim=100)
emb_out = embedding_layer(sentence_tensor)

print('\nEmbedding Result : ',emb_out.shape)
print('Weight Form of Embedding layer : ',embedding_layer.weights[0].shape)

rnn_seq_layer = \
tf.keras.layers.SimpleRNN(units=64, return_sequences=True,use_bias=False)
rnn_seq_out = rnn_seq_layer(emb_out)

print('\nRNN Result : ',rnn_seq_out.shape)
print('Weight Form of RNN layer : ',rnn_seq_layer.weights[0].shape)

rnn_fin_layer = tf.keras.layers.SimpleRNN(units=64,use_bias=False)
rnn_fin_out = rnn_fin_layer(emb_out)

print('\n RNN Result : ',rnn_fin_out.shape)
print('Weight Form of RNN layer : ',rnn_fin_layer.weights[0].shape)

Sentence for RNN What time is it ?
word mapping for Embedding :  [[2 3 0 1 4]]
Form of input data :  (1, 5)

Embedding Result :  (1, 5, 100)
Weight Form of Embedding layer :  (5, 100)

RNN Result :  (1, 5, 64)
Weight Form of RNN layer :  (100, 64)

 RNN Result :  (1, 64)
Weight Form of RNN layer :  (100, 64)


To judge if the sentence is positive or negative,you can do it by reading entire sentence and check the output of last step.<br>
But in case of generationg sentence and figure out positive or negative, you need outputs of every step.
 - in that case you can put `return_sequences=True` in `tf.keras.layers.SimpleRNN`

### LSTM
Long Short-Term Memory<br>
The Model is designed to avoid vanishing gradient problem.<br>

- LSTM is basically RNN layer with 4 differnt weights.<br>
- Each weights are included architecture called "gate" and determine which data should or should not affect to the next step.
- Because LSTM has new concept called 'Cell state',it saves old input data without much loss.
- cell states  add or substracts data that goes into gate.

> GRU : Modified model of LSTM<br>
- Forget Gate and Input Gate are combined in to Update Gate<br>
- Cell state and Hidden state are also combined

In [None]:
lstm_seq_layer = tf.keras.layers.LSTM(units=64, return_sequences=True, use_bias=False)
lstm_seq_out = lstm_seq_layer(emb_out)

print("\nLSTM 결과 (모든 Step Output):", lstm_seq_out.shape)
print("LSTM Layer의 Weight 형태:", lstm_seq_layer.weights[0].shape)

lstm_fin_layer = tf.keras.layers.LSTM(units=64, use_bias=False)
lstm_fin_out = lstm_fin_layer(emb_out)

print("\nLSTM 결과 (최종 Step Output):", lstm_fin_out.shape)
print("LSTM Layer의 Weight 형태:", lstm_fin_layer.weights[0].shape)


LSTM 결과 (모든 Step Output): (1, 5, 64)
LSTM Layer의 Weight 형태: (100, 256)

LSTM 결과 (최종 Step Output): (1, 64)
LSTM Layer의 Weight 형태: (100, 256)


### Bidrectional RNN
- RNN that changed direction of progress.<br>
- Two overlapping RNNs with opposite directions.<br>
- Use `tf.keras.layers.Bidrectional()`<br>
- Rather than analyzing or generating sentences, it's advantageous for tasks such as machine translation

In [None]:
import tensorflow as tf

sentence = 'What time is it ?'
dic = {
    'is' : 0,
    'it': 1,
    'What' : 2,
    'time' : 3,
    '?' : 4,
}

sentence_tensor = tf.constant([[dic[word] for word in sentence.split()]])

embedding_layer = tf.keras.layers.Embedding(input_dim=len(dic),output_dim=100)
emb_out = embedding_layer(sentence_tensor)

print('Form of input data :',emb_out.shape)

bi_rnn = \
tf.keras.layers.Bidirectional(
    tf.keras.layers.SimpleRNN(units=64,use_bias=False,return_sequences=True) 
)
bi_out = bi_rnn(emb_out)

print('Bidirectional RNN Result : ',bi_out.shape)

Form of input data : (1, 5, 100)
Bidirectional RNN Result :  (1, 5, 128)
