## 1. Why CNN?
**CNN** architecture reflects **the order of words / expressions in learning** by preserving local information in sentences

**CNN**은 문장의 지역 정보를 보존함으로써 **단어/표현의 등장순서를 학습에 반영**하는 아키텍처
> https://ratsgo.github.io/natural%20language%20processing/2017/03/19/CNN/

## 2. Practice Code


In [1]:
import tensorflow as tf
import numpy as np

tf. reset_default_graph()

In [2]:
# TextCNN parameter
embedding_size = 2 # bi-gram
sequence_length = 3
num_classes = 2 #0 or 1
filter_sizes = [2,2,2]
num_filters = 3

In [3]:
# 3 words sentences(sequence_length=3)
sentences = ["i love you", "he loves me", "she likes baseball", "he hates football", "i hate you", "sorry for that", "this is awful"]
# present whether is the sentences is positive or negative
labels = [1,1,1,0,0,0,0]

word_list = " ".join(sentences).split()
print(word_list)
#remove duplicates
word_list = list(set(word_list))
print(word_list)
word_dict = {w: i for i, w in enumerate(word_list)}
print(word_dict)
vocab_size = len(word_dict)
print(vocab_size)

['i', 'love', 'you', 'he', 'loves', 'me', 'she', 'likes', 'baseball', 'he', 'hates', 'football', 'i', 'hate', 'you', 'sorry', 'for', 'that', 'this', 'is', 'awful']
['likes', 'me', 'sorry', 'that', 'awful', 'she', 'he', 'football', 'for', 'hates', 'is', 'you', 'i', 'baseball', 'hate', 'loves', 'love', 'this']
{'likes': 0, 'me': 1, 'sorry': 2, 'that': 3, 'awful': 4, 'she': 5, 'he': 6, 'football': 7, 'for': 8, 'hates': 9, 'is': 10, 'you': 11, 'i': 12, 'baseball': 13, 'hate': 14, 'loves': 15, 'love': 16, 'this': 17}
18


In [4]:
inputs = []
for sen in sentences : 
    inputs.append(np.asarray([word_dict[n] for n in sen.split()]))

outputs = []
for out in labels:
    outputs.append(np.eye(num_classes)[out]) # ONE-HOT : To using Tensor Softmax Loss function, eye함수는 단위행렬 만들어줌

In [5]:
print(inputs)

[array([12, 16, 11]), array([ 6, 15,  1]), array([ 5,  0, 13]), array([6, 9, 7]), array([12, 14, 11]), array([2, 8, 3]), array([17, 10,  4])]


In [6]:
print(outputs)

[array([0., 1.]), array([0., 1.]), array([0., 1.]), array([1., 0.]), array([1., 0.]), array([1., 0.]), array([1., 0.])]


In [7]:
# model
X = tf.placeholder(tf.int32, [None, sequence_length])
Y = tf.placeholder(tf.int32, [None, num_classes])

# make lookup table =  단어 ID들 각각을 벡터로 바꿔주기
W = tf.Variable(tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0))
embedded_chars = tf.nn.embedding_lookup(W, X) # [batch_size, sequence_length, embedding_size]
embedded_chars = tf.expand_dims(embedded_chars, -1) # add channel(=1) [batch_size, sequence_length, embedding_size, 1]

Instructions for updating:
Colocations handled automatically by placer.


In [8]:
pooled_outputs = []
for i, filter_size in enumerate(filter_sizes):
    filter_shape = [filter_size, embedding_size, 1, num_filters]
    W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1))
    b = tf.Variable(tf.constant(0.1, shape=[num_filters]))
    
    # make conv layer
    conv = tf.nn.conv2d(embedded_chars, # [batch_size, sequence_length, embedding_size, 1]
                                    W,    # [filter_size, embedding_size, 1, num_filters(=3)]
                                    strides=[1,1,1,1],  #배치데이터 하나씩, 단어 하나씩 슬라이딩하면서 보라는 의미
                                    padding='VALID')
    h= tf.nn.relu(tf.nn.bias_add(conv, b))
    
    # max-pooling
    # ksize = Max-pooling하는 영역의 크기
    pooled = tf.nn.max_pool(h,
                            ksize=[1, sequence_length - filter_size + 1, 1, 1], # [batch_size, filter_height, filter_width, channel]
                            strides=[1, 1, 1, 1],
                            padding='VALID')
    pooled_outputs.append(pooled) # dim of pooled : [batch_size(=6), output_height(=1), output_width(=1), channel(=1)]
    
num_filters_total = num_filters * len(filter_sizes)
h_pool = tf.concat(pooled_outputs, num_filters) # h_pool : [batch_size(=6), output_height(=1), output_width(=1), channel(=1) * 3]
h_pool_flat = tf.reshape(h_pool, [-1, num_filters_total]) # [batch_size, ]

In [9]:
#tf.reset_default_graph()

# Model-Training
Weight = tf.get_variable('W', shape=[num_filters_total, num_classes], 
                    initializer=tf.contrib.layers.xavier_initializer())
Bias = tf.Variable(tf.constant(0.1, shape=[num_classes]))
model = tf.nn.xw_plus_b(h_pool_flat, Weight, Bias)  
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=model, labels=Y))
optimizer = tf.train.AdamOptimizer(0.001).minimize(cost)

# training
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

for epoch in range(5000):
    _, loss = sess.run([optimizer, cost], feed_dict={X: inputs, Y: outputs})
    if (epoch+1)%500 == 0:
        print('Epoch:', '%06d' % (epoch + 1), 'cost =', '{:.6f}'.format(loss))

# Model-Predict
hypothesis = tf.nn.softmax(model)
predictions = tf.argmax(hypothesis, 1)

# test
test_text = 'sorry hate you'
tests = []
tests.append(np.asarray([word_dict[n] for n in test_text.split()]))

predict = sess.run([predictions], feed_dict={X: tests})
result = predict[0][0]
if result == 0:
    print(test_text,"contains negative meaning...")
else:
    print(test_text,"contains positive meaning!!")


For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

Instructions for updating:
Use tf.cast instead.
Epoch: 000500 cost = 0.011613
Epoch: 001000 cost = 0.001719
Epoch: 001500 cost = 0.000649
Epoch: 002000 cost = 0.000324
Epoch: 002500 cost = 0.000185
Epoch: 003000 cost = 0.000115
Epoch: 003500 cost = 0.000075
Epoch: 004000 cost = 0.000051
Epoch: 004500 cost = 0.000035
Epoch: 005000 cost = 0.000025
sorry hate you contains negative meaning...
