# Adversarial Training on Pointer Generator
## Introduction
    The beginning of the introduction. 

## Table Of Contents:
* [Load Data & Initialize Model](#load-initialize)
* [Train Pointer Generator](#train-global-1)
    * [Without Coverage](#train-global-1-sub-1)
        * [Generate Tokens](#gen-global-1-sub-1)
        * [Rouge Evaluation](#rouge-global-1-sub-1)
    * [With Coverage](#train-global-1-sub-2)
        * [Generate Tokens](#gen-global-1-sub-2)
        * [Rouge Evaluation](#rouge-global-1-sub-2)
* [Train Generative Adversarial Network](#train-global-2)
    * [Pretrain Discriminator](#train-global-2-sub-1)
        * [Generate Tokens](#gen-global-2-sub-1)
        * [Rouge Evaluation](#rouge-global-2-sub-1)
    * [Adversarial Training](#train-global-2-sub-2)
        * [Generate Tokens](#gen-global-2-sub-2)
        * [Rouge Evaluation](#rouge-global-2-sub-2)
* [Analysis & Conclusion](#analysis-conclusion)
* [Limitations & Future Work](#limit-future)


## Load Data & Initialize Model <a class="anchor" id="load-initialize"></a>

In [1]:
import numpy as np

In [2]:
from data import Data
from model import SummaryModel
import argparse

import tensorflow as tf

tf.compat.v1.disable_eager_execution()
tf.compat.v1.logging.set_verbosity('ERROR')

parser = argparse.ArgumentParser(description = 'Train/Test summarization model', formatter_class = argparse.ArgumentDefaultsHelpFormatter)

# Import Setting
parser.add_argument("--doc_file", type = str, default = './data/doc.p', help = 'path to document file')
parser.add_argument("--vocab_file", type = str, default = './data/vocab.p', help = 'path to vocabulary file')
parser.add_argument("--emb_file", type = str, default = './data/emb.p', help = 'path to embedding file')
parser.add_argument("--src_time", type = int, default = 1000, help = 'maximal # of time steps in source text')
parser.add_argument("--sum_time", type = int, default = 100, help = 'maximal # of time steps in summary')
parser.add_argument("--max_oov_bucket", type = int, default = 280, help = 'maximal # of out-of-vocabulary word in one summary')
parser.add_argument("--train_ratio", type = float, default = 0.1, help = 'ratio of training data')
parser.add_argument("--seed", type = int, default = 888, help = 'seed for spliting data')

# Saving Setting
parser.add_argument("--log", type = str, default = './log/', help = 'logging directory')
parser.add_argument("--save", type = str, default = './model/', help = 'model saving directory')
parser.add_argument("--checkpoint", type = str, help = 'path to checkpoint point')
parser.add_argument("--autosearch", type = bool, default = False, help = "[NOT AVAILABLE] Set 'True' if searching for latest checkpoint")
parser.add_argument("--save_interval", type = int, default = 1760, help = "Save interval for training")

# Hyperparameter Setting
parser.add_argument("--batch_size", type = int, default = 16, help = 'number of samples in one batch')
parser.add_argument("--gen_lr", type = float, default = 1e-3, help = 'learning rate for generator')
parser.add_argument("--dis_lr", type = float, default = 1e-3, help = 'learning rate for discriminator')
parser.add_argument("--cov_weight", type = float, default = 1e-3, help = 'learning rate for coverage')

params = vars(parser.parse_args([]))

# params['load_pretrain'] = True
# # untrained model with glove embedding fine tuned on NYT dataset
# params['checkpoint'] = './model/model_untrain_glove-0' # Uncomment when requiring reloading model

model = SummaryModel(**params)
data = Data(**params)


437380 437660


  self.enc_fw_unit = tf.compat.v1.nn.rnn_cell.LSTMCell(self.num_unit, name='encoder_forward_cell')
  self.enc_bw_unit = tf.compat.v1.nn.rnn_cell.LSTMCell(self.num_unit, name='encoder_backward_cell')
  self.dec_unit = tf.compat.v1.nn.rnn_cell.LSTMCell(self.num_unit, state_is_tuple=False, name='decoder_cell')
  self.dis_enc_unit = tf.compat.v1.nn.rnn_cell.LSTMCell(self.num_unit, name='dis_enc_unit')
  self.dis_dec_unit = tf.compat.v1.nn.rnn_cell.LSTMCell(self.num_unit, name='dis_dec_unit')
  self.bas_enc_unit = tf.compat.v1.nn.rnn_cell.LSTMCell(self.num_unit, name='bas_enc_unit')
  self.bas_dec_unit = tf.compat.v1.nn.rnn_cell.LSTMCell(self.num_unit, name='bas_dec_unit')
2022-12-10 17:53:08.745869: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
2022-12-10 17:53:08.893713: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


In [3]:
train_data = data.get_next_epoch()
test_data = data.get_next_epoch_test()
src, ref, gen, tokens, scores, attens, gt_attens = None, None, None, None, None, None, None
for feed_dict in train_data:
    real, fake, real_len, fake_len = model.sess.run(
        [model.real_reward, model.fake_reward, model.sum_len, model.tokens_len], feed_dict=feed_dict)
    print(np.mean(real[1, 0:int(real_len[1])]))
    print(np.mean(fake[1, 0:int(fake_len[1])]))
    break

for feed_dict in test_data:
    tokens, scores, attens = model.beam_search(feed_dict)
    src, ref, gen = data.id2word(feed_dict, tokens)
    gt_attens = model.sess.run(model.atten_dist, feed_dict = feed_dict)
#     print(src, ref, gen, gt_attens)
    print(gen)
    break

IndexError: index 100 is out of bounds for axis 0 with size 100

## Train Pointer Generator<a class="anchor" id="train-global-1"></a>
### Train without coverage<a class="anchor" id="train-global-1-sub-1"></a>

In [4]:
train_max_epoch = 2
print (f'Start from step {model.sess.run(model.gen_global_step)}')
for i in range(train_max_epoch):
    print (f'Train Epoch {i}')
    train_data = data.get_next_epoch()
    model.train_one_epoch(train_data, data.n_train_batch, coverage_on = False)

Start from step 0
Train Epoch 0


  0%|          | 0/220 [00:00<?, ?it/s]

IndexError: index 100 is out of bounds for axis 0 with size 100

#### Generate tokens <a class="anchor" id="gen-global-1-sub-1"></a>

In [None]:
train_data = data.get_next_epoch()
test_data = data.get_next_epoch_test()

In [None]:
src, ref, gen, tokens, scores, attens, gt_attens = None, None, None, None, None, None, None

In [None]:
for feed_dict in train_data:
    real, fake, real_len, fake_len = model.sess.run([model.real_reward, model.fake_reward, model.sum_len, model.tokens_len], feed_dict = feed_dict)
    break

In [None]:
print (np.mean(real[1, 0:int(real_len[1])]))
print (np.mean(fake[1, 0:int(fake_len[1])]))

In [None]:
%%time
for feed_dict in test_data:
    tokens, scores, attens = model.beam_search(feed_dict)
    src, ref, gen = data.id2word(feed_dict, tokens)
    gt_attens = model.sess.run(model.atten_dist, feed_dict = feed_dict)
    break
    

In [None]:
x = 0
print ("".join(src[x]).replace("(OOV)",""), end = '\n\n')
print ("".join(ref[x]).replace("(OOV)",""), end = '\n\n')
print ("".join(gen[x]).replace("(OOV)",""), end = '\n\n')
print (scores[x])

In [None]:
def generate_top_k_tokens(top_k, coverage):
    test_data = data.get_next_epoch_test()
    src = [[] for i in range(top_k)]
    ref = [[] for i in range(top_k)]
    gen = [[] for i in range(top_k)]
    for feed_dict in test_data:
        tokens, scores, attens = model.beam_search(feed_dict, coverage_on = coverage, top_k = top_k)
        for i in range(top_k):
            src[i], ref[i], gen[i] = data.id2word(feed_dict, tokens[i])
#         feed_dict['coverage_on:0'] = coverage
#         gt_attens = model.sess.run(model.atten_dist, feed_dict = feed_dict)
        break
    return src, ref, gen, scores

def print_generated_tokens(src, ref, gen, scores):
    print ("".join(src[0][0]).replace("(OOV)", ""), end = '\n\n')
    print ("".join(ref[0][0]).replace("(OOV)", ""), end = '\n\n')
    for i in range(len(src)):
        print ("".join(gen[i][0]).replace("(OOV)", ""))
        print (scores[i][0], end = '\n\n')

In [None]:
test1_src, test1_ref, test1_gen, test1_scores = generate_top_k_tokens(10, False)
print_generated_tokens(test1_src, test1_ref, test1_gen, test1_scores)

#### Rouge Evaluation<a class="anchor" id="rouge-global-1-sub-1"></a>

In [None]:
from rouge import Rouge
rouge = Rouge()

def rouge_evaluation(ref, gen):
    # remove empty generations
    ref = [ref[i] if not (gen[i] == "") for i in range(len(ref))]
    gen = [gen[i] if not (gen[i] == "") for i in range(len(gen))]
    # calculate rouge score
    rouge_score = rouge.get_scores(new_gens, new_refs)
    r1, r2, rl = 0., 0., 0.
    for score in rouge_score:
        r1 = r1 + score['rouge-1']['f']
        r2 = r2 + score['rouge-2']['f']
        rl = rl + score['rouge-l']['f']
    r1 /= len(rouge_score)
    r2 /= len(rouge_score)
    rl /= len(rouge_score)
    print (r1, r2, rl)
    return r1, r2, rl

In [None]:
test1_r1, test1_r2, test1_rl = rouge_evaluation(test1_ref, test1_gen)

### Train with coverage<a class="anchor" id="train-global-1-sub-2"></a>

In [None]:
train_max_epoch = 2
print (f'Start from step {model.sess.run(model.gen_global_step)}')
for i in range(train_max_epoch):
    print (f'Train Epoch {i}')
    train_data = data.get_next_epoch()
    model.train_one_epoch(train_data, data.n_train_batch, coverage_on = True, model_name = 'with_coverage')

#### Generate tokens<a class="anchor" id="gen-global-1-sub-2"></a>

In [None]:
test2_src, test2_ref, test2_gen = generate_top_k_tokens(3, False)
print_generated_tokens(test2_src, test2_ref, test2_gen)

#### Rouge Evaluation<a class="anchor" id="rouge-global-1-sub-2"></a>

In [None]:
test2_r1, test2_r2, test2_rl = rouge_evaluation(test2_ref, test2_gen)

## Train GAN<a class="anchor" id="train-global-2"></a>
### Pretrain Discriminator<a class="anchor" id="train-global-2-sub-1"></a>

In [None]:
train_max_epoch = 2
print (f'Start from step {model.sess.run(model.gen_global_step_2)}')
for i in range(train_max_epoch):
    print (f'Train Epoch {i}')
    train_data = data.get_next_epoch()
    model.train_one_epoch_pre_dis(train_data, data.n_train_batch, coverage_on = True)

#### Generate tokens<a class="anchor" id="gen-global-2-sub-1"></a>

In [None]:
test3_src, test3_ref, test3_gen = generate_top_k_tokens(3, False)
print_generated_tokens(test3_src, test3_ref, test3_gen)

#### Rouge Evaluation<a class="anchor" id="rouge-global-2-sub-1"></a>

In [None]:
test3_r1, test3_r2, test3_rl = rouge_evaluation(test3_ref, test3_gen)

### Adversarial Training<a class="anchor" id="train-global-2-sub-2"></a>

In [None]:
train_max_epoch = 12
print (f'Start from step {model.sess.run(model.gen_global_step_2)}')
for i in range(train_max_epoch):
    print (f'Train Epoch {i}')
    train_data = data.get_next_epoch()
    model.train_one_epoch_unsup(train_data, data.n_train_batch, coverage_on = True)

#### Generate Tokens<a class="anchor" id="gen-global-2-sub-2"></a>

In [None]:
test4_src, test4_ref, test4_gen = generate_top_k_tokens(3, False)
print_generated_tokens(test4_src, test4_ref, test4_gen)

#### Rouge Evaluation<a class="anchor" id="rouge-global-2-sub-2"></a>

In [None]:
test4_r1, test4_r2, test4_rl = rouge_evaluation(test4_ref, test4_gen)

## Analysis & Conclusion<a class="anchor" id="analysis-conclusion"></a>

## Limitations & Future Work<a class="anchor" id="limit-future"></a>