# NLP Review Scorer (Toy Version)

**Disclaimer: This is only a toy. You should seriously treat your rebuttal despite the what scores are given below. Wish you good luck with your paper submission!**

I know some of you are thinking about how to convert paper review to a numerical score.
Yes, the time has come.

In this notebook, you will be able to convert your review to overall score (hopefully in range 1~5).

I assume that you have followed the pre-steps on GitHub: https://github.com/ymcui/NLP-Review-Scorer.

### Step 1: Mount your Google Drive

In [0]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive


### Step 2: Unzip model to Colab
Note that, the model will be updated occasionally according to the prediction performance. I will only keep the latest model here.

In [0]:
!unzip -n /content/drive/My\ Drive/review_model_0711.zip -d /content/bert

Archive:  /content/drive/My Drive/review_model_0711.zip
   creating: /content/bert/model0711/
  inflating: /content/bert/model0711/vocab.txt  
  inflating: /content/bert/model0711/model.ckpt-0.meta  
  inflating: /content/bert/model0711/bert_config.json  
  inflating: /content/bert/model0711/model.ckpt-0.index  
  inflating: /content/bert/model0711/model.ckpt-0.data-00000-of-00001  


## Step 3: Upload dependency files (from GitHub)
Clike 'upload' button to upload:
- modeling.py
- run_classifier.py
- optimization.py
- tokenization.py

## Step 4: Input your review and RUN!

Note that, it is better to remove '\n' in your review before copy to `review_text` field.

Be careful not to remove quote marks

In [0]:
# -*- coding: utf-8 -*-
"""run_squad_on_colab.ipynb

Automatically generated by Colaboratory.
"""

import datetime
import json
import os
import pprint
import random
import string
import sys
import tensorflow as tf

'''
assert 'COLAB_TPU_ADDR' in os.environ, 'ERROR: Not connected to a TPU runtime; please see the first cell in this notebook for instructions!'
TPU_ADDRESS = 'grpc://' + os.environ['COLAB_TPU_ADDR']
print('TPU address is', TPU_ADDRESS)

from google.colab import auth
auth.authenticate_user()
with tf.Session(TPU_ADDRESS) as session:
  print('TPU devices:')
  pprint.pprint(session.list_devices())

  # Upload credentials to TPU.
  with open('/content/adc.json', 'r') as f:
    auth_info = json.load(f)
  tf.contrib.cloud.configure_gcs(session, credentials=auth_info)
  # Now credentials are set for all future sessions on this TPU.
'''


"""### Prepare and import BERT modules
With your environment configured, you can now prepare and import the BERT modules. The following step clones the source code from GitHub and import the modules from the source. Alternatively, you can install BERT using pip (!pip install bert-tensorflow).
"""

# import python modules defined by BERT
import sys
import collections
import modeling
import optimization
import tokenization
from run_classifier import ReviewProcessor, file_based_convert_examples_to_features, file_based_input_fn_builder, model_fn_builder, PaddingInputExample
import numpy as np

review_text = "The paper was rather bad that I don't want to see it again. The idea was trivial and the evaluations are not convincing to me at all. We should reject this paper or I won't review for this venue in the future," #@param {type:"raw"}
review_sample= ["emnlp2019","0","0",review_text]

vocab_file='/content/bert/model0711/vocab.txt'
bert_config_file='/content/bert/model0711/bert_config.json'
init_checkpoint='/content/bert/model0711/model.ckpt-0'

do_train=False
do_predict=True #@param ["False", "True"] {type:"raw"}
train_batch_size=32
predict_batch_size=8
eval_batch_size=8
max_seq_length=512
save_checkpoints_steps=5000
do_lower_case=False
use_tpu=False
warmup_proportion=0.1
learning_rate=1e-5
num_train_epochs=1

def main():
  output_dir = '/content/result'
  tf.gfile.MakeDirs(output_dir)
  print('***** Model output directory: {} *****'.format(output_dir))

  ########################################################################

  tf.logging.set_verbosity(tf.logging.INFO)

  bert_config = modeling.BertConfig.from_json_file(bert_config_file)
  tokenizer = tokenization.FullTokenizer(vocab_file=vocab_file, do_lower_case=do_lower_case)

  #validate_flags_or_throw(bert_config)
  
  processor = ReviewProcessor()

  tf.gfile.MakeDirs(output_dir)
  tpu_cluster_resolver = None
  if use_tpu:
    tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver(TPU_ADDRESS)
  is_per_host = tf.contrib.tpu.InputPipelineConfig.PER_HOST_V2
  run_config = tf.contrib.tpu.RunConfig(
      cluster=tpu_cluster_resolver,
      model_dir=output_dir,
      save_checkpoints_steps=5000,
      keep_checkpoint_max=2,
      tpu_config=tf.contrib.tpu.TPUConfig(
          iterations_per_loop=1000,
          num_shards=8,
          per_host_input_for_training=is_per_host))


  train_examples = None
  num_train_steps = None
  num_warmup_steps = None

  model_fn = model_fn_builder(
      bert_config=bert_config,
      init_checkpoint=init_checkpoint,
      learning_rate=learning_rate,
      num_train_steps=num_train_steps,
      num_warmup_steps=num_warmup_steps,
      use_tpu=use_tpu,
      use_one_hot_embeddings=use_tpu)

  # If TPU is not available, this will fall back to normal Estimator on CPU
  # or GPU.
  estimator = tf.contrib.tpu.TPUEstimator(
      use_tpu=use_tpu,
      model_fn=model_fn,
      config=run_config,
      train_batch_size=train_batch_size,
      eval_batch_size=eval_batch_size,
      predict_batch_size=predict_batch_size)

  if do_predict:
    predict_examples = processor.get_single_examples([review_sample])
    num_actual_predict_examples = len(predict_examples)
    if use_tpu:
      # TPU requires a fixed batch size for all batches, therefore the number
      # of examples must be a multiple of the batch size, or else examples
      # will get dropped. So we pad with fake examples which are ignored
      # later on.
      while len(predict_examples) % predict_batch_size != 0:
        predict_examples.append(PaddingInputExample())

    predict_file = os.path.join(output_dir, "predict.tf_record")
    file_based_convert_examples_to_features(predict_examples,
                                            max_seq_length, tokenizer,
                                            predict_file)

    tf.logging.info("***** Running prediction*****")
    tf.logging.info("  Num examples = %d (%d actual, %d padding)",
                    len(predict_examples), num_actual_predict_examples,
                    len(predict_examples) - num_actual_predict_examples)
    tf.logging.info("  Batch size = %d", predict_batch_size)

    predict_drop_remainder = True if use_tpu else False
    predict_input_fn = file_based_input_fn_builder(
        input_file=predict_file,
        seq_length=max_seq_length,
        is_training=False,
        drop_remainder=predict_drop_remainder)

    result = estimator.predict(input_fn=predict_input_fn)
    tf.logging.info(result)

    output_predict_file = os.path.join("./test_results.tsv")
    with tf.gfile.GFile(output_predict_file, "w") as writer:
      num_written_lines = 0
      tf.logging.info("***** Predict results *****")
      writer.write("paper\trecommendation\tconfidence\n")
      for (i, prediction) in enumerate(result):
        if i >= num_actual_predict_examples:
          break            
        probabilities = prediction["probabilities"]
        output_line = "\t".join(
            str(class_probability)
            for class_probability in probabilities) + "\n"
        output_line = predict_examples[i].guid + "\t" + output_line
        writer.write(output_line)
        num_written_lines += 1
    tf.logging.info("***********REVIEW**************")
    tf.logging.info(review_text)
    tf.logging.info("***********SCORE***************")
    tf.logging.info("paper\trecommendation\tconfidence")
    tf.logging.info(output_line)
    tf.logging.info("********************************")
    assert num_written_lines == num_actual_predict_examples


if __name__ == '__main__':
  main()


W0711 07:24:57.391819 139746709817216 estimator.py:1984] Estimator's model_fn (<function model_fn at 0x7f18fb8b11b8>) includes params argument, but params are not passed to Estimator.


***** Model output directory: /content/result *****


I0711 07:24:57.394778 139746709817216 estimator.py:209] Using config: {'_save_checkpoints_secs': None, '_num_ps_replicas': 0, '_keep_checkpoint_max': 2, '_task_type': 'worker', '_global_id_in_cluster': 0, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f18fa6d1810>, '_model_dir': '/content/result', '_protocol': None, '_save_checkpoints_steps': 5000, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None, eval_training_input_configuration=2), '_tf_random_seed': None, '_save_summary_steps': 100, '_device_fn': None, '_cluster': None, '_experimental_distribute': None, '_num_worker_replicas': 1, '_task_id': 0, '_log

## Step 5: Check your score at the end of log

For example:
```
**************REVIEW***********
this is a very good paper, outstanding paper, brilliant paper. I have never seen such a good paper before. It was well-written and the models are novel. The evaluations are sound and the results achieve state-of-the-art performance. It should be definitely accepted or I will be angry.
**************SCORE***********
paper	recommendation	confidence
emnlp2019	3.4766932	3.4420846
********************************
```


```
**************REVIEW***********
The paper was rather bad that I don't want to see it again. The idea was trivial and the evaluations are not convincing to me at all. We should reject this paper or I won't review for this venue in the future,
**************SCORE***********
paper   recommendation  confidence
emnlp2019	2.011398	3.8701794
********************************
```
​