# Experiment 1: Training with sentence-wise embeddings

We explore training all the deep models for AES two stage flow by taking sentence-wise embeddings and then averaging the word embeddings for each sentence to get the embedding of the essay. This would yeild a tensor of dimension (N x max_sentences x 768) where N is the number of essays, and the max_sentences is found to be 85

## Imports

In [None]:
  from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# import important libraries and download data
import os
import math
import pandas as pd
import numpy as np
import nltk
import re
from nltk.corpus import stopwords
from gensim.models import Word2Vec
from sklearn.model_selection import train_test_split
from sklearn.model_selection import KFold
from sklearn.metrics import cohen_kappa_score
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import random
import multiprocessing
import tensorflow as tf
%matplotlib notebook
import numpy as np
import pickle
import matplotlib.pyplot as plt
from sklearn.manifold import TSNE
from torch.autograd import Variable
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
import xgboost as xgb
from tqdm import tqdm
import string
! pip install tqdm boto3 requests regex sentencepiece sacremoses
! git clone https://github.com/Gaurav-Pande/AES_DL.git && mv AES_DL/data .
! pip install transformers
! pip install xgboost
! pip install language-tool-python 

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.
Collecting boto3
  Downloading boto3-1.21.46-py3-none-any.whl (132 kB)
[K     |████████████████████████████████| 132 kB 7.5 MB/s 
Collecting sentencepiece
  Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[K     |████████████████████████████████| 1.2 MB 61.1 MB/s 
[?25hCollecting sacremoses
  Downloading sacremoses-0.0.49-py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 68.6 MB/s 
[?25hCollecting jmespath<2.0.0,>=0.7.1
  Downloading jmespath-1.0.0-py3-none-any.whl (23 kB)
Collecting botocore<1.25.0,>=1.24.46
  Downloading botocore-1.24.46-py3-none-any.whl (8.7 MB)
[K     |█████████████

Cloning into 'AES_DL'...
remote: Enumerating objects: 59, done.[K
remote: Counting objects: 100% (59/59), done.[K
remote: Compressing objects: 100% (44/44), done.[K
remote: Total 59 (delta 25), reused 28 (delta 8), pack-reused 0[K
Unpacking objects: 100% (59/59), done.
Collecting transformers
  Downloading transformers-4.18.0-py3-none-any.whl (4.0 MB)
[K     |████████████████████████████████| 4.0 MB 8.9 MB/s 
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
  Downloading tokenizers-0.12.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.6 MB)
[K     |████████████████████████████████| 6.6 MB 96.2 MB/s 
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 63.8 MB/s 
Collecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.5.1-py3-none-any.whl (77 kB)
[K     |████████████████████████████████| 77 kB 7.7 MB/s 
Ins

## RUN Configuration

In [None]:
# Load respective embeddings from file
load_bert_sem=True
load_bert_coh= True
load_bert_prel=False

In [None]:
# Flags to load respective models from file, no training of LSTM models
load_trained_model_sem  = True
load_trained_model_coh = True
load_trained_model_prel = True

In [None]:
# Path for all files
#model_path = '/content/drive/MyDrive/Colab Notebooks/AES/full_embeddings'

In [None]:
# Embedding Type
#embedding = 'full_emb'
embedding = "sen_avg"
#embedding = "para_avg"

# Max Words (for full embedding)
max_words_for_full_emb = 200
max_words_for_full_emb_sem = 300

# Max # Sentences (for sentence average embeddings)

## Function Definitions

In [None]:
import language_tool_python
tool = language_tool_python.LanguageTool('en-US')
def check_sp_n_grammar (text):
  matches = tool.check(text)
  num_sp_err = 0
  num_gram_err = 0
  num_other_err = 0
  # print ("Spell n Grammar checker: Number of errors detected: ",len(matches))
  for i in range(len(matches)):
    if (matches[i].ruleIssueType == "misspelling"):
      num_sp_err = num_sp_err +1
    elif (matches[i].ruleIssueType == "grammar"):
      num_gram_err = num_gram_err +1
    else:
      num_other_err = num_other_err +1
  #if (matches[i].ruleId == '')
  return (matches, num_sp_err, num_gram_err, num_other_err)

Downloading LanguageTool 5.6: 100%|██████████| 220M/220M [00:03<00:00, 63.0MB/s]
Unzipping /tmp/tmpx5ihe5hq.zip to /root/.cache/language_tool_python.
Downloaded https://www.languagetool.org/download/LanguageTool-5.6.zip to /root/.cache/language_tool_python.


In [None]:
# Augment dataframe with handrafted features - num of spelling errors, gramm errors, other errors, word-count and also add the 3 scores
def augment_handcrafted_features (df, prelEval=False):
  from transformers import BertModel, BertConfig, BertTokenizer
  sp_errors = len(df)*[0]
  gr_errors = len(df)*[0]
  oth_errors = len(df)*[0]
  semantic_score = len(df)*[0.0]
  coherence_score = len(df)*[0.0]
  prel_score = len(df)*[0.0]
  tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
  config = BertConfig.from_pretrained('bert-base-uncased', output_hidden_states=True)
  model = BertModel.from_pretrained('bert-base-uncased', config=config)
  i = 0
  for index, row in df.iterrows():
    input_tensor = prepare_input_data_sen_avg(model=model, tokenizer=tokenizer, text=row.essay).detach().numpy()
    #coherence_score = lstm_model_coh.predict(input_tensor)[0][0]
    #semantic_score = lstm_model_sem.predict(input_tensor)[0][0]
    #df['semantic_score'][index] = semantic_score
    #df['coherence_score'][index] = coherence_score
    semantic_score[i] = lstm_model_sem.predict(input_tensor)[0][0]
    coherence_score[i] = lstm_model_coh.predict(input_tensor)[0][0]
    if (prelEval ==True):
      del input_tensor
      input_tensor = prepare_input_data_sen_avg(model=model, tokenizer=tokenizer, text=row.prompt+row.essay).detach().numpy()
      #prel_score = lstm_model_prel.predict(input_tensor)[0][0]
      #df['prel_score'][index] = prel_score
      prel_score[i] = lstm_model_prel.predict(input_tensor)[0][0]
    del input_tensor

    _,sp, gr, oth = check_sp_n_grammar(row.essay)
    sp_errors[i] = sp
    gr_errors[i] = gr
    oth_errors[i] = oth

    if (i%100==0):
      print('Iter: ',i)
      print ('Combined Essay: ', row.essay + row.prompt)
      print ('Prel Score: ', prel_score[i])
      print ('Norm Score: ', row.normalized_score)
    i += 1


  df['spell_err'] = sp_errors
  df['gram_err'] = gr_errors
  df['oth_err'] = oth_errors
  df['semantic_score'] = semantic_score
  df['coherence_score'] = coherence_score
  df['prel_score'] = prel_score
  return (df)
  #for essay in df['essay']:
  #  _,sp, gr, oth = check_sp_n_grammar(essay)
  #  sp_errors[i] = sp
  #  gr_errors[i] = gr
  #  oth_errors[i] = oth
  #  if (i%100==0):
  #    print('Iter: ',i)
  #  i += 1

In [None]:
# Prepare Input data (1 essay) for prediction
def prepare_input_data(text, max_len=200):
  tokenized_text = tokenizer.encode(text, add_special_tokens=True ,max_length=200)
  # print ("Tokenized text: ", tokenized_text)
  ## processing the tokenized train values for the test set
  padded_text = np.array( [tokenized_text + [0]*(max_len-len(tokenized_text))])
  # print ("Padded text: ", padded_text)
  attention_mask_test = np.where(padded_text != 0, 1, 0)
  input_ids = torch.tensor(padded_text)
  attention_mask = torch.tensor(attention_mask_test)
  #last_hidden_state = torch.zeros(1,200,768)
  outputs = model(input_ids)
  return(outputs[0])


In [None]:
def prepare_input_data_sen_avg(model, tokenizer, text, max_len=200):
	lhs = torch.empty(1,max_sentences,768, dtype=torch.float)
	emb_for_padding = tokenizer.encode_plus("", add_special_tokens=True, truncation=True, padding="max_length", return_tensors="pt", max_length=10)
	tt = torch.tensor(emb_for_padding['input_ids'])
	output = model(tt)
	lhs_for_padding = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
	lhs_for_padding_np = np.array(lhs_for_padding.detach().numpy())
	lhs_for_padding_mean = np.mean(lhs_for_padding_np,axis=1)
	lhs_avg_for_padding = torch.tensor(lhs_for_padding_mean[0])
	sentences = re.split('\. |\? |! ', text)
	sen_length = len(sentences)
	lhs_sentence_avg = np.zeros((max_sentences,768), dtype=float)
	for i,s in enumerate(sentences):
		if (i>=max_sentences):
			break
		tokenize_sentence = tokenizer.encode(sentences[i],add_special_tokens=True, max_length=512, truncation=True)
		tt = torch.tensor(tokenize_sentence)
		tts = tt.reshape(1,len(tt))
		output = model(tts)
		lhs_sentence = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
		lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
		lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
		lhs_sentence_avg[i] = lhs_sentence_np_mean[0]
		
	lhs[0] = torch.tensor(lhs_sentence_avg)
        
	if (sen_length < max_sentences):
		for i in range (sen_length, max_sentences):
			lhs[0][i]= lhs_avg_for_padding
		
	#print ("SIze of lhs_for_padding: ", lhs_for_padding.shape)
	#print ("SIze of lhs_avg_for_padding: ", lhs_avg_for_padding.shape)
	#print ("SIze of lhs being returned: ", lhs.shape)
	return (lhs)


In [None]:
# Normalizing the domain1_score
def normalize_value(score, min_value, max_value):
  result =  tf.compat.v1.div(float(tf.subtract(score, min_value)), float(tf.subtract(max_value, min_value)))
  return result

In [None]:
# taking care of NEC
def clean_nec(essay):
    essay = re.sub(r"@[A-Za-z0-9]+", ' ', essay)
    essay = re.sub(r"https?://[A-Za-z0-9./]+", ' ', essay)
    #essay = re.sub(r"[^a-zA-Z.!?']", ' ', essay)
    essay = re.sub(r" +", ' ', essay)
    return essay

In [None]:
# augmenting for coherence model
def coherence_augment(essay):
  x = re.split('\. |\? |! ', essay)
  random.shuffle(x)
  return '. '.join(x)



In [None]:
alphabets= "([A-Za-z])"
prefixes = "(Mr|St|Mrs|Ms|Dr)[.]"
suffixes = "(Inc|Ltd|Jr|Sr|Co)"
starters = "(Mr|Mrs|Ms|Dr|He\s|She\s|It\s|They\s|Their\s|Our\s|We\s|But\s|However\s|That\s|This\s|Wherever)"
acronyms = "([A-Z][.][A-Z][.](?:[A-Z][.])?)"
websites = "[.](com|net|org|io|gov)"

def split_into_sentences(text):
    text = " " + text + "  "
    text = text.replace("\n"," ")
    text = re.sub(prefixes,"\\1<prd>",text)
    text = re.sub(websites,"<prd>\\1",text)
    if "Ph.D" in text: text = text.replace("Ph.D.","Ph<prd>D<prd>")
    text = re.sub("\s" + alphabets + "[.] "," \\1<prd> ",text)
    text = re.sub(acronyms+" "+starters,"\\1<stop> \\2",text)
    text = re.sub(alphabets + "[.]" + alphabets + "[.]" + alphabets + "[.]","\\1<prd>\\2<prd>\\3<prd>",text)
    text = re.sub(alphabets + "[.]" + alphabets + "[.]","\\1<prd>\\2<prd>",text)
    text = re.sub(" "+suffixes+"[.] "+starters," \\1<stop> \\2",text)
    text = re.sub(" "+suffixes+"[.]"," \\1<prd>",text)
    text = re.sub(" " + alphabets + "[.]"," \\1<prd>",text)
    if "”" in text: text = text.replace(".”","”.")
    if "\"" in text: text = text.replace(".\"","\".")
    if "!" in text: text = text.replace("!\"","\"!")
    if "?" in text: text = text.replace("?\"","\"?")
    text = text.replace(".",".<stop>")
    text = text.replace("?","?<stop>")
    text = text.replace("!","!<stop>")
    text = text.replace("<prd>",".")
    sentences = text.split("<stop>")
    sentences = sentences[:-1]
    sentences = [s.strip() for s in sentences]
    return sentences

In [None]:
#Loading the dataset
dataset_path = "./data/training_set_rel3.tsv"
data = pd.read_csv(dataset_path, sep="\t", encoding="ISO-8859-1")
min_scores = [2, 1, 0, 0, 0, 0, 0, 0]
max_scores = [12, 6, 3, 3, 4, 4, 30, 60]
data.dropna(axis=1, inplace=True)
data.drop(columns=["rater1_domain1", "rater2_domain1"], inplace=True)
data['normalized_score'] = data.apply(lambda x: float(normalize_value(x['domain1_score'], min_scores[x['essay_set']-1], max_scores[x['essay_set']-1])), axis=1)
data.head()


Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


From /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:1082: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.


Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score
0,1,1,"Dear local newspaper, I think effects computer...",8,0.6
1,2,1,"Dear @CAPS1 @CAPS2, I believe that using compu...",9,0.7
2,3,1,"Dear, @CAPS1 @CAPS2 @CAPS3 More and more peopl...",7,0.5
3,4,1,"Dear Local Newspaper, @CAPS1 I have found that...",10,0.8
4,5,1,"Dear @LOCATION1, I know having computers has a...",8,0.6


In [None]:
data['essay'] = data['essay'].apply(lambda x: clean_nec(x))

In [None]:
data.iloc[68]['essay']

"Some people think it is a good idea and same do not. My opinion is that, I think that people spend a lot of time for good reasons. Here are three reasons why, . grownups working, . students learning how to type, and . communicating with others. My first reason is that parents do a lot of work on computer. For example, they do taxes, paperwork, airline tickets, and the bank. And those are usually all done on the computer so it would be easier if people don't drive. My second reason is students need to learn how to type so they can email or even write on paper. It helps them build learning ability and also, so they can know how to sing the 's. Computers are suppose to be fun for people of any age. My last reason is, communicating with others is a great skill to have so you can talk in person. Computers help because if you mess up of what your trying to say then you could just erase what your trying to say. And in person, you can't. Also you can make plans with one of your friend on some

In [None]:
data.head()

Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score
0,1,1,"Dear local newspaper, I think effects computer...",8,0.6
1,2,1,"Dear , I believe that using computers will ben...",9,0.7
2,3,1,"Dear, More and more people use computers, but ...",7,0.5
3,4,1,"Dear Local Newspaper, I have found that many e...",10,0.8
4,5,1,"Dear , I know having computers has a positive ...",8,0.6


In [None]:
# LSTM Model
from keras.layers import Embedding, Input, LSTM, Dense, Dropout, Lambda, Flatten, Bidirectional, Conv2D, Conv1D, MaxPooling1D, GlobalMaxPooling1D
from keras.models import Sequential,Model, load_model, model_from_config
import keras.backend as K
max_sentences = 128

def get_model(Hidden_dim1=400, Hidden_dim2=128, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, recurrent_dropout=0.4, 
              sen_size=max_sentences, input_size=768, activation='sigmoid', opt_engine='rmsprop', loss_fn='mean_squared_error'):
    """Define the model."""
    model = Sequential()
    model.add(LSTM(Hidden_dim1, dropout=dropout_lstm, recurrent_dropout=recurrent_dropout, input_shape=(sen_size,input_size), return_sequences=return_sequences))
    model.add(LSTM(Hidden_dim2, recurrent_dropout=recurrent_dropout))
    model.add(Dropout(dropout_dense))
    model.add(Dense(1, activation=activation))

    model.compile(loss=loss_fn, optimizer=opt_engine, metrics=['mae'])
    model.summary()
    return model

In [None]:
from keras.layers import Embedding, Input, LSTM, Dense, Dropout, Lambda, Flatten, Bidirectional, Conv2D, Conv1D, MaxPooling1D, GlobalMaxPooling1D
from keras.models import Sequential,Model, load_model, model_from_config
import keras.backend as K
max_sentences = 128

def get_model_CNN(output_dims=10380):
    """Define the model."""
    #inputs = Input(shape=(768,1))
    #x = Conv1D(64, 3, strides=1, padding='same', activation='relu')(inputs)
    ##Cuts the size of the output in half, maxing over every 2 inputs
    #x = MaxPooling1D(pool_size=2)(x)
    #x = Conv1D(128, 3, strides=1, padding='same', activation='relu')(x)
    #x = GlobalMaxPooling1D()(x) 
    #outputs = Dense(output_dims, activation='relu')(x)
    #model = Model(inputs=inputs, outputs=outputs, name='CNN')
    #model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mae','mse'])
    #model.summary()
    model.add (Conv2D())

    return model

In [None]:
tpd_train = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/tpd_train.csv')
tpd_xgb = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/tpd_xgb.csv')
tpd_test = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/tpd_test.csv')

In [None]:
average_essay_lens = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/average_essay_set_lengths.csv')

In [None]:
average_essay_lens

Unnamed: 0,essay_set,essay_len
0,1,350
1,2,369
2,3,104
3,4,91
4,5,118
5,6,150
6,7,156
7,8,571


In [None]:
print("train: %d, xgb: %d, test: %d"%(len(tpd_train), len(tpd_xgb), len(tpd_test)))

train: 6488, xgb: 3892, test: 2596


In [None]:
def prepare_embeddings (df, model_type='semantic', train_or_test='test', load_from_file=True, 
                        file_path='/content/drive/MyDrive/Colab Notebooks/AES/experiment_XX' ,
                        max_sentences=128):
  # Arguments Description:
  # ----------------------
  # 
  #   df:             Dataframe containing the essays and scores 
  #
  #   model_type:     Supports 3 model types: 'semantic', 'coherence' and 'p_rel' (Prompt Relevance)
  #
  #   train_or_test:  Whether there are "training" vectors (essays) or "test" vectors
  #
  #   load_from_file: Boolean flag whether to load the embeddings from previously stored file or generate 
  #                   & save afresh
  #
  #   file_path:      Base directory where models and embeddings will be saved
  #
  #   max_sentences:  Relevant for sentence average (sen_avg) embedding type: maximum no of sentences 
  #                   permissable in an essay
  #
  from transformers import BertModel, BertConfig, BertTokenizer

  print ("Preparing Embeddings...")
  print ("Model Type: ", model_type)
  print ("Train or Test: ", train_or_test)
  if (not df.empty):
    print ("Dataframe provided, Size: ", df.shape)

  if (model_type =='semantic'):
    if (train_or_test=='train'):
      lhs_path = file_path + '/lhs_train.pt'
      y_path = file_path + '/y_train.pt'
    else:
      if (train_or_test == 'test'):
        lhs_path = file_path + '/lhs_test.pt'
        y_path = file_path + '/y_test.pt'
      else:
        print ("Invalid choice for train_or_test. Returning NONE")
        return
  else:
    if (model_type =='coherence'):
      if (train_or_test=='train'):
        lhs_path = file_path + '/lhs_coherence_train.pt'
        y_path = file_path + '/y_train_coh.pt'
      else:
        if (train_or_test == 'test'):
          lhs_path = file_path + '/lhs_coherence_test.pt'
          y_path = file_path + '/y_test_coh.pt'
        else:
          print ("Invalid choice for train_or_test. Returning NONE")
          return
    else:
      if (model_type =='prel'):
        if (train_or_test=='train'):
          lhs_path = file_path + '/lhs_prel_train.pt'
          y_path = file_path + '/y_train_prel.pt'
        else:
          if (train_or_test == 'test'):
            lhs_path = file_path + '/lhs_prel_test.pt'
            y_path = file_path + '/y_test_prel.pt'
          else:
            print ("Invalid choice for train_or_test. Returning NONE")
            return
      else:
        print ("Please choose a valid model - one of SEMANTIC, COHERENCE or PREL")
        return

  if (load_from_file == True):
    print ("Loading existing embeddings from file...")
    print ("LHS File chosen: ", lhs_path)
    print ("Y File chosen: ", y_path)
    lhs = torch.load(lhs_path)
    y_gold = torch.load(y_path)
    print ("Loaded, Size of LHS embeddings: ", lhs.shape)
    print ("Loaded, Size of y Gold: ", y_gold.shape)
  else:
    print ("Generating embeddings from scratch & extracting the Y_Gold from dataframe...\n")
    
    if (df.empty):
      print ("Null dataframe, please provide a valid dataframe")
      return

    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
    config = BertConfig.from_pretrained('bert-base-uncased', output_hidden_states=True)
    model = BertModel.from_pretrained('bert-base-uncased', config=config)

    essays = df['essay']
    y_gold = df['normalized_score']
    sentences = []
    tokenize_sentences = []
  
    cuda = torch.device('cuda')

    # Embeddings for training vectors
    lhs = torch.empty((len(essays),max_sentences,768), dtype=torch.float)
    emb_for_padding = tokenizer.encode_plus("", add_special_tokens=True, truncation=True, padding="max_length", return_tensors="pt", max_length=10)
    tt = torch.tensor(emb_for_padding['input_ids'])
    output = model(tt)
    #lhs_for_padding = output.hidden_states[11]
    lhs_for_padding = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
    lhs_for_padding_np = np.array(lhs_for_padding.detach().numpy())
    lhs_for_padding_mean = np.mean(lhs_for_padding_np,axis=1)
    lhs_avg_for_padding = torch.tensor(lhs_for_padding_mean[0])
    
    for j,essay in enumerate(essays):
      if (j%200 ==0):
        print ("Iteration: ", j)

      sentences = re.split('\. |\? |! ', essay)
      sen_length = len(sentences)
      lhs_sentence_avg = np.zeros((max_sentences,768), dtype=float)

      #for i in range(min(85,len(sentences))):
      for i,s in enumerate(sentences):
        if (i>=max_sentences):
          break
        tokenize_sentence = tokenizer.encode(sentences[i],add_special_tokens=True, max_length=512, truncation=True)
        tt = torch.tensor(tokenize_sentence)
        tts = tt.reshape(1,len(tt))
        output = model(tts)
        # getting the 2nd last layer
        #lhs_sentence = output.hidden_states[11]
        lhs_sentence = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
        lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
        lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
        lhs_sentence_avg[i] = lhs_sentence_np_mean[0]
  
      lhs[j] = torch.tensor(lhs_sentence_avg)

      if (sen_length < max_sentences):
        for i in range (sen_length, max_sentences):
          lhs[j][i]= lhs_avg_for_padding
  
    torch.save(lhs, lhs_path)
    torch.save(y_gold, y_path)

    print ("Saved LHS & Y_gold...") 
    
  print ("Returning lhs: Shape: ", lhs.shape)
  print ("Returning y_gold: Shape: ", y_gold.shape)
  return lhs, y_gold

In [None]:
def prepare_embeddings_updated (df, model_type='semantic', train_or_test='test', load_from_file=True, 
                        file_path='/content/drive/MyDrive/Colab Notebooks/AES/experiment_XX' ,
                        embedding_type='sen_avg',max_sentences=128, max_words=512, hstate='last4sum', gold_field="normalized_score" ):
  # Arguments Description:
  # ----------------------
  # 
  #   model_type:     Supports 3 model types: 'semantic', 'coherence' and 'p_rel' (Prompt Relevance)
  #
  #   train_or_test:  Whether there are "training" vectors (essays) or "test" vectors
  #
  #   load_from_file: Boolean flag whether to load the embeddings from previously stored file or generate 
  #                   & save afresh
  #
  #   file_path:      Base directory where models and embeddings will be saved
  #
  #   embedding_type: Supports 3 types of embedding types: 
  #                     - 'sen_avg':  Averages embeddings for every sentence, embedding vector size per 
  #                                   essay: (max_sentences * 768)
  #                     - 'para_avg": Averages embeddings first for every sentence & then averages these 
  #                                   for the full essay, embedding vector size per essay: (1x768)
  #                     - 'full_emb": Creates embedding for entire sentence - embedding vector size per
  #                                   essay: (max_words * 768)
  #
  #   max_sentences:  Relevant for sentence average (sen_avg) embedding type: maximum no of sentences 
  #                   permissable in an essay
  #
  #   max_words:      Relevant for the full embedding (full_emb) embedding type: maximum no of words
  #                   permissable in an essay
  #
  #   hstate:         How embedding is computed using BERT's hidden states:
  #                     - 'last4sum': Embedding computed by summing last 4 hidden state sof instantited BERT model
  #                     - 'second_last':  Embedding computed by picking the 2nd last hidden state of instantiaed BERT model
  #
  from transformers import BertModel, BertConfig, BertTokenizer

  print ("Preparing Embeddings...")
  print ("Model Type: ", model_type)
  print ("Embedding Type: ", embedding_type)
  print ("hState: ", hstate)
  print ("Save File Directory: ", file_path)
  if (not df.empty):
    print ("Dataframe provided, Size: ", df.shape)

  if (model_type =='semantic'):
    if (train_or_test=='train'):
      lhs_path = file_path + '/lhs_train.pt'
      y_path = file_path + '/y_train.pt'
    else:
      if (train_or_test == 'test'):
        lhs_path = file_path + '/lhs_test.pt'
        y_path = file_path + '/y_test.pt'
      else:
        print ("Invalid choice for train_or_test. Returning NONE")
        return
  else:
    if (model_type =='coherence'):
      if (train_or_test=='train'):
        lhs_path = file_path + '/lhs_coherence_train.pt'
        y_path = file_path + '/y_train_coh.pt'
      else:
        if (train_or_test == 'test'):
          lhs_path = file_path + '/lhs_coherence_test.pt'
          y_path = file_path + '/y_test_coh.pt'
        else:
          print ("Invalid choice for train_or_test. Returning NONE")
          return
    else:
      if (model_type =='p_rel'):
        if (train_or_test=='train'):
          lhs_path = file_path + '/lhs_prel_train.pt'
          y_path = file_path + '/y_train_prel.pt'
        else:
          if (train_or_test == 'test'):
            lhs_path = file_path + '/lhs_prel_test.pt'
            y_path = file_path + '/y_test_prel.pt'
          else:
            print ("Invalid choice for train_or_test. Returning NONE")
            return
      else:
        print ("Please choose a valid model - one of SEMANTIC, COHERENCE or PREL")
        return

  if (load_from_file == True):
    print ("Loading existing embeddings from file...")
    print ("LHS File chosen: ", lhs_path)
    print ("Y File chosen: ", y_path)
    lhs = torch.load(lhs_path)
    y_gold = torch.load(y_path)
    print ("Loaded, Size of LHS embeddings: ", lhs.shape)
    print ("Loaded, Size of y Gold: ", y_gold.shape)
  else:
    print ("Generating embeddings from scratch & extracting the Y_Gold from dataframe...\n")
    
    if (df.empty):
      print ("Null dataframe, please provide a valid dataframe")
      return

    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
    config = BertConfig.from_pretrained('bert-base-uncased', output_hidden_states=True)
    model = BertModel.from_pretrained('bert-base-uncased', config=config)

    if (model_type=='p_rel'):
        essays = df['combined_essay']
    else:
        essays = df['essay']
    y_gold = df[gold_field]
    sentences = []
    tokenize_sentences = []
    
    cuda = torch.device('cuda')

    if (embedding_type == 'sen_avg'):
      print ("Using Sentence Average Embedding...")
      # Embeddings for the dataframe provided
      lhs = torch.empty((len(essays),max_sentences,768), dtype=torch.float)
      emb_for_padding = tokenizer.encode_plus("", add_special_tokens=True, truncation=True, padding="max_length", return_tensors="pt", max_length=10)
      tt = torch.tensor(emb_for_padding['input_ids'])
      output = model(tt)
      if (hstate=='second_last'):
        # getting the 2nd last layer
        lhs_for_padding = output.hidden_states[11]
      else:
        if (hstate=='last4sum'):
          lhs_for_padding = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
        else:
          print ("Invalid value provided for hstate")
          return
      #lhs_for_padding = model(tt)[2][-2]
      lhs_for_padding_np = np.array(lhs_for_padding.detach().numpy())
      lhs_for_padding_mean = np.mean(lhs_for_padding_np,axis=1)
      lhs_avg_for_padding = torch.tensor(lhs_for_padding_mean[0])
    
      for j,essay in enumerate(essays):
        if (j%200 ==0):
          print ("Iteration: ", j)

        sentences = re.split('\. |\? |! ', essay)
        sen_length = len(sentences)
        lhs_sentence_avg = np.zeros((max_sentences,768), dtype=float)

        #for i in range(min(85,len(sentences))):
        for i,s in enumerate(sentences):
          if (i>=max_sentences):
            break
          tokenize_sentence = tokenizer.encode(sentences[i],add_special_tokens=True, max_length=512, truncation=True)
          tt = torch.tensor(tokenize_sentence)
          tts = tt.reshape(1,len(tt))
          output = model(tts)
          if (hstate=='second_last'):
            # getting the 2nd last layer
            lhs_sentence = output.hidden_states[11]
          else:
            if (hstate=='last4sum'):
              lhs_sentence = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
            else:
              print ("Invalid value provided for hstate")
              return
          #lhs_sentence = model(tts).hidden_states[11]
          lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
          lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
          lhs_sentence_avg[i] = lhs_sentence_np_mean[0]
  
        lhs[j] = torch.tensor(lhs_sentence_avg)

        if (sen_length < max_sentences):
          for i in range (sen_length, max_sentences):
            lhs[j][i]= lhs_avg_for_padding
    else:
      if (embedding_type =='para_avg'):
        print ("Using Paragraph Average Embedding...")
        lhs = torch.empty((len(essays),1,768), dtype=torch.float)
        #lhs = torch.empty((1,768), dtype=torch.float)
        #for j in range(len(prompt_data)):
        for j,essay in enumerate(essays):
          if (j%200 ==0):
            print ("Iteration: ", j)
          sentences = split_into_sentences(essay)
          sen_length = len(sentences)
  
          lhs_sentence_avg = np.zeros((1,768), dtype=float)
          lhs_avg_sen = np.empty((0,768), dtype=float)

          for i in range(min(max_sentences,len(sentences))):
            tokenize_sentence = tokenizer.encode(sentences[i],add_special_tokens=True, max_length=512, truncation=True)
            tt = torch.tensor(tokenize_sentence)
            tts = tt.reshape(1,len(tt))
            output = model(tts)
            if (hstate=='second_last'):
              # getting the 2nd last layer
              lhs_sentence = output.hidden_states[11]
            else:
              if (hstate=='last4sum'):
                lhs_sentence = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
              else:
                print ("Invalid value provided for hstate")
                return
            lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
            lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
            lhs_avg_sen = np.append(lhs_avg_sen,lhs_sentence_np_mean, axis=0)

          lhs_sentence_avg = np.mean(lhs_avg_sen, axis=0, keepdims=True)
          lhs[j] = torch.tensor(lhs_sentence_avg)
      else:
        if (embedding_type =='full_emb'):
          print ("Embedding Type: Full Embedding ...")
          #lhs = torch.zeros((len(essays),max_words,768), dtype=torch.float)
          lhs_np = np.zeros((len(essays), max_words, 768), dtype=float)
          for j,essay in enumerate(essays):
            if (j%200 ==0):
              print ("Iteration: ", j)
            tokenized_essay = tokenizer.encode_plus(essay, add_special_tokens=True, truncation=True, 
                                                    padding="max_length", max_length=max_words, 
                                                    return_tensors="pt")
          
            output = model(**tokenized_essay)
            #print ("HS data type: ", type(output.hidden_states[11]))
            if (hstate=='second_last'):
              # getting the 2nd last layer
              lhs_np[j] = output.hidden_states[11].detach().numpy()
            else:
              if (hstate=='last4sum'):
                lhs_np[j] = (output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]).detach().numpy()#.reshape(max_words,768)
              else:
                print ("Invalid value provided for hstate")
                return
            del output

          lhs = torch.tensor(lhs_np)
          del (lhs_np)
          #torch.save(y_gold, y_path)
          #torch.save(lhs_np, lhs_path)
          #print ("Full EMb: Saved LHS & Y_gold...") 
          #print ("Full Emb: Returning lhs: Shape: ", lhs.shape)
          #print ("Full Emb: Returning y_gold: Shape: ", y_gold.shape)
          #return (torch.tensor(lhs_np), y_gold)

    torch.save(lhs, lhs_path)
    torch.save(y_gold, y_path)

    print ("Saved LHS & Y_gold...") 
    
  print ("Returning lhs: Shape: ", lhs.shape)
  print ("Returning y_gold: Shape: ", y_gold.shape)
  return lhs, y_gold

In [None]:
def evaluate_model (model, lhs_test, y_test):
  y_pred = model.predict(lhs_test.numpy())
  tt1 = np.around(10*y_pred)
  tt2 = tt1.reshape(tt1.shape[0],)
  pred_values = tt2.astype(int)
  tt3 = np.array(10* y_test)
  gold_values = tt3.astype(int)
  # evaluate the model
  result = cohen_kappa_score(gold_values,pred_values,weights='quadratic')
  print("Kappa Score: {}".format(result))
  yy_p = y_pred.reshape(y_pred.shape[0],)
  yy_t = np.array(y_test)
  MSE = np.square(np.subtract(yy_t, yy_p)).mean()
  RMSE = math.sqrt(MSE)
  print ("MSE: ", MSE)
  print ("RMSE: ", RMSE)

## Initializing classes and paths

In [None]:
# Initialize the RUN configuration

run_semantic = True
run_coherence = True
run_prelevance = True

In [None]:
import time
import torch
import transformers as ppb
import warnings

# PArent Directory on Google Drive where all moels and other data is stored
#model_path = '/content/drive/MyDrive/Colab Notebooks/AES/experiment_1'
model_path = '/content/drive/MyDrive/Colab Notebooks/AES/experiment_XX'
sem_model_save_path = model_path + '/lstm_model.pt'
coh_model_save_path = model_path + '/coh-lstm_model-latest.pt'
prel_model_save_path = model_path + '/prel-lstm_model-latest.pt'

data_with_errors_path = model_path + '/data_w_errors.csv'

In [None]:
np.random.seed(42)

In [None]:
regressor = xgb.XGBRegressor(
    n_estimators=200,
    reg_lambda=1,
    gamma=0,
    eta = 0.1,
    max_depth=6,
    objective='reg:squarederror'
)

## CNN

In [None]:
def get_model_CNN(Hidden_dim1=400, Hidden_dim2=128, return_sequences = True, dropout=0.5, recurrent_dropout=0.4, input_size=768,output_dims=10380, activation='relu', bidirectional = False):
    """Define the model."""
    inputs = Input(shape=(768,1))
    x = Conv1D(64, 3, strides=1, padding='same', activation='relu')(inputs)
    #Cuts the size of the output in half, maxing over every 2 inputs
    x = MaxPooling1D(pool_size=2)(x)
    x = Conv1D(128, 3, strides=1, padding='same', activation='relu')(x)
    x = GlobalMaxPooling1D()(x) 
    outputs = Dense(output_dims, activation='relu')(x)
    model = Model(inputs=inputs, outputs=outputs, name='CNN')
    model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mae','mse'])
    model.summary()
    return model

In [None]:
class CNN(nn.Module):
	def __init__(self, batch_size, output_size, in_channels, out_channels, kernel_heights, stride, padding, keep_probab, vocab_size, embedding_length, weights):
		super(CNN, self).__init__()
		
		"""
		Arguments
		---------
		batch_size : Size of each batch which is same as the batch_size of the data returned by the TorchText BucketIterator
		output_size : 2 = (pos, neg)
		in_channels : Number of input channels. Here it is 1 as the input data has dimension = (batch_size, num_seq, embedding_length)
		out_channels : Number of output channels after convolution operation performed on the input matrix
		kernel_heights : A list consisting of 3 different kernel_heights. Convolution will be performed 3 times and finally results from each kernel_height will be concatenated.
		keep_probab : Probability of retaining an activation node during dropout operation
		vocab_size : Size of the vocabulary containing unique words
		embedding_length : Embedding dimension of GloVe word embeddings
		weights : Pre-trained GloVe word_embeddings which we will use to create our word_embedding look-up table
		--------
		
		"""
		self.batch_size = batch_size
		self.output_size = output_size
		self.in_channels = in_channels
		self.out_channels = out_channels
		self.kernel_heights = kernel_heights
		self.stride = stride
		self.padding = padding
		self.vocab_size = vocab_size
		self.embedding_length = embedding_dim[0]
		
		# self.word_embeddings = nn.Embedding(vocab_size, embedding_length)
		# self.word_embeddings.weight = nn.Parameter(weights, requires_grad=False)

		self.conv1 = nn.Conv2d(in_channels, out_channels, (kernel_heights[0], embedding_length), stride, padding)
		self.conv2 = nn.Conv2d(in_channels, out_channels, (kernel_heights[1], embedding_length), stride, padding)
		self.conv3 = nn.Conv2d(in_channels, out_channels, (kernel_heights[2], embedding_length), stride, padding)
		self.dropout = nn.Dropout(keep_probab)
		self.label = nn.Linear(len(kernel_heights)*out_channels, output_size)
	
	def conv_block(self, input, conv_layer):
		conv_out = conv_layer(input)# conv_out.size() = (batch_size, out_channels, dim, 1)
		activation = F.relu(conv_out.squeeze(3))# activation.size() = (batch_size, out_channels, dim1)
		max_out = F.max_pool1d(activation, activation.size()[2]).squeeze(2)# maxpool_out.size() = (batch_size, out_channels)
		
		return max_out
	
	def forward(self, input_sentences, batch_size=None):
		
		"""
		The idea of the Convolutional Neural Netwok for Text Classification is very simple. We perform convolution operation on the embedding matrix 
		whose shape for each batch is (num_seq, embedding_length) with kernel of varying height but constant width which is same as the embedding_length.
		We will be using ReLU activation after the convolution operation and then for each kernel height, we will use max_pool operation on each tensor 
		and will filter all the maximum activation for every channel and then we will concatenate the resulting tensors. This output is then fully connected
		to the output layers consisting two units which basically gives us the logits for both positive and negative classes.
		
		Parameters
		----------
		input_sentences: input_sentences of shape = (batch_size, num_sequences)
		batch_size : default = None. Used only for prediction on a single sentence after training (batch_size = 1)
		
		Returns
		-------
		Output of the linear layer containing logits for pos & neg class.
		logits.size() = (batch_size, output_size)
		
		"""
		
		input = input_sentences
		# input.size() = (batch_size, num_seq, embedding_length)
		# print("Input size is: ", input.size())
		input = input.unsqueeze(1)
		input.size() = (batch_size, 1, num_seq, embedding_length)
		max_out1 = self.conv_block(input, self.conv1)
		max_out2 = self.conv_block(input, self.conv2)
		max_out3 = self.conv_block(input, self.conv3)
		
		all_out = torch.cat((max_out1, max_out2, max_out3), 1)
		# all_out.size() = (batch_size, num_kernels*out_channels)
		fc_in = self.dropout(all_out)
		# fc_in.size()) = (batch_size, num_kernels*out_channels)
		logits = self.label(fc_in)
		
		return logits

SyntaxError: ignored

In [None]:
class CNN(nn.Module):   
    def __init__(self, batch_size, output_size, in_channels, out_channels, kernel_heights, stride, padding, keep_probab, embedding_dim):
        super(CNN, self).__init__()
  
        self.batch_size = batch_size
        self.output_size = output_size
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.kernel_heights = kernel_heights
        self.stride = stride
        self.padding = padding
            # self.vocab_size = vocab_size
        self.embedding_length = embedding_dim[1]
        self.num_sentences = embedding_dim[0]
        # self.sentence_embeddings = essay_embedding

        self.conv1 = nn.Conv2d(in_channels, out_channels, (kernel_heights[0], embedding_dim[1]), stride, padding)
        self.conv2 = nn.Conv2d(in_channels, out_channels, (kernel_heights[1], embedding_dim[1]), stride, padding)
        self.conv3 = nn.Conv2d(in_channels, out_channels, (kernel_heights[2], embedding_dim[1]), stride, padding)
        self.dropout = nn.Dropout(keep_probab)
        self.label = nn.Linear(len(kernel_heights)*out_channels, output_size)
	
    def conv_block(self, input, conv_layer):
        conv_out = conv_layer(input)# conv_out.size() = (batch_size, out_channels, dim, 1)
        activation = F.relu(conv_out.squeeze(3))# activation.size() = (batch_size, out_channels, dim1)
        print ("Activation Shape:", activation.shape)
        max_out = F.max_pool1d(activation, activation.size()[2]).squeeze(2)# maxpool_out.size() = (batch_size, out_channels)

        return max_out
	
    def forward(self, input_sentences, batch_size=None):    
		
        """
        The idea of the Convolutional Neural Netwok for Text Classification is very simple. We perform convolution operation on the embedding matrix 
        whose shape for each batch is (num_seq, embedding_length) with kernel of varying height but constant width which is same as the embedding_length.
        We will be using ReLU activation after the convolution operation and then for each kernel height, we will use max_pool operation on each tensor 
        and will filter all the maximum activation for every channel and then we will concatenate the resulting tensors. This output is then fully connected
        to the output layers consisting two units which basically gives us the logits for both positive and negative classes.

        Parameters
        ----------
        input_sentences: input_sentences of shape = (batch_size, num_sequences)
        batch_size : default = None. Used only for prediction on a single sentence after training (batch_size = 1)

        Returns
        -------
        Output of the linear layer containing logits for pos & neg class.
        logits.size() = (batch_size, output_size)

        """
		
        input = input_sentences
        # input.size = (self.batch_size, num_seq, self.embedding_length)
        input = input.unsqueeze(1)
        # input.size = (self.batch_size, 1, num_seq, self.embedding_length)
        max_out1 = self.conv_block(input, self.conv1)
        print(max_out1.shape)
        max_out2 = self.conv_block(input, self.conv2)
        max_out3 = self.conv_block(input, self.conv3)

        all_out = torch.cat((max_out1, max_out2, max_out3), 1)
        # all_out.size() = (batch_size, num_kernels*out_channels)
        fc_in = self.dropout(all_out)
        # fc_in.size()) = (batch_size, num_kernels*out_channels)
        logits = self.label(fc_in)

        return logits

In [None]:
def clip_gradient(model, clip_value):
    params = list(filter(lambda p: p.grad is not None, model.parameters()))
    for p in params:
        p.grad.data.clamp_(-clip_value, clip_value)
    
def train_model(model, lhs_train, y_train, epoch):
    total_epoch_loss = 0
    total_epoch_acc = 0
    model.cuda()
    optim = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()))
    steps = 0
    model.train()
    for i in range(0,len(lhs_train),32):
        text = lhs_train[i:i+32]
        target = y_train[i:i+32]
        # print("text shape: ", text.shape)
        # print("target shape: ", target.shape)
        target = torch.autograd.Variable(target).long()
        if torch.cuda.is_available():
            text = text.cuda()
            target = target.cuda()
        if (text.size()[0] is not 32):# One of the batch returned by BucketIterator has length different than 32.
            continue
        optim.zero_grad()
        prediction = model(text)
        loss = nn.CrossEntropyLoss(prediction, target)
        num_corrects = (torch.max(prediction, 1)[1].view(target.size()).data == target.data).float().sum()
        acc = 100.0 * num_corrects/len(batch)
        loss.backward()
        clip_gradient(model, 1e-1)
        optim.step()
        steps += 1
        
        if steps % 100 == 0:
            print (f'Epoch: {epoch+1}, Idx: {idx+1}, Training Loss: {loss.item():.4f}, Training Accuracy: {acc.item(): .2f}%')
        
        total_epoch_loss += loss.item()
        total_epoch_acc += acc.item()
        
    return total_epoch_loss/len(train_iter), total_epoch_acc/len(train_iter)

def eval_model(model, val_iter):
    total_epoch_loss = 0
    total_epoch_acc = 0
    model.eval()
    with torch.no_grad():
        for idx, batch in enumerate(val_iter):
            text = batch.text[0]
            if (text.size()[0] is not 32):
                continue
            target = batch.label
            target = torch.autograd.Variable(target).long()
            if torch.cuda.is_available():
                text = text.cuda()
                target = target.cuda()
            prediction = model(text)
            loss = loss_fn(prediction, target)
            num_corrects = (torch.max(prediction, 1)[1].view(target.size()).data == target.data).sum()
            acc = 100.0 * num_corrects/len(batch)
            total_epoch_loss += loss.item()
            total_epoch_acc += acc.item()

    return total_epoch_loss/len(val_iter), total_epoch_acc/len(val_iter)

In [None]:
#from keras.utils import np_utils
#from numpy import np_utils
y_train_labels = y_train.apply(lambda x: int(10*x))
y_train_tensor = torch.tensor(y_train_labels.values)
one_hot_y = F.one_hot(y_train_tensor, 11)
if torch.cuda.is_available():
    one_hot_y = one_hot_y.cuda()
print (one_hot_y.shape)

torch.Size([6488, 11])


In [None]:
y_train_labels.value_counts()

6     1870
5      909
7      809
3      719
10     667
4      613
2      363
8      310
0      224
9        2
1        2
Name: normalized_score, dtype: int64

In [None]:
y_train_tensor.shape

torch.Size([6488])

In [None]:
# loss_fn = F.categorical_crossentropy

for epoch in range(10):
    train_loss, train_acc = train_model(model, lhs_train, y_train_tensor, epoch)
    # val_loss, val_acc = eval_model(model, valid_iter)
    
    print(f'Epoch: {epoch+1:02}, Train Loss: {train_loss:.3f}, Train Acc: {train_acc:.2f}%')
    

  return torch.max_pool1d(input, kernel_size, stride, padding, dilation, ceil_mode)


RuntimeError: ignored

In [None]:
total_epoch_loss = 0
total_epoch_acc = 0
model.cuda()
optim = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()))

In [None]:
text = lhs_train[:32]
#target = one_hot_y[:32]
target = y_train_tensor[:32]


In [None]:
print (text.shape)
print (target.shape)

torch.Size([32, 128, 768])
torch.Size([32])


In [None]:
target = torch.autograd.Variable(target).long()
if torch.cuda.is_available():
    text = text.cuda()
    target = target.cuda()

In [None]:
optim.zero_grad()
prediction = model(text)
print ("Size of prediction:", prediction.shape)
print (prediction[5])
print (target[5])

Activation Shape: torch.Size([32, 32, 128])
torch.Size([32, 32])
Activation Shape: torch.Size([32, 32, 127])
Activation Shape: torch.Size([32, 32, 126])
Size of prediction: torch.Size([32, 11])
tensor([-1.1847, -0.3497, -0.2948, -1.7616,  0.8465, -1.2724,  1.5663, -0.0714,
         1.1757, -1.3443, -2.0278], device='cuda:0', grad_fn=<SelectBackward>)
tensor(2, device='cuda:0')


In [None]:
Train ={}
Train["Label"] = torch.empty(20, dtype=torch.long).random_(2)
print (Train["Label"].shape)
print (Train["Label"])

torch.Size([20])
tensor([1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0])


In [None]:
loss = nn.CrossEntropyLoss()
loss(prediction, target)
num_corrects = (torch.max(prediction, 1)[1].view(target.size()).data == target.data).float().sum()
acc = 100.0 * num_corrects/len(batch)
loss.backward()
clip_gradient(model, 1e-1)
optim.step()
steps += 1

NameError: ignored

## CNN BERT 2

In [None]:
batch_size = 32
output_size = 11
in_channels = 1
out_channels = 32
kernel_heights = [1,2,3]
stride = 1
padding = 0
keep_probab = 0.5
embedding_dims = (128,768)

In [None]:
model = CNN(batch_size, output_size, in_channels, out_channels, kernel_heights, stride, padding, keep_probab, embedding_dims)

In [None]:
lhs_train, y_train = prepare_embeddings_updated (tpd_train, model_type='semantic', train_or_test='train', load_from_file=load_bert_sem, embedding_type=embedding, max_words=max_words_for_full_emb_sem, file_path=model_path)

Preparing Embeddings...
Model Type:  semantic
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX
Dataframe provided, Size:  (6488, 6)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/lhs_train.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/y_train.pt
Loaded, Size of LHS embeddings:  torch.Size([6488, 128, 768])
Loaded, Size of y Gold:  (6488,)
Returning lhs: Shape:  torch.Size([6488, 128, 768])
Returning y_gold: Shape:  (6488,)


##CNN BERT

In [None]:
use_gpu = True
seed = 42
max_length = 64
batch_size = 16
lr = 2e-5

In [None]:
from transformers import BertModel, BertConfig, BertTokenizer

In [None]:
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
config = BertConfig.from_pretrained('bert-base-uncased', output_hidden_states=True)
model = BertModel.from_pretrained('bert-base-uncased', config=config)

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [None]:
ts = tokenizer.encode('Hello, how are you doing today?',add_special_tokens=True, max_length=512, padding="max_length", truncation=True)
tt = torch.tensor(ts)
tts = tt.reshape(1,len(tt))
print ("tt shape:", tt.shape)
print ("tts shape:", tts.shape)
output = model(tts)
ooo1 = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
ooo2 = output[2][-4:]
# print ("Size ADDED:", ooo1.shape)
# print ("Size CONCAT 0:", ooo2[0].shape)
# print ("Size CONCAT 1:", ooo2[1].shape)
# print ("Size CONCAT 2:", ooo2[2].shape)
# print ("Size CONCAT 3:", ooo2[3].shape)
x = torch.stack(ooo2, dim=1)
print ("Stacked size:", x.shape)
xx = x.squeeze(3)
print ("Squeezed size:", xx.shape)
num_filters = 32
embed_size = 768
filter_sizes = [1,2,3,4,5]
#conv = nn.Conv2d(4, num_filters, (1, embed_size))
conv = nn.ModuleList([nn.Conv2d(4, num_filters, (K, embed_size)) for K in filter_sizes])
cnx = conv[0](xx)
print("Size of conv output: ", cnx[0].shape)
relx = [F.relu(cnx(x)).squeeze(3) for cnx in conv] 
print("Size of relu output (after squeeze): ", relx[1].shape)
#mpoolx = [F.max_pool1d(relx[0], relx[0].size(2)).squeeze(2)]
# mpoolx = [F.max_pool1d(i, i.size(2)).squeeze(2) for i in relx] 
# out = torch.cat(mpoolx, 1)
# print ("Con Size:", cnx.shape)
# print ("post rel Size:", relx[0].size())
# print ("post rel Size 2:", relx[0].size(2))
# print ("Post maxpool size:", mpoolx[0].size())
# print ("Out:", out.size())

tt shape: torch.Size([512])
tts shape: torch.Size([1, 512])
Stacked size: torch.Size([1, 4, 512, 768])
Squeezed size: torch.Size([1, 4, 512, 768])
Size of conv output:  torch.Size([32, 512, 1])
Size of relu output (after squeeze):  torch.Size([1, 32, 511])


In [None]:
mpoolx[0]

tensor([[0.4199, 0.4277, 0.6975, 0.8807, 1.0234, 0.5523, 0.0000, 0.4113, 0.7582,
         0.4670, 0.6480, 1.2027, 0.7079, 0.8579, 0.7264, 0.2271, 0.6367, 0.3200,
         0.3867, 1.1314, 0.4128, 0.4859, 0.5273, 0.6375, 1.1721, 0.8330, 1.1401,
         0.7078, 0.3995, 0.7792, 1.1293, 1.0822]], grad_fn=<SqueezeBackward1>)

In [None]:
from tensorflow import keras
from tensorflow.keras import layers
from keras.layers import Embedding, Input, LSTM, Dense, Dropout, Lambda, Flatten, Bidirectional, Conv2D, Conv1D, MaxPooling1D, GlobalMaxPooling1D, Concatenate
from keras.models import Sequential,Model, load_model, model_from_config
import keras.backend as K

num_filters = 32
embed_size = 768
filter_sizes = [1,2,3,4,5]
input_shape = [1,128,768,4]
#conv_input = Input(shape=input_shape[1:])
conv_input = Input(shape=input_shape)
conv_op = []
mp_op = []
#conv_op = layers.Conv2D(num_filters, (2,768), input_shape=input_shape[1:], activation="relu")(encoder_input)
for K in filter_sizes:
  print((Conv2D(num_filters, (K, embed_size), input_shape=input_shape,activation="relu")(conv_input)).shape)
  conv_op.append(Conv2D(num_filters, (K, embed_size), input_shape=input_shape,activation="relu")(conv_input))
print ("Con length:", len(conv_op))

#pooling each parallel conv layer
for i in range(0, len(conv_op)):
  print ("Size CONCAT:", i, conv_op[i].shape)
  print ("Squeeze: ", tf.squeeze(conv_op[i],axis=0).shape)
  #mp_op.append(layers.MaxPool1D(conv_op[i].size(2)))(conv_op[i])
  mp_op.append(MaxPooling1D(128)(tf.squeeze(conv_op[i])))

#print("MP OP shape",mp_op[0].shape)
#out = Concatenate(axis=1)(mp_op)
#print("Output shape: ", out.shape)


#print ("Size:", conv_op.shape)
#x = layers.Conv2D(32, 3, activation="relu")(x)
#x = layers.MaxPooling2D(3)(x)
#x = layers.Conv2D(32, 3, activation="relu")(x)
#x = layers.Conv2D(16, 3, activation="relu")(x)
#encoder_output = layers.GlobalMaxPooling2D()(x)

#encoder = keras.Model(encoder_input, encoder_output, name="encoder")
#encoder.summary()

(None, 1, 128, 1, 32)
(None, 1, 127, 1, 32)
(None, 1, 126, 1, 32)
(None, 1, 125, 1, 32)
(None, 1, 124, 1, 32)
Con length: 5
Size CONCAT: 0 (None, 1, 128, 1, 32)
Squeeze:  (1, 128, 1, 32)
Size CONCAT: 1 (None, 1, 127, 1, 32)
Squeeze:  (1, 127, 1, 32)
Size CONCAT: 2 (None, 1, 126, 1, 32)
Squeeze:  (1, 126, 1, 32)
Size CONCAT: 3 (None, 1, 125, 1, 32)
Squeeze:  (1, 125, 1, 32)
Size CONCAT: 4 (None, 1, 124, 1, 32)
Squeeze:  (1, 124, 1, 32)


In [None]:
inputs = Input(shape=(768,1))
x = Conv1D(64, 3, strides=1, padding='same', activation='relu')(inputs)
#Cuts the size of the output in half, maxing over every 2 inputs
x = MaxPooling1D(pool_size=2)(x)
x = Conv1D(128, 3, strides=1, padding='same', activation='relu')(x)
x = GlobalMaxPooling1D()(x) 
outputs = Dense(output_dims, activation='relu')(x)
model = Model(inputs=inputs, outputs=outputs, name='CNN')
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mae','mse'])
model.summary()

In [None]:
class CNNBert(nn.Module):
    
    def __init__(self, embed_size, bert_model):
        super(CNNBert, self).__init__()
        filter_sizes = [1,2,3,4,5]
        num_filters = 32
        self.convs1 = nn.ModuleList([nn.Conv2d(4, num_filters, (K, embed_size)) for K in filter_sizes])
        self.dropout = nn.Dropout(0.1)
        self.fc1 = nn.Linear(len(filter_sizes)*num_filters, 1)
        self.sigmoid = nn.Sigmoid()
        self.bert_model = bert_model

    def forward(self, x, input_masks, token_type_ids):
        x = self.bert_model(x, attention_mask=input_masks, token_type_ids=token_type_ids)[2][-4:]
        x = torch.stack(x, dim=1)
        x = [F.relu(conv(x)).squeeze(3) for conv in self.convs1] 
        x = [F.max_pool1d(i, i.size(2)).squeeze(2) for i in x]  
        x = torch.cat(x, 1)
        x = self.dropout(x)  
        logit = self.fc1(x)
        return self.sigmoid(logit)

NameError: ignored

In [None]:
def prepare_set(text, max_length=128):
    """returns input_ids, attention_mask, token_type_ids for set of data ready in BERT format"""
    global tokenizer

    text = [ split_into_sentences(t) for t in text ]
    t = tokenizer.batch_encode_plus(text,
                        pad_to_max_length=True,
                        add_special_tokens=True,
                        max_length=max_length,
                        return_tensors='pt')

In [None]:
def train_bert_cnn(x_train, x_dev, y_train, y_dev, n_epochs=10, model_path="temp.pt", batch_size=batch_size):
    bert_model = model
    
    print([len(x) for x in (y_train, y_dev)])
    y_train, y_dev = ( torch.FloatTensor(t) for t in (y_train, y_dev) )

    train_inputs, train_masks, train_type_ids = prepare_set(x_train, max_length=max_length)
    train_data = TensorDataset(train_inputs, train_masks, train_type_ids, y_train)
    train_sampler = RandomSampler(train_data)
    train_dataloader = DataLoader(train_data, sampler=train_sampler, batch_size=batch_size)

    # Create the DataLoader for our dev set.
    dev_inputs, dev_masks, dev_type_ids = prepare_set(x_dev, max_length=max_length)
    dev_data = TensorDataset(dev_inputs, dev_masks, dev_type_ids, y_dev)
    dev_sampler = SequentialSampler(dev_data)
    dev_dataloader = DataLoader(dev_data, sampler=dev_sampler, batch_size=batch_size)

    model = CNNBert(768, bert_model)
    if len(device_ids) > 1 and device.type == "cuda":
        model = nn.DataParallel(model, device_ids=device_ids)
    model.to(device)

    optimizer = AdamW(model.parameters(), lr=lr, weight_decay=0.9)
    loss_fn = nn.BCELoss()
    train_losses, val_losses = [], []
    np.random.seed(seed)
    torch.manual_seed(seed) 
    if device.type == "cuda":
        torch.cuda.manual_seed_all(seed)

    total_steps = len(train_dataloader) * n_epochs
    scheduler = get_linear_schedule_with_warmup(optimizer, 
                                        num_warmup_steps = 0,
                                        num_training_steps = total_steps)

    model.zero_grad()
    best_score = 0
    best_loss = 1e6

    for epoch in range(n_epochs):

        start_time = time.time()
        train_loss = 0 
        model.train(True)

        for batch in train_dataloader:
            b_input_ids, b_input_mask, b_token_type_ids, b_labels  = tuple(t.to(device) for t in batch)
            y_pred = model(b_input_ids, b_input_mask, b_token_type_ids)
            loss = loss_fn(y_pred, b_labels.unsqueeze(1))
            loss.backward()
            optimizer.step()
            train_loss += loss.item()
            scheduler.step()
            model.zero_grad()

        train_losses.append(train_loss)
        elapsed = time.time() - start_time
        model.eval()
        val_preds = []

        with torch.no_grad(): 
            val_loss = 0
            for batch in dev_dataloader:
                b_input_ids, b_input_mask, b_token_type_ids, b_labels  = tuple(t.to(device) for t in batch)
                y_pred = model(b_input_ids, b_input_mask, b_token_type_ids)
                loss = loss_fn(y_pred, b_labels.unsqueeze(1))
                val_loss += loss.item()
                y_pred = y_pred.cpu().numpy().flatten()
                val_preds += [ int(p >= 0.5) for p in y_pred ] 
                model.zero_grad()

        val_score = f1_score(y_dev.cpu().numpy().tolist(), val_preds)
        val_losses.append(val_loss)    
        print("Epoch %d Train loss: %.4f. Validation F1-Macro: %.4f  Validation loss: %.4f. Elapsed time: %.2fs."% (epoch + 1, train_losses[-1], val_score, val_losses[-1], elapsed))

        if val_score > best_score:
            torch.save(model.state_dict(), "temp.pt")
            print(classification_report(y_dev.cpu().numpy().tolist(), val_preds, digits=4))
            best_score = val_score

    model.load_state_dict(torch.load("temp.pt"))
    model.to(device)
    model.predict = predict.__get__(model)
    model.eval()
    os.remove("temp.pt")
    return model

In [None]:
tpd_train.head()

Unnamed: 0.1,Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score
0,6351,9908,4,The author concludes the story w/this paragrap...,1,0.333333
1,6315,9872,4,I believe that the author concludes the story ...,2,0.666667
2,304,305,1,"Computers, a very much talked about subject. D...",10,0.8
3,8023,12771,5,I think in my opion is that the author was ver...,1,0.25
4,4442,6839,3,The setting that affect the cyclist is the con...,1,0.333333


In [None]:
train_bert_cnn(tpd_train.essay, tpd_test, y_train, y_dev, n_epochs=10, model_path="temp.pt", batch_size=batch_size):

## Training Flow - Obsolete Now

### Prompt relevance score

#### Preprocessing

In [None]:
#prompts = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/prompts-aes-kaggle-dataset.csv')

In [None]:
#prompts.columns = ['essay_set','prompt']

In [None]:
#prompts

In [None]:
#prompt_data_test = tpd_test.merge(prompts,on='essay_set')
#prompt_data_train = tpd_train.merge(prompts,on='essay_set')

In [None]:
#prompt_data_train['combined_essay'] = prompt_data_train['prompt'] + prompt_data_train['essay']
#prompt_data_test['combined_essay'] = prompt_data_test['prompt'] + prompt_data_test['essay']

In [None]:
#prompt_data_train = prompt_data_train.drop(columns=['Unnamed: 0'])
#prompt_data_test = prompt_data_test.drop(columns=['Unnamed: 0'])

In [None]:
#prompt_data_test.head()

#### Saving training set

In [None]:
#prompt_data_train.to_csv('/content/drive/MyDrive/Colab Notebooks/AES/prompt_train_tpd.csv')
#prompt_data_test.to_csv('/content/drive/MyDrive/Colab Notebooks/AES/prompt_test_tpd.csv')

#### Load training set


In [None]:
#prompt_data_train = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/prompt_train_tpd.csv')
#prompt_data_test = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/prompt_test_tpd.csv')

In [None]:
#X_train = prompt_data_train
#y_train = prompt_data_train['normalized_score']
#X_test = prompt_data_test
#y_test = prompt_data_test['normalized_score']
# X_train.to_csv('/content/drive/MyDrive/Colab Notebooks/AES/X_train_prel.csv')

In [None]:
#tmp_aug = X_train.sample(frac=0.1)

In [None]:
#tmp_aug.head()

In [None]:
#def return_shuffle_prompt(p, prompts):
#  essay_id = prompts.index[prompts['prompt'] == p].tolist()[0] + 1
#  essay_ids = list(range(1,9))
#  essay_ids.remove(essay_id)
#  ran_id = random.choice(essay_ids)
#  return prompts.iloc[ran_id -1].prompt

In [None]:
#tmp_aug['prompt'] = tmp_aug['prompt'].apply(lambda x: return_shuffle_prompt(x,prompts))

In [None]:
#tmp_aug['combined_essay'] = tmp_aug['prompt'] + tmp_aug['essay']

In [None]:
#tmp_aug.loc[tmp_aug.essay_id==9792]

In [None]:
#X_train = X_train.append(tmp_aug)

In [None]:
#X_train.to_csv('/content/drive/MyDrive/Colab Notebooks/AES/X_train_prel_augmented_tpd.csv')
#X_test.to_csv('/content/drive/MyDrive/Colab Notebooks/AES/X_test_prel_tpd.csv')
#y_train.to_csv('/content/drive/MyDrive/Colab Notebooks/AES/y_train_prel_tpd.csv')
#y_test.to_csv('/content/drive/MyDrive/Colab Notebooks/AES/y_test_prel_tpd.csv')

#### Evaluating embedding for training set

In [None]:
#from transformers import BertModel, BertConfig, BertTokenizer
#
#max_sentences = 100
## Either load precomputed or compute the BERT embeddings & dataset for semantic scoring 
#if (use_existing_bert_prel == True):
#  print ("Experiment 1: Prompt-relavance Model: Using Saved vectors & embeddings...")
#  if (load_trained_model_prel == False):
#    lhs_train_prel = torch.load(lhs_prelevance_train_path)
#    y_train = torch.load(y_train_file_path_prel)
#    print("Prompt-relevance Model: Shape of loaded TRAIN embeddings:",lhs_train_prel.shape)
#  else:
#    print ("Since LOAD LSTM Model is TRUE, not loading the saved training BERT embedding TENSOR")
#  
#  y_test = torch.load(y_test_file_path_prel)
#  lhs_test_prel = torch.load(lhs_prelevance_test_path)
#  
#  print("Prompt-relevance Model: Shape of loaded y_train:",y_train.shape)
#  print("Prompt-relevance Model: Shape of loaded y_test:",y_test.shape)
#  print("Prompt-relevance Model: Shape of loaded TEST embeddings:",lhs_test_prel.shape)
#else:
#  print ("Experiment 1: Prompt-relevance Model: New Train & Test vector split & creating corresponding BERT embeddings...")
#  tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
#  config = BertConfig.from_pretrained('bert-base-uncased', output_hidden_states=True)
#  model = BertModel.from_pretrained('bert-base-uncased', config=config)
#
#  train_essays = X_train['combined_essay']
#  test_essays = X_test['combined_essay']
#  sentences = []
#  tokenize_sentences = []
#  train_bert_embeddings = []
#  
#  torch.save(y_train, y_train_file_path_prel)
#  torch.save(y_test, y_test_file_path_prel)
#
#  cuda = torch.device('cuda')
#
#  # Embeddings for training vectors
#  lhs_train_prel = torch.empty((len(train_essays),max_sentences,768), dtype=torch.float)
#  emb_for_padding = tokenizer.encode_plus("", add_special_tokens=True, truncation=True, padding="max_length", return_tensors="pt", max_length=10)
#  tt = torch.tensor(emb_for_padding['input_ids'])
#  lhs_for_padding = model(tt)[2][-2]
#  lhs_for_padding_np = np.array(lhs_for_padding.detach().numpy())
#  lhs_for_padding_mean = np.mean(lhs_for_padding_np,axis=1)
#  lhs_avg_for_padding = torch.tensor(lhs_for_padding_mean[0])
#
#for j,essay in enumerate(tqdm(train_essays)):
#  sentences = split_into_sentences(essay)
#  sen_length = len(sentences)
#  
#  lhs_sentence_avg = np.zeros((max_sentences,768), dtype=float)
#  
#  for i in range(min(max_sentences,len(sentences))):
#    tokenize_sentence = tokenizer.encode(sentences[i],add_special_tokens=True, max_length=512, truncation=True)
#    tt = torch.tensor(tokenize_sentence)
#    tts = tt.reshape(1,len(tt))
#    # getting the 2nd last layer
#    lhs_sentence = model(tts).hidden_states[11]
#    lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
#    lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
#    lhs_sentence_avg[i] = lhs_sentence_np_mean[0]
#  
#  lhs_train_prel[j] = torch.tensor(lhs_sentence_avg)
#
#  if (sen_length < max_sentences):
#   for i in range (sen_length, max_sentences):
#     lhs_train_prel[j][i]= lhs_avg_for_padding
#  
#torch.save(lhs_train_prel, lhs_prelevance_train_path)

#### Creating embedding for the test set

In [None]:
#from transformers import BertModel, BertConfig, BertTokenizer
#tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
#config = BertConfig.from_pretrained('bert-base-uncased', output_hidden_states=True)
#model = BertModel.from_pretrained('bert-base-uncased', config=config)
#
#max_sentences = 100
#X = prompt_data
#y = prompt_data['normalized_score']
#X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
#fold_count =1
#train_essays = X_train['combined_essay']
#test_essays = X_test['combined_essay']
#sentences = []
#tokenize_sentences = []
#
#lhs_test_prel = torch.empty((len(test_essays),max_sentences,768), dtype=torch.float)
#emb_for_padding = tokenizer.encode_plus("", add_special_tokens=True, truncation=True, padding="max_length", return_tensors="pt", max_length=10)
#tt = torch.tensor(emb_for_padding['input_ids'])
#output = model(tt)
#lhs_for_padding = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
#lhs_for_padding_np = np.array(lhs_for_padding.detach().numpy())
#lhs_for_padding_mean = np.mean(lhs_for_padding_np,axis=1)
#
#for j,essay in enumerate(tqdm(test_essays)):
#  sentences = split_into_sentences(essay)
#  sen_length = len(sentences)
#  
#  lhs_sentence_avg = np.zeros((max_sentences,768), dtype=float)
#  
#  for i in range(min(max_sentences,len(sentences))):
#    tokenize_sentence = tokenizer.encode(sentences[i],add_special_tokens=True, max_length=512, truncation=True)
#    tt = torch.tensor(tokenize_sentence)
#    tts = tt.reshape(1,len(tt))
#    # getting the 2nd last layer
#    output = model(tts)
#    lhs_sentence = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
#    lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
#    lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
#    lhs_sentence_avg[i] = lhs_sentence_np_mean[0]
#  
#  lhs_test_prel[j] = torch.tensor(lhs_sentence_avg)
#
#  if (sen_length < max_sentences):
#   for i in range (sen_length, max_sentences):
#     lhs_test_prel[j][i]= lhs_avg_for_padding
#  
#torch.save(lhs_test_prel, lhs_prelevance_test_path)

#### Training/loading the promp-relevance LSTM model

In [None]:
## to load LHS for training and evalutation purposes

#lhs_train_prel = torch.load(lhs_prelevance_train_path)
#y_train = torch.load(y_train_file_path_prel)
#print("Prompt-relevance Model: Shape of loaded TRAIN embeddings:",lhs_train_prel.shape)
#y_test = torch.load(y_test_file_path_prel)
#lhs_test_prel = torch.load(lhs_prelevance_test_path)

In [None]:
#max_sentences = 100
#load_trained_model_prel = False
#if (load_trained_model_prel == True):
#    lstm_model_prel = load_model(prel_model_save_path)
#else:
#  lstm_model_prel = get_model(sen_size=max_sentences)
#  lstm_model_prel.fit(lhs_train_prel.numpy(), y_train, batch_size=128, epochs=60)
#  lstm_model_prel.save(prel_model_save_path)

#### Evaluating the prompt-relevance model

In [None]:
#y_pred = lstm_model_prel.predict(lhs_test_prel.numpy())
#tt1 = np.around(10*y_pred)
#tt2 = tt1.reshape(tt1.shape[0],)
#pred_values = tt2.astype(int)
#tt3 = np.array(10* y_test)
#gold_values = tt3.astype(int)
## evaluate the model
#result = cohen_kappa_score(gold_values,pred_values,weights='quadratic')
#print("Kappa Score: {}".format(result))
#yy_p = y_pred.reshape(y_pred.shape[0],)
#yy_t = np.array(y_test)
#MSE = np.square(np.subtract(yy_t, yy_p)).mean()
#RMSE = math.sqrt(MSE)
#print ("Prompt-relevance Model: MSE: ", MSE)
#print ("Prompt-relevance Model: RMSE: ", RMSE)

### Prompt relevance model - w/ augmented dataset

In [None]:
#from transformers import BertModel, BertConfig, BertTokenizer
#tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
#config = BertConfig.from_pretrained('bert-base-uncased', output_hidden_states=True)
#model = BertModel.from_pretrained('bert-base-uncased', config=config)
#
#max_sentences = 100
#sentences = []
#tokenize_sentences = []
#
#aug_essays = tmp_aug['combined_essay']
#lhs_aug_prel = torch.empty((len(tmp_aug),max_sentences,768), dtype=torch.float)
#emb_for_padding = tokenizer.encode_plus("", add_special_tokens=True, truncation=True, padding="max_length", return_tensors="pt", max_length=10)
#tt = torch.tensor(emb_for_padding['input_ids'])
#output = model(tt)
#lhs_for_padding = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
#lhs_for_padding_np = np.array(lhs_for_padding.detach().numpy())
#lhs_for_padding_mean = np.mean(lhs_for_padding_np,axis=1)
#lhs_avg_for_padding = torch.tensor(lhs_for_padding_mean[0])
#
#for j,essay in enumerate(tqdm(aug_essays)):
#  sentences = split_into_sentences(essay)
#  sen_length = len(sentences)
#  
#  lhs_sentence_avg = np.zeros((max_sentences,768), dtype=float)
#  
#  for i in range(min(max_sentences,len(sentences))):
#    tokenize_sentence = tokenizer.encode(sentences[i],add_special_tokens=True, max_length=512, truncation=True)
#    tt = torch.tensor(tokenize_sentence)
#    tts = tt.reshape(1,len(tt))
#    # getting the 2nd last layer
#    output = model(tts)
#    lhs_sentence = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
#    lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
#    lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
#    lhs_sentence_avg[i] = lhs_sentence_np_mean[0]
#  
#  lhs_aug_prel[j] = torch.tensor(lhs_sentence_avg)
#
#  if (sen_length < max_sentences):
#   for i in range (sen_length, max_sentences):
#     lhs_aug_prel[j][i]= lhs_avg_for_padding
#  
#torch.save(lhs_aug_prel, '/content/drive/MyDrive/Colab Notebooks/AES/prel_aug_lhs.pt')

#### Training/loading the promp-relevance LSTM model

In [None]:
## to load LHS for training and evalutation purposes
#lhs_train_prel = torch.load(lhs_prelevance_train_path)
#lhs_aug_prel = torch.load('/content/drive/MyDrive/Colab Notebooks/AES/prel_aug_lhs.pt')
#lhs_train_prel_aug = torch.cat((lhs_train_prel,lhs_aug_prel),0)
#y_train = torch.load(y_train_file_path_prel)
#print("Prompt-relevance Model: Shape of loaded TRAIN embeddings:",lhs_train_prel_aug.shape)


In [None]:
#y_train_org = torch.load(y_train_file_path_prel)

In [None]:
#lhs_aug_prel.shape

In [None]:
#print (lhs_train_prel_aug[:5])

In [None]:
#y_test = torch.load(y_test_file_path_prel)
#lhs_test_prel = torch.load(lhs_prelevance_test_path)

In [None]:
#print (y_test[:10])

In [None]:
#y_app = 1038*[0.0]

In [None]:
#y_train = y_train.append(pd.Series(y_app),ignore_index=True)

In [None]:
#y_train.iloc[10000:10500]

In [None]:
#type(y_train)

In [None]:
#max_sentences = 100
#load_trained_model_prel = False
#if (load_trained_model_prel == True):
#    lstm_model_prel = load_model('/content/drive/MyDrive/Colab Notebooks/AES/prel_model_with_adverserial_examples.pkl')
#else:
#  lstm_model_prel = get_model(sen_size=max_sentences)
#  lstm_model_prel.fit(lhs_train_prel_aug.numpy(), y_train, batch_size=128, epochs=60)
#  #lstm_model_prel.fit(lhs_train_prel.numpy(), y_train_org, batch_size=128, epochs=60)
#  lstm_model_prel.save('/content/drive/MyDrive/Colab Notebooks/AES/prel_model_with_adverserial_examples.pkl')

#### Evaluating the prompt-relevance model

In [None]:
#y_pred = lstm_model_prel.predict(lhs_test_prel.numpy())
#tt1 = np.around(10*y_pred)
#tt2 = tt1.reshape(tt1.shape[0],)
#pred_values = tt2.astype(int)
#print ("pred values", pred_values)
#tt3 = np.array(10* y_test)
#gold_values = tt3.astype(int)
#print ("gold values", gold_values)
## evaluate the model
#result = cohen_kappa_score(gold_values,pred_values,weights='quadratic')
#print("Kappa Score: {}".format(result))
#yy_p = y_pred.reshape(y_pred.shape[0],)
#yy_t = np.array(y_test)
#MSE = np.square(np.subtract(yy_t, yy_p)).mean()
#RMSE = math.sqrt(MSE)
#print ("Prompt-relevance Model: MSE: ", MSE)
#print ("Prompt-relevance Model: RMSE: ", RMSE)

## Clean Spelling Errors (CSE): Semantic Score


### Cleaning TPD dataset

In [None]:
# remove annoying characters
def unicodetoascii(text):

    TEXT = (text.
    		replace('\\xe2\\x80\\x99', "'").
            replace('\\xc3\\xa9', 'e').
            replace('\\xe2\\x80\\x90', '-').
            replace('\\xe2\\x80\\x91', '-').
            replace('\\xe2\\x80\\x92', '-').
            replace('\\xe2\\x80\\x93', '-').
            replace('\\xe2\\x80\\x94', '-').
            replace('\\xe2\\x80\\x94', '-').
            replace('\\xe2\\x80\\x98', "'").
            replace('\x92',"'").
            replace('\\xe2\\x80\\x9b', "'").
            replace('\\xe2\\x80\\x9c', '"').
            replace('\\xe2\\x80\\x9c', '"').
            replace('\\xe2\\x80\\x9d', '"').
            replace('\\xe2\\x80\\x9e', '"').
            replace('\\xe2\\x80\\x9f', '"').
            replace('\\xe2\\x80\\xa6', '...').
            replace('\\xe2\\x80\\xb2', "'").
            replace('\\xe2\\x80\\xb3', "'").
            replace('\\xe2\\x80\\xb4', "'").
            replace('\\xe2\\x80\\xb5', "'").
            replace('\\xe2\\x80\\xb6', "'").
            replace('\\xe2\\x80\\xb7', "'").
            replace('\\xe2\\x81\\xba', "+").
            replace('\\xe2\\x81\\xbb', "-").
            replace('\\xe2\\x81\\xbc', "=").
            replace('\\xe2\\x81\\xbd', "(").
            replace('\\xe2\\x81\\xbe', ")")

                 )
    return TEXT

In [None]:
import language_tool_python
tool = language_tool_python.LanguageTool('en-US')
def fix_spellings(text):
    # print("Text before: " ,text)
    text = unicodetoascii(text)
    matches = tool.check(text)
    error_offsets = [[text[x.offset:x.offset+x.errorLength],x.offset,x.replacements[0]] 
                     for x in matches if (x.ruleIssueType=='misspelling' and len(x.replacements)>0)]
    error_offsets = sorted(error_offsets, key=lambda x: x[1])
    # print("Error offsets before: ", error_offsets)
    for i in range(len(error_offsets)):
        for j in range(i+1,len(error_offsets)):
            if len(error_offsets[i][2]) >= len(error_offsets[i][0]):
                error_offsets[j][1] += len(error_offsets[i][2]) - len(error_offsets[i][0])
            else:
                error_offsets[j][1] -= abs(len(error_offsets[i][2]) - len(error_offsets[i][0]))
        text = text.replace(error_offsets[i][0], error_offsets[i][2])
    # print("Error offsets after: ", error_offsets)
    # print("Text after: " ,text)
    return text

In [None]:
fix_spellings(tpd_train['essay'][3])

'I think in my option is that the author was very comfortable with his words and his way of being human. His parents was originally from Cuba, was in to there culture nice to other in there surrounding. For an example say It was in this simple house that my parents welcomed other refugees to celebrate their arrival to the country and where I celebrated His first birthday.'

In [None]:
for i in tqdm(range(len(tpd_train))):
    tpd_train.iloc[i]['essay'] = fix_spellings(tpd_train.iloc[i]['essay'])
tpd_train.to_csv("/content/drive/MyDrive/Colab Notebooks/AES/tpd_train_w_fixed_spellings.csv")

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
100%|██████████| 6488/6488 [27:24<00:00,  3.94it/s]


In [None]:
for i in tqdm(range(len(tpd_test))):
    tpd_test.iloc[i]['essay'] = fix_spellings(tpd_test.iloc[i]['essay'])
tpd_test.to_csv("/content/drive/MyDrive/Colab Notebooks/AES/tpd_test_w_fixed_spellings.csv")

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
100%|██████████| 2596/2596 [16:56<00:00,  2.56it/s]


In [None]:
tpd_train = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/tpd_train_w_fixed_spellings.csv')

In [None]:
tpd_test = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/tpd_test_w_fixed_spellings.csv')

In [None]:
tpd_train.head()

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,essay_id,essay_set,essay,domain1_score,normalized_score
0,0,6351,9908,4,The author concludes the story w/this paragrap...,1,0.333333
1,1,6315,9872,4,I believe that the author concludes the story ...,2,0.666667
2,2,304,305,1,"Computers, a very much talked about subject. D...",10,0.8
3,3,8023,12771,5,I think in my opion is that the author was ver...,1,0.25
4,4,4442,6839,3,The setting that affect the cyclist is the con...,1,0.333333


### Get Embeddings: CSE Semantic Score

In [None]:
prepare_embeddings_updated(tpd_train, model_type='semantic', train_or_test='train', 
                           load_from_file=False,file_path='/content/drive/MyDrive/Colab Notebooks/AES/clean_spelling_errors')

In [None]:
prepare_embeddings_updated(tpd_test, model_type='semantic', train_or_test='test', 
                           load_from_file=False,file_path='/content/drive/MyDrive/Colab Notebooks/AES/clean_spelling_errors_test')

###Training LSTM model for Semantic Score

In [None]:
#load embeddings
lhs_train = torch.load('/content/drive/MyDrive/Colab Notebooks/AES/clean_spelling_errors/lhs_train.pt')
y_train = torch.load("/content/drive/MyDrive/Colab Notebooks/AES/clean_spelling_errors/y_train.pt")

In [None]:
#load test embeddings
lhs_test = torch.load("/content/drive/MyDrive/Colab Notebooks/AES/clean_spelling_errors_test/lhs_test.pt")
y_test = torch.load("/content/drive/MyDrive/Colab Notebooks/AES/clean_spelling_errors_test/y_test.pt")

In [None]:
clean_semantic_model_save_path = "/content/drive/MyDrive/Colab Notebooks/AES/clean_semantic.pt"

In [None]:
load_trained_model_clean_semantic = False
if (load_trained_model_clean_semantic == True):
    lstm_model_coh_nsp = load_model(clean_semantic_model_save_path)
else:
  if (embedding == 'full_emb'):
    lstm_model_clean_semantic = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_words_for_full_emb_sem, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  else:
    lstm_model_clean_semantic = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_sentences, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  lstm_model_clean_semantic.fit(lhs_train.numpy(), y_train, batch_size=64, epochs=100)
  lstm_model_clean_semantic.save(clean_semantic_model_save_path)



Layer lstm will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.




Layer lstm_1 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 128, 1028)         7389264   
                                                                 
 lstm_1 (LSTM)               (None, 512)               3155968   
                                                                 
 dropout (Dropout)           (None, 512)               0         
                                                                 
 dense (Dense)               (None, 1)                 513       
                                                                 
Total params: 10,545,745
Trainable params: 10,545,745
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epo

Assets written to: /content/drive/MyDrive/Colab Notebooks/AES/clean_semantic.pt/assets
<keras.layers.recurrent.LSTMCell object at 0x7faab00f7c10> has the same name 'LSTMCell' as a built-in Keras object. Consider renaming <class 'keras.layers.recurrent.LSTMCell'> to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
<keras.layers.recurrent.LSTMCell object at 0x7faa87e9b910> has the same name 'LSTMCell' as a built-in Keras object. Consider renaming <class 'keras.layers.recurrent.LSTMCell'> to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.


### Evaluating the semantic model

In [None]:
evaluate_model (lstm_model_clean_semantic, lhs_test, y_test)

Kappa Score: 0.7640924160128812
MSE:  0.022570307359501808
RMSE:  0.15023417507179188


## Case 1

**Semantic Score:** LSTM

**Coherence Score:** LSTM + NSP Goldens

**Promp-relevance Score:** LSTM + Cosine Similarity

### Semantic Model

#### Load Semantic Model

In [None]:
load_trained_model_sem = True
if (load_trained_model_sem == True):
    lstm_model_sem = load_model(sem_model_save_path)
else:
  if (embedding == 'full_emb'):
    lstm_model_sem = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_words_for_full_emb_sem, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  else:
    lstm_model_sem = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_sentences, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  lstm_model_sem.fit(lhs_train.numpy(), y_train, batch_size=64, epochs=100)
  lstm_model_sem.save(sem_model_save_path)

#### Evaluating the semantic model

In [None]:
evaluate_model (lstm_model_sem, lhs_test, y_test)

Kappa Score: 0.746204577711272
MSE:  0.02282681984321185
RMSE:  0.15108547197931324


### Coherence Model

#### Creating augmented dataset

In [None]:
print ("Original Train Data Shape", tpd_train.shape)
print ("Original Test Data Shape", tpd_test.shape)

samp_tpd_train = tpd_train.sample(frac=0.33)
samp_tpd_train['essay'] = samp_tpd_train['essay'].apply(lambda x: coherence_augment(x))
samp_tpd_train['normalized_score'] = 0
aug_data_train = tpd_train.append(samp_tpd_train)
aug_data_train = aug_data_train.reset_index()
aug_data_train = aug_data_train.drop(columns=['Unnamed: 0','domain1_score','index'])
print ("Augmented Train Data Shape", aug_data_train.shape)

samp_tpd_test = tpd_test.sample(frac=0.33)
samp_tpd_test['essay'] = samp_tpd_test['essay'].apply(lambda x: coherence_augment(x))
samp_tpd_test['normalized_score'] = 0
aug_data_test = tpd_test.append(samp_tpd_test)
aug_data_test = aug_data_test.reset_index()
aug_data_test = aug_data_test.drop(columns=['Unnamed: 0','domain1_score','index'])
print ("Augmented Test Data Shape", aug_data_test.shape)

Original Train Data Shape (6488, 6)
Original Test Data Shape (2596, 6)
Augmented Train Data Shape (8629, 4)
Augmented Test Data Shape (3453, 4)


In [None]:
aug_data_test

Unnamed: 0,essay_id,essay_set,essay,normalized_score
0,9064,4,The reason why at the end of the story she end...,0.333333
1,8884,4,They probably ended it like that to build susp...,0.333333
2,6929,3,The setting in the essay Rough Road ahead; Do...,0.666667
3,15816,6,"Based on the excerpt, The Mooring Mast, the ob...",0.750000
4,368,1,"Dear , Computers have helped us in many ways. ...",0.700000
...,...,...,...,...
3448,6239,3,Kurmaskie experienced a harsh scenery with hig...,0.000000
3449,12001,5,There is lots of emotion because Narciso talks...,0.000000
3450,15864,6,"Also, ""Most dirigibles from out side of the Un...",0.000000
3451,21390,8,That's is how much i was laughing. I was laugh...,0.000000


#### Creating aug TPD dataset with NSP Goldens

In [None]:
from transformers import BertTokenizer, BertForNextSentencePrediction
import torch
from torch.nn import functional as F
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForNextSentencePrediction.from_pretrained('bert-base-uncased')

Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/420M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForNextSentencePrediction: ['cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertForNextSentencePrediction from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForNextSentencePrediction from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [None]:
def nsp_average(essay):
  score = 0
  avg_score = 0
  sentences = split_into_sentences(essay)
  if len(sentences) != 0:
    for i in range(len(sentences)-1):
        encoding = tokenizer.encode_plus(sentences[i], sentences[i+1], return_tensors='pt')
        outputs = model(**encoding).logits
        softmax = F.softmax(outputs, dim = 1)
        score = score + np.float (softmax[0][0])
    avg_score = score / len(sentences)
  # print("Total score: ", score, "Avg. score: ", avg_score)
  # print ("Sentences:\n", sentences)
  return avg_score

In [None]:
#get nsp 
aug_data_train['nsp_golden'] = 0
for i in tqdm(range(len(aug_data_train))):
    aug_data_train.loc[i,'nsp_golden'] = nsp_average(aug_data_train['essay'][i])
aug_data_train.to_csv("/content/drive/MyDrive/Colab Notebooks/AES/aug_data_train_coh_with_nsp.csv")

aug_data_test['nsp_golden'] = 0
for i in tqdm(range(len(aug_data_test))):
    aug_data_test.loc[i,'nsp_golden'] = nsp_average(aug_data_test['essay'][i])
aug_data_test.to_csv("/content/drive/MyDrive/Colab Notebooks/AES/aug_data_test_coh_with_nsp.csv")

100%|██████████| 8629/8629 [2:30:37<00:00,  1.05s/it]
100%|██████████| 3453/3453 [58:11<00:00,  1.01s/it]


In [None]:
aug_data_train = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/aug_data_train_coh_with_nsp.csv')
aug_data_test = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/aug_data_test_coh_with_nsp.csv')

#### Creating embeddings for training set

In [None]:
lhs_train_coh, y_train_coh = prepare_embeddings_updated (aug_data_train, model_type='coherence', train_or_test='train', load_from_file=True, embedding_type='sen_avg', max_words=max_words_for_full_emb, file_path='/content/drive/MyDrive/Colab Notebooks/AES/aug_data_train_coh_with_nsp')

Preparing Embeddings...
Model Type:  coherence
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/aug_data_train_coh_with_nsp
Dataframe provided, Size:  (8629, 6)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/aug_data_train_coh_with_nsp/lhs_coherence_train.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/aug_data_train_coh_with_nsp/y_train_coh.pt
Loaded, Size of LHS embeddings:  torch.Size([8629, 128, 768])
Loaded, Size of y Gold:  (8629,)
Returning lhs: Shape:  torch.Size([8629, 128, 768])
Returning y_gold: Shape:  (8629,)


#### Creating embedding for test set

In [None]:
lhs_test_coh, y_test_coh = prepare_embeddings_updated (aug_data_test, model_type='coherence', train_or_test='test', load_from_file=True, embedding_type='sen_avg', max_words=max_words_for_full_emb, file_path='/content/drive/MyDrive/Colab Notebooks/AES/aug_data_test_coh_with_nsp')

Preparing Embeddings...
Model Type:  coherence
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/aug_data_test_coh_with_nsp
Dataframe provided, Size:  (3453, 6)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/aug_data_test_coh_with_nsp/lhs_coherence_test.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/aug_data_test_coh_with_nsp/y_test_coh.pt
Loaded, Size of LHS embeddings:  torch.Size([3453, 128, 768])
Loaded, Size of y Gold:  (3453,)
Returning lhs: Shape:  torch.Size([3453, 128, 768])
Returning y_gold: Shape:  (3453,)


#### Load LSTM Model for Coherence

In [None]:
coh_nsp_model_save_path = "/content/drive/MyDrive/Colab Notebooks/AES/coherence_model_with_nsp_goldens.pt"

In [None]:
load_trained_model_coh_nsp = True
if (load_trained_model_coh_nsp == True):
    lstm_model_coh_nsp = load_model(coh_nsp_model_save_path)
else:
  if (embedding == 'full_emb'):
    lstm_model_coh_nsp = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_words_for_full_emb_sem, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  else:
    lstm_model_coh_nsp = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_sentences, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  lstm_model_coh_nsp.fit(lhs_train_coh.numpy(), aug_data_train.nsp_golden, batch_size=64, epochs=100)
  lstm_model_coh_nsp.save(coh_nsp_model_save_path)



Layer lstm_6 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.




Layer lstm_7 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.


#### Evaluating the coherence model

In [None]:
evaluate_model (lstm_model_coh_nsp, lhs_test=lhs_test_coh, y_test=aug_data_test.nsp_golden)

Kappa Score: 0.738342082081648
MSE:  0.010897710812705432
RMSE:  0.1043921012946163


### Prompt Relevance LSTM Model w/ Cosine Sim Goldens

#### Loading cosine sim dataset

In [None]:
cos_sim_prel_data = torch.load('/content/drive/MyDrive/Colab Notebooks/AES/prompt_data_with_cosine_sim.df')
print("cos_sim_prel_data shape: ", cos_sim_prel_data.shape)
cos_sim_prel_data.head()

cos_sim_prel_data shape:  (4542, 8)


Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score,prompt,combined_essay,cosine_sim
0,15177,6,The builders of the empire state building atte...,1,0.25,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a...",0.947607
1,14855,6,The builders of the many obstacles when attemp...,4,1.0,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a...",0.937073
2,16587,6,The ability to dock dirigibles atop the Empire...,4,1.0,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a...",0.89864
3,16368,6,They faced many problems when trying to dock t...,2,0.5,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a...",0.86202
4,15281,6,While attempting to allow dirigibles to dock a...,3,0.75,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a...",0.930908


In [None]:
prompt_data_train = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/prompt_data_train.csv")
prompt_data_test = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/prompt_data_test.csv")

In [None]:
lhs_prompts = torch.load( '/content/drive/MyDrive/Colab Notebooks/AES/prompts_lhs.pt')

In [None]:
def return_shuffle_prompt(p, prompts):
    essay_id = prompts.essay_set[prompts['prompt'] == p]
    essay_ids = list(range(1,9))
    essay_ids.remove(int(essay_id))
    ran_id = random.choice(essay_ids)
    return prompts.prompt[ran_id-1]

In [None]:
samp_prompt_train = prompt_data_train.sample(frac=0.33)
samp_prompt_train['prompt'] = samp_prompt_train['prompt'].apply(lambda x: str(return_shuffle_prompt(x,prompts)))
samp_prompt_train['combined_essay'] = samp_prompt_train['prompt'] + samp_prompt_train['essay']
aug_data_train_prel = prompt_data_train.append(samp_prompt_train)
aug_data_train_prel = aug_data_train_prel.reset_index()
aug_data_train_prel = aug_data_train_prel.drop(columns=['index'])

In [None]:
aug_data_train_prel

Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score,prompt,combined_essay
0,9908,4,The author concludes the story w/this paragrap...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
1,9872,4,I believe that the author concludes the story ...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
2,9441,4,The author of the Winter Hibiscus concludes th...,3,1.000000,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
3,9110,4,"From the story, Winter Hibiscus, by Minfong ...",2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
4,10540,4,I believe that the author chose to conclude th...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
...,...,...,...,...,...,...,...
8624,7221,3,The setting sffects the cyclist by the need an...,1,0.333333,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a..."
8625,15703,6,While attempting to allow dirigibles to dock o...,4,1.000000,The author is a second-generation Cuban migran...,The author is a second-generation Cuban migran...
8626,20933,8,Laughter is such a marvelous element to have i...,32,0.533333,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a..."
8627,9071,4,The author chooses to end this story like this...,2,0.666667,The author is a second-generation Cuban migran...,The author is a second-generation Cuban migran...


In [None]:
samp_prompt_test = prompt_data_test.sample(frac=0.33)
samp_prompt_test['prompt'] = samp_prompt_test['prompt'].apply(lambda x: str(return_shuffle_prompt(x,prompts)))
samp_prompt_test['combined_essay'] = samp_prompt_test['prompt'] + samp_prompt_test['essay']
aug_data_test_prel = prompt_data_test.append(samp_prompt_test)
aug_data_test_prel = aug_data_test_prel.reset_index()
aug_data_test_prel = aug_data_test_prel.drop(columns=['index'])

In [None]:
aug_data_test_prel

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,essay_id,essay_set,essay,domain1_score,normalized_score,prompt,combined_essay,cosine_sim
0,0,5510,9064,4,The reason why at the end of the story she end...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,0.864728
1,1,5330,8884,4,They probably ended it like that to build susp...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,0.831380
2,2,6230,9787,4,The author coNcludes the story with this parag...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,0.903829
3,3,6272,9829,4,The author concludes the story Winter Hibiscu...,3,1.000000,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,0.945830
4,4,6164,9721,4,He concludes this story like that so you dont ...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,0.894628
...,...,...,...,...,...,...,...,...,...,...
3448,3448,699,702,1,My opinion about the effects of computers is t...,6,0.400000,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,0.890502
3449,3449,6570,10127,4,The author of Winter Hibiscus included the l...,3,1.000000,"More and more people use computers, but not ev...","More and more people use computers, but not ev...",0.925911
3450,3450,5207,7607,3,There are a lot of things that effect the cycl...,2,0.666667,"More and more people use computers, but not ev...","More and more people use computers, but not ev...",0.851332
3451,3451,11559,18807,7,A time when I was patient was when I was in gr...,7,0.233333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,0.773099


In [None]:
aug_data_test_prel.to_csv("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/aug_data_test_prel.csv")
aug_data_train_prel.to_csv("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/aug_data_train_prel.csv")

NameError: ignored

In [None]:
aug_data_train_prel=pd.read_csv("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/aug_data_train_prel_with.csv")
aug_data_test_prel=pd.read_csv("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/aug_data_test_prel.csv")

In [None]:
print("train shape: ", aug_data_train_prel.shape)
print("test shape: ", aug_data_test_prel.shape)

train shape:  (8629, 8)
test shape:  (3453, 9)


In [None]:
aug_data_test_prel['cosine_sim'] = 0
aug_data_train_prel['cosine_sim'] = 0

In [None]:
from transformers import BertModel, BertConfig, BertTokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
config = BertConfig.from_pretrained('bert-base-uncased', output_hidden_states=True)
model = BertModel.from_pretrained('bert-base-uncased', config=config)

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/420M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [None]:
cos = torch.nn.CosineSimilarity(dim=1, eps=1e-08)

In [None]:
lhs_essay = torch.empty((1,768), dtype=torch.float)

for j in tqdm(range(len(aug_data_train_prel))):
  essay = aug_data_train_prel.essay.iloc[j]
  sentences = split_into_sentences(essay)

  sen_length = len(sentences)
  
  lhs_sentence_avg = np.zeros((1,768), dtype=float)
  lhs_avg_sen = np.empty((0,768), dtype=float)

  for i in range(min(max_sentences,len(sentences))):
    tokenize_sentence = tokenizer.encode(sentences[i],add_special_tokens=True, max_length=512, truncation=True)
    tt = torch.tensor(tokenize_sentence)
    tts = tt.reshape(1,len(tt))
    # getting the 2nd last layer
    output = model(tts)
    lhs_sentence = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
    lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
    lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
    lhs_avg_sen = np.append(lhs_avg_sen,lhs_sentence_np_mean, axis=0)

  lhs_sentence_avg = np.mean(lhs_avg_sen, axis=0, keepdims=True)
  lhs_essay = torch.tensor(lhs_sentence_avg)
#   print("Cosine sim: ", cos(lhs_prompts[aug_data_train_prel.essay_set.iloc[j]-1] , lhs_essay ))
  aug_data_train_prel.cosine_sim.iloc[j] = float(cos(lhs_prompts[aug_data_train_prel.essay_set.iloc[j]-1] , lhs_essay))
  
torch.save(aug_data_train_prel, '/content/drive/MyDrive/Colab Notebooks/AES/prel_data/aug_data_train_prel_with_cosine_sim.df')

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
  out=out, **kwargs)
  ret, rcount, out=ret, casting='unsafe', subok=False)
100%|██████████| 8629/8629 [2:05:38<00:00,  1.14it/s]


In [None]:
lhs_essay = torch.empty((1,768), dtype=torch.float)

for j in tqdm(range(len(aug_data_test_prel))):
  essay = aug_data_test_prel.essay.iloc[j]
  sentences = split_into_sentences(essay)

  sen_length = len(sentences)
  
  lhs_sentence_avg = np.zeros((1,768), dtype=float)
  lhs_avg_sen = np.empty((0,768), dtype=float)

  for i in range(min(max_sentences,len(sentences))):
    tokenize_sentence = tokenizer.encode(sentences[i],add_special_tokens=True, max_length=512, truncation=True)
    tt = torch.tensor(tokenize_sentence)
    tts = tt.reshape(1,len(tt))
    # getting the 2nd last layer
    output = model(tts)
    lhs_sentence = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
    lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
    lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
    lhs_avg_sen = np.append(lhs_avg_sen,lhs_sentence_np_mean, axis=0)

  lhs_sentence_avg = np.mean(lhs_avg_sen, axis=0, keepdims=True)
  lhs_essay = torch.tensor(lhs_sentence_avg)
#   print("Cosine sim: ", cos(lhs_prompts[aug_data_train_prel.essay_set.iloc[j]-1] , lhs_essay ))
  aug_data_test_prel.cosine_sim.iloc[j] = float(cos(lhs_prompts[aug_data_test_prel.essay_set.iloc[j]-1] , lhs_essay))
  
torch.save(aug_data_test_prel, '/content/drive/MyDrive/Colab Notebooks/AES/prel_data/aug_data_test_prel_with_cosine_sim.df')

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
  out=out, **kwargs)
  ret, rcount, out=ret, casting='unsafe', subok=False)
100%|██████████| 3453/3453 [49:49<00:00,  1.15it/s]


#### Load or Create BERT Embeddings for Training Data for Prompt Relevance Model

In [None]:
temp_df_train = pd.DataFrame()
tpd_train.shape

(6488, 6)

In [None]:
aug_data_train_prel = torch.load("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/aug_data_train_prel_with_cosine_sim.df")
aug_data_test_prel = torch.load("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/aug_data_test_prel_with_cosine_sim.df")

In [None]:
load_bert_prel=True
lhs_train_prel, y_train_prel = prepare_embeddings_updated (aug_data_train_prel, model_type='p_rel', train_or_test='train', load_from_file=load_bert_prel, embedding_type=embedding, max_words=max_words_for_full_emb, file_path='/content/drive/MyDrive/Colab Notebooks/AES/prel_data', gold_field='cosine_sim')

Preparing Embeddings...
Model Type:  p_rel
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/prel_data
Dataframe provided, Size:  (8629, 9)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/prel_data/lhs_prel_train.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/prel_data/y_train_prel.pt
Loaded, Size of LHS embeddings:  torch.Size([8629, 128, 768])
Loaded, Size of y Gold:  (8629,)
Returning lhs: Shape:  torch.Size([8629, 128, 768])
Returning y_gold: Shape:  (8629,)


In [None]:
lhs_test_prel, y_test_prel = prepare_embeddings_updated (aug_data_test_prel, model_type='p_rel', train_or_test='test', load_from_file=load_bert_prel, embedding_type=embedding, max_words=max_words_for_full_emb, file_path='/content/drive/MyDrive/Colab Notebooks/AES/prel_data', gold_field='cosine_sim')

Preparing Embeddings...
Model Type:  p_rel
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/prel_data
Dataframe provided, Size:  (3453, 10)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/prel_data/lhs_prel_test.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/prel_data/y_test_prel.pt
Loaded, Size of LHS embeddings:  torch.Size([3453, 128, 768])
Loaded, Size of y Gold:  (3453,)
Returning lhs: Shape:  torch.Size([3453, 128, 768])
Returning y_gold: Shape:  (3453,)


In [None]:
y_train_prel.head()

0    0.899656
1    0.922672
2    0.938123
3    0.926424
4    0.930165
Name: cosine_sim, dtype: float64

#### Build LSTM Model for Prompt Relevance

In [None]:
prel_model_save_path = "/content/drive/MyDrive/Colab Notebooks/AES/prel_data/"

In [None]:
load_trained_model_prel = False
if (load_trained_model_prel == True):
    lstm_model_prel = load_model(prel_model_save_path)
else:
  if (embedding == 'full_emb'):
    lstm_model_prel = get_model(Hidden_dim1=1540, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_words_for_full_emb, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  else:
    lstm_model_prel = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_sentences, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  lstm_model_prel.fit(lhs_train_prel.numpy(), y_train_prel, batch_size=96, epochs=100)
  lstm_model_prel.save(prel_model_save_path)



Layer lstm will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.




Layer lstm_1 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 128, 1028)         7389264   
                                                                 
 lstm_1 (LSTM)               (None, 512)               3155968   
                                                                 
 dropout (Dropout)           (None, 512)               0         
                                                                 
 dense (Dense)               (None, 1)                 513       
                                                                 
Total params: 10,545,745
Trainable params: 10,545,745
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epo

## Experiment 1 (b): Sentence-wise embedding using sum of last 4 hidden states

### Semantic Model

#### Creating embedding for training set

In [None]:
lhs_train, y_train = prepare_embeddings_updated (tpd_train, model_type='semantic', train_or_test='train', load_from_file=load_bert_sem, embedding_type=embedding, max_words=max_words_for_full_emb_sem, file_path=model_path)

Preparing Embeddings...
Model Type:  semantic
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX
Dataframe provided, Size:  (6488, 6)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/lhs_train.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/y_train.pt
Loaded, Size of LHS embeddings:  torch.Size([6488, 128, 768])
Loaded, Size of y Gold:  (6488,)
Returning lhs: Shape:  torch.Size([6488, 128, 768])
Returning y_gold: Shape:  (6488,)


#### Creating embedding for test set

In [None]:
lhs_test, y_test = prepare_embeddings_updated (tpd_test, model_type='semantic', train_or_test='test', load_from_file=load_bert_sem, embedding_type=embedding, max_words=max_words_for_full_emb_sem, file_path = model_path)

Preparing Embeddings...
Model Type:  semantic
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX
Dataframe provided, Size:  (2596, 6)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/lhs_test.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/y_test.pt
Loaded, Size of LHS embeddings:  torch.Size([2596, 128, 768])
Loaded, Size of y Gold:  (2596,)
Returning lhs: Shape:  torch.Size([2596, 128, 768])
Returning y_gold: Shape:  (2596,)


#### Training/loading the semantic LSTM model

In [None]:
load_trained_model_sem = True
if (load_trained_model_sem == True):
    lstm_model_sem = load_model(sem_model_save_path)
else:
  if (embedding == 'full_emb'):
    lstm_model_sem = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_words_for_full_emb_sem, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  else:
    lstm_model_sem = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_sentences, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  lstm_model_sem.fit(lhs_train.numpy(), y_train, batch_size=64, epochs=100)
  lstm_model_sem.save(sem_model_save_path)



Layer lstm_2 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.




Layer lstm_3 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.


#### Evaluating the semantic model

In [None]:
evaluate_model (lstm_model_sem, lhs_test, y_test)

Kappa Score: 0.746204577711272
MSE:  0.02282681984321185
RMSE:  0.15108547197931324


### Coherence Model

#### Creating augmented dataset

In [None]:
aug_data_train = pd.DataFrame(columns=['essay_id', 'essay_set', 'essay', 'normalized_score'])
aug_data_test = pd.DataFrame(columns=['essay_id', 'essay_set', 'essay', 'normalized_score'])

print ("Original Train Data Shape", tpd_train.shape)
print ("Original Test Data Shape", tpd_test.shape)

aug_data_train['essay'] = tpd_train['essay'].apply(lambda x: coherence_augment(x))
aug_data_train['essay_id'] = tpd_train['essay_id'].apply(lambda x: (x))
aug_data_train['essay_set'] = tpd_train['essay_set'].apply(lambda x: (x))
#aug_data_train['normalized_score'] = 0.0
aug_data_train['normalized_score']=tpd_train['normalized_score'].apply(lambda x: (x/8))
aug_data_train = aug_data_train.sample(frac=0.3)

aug_data_test['essay'] = tpd_test['essay'].apply(lambda x: coherence_augment(x))
aug_data_test['essay_id'] = tpd_test['essay_id'].apply(lambda x: (x))
aug_data_test['essay_set'] = tpd_test['essay_set'].apply(lambda x: (x))
#aug_data_test['normalized_score'] = 0.0
aug_data_test['normalized_score']=tpd_test['normalized_score'].apply(lambda x: (x/8))
aug_data_test = aug_data_test.sample(frac=0.3)

tpd_train_thin = tpd_train.drop(columns=['Unnamed: 0','domain1_score'])
tpd_test_thin = tpd_test.drop(columns=['Unnamed: 0','domain1_score'])

aug_data_train = aug_data_train.append(tpd_train_thin)
aug_data_test = aug_data_test.append(tpd_test_thin)
aug_data_train = aug_data_train.sample(frac = 1)
aug_data_test = aug_data_test.sample(frac = 1)

print ("Augmented Train Data Shape", aug_data_train.shape)
print ("Augmented Test Data Shape", aug_data_test.shape)

Original Train Data Shape (6488, 6)
Original Test Data Shape (2596, 6)
Augmented Train Data Shape (8434, 4)
Augmented Test Data Shape (3375, 4)


#### Creating embeddings for training set

In [None]:
lhs_train_coh, y_train_coh = prepare_embeddings_updated (aug_data_train, model_type='coherence', train_or_test='train', load_from_file=load_bert_coh, embedding_type=embedding, max_words=max_words_for_full_emb, file_path=model_path)

Preparing Embeddings...
Model Type:  coherence
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX
Dataframe provided, Size:  (8434, 4)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/lhs_coherence_train.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/y_train_coh.pt
Loaded, Size of LHS embeddings:  torch.Size([8434, 128, 768])
Loaded, Size of y Gold:  (8434,)
Returning lhs: Shape:  torch.Size([8434, 128, 768])
Returning y_gold: Shape:  (8434,)


In [None]:
model_path

'/content/drive/MyDrive/Colab Notebooks/AES/full_embeddings'

#### Creating embedding for test set

In [None]:
lhs_test_coh, y_test_coh = prepare_embeddings_updated (aug_data_test, model_type='coherence', train_or_test='test', load_from_file=load_bert_coh, embedding_type=embedding, max_words=200)

Preparing Embeddings...
Model Type:  coherence
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX
Dataframe provided, Size:  (3375, 4)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/lhs_coherence_test.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/y_test_coh.pt
Loaded, Size of LHS embeddings:  torch.Size([3375, 128, 768])
Loaded, Size of y Gold:  (3375,)
Returning lhs: Shape:  torch.Size([3375, 128, 768])
Returning y_gold: Shape:  (3375,)


#### Load LSTM Model for Coherence

In [None]:
if (load_trained_model_coh == True):
    lstm_model_coh = load_model(coh_model_save_path)
else:
  if (embedding == 'full_emb'):
    lstm_model_coh = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_words_for_full_emb, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  else:
    lstm_model_coh = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_sentences, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  lstm_model_coh.fit(lhs_train_coh.numpy(), y_train_coh, batch_size=64, epochs=100)
  lstm_model_coh.save(coh_model_save_path)



Layer lstm will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.




Layer lstm_1 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.


#### Evaluating the coherence model

In [None]:
evaluate_model (lstm_model_coh, lhs_test_coh, y_test_coh)

Kappa Score: 0.7488510161504939
MSE:  0.03650495161279134
RMSE:  0.19106269026890452


In [None]:
coh_model_save_path

'/content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/coh-lstm_model-latest.pt'

### Prompt relevance Model (Baseline)

#### Prepare prompts data & combined essay with prompt pre-pended

In [None]:
prompts = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/prompts-aes-kaggle-dataset.csv')

In [None]:
prompts.columns = ['essay_set','prompt']

In [None]:
prompts

Unnamed: 0,essay_set,prompt
0,1,"More and more people use computers, but not ev..."
1,2,"""All of us can think of a book that we hope no..."
2,3,"The author, on an ambitious cycling trip to Yo..."
3,4,Saeng is a teenaged Vietnamese migrant in the ...
4,5,The author is a second-generation Cuban migran...
5,6,"In their ambition to outshine the other, the a..."
6,7,Write about patience. Being patient means that...
7,8,We all understand the benefits of laughter. Fo...


In [None]:
#data_train_set = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/AES/experiment_3_part_datasettraining_set_33pc_sample.csv')

In [None]:
#data_train_set = data_train_set.drop(columns = ['Unnamed: 0'])

In [None]:
prompt_data_train = tpd_train.merge(prompts,on='essay_set')
prompt_data_test = tpd_test.merge(prompts,on='essay_set')
prompt_data_xgb = tpd_xgb.merge(prompts,on='essay_set')

In [None]:
prompt_data_train.head()

Unnamed: 0.1,Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score,prompt
0,6351,9908,4,The author concludes the story w/this paragrap...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...
1,6315,9872,4,I believe that the author concludes the story ...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...
2,5885,9441,4,The author of the Winter Hibiscus concludes th...,3,1.0,Saeng is a teenaged Vietnamese migrant in the ...
3,5556,9110,4,"From the story, Winter Hibiscus, by Minfong ...",2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...
4,6977,10540,4,I believe that the author chose to conclude th...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...


In [None]:
prompt_data_test.head()

Unnamed: 0.1,Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score,prompt
0,5510,9064,4,The reason why at the end of the story she end...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...
1,5330,8884,4,They probably ended it like that to build susp...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...
2,6230,9787,4,The author coNcludes the story with this parag...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...
3,6272,9829,4,The author concludes the story Winter Hibiscu...,3,1.0,Saeng is a teenaged Vietnamese migrant in the ...
4,6164,9721,4,He concludes this story like that so you dont ...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...


In [None]:
prompt_data_xgb

Unnamed: 0.1,Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score,prompt
0,5736,9291,4,"In the story Winter Hibiscus by Minfong Ho, ...",2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...
1,6843,10402,4,I think the author ended the story with that p...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...
2,6449,10006,4,The author chose to end the story with this pa...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...
3,6434,9991,4,"This sentence concludes the passage, to show h...",3,1.000000,Saeng is a teenaged Vietnamese migrant in the ...
4,6569,10126,4,The author concludes the Story with this parag...,0,0.000000,Saeng is a teenaged Vietnamese migrant in the ...
...,...,...,...,...,...,...,...
3887,12586,21132,8,Laughter is what connects me my friends. It's...,40,0.666667,We all understand the benefits of laughter. Fo...
3888,12411,20915,8,"was , hot and dry hadn't seen any for a few m...",40,0.666667,We all understand the benefits of laughter. Fo...
3889,12826,21437,8,Laughter is the most important part when you ...,34,0.566667,We all understand the benefits of laughter. Fo...
3890,12557,21095,8,Laughter can not only be a benefit to ourselv...,35,0.583333,We all understand the benefits of laughter. Fo...


In [None]:
prompt_data_train['combined_essay'] = prompt_data_train['prompt'] + prompt_data_train['essay']
prompt_data_test['combined_essay'] = prompt_data_test['prompt'] + prompt_data_test['essay']
prompt_data_xgb['combined_essay'] = prompt_data_xgb['prompt'] + prompt_data_xgb['essay']

In [None]:
prompt_data_xgb.head()

Unnamed: 0.1,Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score,prompt,combined_essay
0,5736,9291,4,"In the story Winter Hibiscus by Minfong Ho, ...",2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
1,6843,10402,4,I think the author ended the story with that p...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
2,6449,10006,4,The author chose to end the story with this pa...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
3,6434,9991,4,"This sentence concludes the passage, to show h...",3,1.0,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
4,6569,10126,4,The author concludes the Story with this parag...,0,0.0,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...


In [None]:
prompt_data_train.head()

Unnamed: 0.1,Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score,prompt,combined_essay
0,6351,9908,4,The author concludes the story w/this paragrap...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
1,6315,9872,4,I believe that the author concludes the story ...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
2,5885,9441,4,The author of the Winter Hibiscus concludes th...,3,1.0,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
3,5556,9110,4,"From the story, Winter Hibiscus, by Minfong ...",2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
4,6977,10540,4,I believe that the author chose to conclude th...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...


#### Prepare Augmented Data for building & testing prompt relevance model

In [None]:
temp_df_train = pd.DataFrame(columns=["essay","prompt","j_prompt","essay_set", "essay_id","combined_essay","normalized_score"])
temp_df_test = pd.DataFrame(columns=["essay","prompt","j_prompt","essay_set", "essay_id","combined_essay","normalized_score"])
temp_df_xgb = pd.DataFrame(columns=["essay","prompt","j_prompt","essay_set", "essay_id","combined_essay","normalized_score"])

In [None]:
temp_df_train['essay'] = prompt_data_train["essay"].apply(lambda x: (x))
temp_df_train['essay_id'] = prompt_data_train['essay_id'].apply(lambda x: (x))
temp_df_train['essay_set'] = prompt_data_train['essay_set'].apply(lambda x: (x))
temp_df_train['normalized_score'] = prompt_data_train['normalized_score'].apply(lambda x: (x))
temp_df_train['prompt'] = prompt_data_train['prompt'].apply(lambda x: (x))
temp_df_train['j_prompt'] = np.random.choice(prompts.prompt, size=len(temp_df_train))
temp_df_train.loc[temp_df_train.prompt != temp_df_train.j_prompt, 'normalized_score'] = 0
temp_df_train['combined_essay'] = temp_df_train['j_prompt'] + temp_df_train['essay']
temp_df_train = temp_df_train.drop(columns=["essay", "prompt", "j_prompt"])
prompt_data_train_thin = prompt_data_train.drop(columns=["Unnamed: 0", "essay", "domain1_score", 'prompt'])
temp_df_train = temp_df_train.sample(frac=0.25)
print ("Size of additional aumented TRAIN data: ", temp_df_train.shape)
temp_df_train = temp_df_train.append(prompt_data_train_thin)
temp_df_train = temp_df_train.sample(frac = 1)
temp_df_train = temp_df_train.rename(columns={"combined_essay": "essay"})

temp_df_test['essay'] = prompt_data_test["essay"].apply(lambda x: (x))
temp_df_test['essay_id'] = prompt_data_test['essay_id'].apply(lambda x: (x))
temp_df_test['essay_set'] = prompt_data_test['essay_set'].apply(lambda x: (x))
temp_df_test['normalized_score'] = prompt_data_test['normalized_score'].apply(lambda x: (x))
temp_df_test['prompt'] = prompt_data_test['prompt'].apply(lambda x: (x))
temp_df_test['j_prompt'] = np.random.choice(prompts.prompt, size=len(temp_df_test))
temp_df_test.loc[temp_df_test.prompt != temp_df_test.j_prompt, 'normalized_score'] = 0
temp_df_test['combined_essay'] = temp_df_test['j_prompt'] + temp_df_test['essay']
temp_df_test = temp_df_test.drop(columns=["essay", "prompt", "j_prompt"])
prompt_data_test_thin = prompt_data_test.drop(columns=["Unnamed: 0", "essay", "domain1_score", 'prompt'])
temp_df_test = temp_df_test.sample(frac=0.25)
print ("Size of additional aumented TRAIN data: ", temp_df_test.shape)
temp_df_test = temp_df_test.append(prompt_data_test_thin)
temp_df_test = temp_df_test.sample(frac = 1)
temp_df_test = temp_df_test.rename(columns={"combined_essay": "essay"})

#temp_df_xgb['essay'] = prompt_data_xgb["essay"].apply(lambda x: (x))
#temp_df_xgb['essay_id'] = prompt_data_xgb['essay_id'].apply(lambda x: (x))
#temp_df_xgb['essay_set'] = prompt_data_xgb['essay_set'].apply(lambda x: (x))
#temp_df_xgb['normalized_score'] = prompt_data_xgb['normalized_score'].apply(lambda x: (x))
#temp_df_xgb['prompt'] = prompt_data_xgb['prompt'].apply(lambda x: (x))
#temp_df_xgb['j_prompt'] = np.random.choice(prompts.prompt, size=len(temp_df_xgb))
#temp_df_xgb.loc[temp_df_xgb.prompt != temp_df_xgb.j_prompt, 'normalized_score'] = 0
#temp_df_xgb['combined_essay'] = temp_df_xgb['j_prompt'] + temp_df_xgb['essay']
#temp_df_xgb['combined_essay'] = temp_df_xgb['prompt'] + temp_df_xgb['essay']
#temp_df_xgb = temp_df_xgb.drop(columns=["essay", "prompt", "j_prompt"])
#prompt_data_xgb_thin = prompt_data_xgb.drop(columns=["Unnamed: 0", "essay", "domain1_score", 'prompt'])
##temp_df_xgb = temp_df_xgb.sample(frac=0.25)
#print ("Size of additional aumented XGB TEST/TRAIN data: ", temp_df_xgb.shape)
#temp_df_xgb = temp_df_xgb.append(prompt_data_xgb_thin)
#temp_df_xgb = temp_df_xgb.sample(frac = 1)
#temp_df_xgb = temp_df_xgb.rename(columns={"combined_essay": "essay"})

Size of additional aumented TRAIN data:  (1622, 4)
Size of additional aumented TRAIN data:  (649, 4)


In [None]:
prompt_data_train

Unnamed: 0.1,Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score,prompt,combined_essay
0,6351,9908,4,The author concludes the story w/this paragrap...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
1,6315,9872,4,I believe that the author concludes the story ...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
2,5885,9441,4,The author of the Winter Hibiscus concludes th...,3,1.000000,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
3,5556,9110,4,"From the story, Winter Hibiscus, by Minfong ...",2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
4,6977,10540,4,I believe that the author chose to conclude th...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
...,...,...,...,...,...,...,...,...
6483,11629,18884,7,Once upon a time there was a princess named wa...,20,0.666667,Write about patience. Being patient means that...,Write about patience. Being patient means that...
6484,10939,18121,7,"I am never paitent, but which I am it is only ...",14,0.466667,Write about patience. Being patient means that...,Write about patience. Being patient means that...
6485,11355,18591,7,"Have you ever been patient? I have, especially...",24,0.800000,Write about patience. Being patient means that...,Write about patience. Being patient means that...
6486,11541,18789,7,"One day my brother, my mom and me were going t...",19,0.633333,Write about patience. Being patient means that...,Write about patience. Being patient means that...


In [None]:
prompt_data_test

Unnamed: 0.1,Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score,prompt,combined_essay
0,5510,9064,4,The reason why at the end of the story she end...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
1,5330,8884,4,They probably ended it like that to build susp...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
2,6230,9787,4,The author coNcludes the story with this parag...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
3,6272,9829,4,The author concludes the story Winter Hibiscu...,3,1.000000,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
4,6164,9721,4,He concludes this story like that so you dont ...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...
...,...,...,...,...,...,...,...,...
2591,2747,3942,2,Things that mean so much to just one person to...,3,0.400000,"""All of us can think of a book that we hope no...","""All of us can think of a book that we hope no..."
2592,2047,3242,2,Censorship is a big controversy in modern soci...,4,0.600000,"""All of us can think of a book that we hope no...","""All of us can think of a book that we hope no..."
2593,2558,3753,2,"In libraries there are a bunch of books, music...",3,0.400000,"""All of us can think of a book that we hope no...","""All of us can think of a book that we hope no..."
2594,3385,4580,2,"Yes, I do think so because everyone has an opi...",2,0.200000,"""All of us can think of a book that we hope no...","""All of us can think of a book that we hope no..."


In [None]:
new_validation_df_xgb = prompt_data_xgb.drop(columns=["Unnamed: 0", "domain1_score","prompt","essay"])
new_validation_df_xgb = new_validation_df_xgb.rename(columns={"combined_essay": "essay"})
new_validation_df_xgb.head()

Unnamed: 0,essay_id,essay_set,normalized_score,essay
0,9291,4,0.666667,Saeng is a teenaged Vietnamese migrant in the ...
1,10402,4,0.333333,Saeng is a teenaged Vietnamese migrant in the ...
2,10006,4,0.666667,Saeng is a teenaged Vietnamese migrant in the ...
3,9991,4,1.0,Saeng is a teenaged Vietnamese migrant in the ...
4,10126,4,0.0,Saeng is a teenaged Vietnamese migrant in the ...


In [None]:
prompts[prompts['essay_set']==2].prompt.iloc[0]

'"All of us can think of a book that we hope none of our children or any other children have taken off the shelf. But if I have the right to remove that book from the shelf -- that work I abhor -- then you also have exactly the same right and so does everyone else. And then we have no books left on the shelf for any of us." --Katherine Paterson, Author. Write a persuasive essay to a newspaper reflecting your vies on censorship in libraries. Do you believe that certain materials, such as books, music, movies, magazines, etc., should be removed from the shelves if they are found offensive? Support your position with convincing arguments from your own experience, observations, and/or reading.'

In [None]:
prompt_data_train.to_csv("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/prompt_data_train.csv")
prompt_data_test.to_csv("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/prompt_data_test.csv")
new_validation_df_xgb.to_csv("/content/drive/MyDrive/Colab Notebooks/AES/prel_data/new_validation_df_xgb.csv")

#### Load or Create BERT Embeddings for Training Data for Prompt Relevance Model

In [None]:
embedding

'sen_avg'

In [None]:
load_bert_prel=False
lhs_train_prel, y_train_prel = prepare_embeddings_updated (temp_df_train, model_type='prel', train_or_test='train', load_from_file=load_bert_prel, embedding_type=embedding, max_words=max_words_for_full_emb)

Preparing Embeddings...
Model Type:  prel
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX
Dataframe provided, Size:  (8110, 4)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/lhs_prel_train.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/y_train_prel.pt


KeyboardInterrupt: ignored

#### Load or Create BERT Embeddings for TEST set for Prompt Relevance Model

In [None]:
load_bert_prel=True
lhs_test_prel, y_test_prel = prepare_embeddings (temp_df_test, model_type='prel', train_or_test='test', load_from_file=load_bert_prel)
#lhs_xgb_prel, y_xgb_prel = prepare_embeddings (new_validation_df_xgb, model_type='prel', train_or_test='test', load_from_file=load_bert_prel, file_path='/content/drive/MyDrive/Colab Notebooks/AES/')

Preparing Embeddings...
Model Type:  prel
Train or Test:  test
Dataframe provided, Size:  (3245, 4)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/lhs_prel_test.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/y_test_prel.pt
Loaded, Size of LHS embeddings:  torch.Size([3245, 128, 768])
Loaded, Size of y Gold:  (3245,)
Returning lhs: Shape:  torch.Size([3245, 128, 768])
Returning y_gold: Shape:  (3245,)


### Build LSTM Model for Prompt Relevance

In [None]:
prel_model_save_path

'/content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/prel-lstm_model-latest.pt'

In [None]:
load_trained_model_prel = True
if (load_trained_model_prel == True):
    lstm_model_prel = load_model(prel_model_save_path)
else:
  if (embedding == 'full_emb'):
    lstm_model_prel = get_model(Hidden_dim1=1540, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_words_for_full_emb, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  else:
    lstm_model_prel = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_sentences, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  lstm_model_prel.fit(lhs_train_prel.numpy(), y_train_prel, batch_size=96, epochs=100)
  lstm_model_prel.save(prel_model_save_path)



Layer lstm will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.




Layer lstm_1 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.


In [None]:
#lstm_model_prel.save(prel_model_save_path)

### Validate the Prompt Relevance Model

In [None]:
evaluate_model (lstm_model_prel, lhs_test_prel, y_test_prel)
#evaluate_model (lstm_model_prel, lhs_xgb_prel, y_xgb_prel)

Kappa Score: 0.8722539037517647
MSE:  0.02049779149266475
RMSE:  0.14317049798287618


###XGBoost Regression Model

In [None]:
compute_handcrafted_features = False
prelEval = True
if (compute_handcrafted_features == True):
  tpd_xgb = tpd_xgb.merge(prompts, on="essay_set") 
  data_with_handcrafted = augment_handcrafted_features(tpd_xgb, prelEval=prelEval)
  data_with_handcrafted.to_csv(data_with_errors_path)
else:
  data_with_handcrafted = pd.read_csv(data_with_errors_path)
data_with_handcrafted.head()

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,essay_id,essay_set,essay,domain1_score,normalized_score,prompt_x,prompt_y,prompt_x.1,prompt_y.1,prompt,spell_err,gram_err,oth_err,semantic_score,coherence_score,prel_score
0,0,5736,9291,4,"In the story Winter Hibiscus by Minfong Ho, ...",2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,8,1,1,0.803541,0.658268,0.637365
1,1,6843,10402,4,I think the author ended the story with that p...,1,0.333333,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,2,0,2,0.367756,0.304968,0.324565
2,2,6449,10006,4,The author chose to end the story with this pa...,2,0.666667,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,1,0,0,0.560226,0.464632,0.572027
3,3,6434,9991,4,"This sentence concludes the passage, to show h...",3,1.0,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,5,0,0,0.994009,0.726757,0.688145
4,4,6569,10126,4,The author concludes the Story with this parag...,0,0.0,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,Saeng is a teenaged Vietnamese migrant in the ...,1,0,0,0.011546,0.112469,0.197627


In [None]:
training_df_xgb_fin = data_with_handcrafted.drop(columns=['Unnamed: 0', 'domain1_score', 'prompt','Unnamed: 0.1','prompt_x', 'prompt_y', 'prompt_x.1', 'prompt_y.1' ])

In [None]:
training_df_xgb_fin.head()

Unnamed: 0,essay_id,essay_set,essay,normalized_score,spell_err,gram_err,oth_err,semantic_score,coherence_score,prel_score
0,9291,4,"In the story Winter Hibiscus by Minfong Ho, ...",0.666667,8,1,1,0.803541,0.658268,0.637365
1,10402,4,I think the author ended the story with that p...,0.333333,2,0,2,0.367756,0.304968,0.324565
2,10006,4,The author chose to end the story with this pa...,0.666667,1,0,0,0.560226,0.464632,0.572027
3,9991,4,"This sentence concludes the passage, to show h...",1.0,5,0,0,0.994009,0.726757,0.688145
4,10126,4,The author concludes the Story with this parag...,0.0,1,0,0,0.011546,0.112469,0.197627


In [None]:
tmp_merge = training_df_xgb_fin.merge(average_essay_lens, left_on='essay_set', right_on='essay_set')
# training_df_xgb_fin['length_deviation'] = 

In [None]:
tmp_merge

Unnamed: 0,essay_id,essay_set,essay,normalized_score,spell_err,gram_err,oth_err,semantic_score,coherence_score,prel_score,essay_len
0,9291,4,"In the story Winter Hibiscus by Minfong Ho, ...",0.666667,8,1,1,0.803541,0.658268,0.637365,91
1,10402,4,I think the author ended the story with that p...,0.333333,2,0,2,0.367756,0.304968,0.324565,91
2,10006,4,The author chose to end the story with this pa...,0.666667,1,0,0,0.560226,0.464632,0.572027,91
3,9991,4,"This sentence concludes the passage, to show h...",1.000000,5,0,0,0.994009,0.726757,0.688145,91
4,10126,4,The author concludes the Story with this parag...,0.000000,1,0,0,0.011546,0.112469,0.197627,91
...,...,...,...,...,...,...,...,...,...,...,...
3887,21132,8,Laughter is what connects me my friends. It's...,0.666667,4,0,11,0.615803,0.602049,0.537702,571
3888,20915,8,"was , hot and dry hadn't seen any for a few m...",0.666667,9,5,46,0.740394,0.772299,0.609743,571
3889,21437,8,Laughter is the most important part when you ...,0.566667,1,2,6,0.563059,0.589851,0.551155,571
3890,21095,8,Laughter can not only be a benefit to ourselv...,0.583333,5,3,11,0.721744,0.672230,0.619011,571


In [None]:
tmp_merge['e_len'] = tmp_merge.essay.apply(lambda x: sum([i.strip(string.punctuation).isalpha() for i in x.split()]))

In [None]:
tmp_merge

Unnamed: 0,essay_id,essay_set,essay,normalized_score,spell_err,gram_err,oth_err,semantic_score,coherence_score,prel_score,essay_len,e_len
0,9291,4,"In the story Winter Hibiscus by Minfong Ho, ...",0.666667,8,1,1,0.803541,0.658268,0.637365,91,132
1,10402,4,I think the author ended the story with that p...,0.333333,2,0,2,0.367756,0.304968,0.324565,91,79
2,10006,4,The author chose to end the story with this pa...,0.666667,1,0,0,0.560226,0.464632,0.572027,91,82
3,9991,4,"This sentence concludes the passage, to show h...",1.000000,5,0,0,0.994009,0.726757,0.688145,91,123
4,10126,4,The author concludes the Story with this parag...,0.000000,1,0,0,0.011546,0.112469,0.197627,91,27
...,...,...,...,...,...,...,...,...,...,...,...,...
3887,21132,8,Laughter is what connects me my friends. It's...,0.666667,4,0,11,0.615803,0.602049,0.537702,571,293
3888,20915,8,"was , hot and dry hadn't seen any for a few m...",0.666667,9,5,46,0.740394,0.772299,0.609743,571,731
3889,21437,8,Laughter is the most important part when you ...,0.566667,1,2,6,0.563059,0.589851,0.551155,571,602
3890,21095,8,Laughter can not only be a benefit to ourselv...,0.583333,5,3,11,0.721744,0.672230,0.619011,571,811


In [None]:
def fnx(x):
  return (x['e_len']-x['essay_len'])/x['essay_len']

In [None]:
tmp_merge['length_deviation'] = tmp_merge.apply(fnx, axis=1)

In [None]:
tmp_merge

Unnamed: 0,essay_id,essay_set,essay,normalized_score,spell_err,gram_err,oth_err,semantic_score,coherence_score,prel_score,essay_len,e_len,length_deviation
0,9291,4,"In the story Winter Hibiscus by Minfong Ho, ...",0.666667,8,1,1,0.803541,0.658268,0.637365,91,132,0.450549
1,10402,4,I think the author ended the story with that p...,0.333333,2,0,2,0.367756,0.304968,0.324565,91,79,-0.131868
2,10006,4,The author chose to end the story with this pa...,0.666667,1,0,0,0.560226,0.464632,0.572027,91,82,-0.098901
3,9991,4,"This sentence concludes the passage, to show h...",1.000000,5,0,0,0.994009,0.726757,0.688145,91,123,0.351648
4,10126,4,The author concludes the Story with this parag...,0.000000,1,0,0,0.011546,0.112469,0.197627,91,27,-0.703297
...,...,...,...,...,...,...,...,...,...,...,...,...,...
3887,21132,8,Laughter is what connects me my friends. It's...,0.666667,4,0,11,0.615803,0.602049,0.537702,571,293,-0.486865
3888,20915,8,"was , hot and dry hadn't seen any for a few m...",0.666667,9,5,46,0.740394,0.772299,0.609743,571,731,0.280210
3889,21437,8,Laughter is the most important part when you ...,0.566667,1,2,6,0.563059,0.589851,0.551155,571,602,0.054291
3890,21095,8,Laughter can not only be a benefit to ourselv...,0.583333,5,3,11,0.721744,0.672230,0.619011,571,811,0.420315


In [None]:
training_df_xgb_fin = tmp_merge

In [None]:
X_xgb = pd.DataFrame()
print ("Input Data shape: ", training_df_xgb_fin.shape)
X_xgb['spell_err'] = training_df_xgb_fin['spell_err']
X_xgb['gram_err'] = training_df_xgb_fin['gram_err']
#X_xgb['num_words'] = training_df_xgb_fin['num_words']
X_xgb['oth_err'] = training_df_xgb_fin['oth_err']
X_xgb['coherence_score'] = training_df_xgb_fin['coherence_score']
X_xgb['semantic_score'] = training_df_xgb_fin['semantic_score']
X_xgb['prel_score'] = training_df_xgb_fin['prel_score']
X_xgb['length_deviation'] = training_df_xgb_fin['length_deviation']
y_xgb = training_df_xgb_fin['normalized_score']
#X_xgb = X_xgb.drop(columns=["Unnamed: 0.1.1"])
#X_xgb = X_xgb.drop(columns=["Unnamed: 0.1"])
#X_xgb = X_xgb.drop(columns=["Unnamed: 0"])
print ("X_xgb Data shape after drop: ", X_xgb.shape)
print ("Original Input Data shape after assigment of X_xgb: ", training_df_xgb_fin.shape)
print ("y shape: ", y_xgb.shape)
print ("Sample Inputs:")
print (X_xgb.head())
print ("Sample Tags - y:")
print (y_xgb.head())
Xgb_train, Xgb_test, ygb_train, ygb_test = train_test_split(X_xgb, y_xgb, test_size=0.2, random_state=42)
Xgb_train_noprel, Xgb_test_noprel, ygb_train_noprel, ygb_test_noprel = train_test_split(X_xgb.drop(columns="prel_score"), y_xgb, test_size=0.2, random_state=42)

Input Data shape:  (3892, 13)
X_xgb Data shape after drop:  (3892, 7)
Original Input Data shape after assigment of X_xgb:  (3892, 13)
y shape:  (3892,)
Sample Inputs:
   spell_err  gram_err  oth_err  ...  semantic_score  prel_score  length_deviation
0          8         1        1  ...        0.803541    0.637365          0.450549
1          2         0        2  ...        0.367756    0.324565         -0.131868
2          1         0        0  ...        0.560226    0.572027         -0.098901
3          5         0        0  ...        0.994009    0.688145          0.351648
4          1         0        0  ...        0.011546    0.197627         -0.703297

[5 rows x 7 columns]
Sample Tags - y:
0    0.666667
1    0.333333
2    0.666667
3    1.000000
4    0.000000
Name: normalized_score, dtype: float64


In [None]:
Xgb_train

Unnamed: 0,spell_err,gram_err,oth_err,coherence_score,semantic_score,prel_score,length_deviation
3731,3,3,8,0.561768,0.621032,0.582219,-0.133100
3079,4,2,9,0.360528,0.549661,0.557382,-0.262821
175,14,0,0,0.506822,0.477708,0.436400,0.307692
278,2,0,0,0.398197,0.433544,0.418570,-0.164835
1074,12,0,7,0.749560,0.707989,0.731731,0.013333
...,...,...,...,...,...,...,...
1130,1,0,5,0.810506,0.780722,0.842737,0.566667
1294,6,1,2,0.764380,0.737699,0.731473,-0.140000
860,7,0,0,0.716068,0.526516,0.783841,0.372881
3507,12,1,4,0.633262,0.619817,0.548944,0.045714


In [None]:
Xgb_train_noprel

Unnamed: 0,spell_err,gram_err,oth_err,coherence_score,semantic_score,length_deviation
3731,3,3,8,0.561768,0.621032,-0.133100
3079,4,2,9,0.360528,0.549661,-0.262821
175,14,0,0,0.506822,0.477708,0.307692
278,2,0,0,0.398197,0.433544,-0.164835
1074,12,0,7,0.749560,0.707989,0.013333
...,...,...,...,...,...,...
1130,1,0,5,0.810506,0.780722,0.566667
1294,6,1,2,0.764380,0.737699,-0.140000
860,7,0,0,0.716068,0.526516,0.372881
3507,12,1,4,0.633262,0.619817,0.045714


In [None]:
Xgb_test

Unnamed: 0,spell_err,gram_err,oth_err,coherence_score,semantic_score,prel_score,length_deviation
3107,1,0,3,0.405781,0.411133,0.350979,-0.615385
2398,8,0,2,0.278744,0.337107,0.310831,-0.571816
3864,15,4,59,0.679779,0.725295,0.612141,0.316988
1187,0,0,0,0.973537,0.996845,0.833963,0.753333
315,6,1,1,0.541859,0.697749,0.658260,0.252747
...,...,...,...,...,...,...,...
3453,8,6,5,0.530352,0.466695,0.526662,-0.234286
2765,24,1,5,0.632146,0.553861,0.625056,0.371795
978,0,1,0,0.685724,0.752813,0.698727,-0.127119
650,1,0,2,0.181323,0.408657,0.316767,-0.449153


In [None]:
Xgb_test_noprel

Unnamed: 0,spell_err,gram_err,oth_err,coherence_score,semantic_score,length_deviation
3107,1,0,3,0.405781,0.411133,-0.615385
2398,8,0,2,0.278744,0.337107,-0.571816
3864,15,4,59,0.679779,0.725295,0.316988
1187,0,0,0,0.973537,0.996845,0.753333
315,6,1,1,0.541859,0.697749,0.252747
...,...,...,...,...,...,...
3453,8,6,5,0.530352,0.466695,-0.234286
2765,24,1,5,0.632146,0.553861,0.371795
978,0,1,0,0.685724,0.752813,-0.127119
650,1,0,2,0.181323,0.408657,-0.449153


In [None]:
ygb_train

3731    0.600000
3079    0.666667
175     0.666667
278     0.333333
1074    0.500000
          ...   
1130    0.750000
1294    0.750000
860     0.750000
3507    0.600000
3174    0.800000
Name: normalized_score, Length: 3113, dtype: float64

In [None]:
ygb_train_noprel

3731    0.600000
3079    0.666667
175     0.666667
278     0.333333
1074    0.500000
          ...   
1130    0.750000
1294    0.750000
860     0.750000
3507    0.600000
3174    0.800000
Name: normalized_score, Length: 3113, dtype: float64

In [None]:
#normalized_Xgb_train=(Xgb_train-Xgb_train.mean())/Xgb_train.std()
from sklearn import preprocessing
load_trained_model_RF = False
second_layer_with_prel = False

if (load_trained_model_RF == True):
  # No need to generate/scale training vectors, simple load the pre-trained Standard Scaler models from file
  # Load Standard Scaler Model
  print ("Not generating input training vecotrs, only loading standard scaler model from file")
  if (second_layer_with_prel == True):
    sc = pickle.load(open("/content/drive/MyDrive/Colab Notebooks/AES/std_scaler_aes_withprel_15Sep2021", 'rb'))
    print ("Loaded Standard Scaler model WITH PREL from file")
  else:
    sc = pickle.load(open("/content/drive/MyDrive/Colab Notebooks/AES/std_scaler_aes_wo_prel_15Sep2021", 'rb'))
    print ("Loaded Standard Scaler model WITHOUT PREL from file")

else:
  # Fit standard scaler model on input training vectors & scale the input training vectors
  print ("Building standard scaler model from input training vecotrs, scaling input training vectors, soting the SC model to file!")
  sc = preprocessing.StandardScaler()
  if (second_layer_with_prel == True):
    # Build second layer RF model with prompt relevence
    sc.fit(Xgb_train)
    normalized_Xgb_train = sc.transform(Xgb_train.values)
    normalized_Xgb_train_df = pd.DataFrame(normalized_Xgb_train, index=Xgb_train.index, columns=Xgb_train.columns)
    ygb_class = np.around(ygb_train*20).astype(int)
    feature_names = ['spell_err','gram_err','oth_err','coherence_score','semantic_score', 'prel_score','length_deviation']
    # Save Standard Scaler Model
    pickle.dump(sc, open("/content/drive/MyDrive/Colab Notebooks/AES/std_scaler_aes_withprel_15Sep2021", 'wb'))
    print ("Input training vectors scaled and saved Standard Scaler model WITH PREL to a file")
  else:
    # Build second layer RF model with prompt relevence
    sc.fit(Xgb_train_noprel)
    normalized_Xgb_train = sc.transform(Xgb_train_noprel.values)
    normalized_Xgb_train_df = pd.DataFrame(normalized_Xgb_train, index=Xgb_train_noprel.index, columns=Xgb_train_noprel.columns)
    ygb_class = np.around(ygb_train_noprel*20).astype(int)
    feature_names = ['spell_err','gram_err','oth_err','coherence_score','semantic_score', 'length_deviation']
    # Save Standard Scaler Model
    pickle.dump(sc, open("/content/drive/MyDrive/Colab Notebooks/AES/std_scaler_aes_wo_prel_15Sep2021", 'wb'))
    print ("Input training vectors scaled & saved Standard Scaler model WITHOUT PREL to a file")


Building standard scaler model from input training vecotrs, scaling input training vectors, soting the SC model to file!
Input training vectors scaled & saved Standard Scaler model WITHOUT PREL to a file


In [None]:
normalized_Xgb_train_df

Unnamed: 0,spell_err,gram_err,oth_err,coherence_score,semantic_score,length_deviation
3731,-0.499252,1.372664,0.849337,0.011327,0.121764,-0.305810
3079,-0.338962,0.694716,1.052128,-0.865662,-0.205519,-0.593669
175,1.263937,-0.661178,-0.772989,-0.228124,-0.535475,0.672337
278,-0.659542,-0.661178,-0.772989,-0.701504,-0.737996,-0.376233
1074,0.943357,-0.661178,0.646546,0.829706,0.520518,0.019135
...,...,...,...,...,...,...
1130,-0.819832,-0.661178,0.240965,1.095305,0.854048,1.247018
1294,-0.018382,0.016769,-0.367408,0.894290,0.656760,-0.321122
860,0.141908,-0.661178,-0.772989,0.683751,-0.311657,0.816996
3507,0.943357,0.016769,0.038174,0.322893,0.116192,0.090990


In [None]:
if (second_layer_with_prel == True):  
  tt = np.array(20* ygb_test)
else:
  tt = np.array(20* ygb_test_noprel)
xgb_gold_values_class = tt.astype(int)

In [None]:
import matplotlib.pyplot as plt
%matplotlib notebook
def f_importances (coef, names):
  imp = coef[0]
  imp,names = zip(*sorted(zip(imp,names)))
  print ("AS Importance, names: ", imp, names)
  plt.barh(range(len(names)), imp, align='center')
  plt.yticks(range(len(names)), names)
  plt.show()

In [None]:
from sklearn import svm
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, ExtraTreesClassifier, GradientBoostingClassifier, AdaBoostClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import NearestNeighbors
import pickle as pkl

clfs = {'RF': RandomForestClassifier (n_estimators=50, n_jobs=-1),
        'ET': ExtraTreesClassifier (n_estimators=10, n_jobs=-1, criterion='entropy'),
        'AS': AdaBoostClassifier (DecisionTreeClassifier(max_depth=1), algorithm='SAMME', n_estimators=200),
        'LR': LogisticRegression (penalty='l1', solver='liblinear', C=1e5),
        'SVM': svm.SVC(kernel='linear', probability=True, random_state=0),
        'GB': GradientBoostingClassifier (learning_rate=0.05, subsample=0.5, max_depth=6, n_estimators=10),
        'NB': GaussianNB(),
        'DT': DecisionTreeClassifier()
        }
model = clfs['RF'].fit(normalized_Xgb_train_df, ygb_class)

In [None]:
if (second_layer_with_prel == True):
  # Build second layer RF model with prompt relevence
  normalized_Xgb_test = sc.transform(Xgb_test.values)
  normalized_Xgb_test_df = pd.DataFrame(normalized_Xgb_test, index=Xgb_test.index, columns=Xgb_test.columns)
  tt = np.array(20* ygb_test)
else:
  normalized_Xgb_test = sc.transform(Xgb_test_noprel.values)
  normalized_Xgb_test_df = pd.DataFrame(normalized_Xgb_test, index=Xgb_test_noprel.index, columns=Xgb_test_noprel.columns)
  tt = np.array(20* ygb_test_noprel)

xgb_gold_values_class = tt.astype(int)
xgb_y_pred = model.predict(normalized_Xgb_test_df)

In [None]:
normalized_Xgb_test_df

Unnamed: 0,spell_err,gram_err,oth_err,coherence_score,semantic_score,length_deviation
3107,-0.819832,-0.661178,-0.164617,-0.668454,-0.840762,-1.376032
2398,0.302198,-0.661178,-0.367408,-1.222071,-1.180223,-1.279349
3864,1.424227,2.050611,11.191668,0.525610,0.599878,0.692964
1187,-0.980122,-0.661178,-0.772989,1.805783,1.845119,1.661244
315,-0.018382,0.016769,-0.570198,-0.075435,0.473564,0.550410
...,...,...,...,...,...,...
3453,0.302198,3.406506,0.240965,-0.125582,-0.585974,-0.530348
2765,2.866837,0.016769,0.240965,0.318026,-0.186260,0.814585
978,-0.980122,0.016769,-0.772989,0.551518,0.726069,-0.292537
650,-0.819832,-0.661178,-0.367408,-1.646625,-0.852116,-1.007152


In [None]:
result = cohen_kappa_score(xgb_gold_values_class,xgb_y_pred,weights='quadratic')
print("Kappa Score: {}".format(result))

Kappa Score: 0.777496800526493


In [None]:
bst = xgb.XGBRegressor({'nthread':4})
bst.load_model(file_name)

In [None]:
xgb_y_pred = bst.predict(Xgb_test)

In [None]:
xgb_y_pred = regressor.predict(Xgb_test)

NameError: ignored

In [None]:
if (second_layer_with_prel == True):
  pkl.dump(model,open('/content/drive/MyDrive/Colab Notebooks/AES/RandomForest_with_prel.sav','wb'))
else:
  pkl.dump(model,open('/content/drive/MyDrive/Colab Notebooks/AES/RandomForest_wo_prel.sav','wb'))

In [None]:
pd.DataFrame(regressor.feature_importances_.reshape(1, -1), columns=['spell_err','gram_err','oth_err','coherence_score','semantic_score', 'prel_score','length_deviation'])

xgb_y_pred = regressor.predict(Xgb_test)

MSE = np.square(np.subtract(ygb_test, xgb_y_pred)).mean()
RMSE = math.sqrt(MSE)
print ("Coherence Model: MSE: ", MSE)
print ("COherence Model: RMSE: ", RMSE)

file_name = "/content/drive/MyDrive/Colab Notebooks/AES/xgb_reg.pkl"

# save XGB Model
regressor.save_model(file_name)
result = cohen_kappa_score(xgb_gold_values_class,xgb_pred_class,weights='quadratic')
print("Kappa Score: {}".format(result))


## Cosine Similarity for Prompt Relevance & Next Sentence Prediction model for Coherence

In [None]:
#prompt_data['cosine_sim'] = 0.0

In [None]:
from transformers import BertModel, BertConfig, BertTokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
config = BertConfig.from_pretrained('bert-base-uncased', output_hidden_states=True)
model = BertModel.from_pretrained('bert-base-uncased', config=config)

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [None]:
lhs_prompts = torch.empty((len(prompts),1,768), dtype=torch.float)

for i in range(len(prompts)):
  prompt = prompts.prompt.iloc[i]
  prompt_sentences = split_into_sentences(prompt)
  sen_length = len(prompt_sentences)

  lhs_sentence_avg = np.zeros((1,768), dtype = float)
  lhs_avg_sen = np.empty((0,768), dtype=float)

  for j in range(min(max_sentences,sen_length)):
    tokenize_sentence = tokenizer.encode(prompt_sentences[j], add_special_tokens=True, max_length=512, truncation=True)
    tt = torch.tensor(tokenize_sentence)
    tts = tt.reshape(1,len(tt))
    # getting the 2nd last layer
    output = model(tts)
    lhs_sentence = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
    lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
    lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
    lhs_avg_sen = np.append(lhs_avg_sen,lhs_sentence_np_mean, axis=0)
  lhs_sentence_avg = np.mean(lhs_avg_sen, axis=0)

  lhs_prompts[i] = torch.tensor(lhs_sentence_avg)

torch.save(lhs_prompts, '/content/drive/MyDrive/Colab Notebooks/AES/prompts_lhs.pt')

In [None]:
lhs_prompts.shape

torch.Size([8, 1, 768])

In [None]:
cos = torch.nn.CosineSimilarity(dim=0, eps=1e-08)

In [None]:
input1 = torch.randn(1, 768)
input2 = torch.randn(1, 768)

NameError: ignored

In [None]:
cos(input1, input2)

tensor([-0.0735, -0.0395, -0.1273,  0.0273, -0.1357,  0.0424, -0.0101,  0.1760,
        -0.0862, -0.0300,  0.0151,  0.0976, -0.0567,  0.1664, -0.0106, -0.0485,
         0.0470, -0.1434, -0.0121,  0.0388, -0.1286,  0.0011,  0.0601,  0.2076,
        -0.0657,  0.0909, -0.0322,  0.0533,  0.0668, -0.1319, -0.0291,  0.1785,
        -0.0184,  0.2033,  0.0617,  0.0654, -0.0874, -0.1517, -0.2302,  0.2155,
         0.1217, -0.1152, -0.0351, -0.0500, -0.0870,  0.1235, -0.0581, -0.0784,
         0.0679,  0.0482,  0.0541, -0.2324,  0.2389,  0.0947,  0.0015,  0.0865,
        -0.0198, -0.1263, -0.1202, -0.1421, -0.1142,  0.0237, -0.1126,  0.1057,
         0.0484, -0.0793,  0.1952, -0.0747,  0.0134, -0.1639,  0.2234, -0.0713,
         0.0472, -0.0408,  0.0570,  0.0212, -0.0656,  0.0661, -0.0477,  0.0899,
        -0.0613,  0.1057,  0.0440, -0.0396,  0.0081,  0.1307, -0.0538, -0.1125,
        -0.0027, -0.0480, -0.0281,  0.1326, -0.0061,  0.1386,  0.0103,  0.0321,
        -0.0608,  0.0912, -0.2205,  0.09

In [None]:
lhs_essay = torch.empty((1,768), dtype=torch.float)
# emb_for_padding = tokenizer.encode_plus("", add_special_tokens=True, truncation=True, padding="max_length", return_tensors="pt", max_length=10)
# tt = torch.tensor(emb_for_padding['input_ids'])
# output = model(tt)
# lhs_for_padding = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
# lhs_for_padding_np = np.array(lhs_for_padding.detach().numpy())
# lhs_for_padding_mean = np.mean(lhs_for_padding_np,axis=1)
# lhs_avg_for_padding = torch.tensor(lhs_for_padding_mean[0])

for j in tqdm(range(len(prompt_data))):
  essay = prompt_data.essay.iloc[j]
  sentences = split_into_sentences(essay)

  sen_length = len(sentences)
  
  lhs_sentence_avg = np.zeros((1,768), dtype=float)
  lhs_avg_sen = np.empty((0,768), dtype=float)

  for i in range(min(max_sentences,len(sentences))):
    tokenize_sentence = tokenizer.encode(sentences[i],add_special_tokens=True, max_length=512, truncation=True)
    tt = torch.tensor(tokenize_sentence)
    tts = tt.reshape(1,len(tt))
    # getting the 2nd last layer
    output = model(tts)
    lhs_sentence = output.hidden_states[12] + output.hidden_states[11] + output.hidden_states[10] + output.hidden_states[9]
    lhs_sentence_np = np.array(lhs_sentence.detach().numpy())
    lhs_sentence_np_mean = np.mean(lhs_sentence_np,axis=1)
    lhs_avg_sen = np.append(lhs_avg_sen,lhs_sentence_np_mean, axis=0)

  lhs_sentence_avg = np.mean(lhs_avg_sen, axis=0, keepdims=True)
  lhs_essay = torch.tensor(lhs_sentence_avg)

  prompt_data.cosine_sim.iloc[j] = float(cos(lhs_prompts[prompt_data.essay_set.iloc[j]-1] , lhs_essay))
  
torch.save(prompt_data, '/content/drive/MyDrive/Colab Notebooks/AES/prompt_data_with_cosine_sim.df')

100%|██████████| 4542/4542 [1:19:42<00:00,  1.05s/it]


In [None]:
lhs_prompts[prompt_data.essay_set.iloc[j]-1].shape

torch.Size([1, 768])

In [None]:
prompt_data.cosine_sim.iloc[0] = 0.0

In [None]:
lhs_essay.shape

torch.Size([768])

In [None]:
lhs_prompts[prompt_data.essay_set.iloc[0]-1].shape

torch.Size([80, 768])

In [None]:
prompt_data.essay_set.iloc[10000]-1

5

In [None]:
prompt_data.head()

Unnamed: 0,essay_id,essay_set,essay,domain1_score,normalized_score,prompt,combined_essay,cosine_sim
0,15177,6,The builders of the empire state building atte...,1,0.25,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a...",0.947607
1,14855,6,The builders of the many obstacles when attemp...,4,1.0,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a...",0.937073
2,16587,6,The ability to dock dirigibles atop the Empire...,4,1.0,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a...",0.89864
3,16368,6,They faced many problems when trying to dock t...,2,0.5,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a...",0.86202
4,15281,6,While attempting to allow dirigibles to dock a...,3,0.75,"In their ambition to outshine the other, the a...","In their ambition to outshine the other, the a...",0.930908


In [None]:
prompt_data.iloc[10000]

NameError: ignored

In [None]:
cos(lhs_prompts[prompt_data.essay_set.iloc[j]-1] , lhs_essay)

NameError: ignored

### XGBoost Regression

#####Prepare Handcrafted Data  for XGBoost Regression

In [None]:
compute_handcrafted_features = False
if (compute_handcrafted_features == True):
  data_with_handcrafted = augment_handcrafted_features(data)
  data_with_handcrafted.to_csv(data_with_errors_path)
else:
  data_with_handcrafted = pd.read_csv(data_with_errors_path)
data_with_handcrafted.head()

## New coherence model with NSP goldens

### Loading/creating dataset

In [None]:
lhs_train, y_train = prepare_embeddings_updated (tpd_train, model_type='semantic', train_or_test='train', load_from_file=load_bert_sem, embedding_type=embedding, max_words=max_words_for_full_emb_sem, file_path=model_path)

Preparing Embeddings...
Model Type:  semantic
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX
Dataframe provided, Size:  (6488, 6)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/lhs_train.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/y_train.pt
Loaded, Size of LHS embeddings:  torch.Size([6488, 128, 768])
Loaded, Size of y Gold:  (6488,)
Returning lhs: Shape:  torch.Size([6488, 128, 768])
Returning y_gold: Shape:  (6488,)


In [None]:
from transformers import BertTokenizer, BertForNextSentencePrediction
import torch
from torch.nn import functional as F
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForNextSentencePrediction.from_pretrained('bert-base-uncased')

Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/420M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForNextSentencePrediction: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertForNextSentencePrediction from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForNextSentencePrediction from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [None]:
calculate_nsp_goldens = False
def nsp_average(essay):
  score = 0
  avg_score = 0
  sentences = split_into_sentences(essay)
  if len(sentences) != 0:
    for i in range(len(sentences)-1):
        encoding = tokenizer.encode_plus(sentences[i], sentences[i+1], return_tensors='pt')
        outputs = model(**encoding).logits
        softmax = F.softmax(outputs, dim = 1)
        score = score + np.float (softmax[0][0])
    avg_score = score / len(sentences)
  # print("Total score: ", score, "Avg. score: ", avg_score)
  # print ("Sentences:\n", sentences)
  return avg_score
  
if calculate_nsp_goldens:
    tpd_train['nsp_golden'] = 0
    for i in tqdm(range(len(tpd_train))):
        tpd_train['nsp_golden'].iloc[i] = nsp_average(tpd_train['essay'][i])
    nsp_average(tpd_train['essay'][0])
    tpd_train.to_csv("/content/drive/MyDrive/Colab Notebooks/AES/tpd_train_with_nsp.csv")

In [None]:
nsp_tpd_train = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/AES/tpd_train_with_nsp.csv")
nsp_tpd_train

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,essay_id,essay_set,essay,domain1_score,normalized_score,nsp_golden
0,0,6351,9908,4,The author concludes the story w/this paragrap...,1,0.333333,0.666664
1,1,6315,9872,4,I believe that the author concludes the story ...,2,0.666667,0.857135
2,2,304,305,1,"Computers, a very much talked about subject. D...",10,0.800000,0.956475
3,3,8023,12771,5,I think in my opion is that the author was ver...,1,0.250000,0.666370
4,4,4442,6839,3,The setting that affect the cyclist is the con...,1,0.333333,0.666664
...,...,...,...,...,...,...,...,...
6483,6483,12781,21380,8,When I was fourteen years old I think my fami...,37,0.616667,0.962917
6484,6484,7107,11855,5,"In Narciso Rodriguez's memoir, the mood and fe...",3,0.750000,0.857108
6485,6485,1736,1741,1,"Dear, local newspaper, you like computers? I ,...",8,0.600000,0.965063
6486,6486,12296,20770,8,Laughter is indeed an important part of anyon...,39,0.650000,0.934141


### Training/loading the coherence (NSP) LSTM model

In [None]:
coh_nsp_model_save_path = "/content/drive/MyDrive/Colab Notebooks/AES/coherence_model_with_nsp_goldens.pt"

In [None]:
load_trained_model_coh_nsp = False
if (load_trained_model_coh_nsp == True):
    lstm_model_coh_nsp = load_model(coh_nsp_model_save_path)
else:
  if (embedding == 'full_emb'):
    lstm_model_coh_nsp = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_words_for_full_emb_sem, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  else:
    lstm_model_coh_nsp = get_model(Hidden_dim1=1028, Hidden_dim2=512, return_sequences = True, dropout_dense=0.5, dropout_lstm=0.4, 
                             recurrent_dropout=0.4, sen_size=max_sentences, input_size=768, activation='sigmoid', 
                             opt_engine='adam', loss_fn='mse')
  lstm_model_coh_nsp.fit(lhs_train.numpy(), nsp_tpd_train.nsp_golden, batch_size=64, epochs=100)
  lstm_model_coh_nsp.save(coh_nsp_model_save_path)



Layer lstm will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.




Layer lstm_1 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 128, 1028)         7389264   
                                                                 
 lstm_1 (LSTM)               (None, 512)               3155968   
                                                                 
 dropout (Dropout)           (None, 512)               0         
                                                                 
 dense (Dense)               (None, 1)                 513       
                                                                 
Total params: 10,545,745
Trainable params: 10,545,745
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epo

Assets written to: /content/drive/MyDrive/Colab Notebooks/AES/coherence_model_with_nsp_goldens.pt/assets
<keras.layers.recurrent.LSTMCell object at 0x7f42c12f4ed0> has the same name 'LSTMCell' as a built-in Keras object. Consider renaming <class 'keras.layers.recurrent.LSTMCell'> to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.
<keras.layers.recurrent.LSTMCell object at 0x7f42c12ed550> has the same name 'LSTMCell' as a built-in Keras object. Consider renaming <class 'keras.layers.recurrent.LSTMCell'> to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.


In [None]:
lhs_test, y_test = prepare_embeddings_updated (tpd_test, model_type='semantic', train_or_test='test', load_from_file=load_bert_sem, embedding_type=embedding, max_words=max_words_for_full_emb_sem, file_path = model_path)

Preparing Embeddings...
Model Type:  semantic
Embedding Type:  sen_avg
hState:  last4sum
Save File Directory:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX
Dataframe provided, Size:  (2596, 6)
Loading existing embeddings from file...
LHS File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/lhs_test.pt
Y File chosen:  /content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/y_test.pt
Loaded, Size of LHS embeddings:  torch.Size([2596, 128, 768])
Loaded, Size of y Gold:  (2596,)
Returning lhs: Shape:  torch.Size([2596, 128, 768])
Returning y_gold: Shape:  (2596,)


In [None]:
evaluate_model (lstm_model_sem, lhs_test, y_test)

## NSP v/s OG model- on IELTS Dataset

### Loading IELTS dataset

In [None]:
ielts_dataset = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/AES/essay-training-data-scraped.xlsx - IELTS-Essays.csv", header=0, index_col=False, names=['id','prompt','essay','score','comments', 'COH','LR','GR',"TA"])

  return func(*args, **kwargs)


In [None]:
ielts_dataset = ielts_dataset.drop(columns=['id'])

In [None]:
ielts_dataset.comments[3]

'This is a great essay. Seems worthy of Band 8. No improvements are necessary, keep up the good work!'

In [None]:
ielts_dataset['normalized_score'] = ielts_dataset.score.apply(lambda x: float(normalize_value(x, ielts_dataset.score.min(), ielts_dataset.score.max())))

In [None]:
ielts_dataset

Unnamed: 0,prompt,essay,score,comments,COH,LR,GR,TA,normalized_score
0,As computers are being used more and more in e...,There is no doubt that education and the learn...,8.00,,,,,,0.833333
1,Popular events like the Football World Cup and...,"Every four years, the whole world stops to wat...",8.00,"This is a great essay, the ideas, language, st...",,,,,0.833333
2,Some say that rich countries should help poor ...,"Improvements in health, education and trade ar...",8.00,"This is a great essay, seems to be on a Band 8...",,,,,0.833333
3,As computers are being used more and more in e...,There have been immense advances in technology...,8.00,This is a great essay. Seems worthy of Band 8....,,,,,0.833333
4,Financial education should be a mandatory comp...,It is an obvious fact that financial aspects a...,7.75,"This is a wonderful essay. It covers the task,...",,,,,0.791667
...,...,...,...,...,...,...,...,...,...
210,Some people view teenage conflict with their p...,There is no doubt that adolescence can be a di...,9.00,"COH: 9,9,9,9,9; LR: 8,9; GR: 9,9; TA: 9,9,9,ok",9.0,8.0,9.0,9.0,1.000000
211,Some people believe money is a less important ...,It is widely believed by some that compared wi...,7.00,"COH: 9,9,9,7,4; LR: 6,9; GR: 9,9; TA: 9,8,9,ok",4.0,6.0,9.0,8.0,0.666667
212,People are now living much longer lives than b...,"In recent times, people are beginning to live ...",7.00,"COH: 9,9,6,9,9; LR: 6,9; GR: 9,9; TA: 9, 6.5, ...",6.0,6.0,9.0,6.0,0.666667
213,Nowadays the way many people interact with eac...,It is widely observed that the mode of communi...,7.00,"COH: 9,9,9,9,7; LR: 7,9; GR: 9,9; TA: 5,7,9,ok",7.0,7.0,9.0,,0.666667


### Creating embeddings

In [None]:
# prepare_embeddings_updated(ielts_dataset,load_from_file=False,file_path="/content/drive/MyDrive/Colab Notebooks/AES/IELTS_dataset_data")

In [None]:
lhs_ielts_dataset = torch.load("/content/drive/MyDrive/Colab Notebooks/AES/IELTS_dataset_data/lhs_test.pt")
y_test_ielts_dataset = torch.load("/content/drive/MyDrive/Colab Notebooks/AES/IELTS_dataset_data/y_test.pt")

In [None]:
non_norm_y_test_ielts = ielts_dataset.score.apply(lambda x: x/10)

###Loading NSP Coherence Model

In [None]:
coh_nsp_model_save_path = "/content/drive/MyDrive/Colab Notebooks/AES/coherence_model_with_nsp_goldens.pt"
nsp_model = load_model(coh_nsp_model_save_path)



Layer lstm_6 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.




Layer lstm_7 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.


In [None]:
evaluate_model(nsp_model,lhs_ielts_dataset, non_norm_y_test_ielts)

Kappa Score: 0.013717608825793093
MSE:  0.07298263263898985
RMSE:  0.27015298006683147


###Loading OG model

In [None]:
og_model = load_model("/content/drive/MyDrive/Colab Notebooks/AES/experiment_XX/coh-lstm_model-latest.pt")



Layer lstm will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.




Layer lstm_1 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.


In [None]:
evaluate_model(og_model,lhs_ielts_dataset, non_norm_y_test_ielts)

Kappa Score: 0.1655800969599387
MSE:  0.08179962083601923
RMSE:  0.2860063300628488


In [None]:
ielts_dataset.essay.iloc[1]

'Every four years, the whole world stops to watch international sporting events such as the Olympics and the Football World Cup in which athletes show their best performance to make their country proud. These sporting occasions have proved to be helpful in easing international tension in difficult times when powerful leaders were trying to control the world’s economy and other governments were fighting over the land. The Olympic Games are one of the best examples which prove how sporting events can bring nations together, at least temporarily. From the ancient History, when Greeks and Romans would interrupt battles to participate in the games, to the more recent international disputes, when athletes from Palestine and Israel would forget their differences, compete peacefully and even embrace each other after an event. Moreover, these popular events have called the world’s attention to the terrible consequences of wars; thus some leaders have tried to reach agreements to end their dispu

In [None]:
ielts_dataset.score[1]

8.0

In [None]:
og_model.predict(lhs_ielts_dataset[1].numpy().reshape(1,128,768))

array([[0.88804144]], dtype=float32)

In [None]:
nsp_model.predict(lhs_ielts_dataset[1].numpy().reshape(1,128,768))

array([[0.90515244]], dtype=float32)