# <center> Style Transfer Evaluation </center>

Since our intention is to provide machine-based style transfer, we need to task ourselves with subjectof output evaluation.  It follows from our discussion on styling that such score should include themeasures of a triplet {*style source, narrative fluency, and content equivalence*}. Provided that our ultimate goal is a perfect imitation of source style conditioned on story from content source, missing either of the aforementioned factors will not yield a satisfactory result.For one example, if the output text does not employ the vocabulary and sentence structure of style donor, it will result in the stylistic miss.  For another example, if the output employs the style but departs from the content, it will fail to form a parallel representation.  For a third example, if the output text successfully fuses the content with style of input sources but violates general languageand writing norms, it will result in a poor reading experience. Therefore, to evaluate the quality ofstyle transfer, we need to take all those considerations into account.

In evaluating results of the literature style transfer, we must consider that the two dimensions of the metric (naturalness and content preservation) are of the satisficing type, while the style is the metric component we optimize for. We leverage the work done by Mir et. al [1] where the authors propose - 

* Style Transfer Intensity 
* Naturalness
* Content preservation 

as key aspects of interest for style transfer for text. The authors propose a set of metrics for automated evaluation and demonstrate that they are are more strongly correlated and in agreement with human judgement than prior work in the area for the respective aspects. We leverage one of these automated metrics these automated metrics obtained via adversarial classification to denote naturalness.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [2]:
import sys
sys.path.append('/content/drive/My Drive/Colab Notebooks/')
import re
import numpy as np
import pandas as pd
import re
import os
import glob
from tokenizer import tokenize
from tokenizer import RE_PATTERN
from collections import Counter
from keras.models import load_model as load_keras_model
from keras.preprocessing.sequence import pad_sequences
from sklearn.externals import joblib

Using TensorFlow backend.




# Naturalness

In [3]:
"""EVALUATION OF NATURALNESS
This is used to evaluate the naturalness of output texts of our style transfer model.
For a baseline understanding of what is considered "natural," any method used for automated evaluation of naturalness
also requires an understanding of the human-sourced input texts.
Inspired by the adversarial evaluation approach in "Generating Sentences from a Continuous Space"
(Bowman et al., 2016), we use pretrained LSTM logistic classifier available from [1]
on samples of input texts and output texts for each style transfer model.
Via adversarial evaluation, the classifiers must distinguish human-generated inputs from machine-generated outputs.
The more natural an output is, the likelier it is to fool an adversarial classifier.

    - Calculate naturalness scores for texts with clf, a NaturalnessClassifier      -> clf.score(...)
You can find examples of more detailed usage commands below.
"""

NATURALNESS_CLASSIFIER_BASE_PATH = '/content/drive/My Drive/NaturalnessClassifier/'
MAX_SEQ_LEN = 30 # for neural classifier

def load_model(path):
    return joblib.load(path)

def invert_dict(dictionary):
    return dict(zip(dictionary.values(), dictionary.keys()))

TEXT_VECTORIZER = load_model('/content/drive/My Drive/vectorizer.pkl')

# adjust vocabulary to account for unknowns
VOCABULARY = TEXT_VECTORIZER.vocabulary_
INVERSE_VOCABULARY = invert_dict(VOCABULARY)
VOCABULARY[INVERSE_VOCABULARY[0]] = len(VOCABULARY)
VOCABULARY['CUSTOM_UNKNOWN'] = len(VOCABULARY)+1




## DATA PREP
def convert_to_indices(text):
    # tokenize input text
    tokens = re.compile(RE_PATTERN).split(text)
    non_empty_tokens = list(filter(lambda token: token, tokens))

    indices = []

    # collect indices of tokens in vocabulary
    for token in non_empty_tokens:
        if token in VOCABULARY:
            index = VOCABULARY[token]
        else:
            index = VOCABULARY['CUSTOM_UNKNOWN']

        indices.append(index)

    return indices

def format_inputs(texts):
    # prepare texts for use in neural classifier
    texts_as_indices = []
    for text in texts:
        texts_as_indices.append(convert_to_indices(text))
    return pad_sequences(texts_as_indices, maxlen=MAX_SEQ_LEN, padding='post', truncating='post', value=0.)

def merge_datasets(dataset1, dataset2):
    x = []
    x.extend(dataset1)
    x.extend(dataset2)
    return x

def load_dataset(path):
    data = []
    with open(path) as f:
        data.append(f.read())
    data = [s.strip() for s in data]
    return data


    
## NATURALNESS CLASSIFIERS
class NaturalnessClassifier:
    '''
    An external classifier was trained for a style transfer model -
    more specifically using its inputs and outputs excluding test samples.

    Use UnigramBasedClassifier (TBD) or NeuralBasedClassifier to load a
    trained classifier and score texts of a given style transfer model.
    The scores represent the probabilities of the texts being 'natural'.

    '''

    pass

class UnigramBasedClassifier(NaturalnessClassifier):
    ''' 
    Might implement in future if neccessary

    '''

class NeuralBasedClassifier(NaturalnessClassifier):
    def __init__(self, style_transfer_model_name):
        self.path = f'{NATURALNESS_CLASSIFIER_BASE_PATH}/neural_{style_transfer_model_name}.h5'
        self.classifier = load_keras_model(self.path)

    def score(self, texts):
        inps = format_inputs(texts)
        distribution = self.classifier.predict(inps)
        scores = distribution.squeeze()
        return scores

    def summary(self):
        return self.classifier.summary()



## Naturalness Classifier

In [64]:
model = 'CAAE' 
neural_classifier = NeuralBasedClassifier(model)
print(neural_classifier.summary())

Model: "neural_adv_clf"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 30)                0         
_________________________________________________________________
embedding_2 (Embedding)      (None, 30, 256)           2419456   
_________________________________________________________________
lstm_2 (LSTM)                (None, 128)               197120    
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 129       
Total params: 2,616,705
Trainable params: 2,616,705
Non-trainable params: 0
_________________________________________________________________
None


## Naturalness Scores for Donor Data

In [62]:
twain_2 = ["The question as to whether there is such a thing as divine right of kings is not settled in this book.  It was found too difficult. That the executive head of a nation should be a person of lofty character and extraordinary ability, was manifest and indisputable; that none but the Deity could select that head unerringly, was also manifest and indisputable; that the Deity ought to make that selection, then, was likewise manifest and indisputable; consequently, that He does make it, as claimed, was an unavoidable deduction. I mean, until the author of this book encountered the Pompadour, and Lady Castlemaine, and some other executive heads of that kind; these were found so difficult to work into the scheme, that it was judged better to take the other tack in this book (which must be issued this fall), and then go into training and settle the question in another book.  It is, of course, a thing which ought to be settled, and I am not going to have anything particular to do next winter anyway."]
dumas_2 = ["In the meanwhile, Monsieur continued his route with an air at once calm and majestic, and the more he thought about it, the less attractive he became of spectators, as there were too many spectators to keep up the exchange; but the good citizens of Blois could not pardon Monsieur for having chosen their gay city for an abode in which to indulge melancholy at his ease, and as often as they caught a glimpse of this demurecy, they stole away gaping, or drew back their heads into the interior of their dwellings, to wander again about and remain thus for a while. "]
not_real = ['Not naturalness is this, machine generated, flipped ngrams replacement decipher hard']

print('Naturalness score for donors - twain_2 - ' + str(neural_classifier.score(twain_2)))
print('Naturalness score for donors - dumas_2 - ' + str(neural_classifier.score(dumas_2)))
print('Naturalness score for not real text - ' + str(neural_classifier.score(not_real)))

Naturalness score for donors - twain_2 - 0.9155893
Naturalness score for donors - dumas_2 - 0.87011606
Naturalness score for not real text - 0.012912448


In [63]:
df = pd.read_csv('/content/drive/My Drive/donor.csv')
texts = df['text'].tolist()
scores = neural_classifier.score(texts)
df['naturalness_score'] = scores
df.sample(10)
#df.to_csv('donor_results_caae.csv', index=False)

Unnamed: 0,text,author,naturalness_score
39,"in the; afternoon his father's face was gray,...",Nabokov,0.493888
22,"been drinking,” he said quietly. “Or have gon...",Nabokov,0.108094
103,"yourself all dirty, Mark. Your hand is black....",Nabokov,0.016913
27,"corn, coal and hog producing area; and finall...",Nabokov,0.021871
43,of all I want to thank you for the generous c...,Nabokov,0.220617
286,"one million seven hundred thousand francs, wi...",Dumas,0.124283
285,to the Town-hall; let us go and see the deput...,Dumas,0.008071
328,"The queens passed to their own apartments, ac...",Dumas,0.058468
79,"see,” he continued, filling Lik’s glass and h...",Nabokov,0.024451
223,Portsmouth; and but for Mr. Crawford and the ...,Austen,0.03604


## Ingested Style Transfered Data for Different Authors

In [20]:
results = []
for root, dirs, files in os.walk("/content/drive/My Drive"):
    for file in files:
        if file.endswith(".txt"):
            path = os.path.join(root, file)   
            data = load_dataset(path)
            score = neural_classifier.score(data)
            print(path + ' - ' + str(score))
            results.append([path, ' '.join(data), score])


austen_path = '/content/drive/My Drive/Nabokov-style/Austen_raw/'
os.chdir(austen_path)
files = glob.glob('*.txt???')
for file in files:
  path = austen_path + file
  data = load_dataset(path)
  score = neural_classifier.score(data)
  print(path + ' - ' + str(score))
  results.append([path, ' '.join(data), score])


dumas_path = '/content/drive/My Drive/Nabokov-style/Dumas_raw/'
os.chdir(dumas_path)
files = glob.glob('*.txt???')
for file in files:
  path = dumas_path + file
  data = load_dataset(path)
  score = neural_classifier.score(data)
  print(path + ' - ' + str(score))
  results.append([path, ' '.join(data), score])

results = pd.DataFrame(results, columns=['file_name', 'text', 'naturalness'])
results.head()

/content/drive/My Drive/rand_donor_original.txt - 0.93960655
/content/drive/My Drive/misc (Nabokov style)/Austen/Austen_117M_10000_Nabokov-All-3.txt - 0.991459
/content/drive/My Drive/misc (Nabokov style)/Austen/Austen-original.txt - 0.9612681
/content/drive/My Drive/misc (Nabokov style)/Austen/Austen-original-2.txt - 0.9915976
/content/drive/My Drive/misc (Nabokov style)/Austen/Austen_117M_10000_Nabokov-All-2.txt - 0.9815973
/content/drive/My Drive/misc (Nabokov style)/Austen/Austen_117M_10000_Nabokov-All.txt - 0.9508127
/content/drive/My Drive/misc (Nabokov style)/Shakespeare/Shakespeare_117M_10000_Nabokov-All.txt - 0.99463415
/content/drive/My Drive/misc (Nabokov style)/Shakespeare/Shake__117M_10000_Nabokov-All.txt - 0.95405936
/content/drive/My Drive/misc (Nabokov style)/Rand/output-nabovokov-12k-117M-reject10000-2.txt - 0.99902236
/content/drive/My Drive/misc (Nabokov style)/Rand/output-nabovokov-12k-117M-reject1000.txt - 0.93960655
/content/drive/My Drive/misc (Nabokov style)/Ran

Unnamed: 0,file_name,text,naturalness
0,/content/drive/My Drive/rand_donor_original.txt,"She sat at the window of the train, her head t...",0.93960655
1,/content/drive/My Drive/misc (Nabokov style)/A...,Mary had neither genius nor taste; but her day...,0.991459
2,/content/drive/My Drive/misc (Nabokov style)/A...,"Elizabeth listened in silence, but was not con...",0.9612681
3,/content/drive/My Drive/misc (Nabokov style)/A...,Mary had neither genius nor taste; and though ...,0.9915976
4,/content/drive/My Drive/misc (Nabokov style)/A...,Mary had neither genius nor taste; her only en...,0.9815973


## Processing for Style Transfer Text & Rescoring
* remove non asci
* remove special characeters like . --, -, |

In [21]:
def remove_non_ascii(text):
    return ''.join(i for i in text if ord(i)<128)
 
results['text_ascii_only'] = results['text'].apply(remove_non_ascii).str.replace('--|-|_', ',')
texts = results['text_ascii_only'].tolist()
scores = neural_classifier.score(texts)
results['naturalness_ascii_only'] = scores
results.head()

Unnamed: 0,file_name,text,naturalness,text_ascii_only,naturalness_ascii_only
0,/content/drive/My Drive/rand_donor_original.txt,"She sat at the window of the train, her head t...",0.93960655,"She sat at the window of the train, her head t...",0.939606
1,/content/drive/My Drive/misc (Nabokov style)/A...,Mary had neither genius nor taste; but her day...,0.991459,Mary had neither genius nor taste; but her day...,0.995231
2,/content/drive/My Drive/misc (Nabokov style)/A...,"Elizabeth listened in silence, but was not con...",0.9612681,"Elizabeth listened in silence, but was not con...",0.961268
3,/content/drive/My Drive/misc (Nabokov style)/A...,Mary had neither genius nor taste; and though ...,0.9915976,Mary had neither genius nor taste; and though ...,0.991598
4,/content/drive/My Drive/misc (Nabokov style)/A...,Mary had neither genius nor taste; her only en...,0.9815973,Mary had neither genius nor taste; her only en...,0.981597


## Save Results

In [0]:
os.chdir('/content/drive/My Drive')
#results.to_csv('All_Results.csv', index=False)