# Attack IMDB model

The IMDB dataset contains movie reviews that are labeled either positive or negative. Each review is a paragraph consists of multiple sentences.

Here, we attempt to attack a **wordCNN** model for IMDB dataset.

In [62]:
from neural_networks import word_cnn, char_cnn, bd_lstm, lstm
import os
from read_files import split_imdb_files, split_yahoo_files, split_agnews_files
from word_level_process import word_process, get_tokenizer, text_to_vector_for_all
from config import config
from keras.preprocessing import sequence
import numpy as np

dataset = "imdb"
model_name = "pretrained_word_cnn"

In [2]:
import stanfordnlp

# stanfordnlp.download('en')
nlp = stanfordnlp.Pipeline() # This sets up a default neural pipeline in English

Use device: cpu
---
Loading: tokenize
With settings: 
{'model_path': '/Users/weifanjiang/stanfordnlp_resources/en_ewt_models/en_ewt_tokenizer.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
---
Loading: pos
With settings: 
{'model_path': '/Users/weifanjiang/stanfordnlp_resources/en_ewt_models/en_ewt_tagger.pt', 'pretrain_path': '/Users/weifanjiang/stanfordnlp_resources/en_ewt_models/en_ewt.pretrain.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
---
Loading: lemma
With settings: 
{'model_path': '/Users/weifanjiang/stanfordnlp_resources/en_ewt_models/en_ewt_lemmatizer.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
Building an attentional Seq2Seq model...
Using a Bi-LSTM encoder
Using soft attention for LSTM.
Finetune all embeddings.
[Running seq2seq lemmatizer with edit classifier]
---
Loading: depparse
With settings: 
{'model_path': '/Users/weifanjiang/stanfordnlp_resources/en_ewt_models/en_ewt_parser.pt', 'pretrain_path': '/Users/weifanjiang/sta

In [3]:
model = word_cnn(dataset)
model_path = r'./runs/{}/{}.dat'.format(dataset, model_name)
model.load_weights(model_path)
print("successfully load model")

Build word_cnn model...




Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where






successfully load model


In [4]:
# Data label:
# [1 0] is negative review
# [0 1] is positive review

train_texts, train_labels, test_texts, test_labels = split_imdb_files()
x_train, y_train, x_test, y_test = word_process(train_texts, train_labels, test_texts, test_labels, dataset)
print('successfully load data')

Processing IMDB dataset
successfully load data


`predict_str` allows a string representation of movie review be predicted without manually converting to sequence first.

In [5]:
# Predict a string input with a model directly
# so skips the need of converting to sequence first...
def predict_str(model, s):
    maxlen = config.word_max_len[dataset]
    tokenizer = get_tokenizer(dataset)
    s_seq = tokenizer.texts_to_sequences([s])
    s_seq = sequence.pad_sequences(s_seq, maxlen=maxlen, padding='post', truncating='post')
    s_sep = s_seq[0]
    return model.predict(s_seq)[0]

Here we sample one sentence from entire testing corpus

In [6]:
import random
# idx = random.randint(0, x_test.shape[0] - 1)
idx = 15557

xi = x_test[idx:idx+1]
yi = y_test[idx:idx+1][0]
xi_text = test_texts[idx]

print(xi_text)
print()
print("model predict ", model.predict(xi)[0])
print("predict with predict_str ", predict_str(model, xi_text))
print("true label ", yi)

This is a stupid movie. When I saw it in a movie theater more than half the audience left before it was half over. I stayed to the bitter end. To show fortitude? I caught it again on television and it was much funnier. Still by no means a classic, or even consistently hilarious but the family kinda grew on me. I love Jessica Lundy anyway. If you've nothing better to do and it's free on t.v. you could do worse.

model predict  [0.92145205 0.07800014]
predict with predict_str  [0.92145205 0.07800014]
true label  [1 0]


Breaks a review to sentences based on `StanfordParser`'s result.

In [7]:
def sentence_list(doc):
    sentences = []
    for words in doc.sentences:
        sentence = words.words[0].text
        for word in words.words[1:]:
            if word.upos != 'PUNCT' and not word.text.startswith('\''):
                sentence += ' '
            sentence += word.text
        sentences.append(sentence)
    return sentences

In [8]:
"""
Paraneter of a StanfordNLP doc object；
'_text', '_conll_file', '_sentences'

Parameters of doc.conll_file:
'ignore_gapping', '_file', '_from_str', '_sents', '_num_words'

Parameters of doc.sentence object:
'_tokens', '_words', '_dependencies'
"""

doc = nlp(xi_text)
sentences = sentence_list(doc)

Compute **Sentence Saliency**.

Let $x = s_1s_2\dots s_n$ be a input consists of $n$ sentences. Let $y$ be $x$'s true label. The sentence saliency for sentence $s_k$ is:

$$S(y|s_k) = P(y|x) - P(x|s_1s_2\dots s_{k-1}s_{k+1}\dots s_n)$$

In [71]:
def sentence_saliency(model, sentences, label):
    true_pred = predict_str(model, ' '.join(sentences))
    if label[0] == 1:
        idx = 0
    else:
        idx = 1
    scores = []
    for i in range(len(sentences)):
        x_hat = ' '.join(sentences[0:i] + sentences[i+1:])
        scores.append(true_pred[idx] - predict_str(model, x_hat)[idx])
    
    # normalize w/ softmax
    determinism = 2.0
    scores = np.array(scores)
    softmax = np.exp(determinism * scores)
    softmax /= np.sum(softmax)
    return softmax

In [63]:
saliency_scores = sentence_saliency(model, sentences, yi)

In [64]:
print(saliency_scores)

[0.12932211 0.12282395 0.11874726 0.12187595 0.1260943  0.1195102
 0.11857406 0.14305218]


## Sentence paraphrasing

**back translation**: translate input sentence to another language, then translate back to the original language. This is a technique commonly used for evaluation of language translations. Here, we use this technique to quickly generate a rephrase of original sentence.

In [34]:
print(xi_text)

This is a stupid movie. When I saw it in a movie theater more than half the audience left before it was half over. I stayed to the bitter end. To show fortitude? I caught it again on television and it was much funnier. Still by no means a classic, or even consistently hilarious but the family kinda grew on me. I love Jessica Lundy anyway. If you've nothing better to do and it's free on t.v. you could do worse.


Test google cloud authentication

In [39]:
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/Users/weifanjiang/Documents/Personal/My Project-1e7426894fe6.json"

def implicit():
    from google.cloud import storage

    # If you don't specify credentials when constructing the client, the
    # client library will look for credentials in the environment.
    storage_client = storage.Client()

    # Make an authenticated API request
    buckets = list(storage_client.list_buckets())
    print(buckets)
implicit()

[]


In [47]:
from google.cloud import translate_v2 as translate
translate_client = translate.Client()

results = translate_client.get_languages()

for language in results:
    print(u'{name} ({language})'.format(**language))

Afrikaans (af)
Albanian (sq)
Amharic (am)
Arabic (ar)
Armenian (hy)
Azerbaijani (az)
Basque (eu)
Belarusian (be)
Bengali (bn)
Bosnian (bs)
Bulgarian (bg)
Catalan (ca)
Cebuano (ceb)
Chichewa (ny)
Chinese (Simplified) (zh)
Chinese (Traditional) (zh-TW)
Corsican (co)
Croatian (hr)
Czech (cs)
Danish (da)
Dutch (nl)
English (en)
Esperanto (eo)
Estonian (et)
Filipino (tl)
Finnish (fi)
French (fr)
Frisian (fy)
Galician (gl)
Georgian (ka)
German (de)
Greek (el)
Gujarati (gu)
Haitian Creole (ht)
Hausa (ha)
Hawaiian (haw)
Hebrew (iw)
Hindi (hi)
Hmong (hmn)
Hungarian (hu)
Icelandic (is)
Igbo (ig)
Indonesian (id)
Irish (ga)
Italian (it)
Japanese (ja)
Javanese (jw)
Kannada (kn)
Kazakh (kk)
Khmer (km)
Kinyarwanda (rw)
Korean (ko)
Kurdish (Kurmanji) (ku)
Kyrgyz (ky)
Lao (lo)
Latin (la)
Latvian (lv)
Lithuanian (lt)
Luxembourgish (lb)
Macedonian (mk)
Malagasy (mg)
Malay (ms)
Malayalam (ml)
Maltese (mt)
Maori (mi)
Marathi (mr)
Mongolian (mn)
Myanmar (Burmese) (my)
Nepali (ne)
Norwegian (no)
Odia (Oriya)

In [55]:
from google.cloud import translate_v2 as translate

import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/Users/weifanjiang/Documents/Personal/My Project-1e7426894fe6.json"

def back_translation(s_in, show_mid=False):
    translate_client = translate.Client()
    mid_result = translate_client.translate(s_in, target_language="ko")['translatedText']
    if show_mid:
        print(mid_result.replace("&#39;", "\'"))
    en_result = translate_client.translate(mid_result, target_language="en")['translatedText']
    return en_result.replace("&#39;", "\'")

In [56]:
print(xi_text)
print(back_translation(xi_text, show_mid = True))

This is a stupid movie. When I saw it in a movie theater more than half the audience left before it was half over. I stayed to the bitter end. To show fortitude? I caught it again on television and it was much funnier. Still by no means a classic, or even consistently hilarious but the family kinda grew on me. I love Jessica Lundy anyway. If you've nothing better to do and it's free on t.v. you could do worse.
이 바보 같은 영화입니다. 영화관에서 반 이상을 시청하기 전에 관중의 절반 이상이 남았습니다. 나는 쓰라린 끝에 머물렀다. 용기를 나타내려면? 나는 그것을 텔레비전에서 다시 붙 잡았다. 그리고 그것은 훨씬 더 재미 있었다. 여전히 고전적인 일이 아니고 심지어는 항상 재밌지 만 가족이 나에게 자랐습니다. 어쨌든 Jessica Lundy를 좋아합니다. 더 좋은 방법이없고 TV에서 무료로 사용하면 더 나빠질 수 있습니다.
This is a silly movie. More than half of the spectators remained before watching more than half at the cinema. I stayed bitter. To show courage? I caught it on television again. And it was much more fun. It's still not classic and it's always fun, but the family grew up for me. Anyway, I like Jessica Lundy. There is no better way and it can get wors

## Implementing sentence-level genetic algorithm

`perturb`: randomly select a sentence from the input paragraph with probability propotional to the saliency score of each sentence. Then apply rephrasing to the selected sentence. Return the paragraph with one modified sentence.

In [77]:
def perturb(sentences, saliencies):
    choice = np.random.choice(sentences, p=saliencies)
    rephrase = back_translation(choice)
    new_paragraph = [x if (x != choice) else rephrase for x in sentences]
    return ' '.join(new_paragraph)

`genetic`: main function to perform genetic attack

In [79]:
def genetic(x0, y0, model, population, generation):
    doc = nlp(xi_text)
    sentences = sentence_list(doc)
    saliency_scores = sentence_saliency(model, sentences, yi)
    
    gen0 = set()
    for i in range(population):
        gen0.add(perturb(sentences, saliency_scores))
    
    print(gen0)

In [80]:
genetic(xi_text, yi, model, 3, 10)

{"This is a stupid movie. When I saw it in a movie theater more than half the audience left before it was half over. I stayed to the bitter end. To show fortitude? I caught it again on television and it was much funnier. It's still not classic and it's always fun, but the family grew up for me. I love Jessica Lundy anyway. If you've nothing better to do and it's free on t.v. you could do worse.", "This is a stupid movie. When I saw it in a movie theater more than half the audience left before it was half over. I stayed to the bitter end. To show courage? I caught it again on television and it was much funnier. Still by no means a classic, or even consistently hilarious but the family kinda grew on me. I love Jessica Lundy anyway. If you've nothing better to do and it's free on t.v. you could do worse.", "This is a stupid movie. When I saw it in a movie theater more than half the audience left before it was half over. I stayed to the bitter end. To show fortitude? I caught it on televis

---