# Multi-language attacks

TextAttack's four-component framework makes it trivial to run attacks in other languages. In this tutorial, we:

- Create a model wrapper around Transformers [pipelines](https://huggingface.co/transformers/main_classes/pipelines.html) 
- Initialize a pre-trained [CamemBERT](https://camembert-model.fr/) model for sentiment classification
- Load the AlloCiné movie review sentiment classification dataset (from [`datasets`](https://github.com/huggingface/datasets/))
- Load the `pwws` recipe, but use French synonyms from multilingual WordNet (instead of English synonyms)
- Run an adversarial attack on a French language model

Voilà!

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/QData/TextAttack/blob/master/docs/2notebook/Example_4_CamemBERT.ipynb)

[![View Source on GitHub](https://img.shields.io/badge/github-view%20source-black.svg)](https://github.com/QData/TextAttack/blob/master/docs/2notebook/Example_4_CamemBERT.ipynb)

In [None]:
from textattack.attack_recipes import PWWSRen2019
from textattack.datasets import HuggingFaceDataset
from textattack.models.wrappers import ModelWrapper
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification, pipeline
from textattack import Attacker

import numpy as np

# Quiet TensorFlow.
import os
if "TF_CPP_MIN_LOG_LEVEL" not in os.environ:
    os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"


class HuggingFaceSentimentAnalysisPipelineWrapper(ModelWrapper):
    """ Transformers sentiment analysis pipeline returns a list of responses
        like 
        
            [{'label': 'POSITIVE', 'score': 0.7817379832267761}]
            
        We need to convert that to a format TextAttack understands, like
        
            [[0.218262017, 0.7817379832267761]
    """
    def __init__(self, model):
        self.model = model#pipeline = pipeline
    def __call__(self, text_inputs):
        raw_outputs = self.model(text_inputs)
        outputs = []
        for output in raw_outputs:
            score = output['score']
            if output['label'] == 'POSITIVE':
                outputs.append([1-score, score])
            else:
                outputs.append([score, 1-score])
        return np.array(outputs)


In [None]:
# Create the model: a French sentiment analysis model.
# see https://github.com/TheophileBlard/french-sentiment-analysis-with-bert
model = TFAutoModelForSequenceClassification.from_pretrained("tblard/tf-allocine")
tokenizer = AutoTokenizer.from_pretrained("tblard/tf-allocine")
pipeline = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)

model_wrapper = HuggingFaceSentimentAnalysisPipelineWrapper(pipeline)

# Create the recipe: PWWS uses a WordNet transformation.
recipe = PWWSRen2019.build(model_wrapper)
#
# WordNet defaults to english. Set the default language to French ('fra')
#
# See "Building a free French wordnet from multilingual resources", 
# E. L. R. A. (ELRA) (ed.), 
# Proceedings of the Sixth International Language Resources and Evaluation (LREC’08).
recipe.transformation.language = 'fra'

dataset = HuggingFaceDataset('allocine', split='test')

attacker = Attacker(recipe, dataset)
attacker.attack_dataset()


All model checkpoint layers were used when initializing TFCamembertForSequenceClassification.

All the layers of TFCamembertForSequenceClassification were initialized from the model checkpoint at tblard/tf-allocine.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFCamembertForSequenceClassification for predictions without further training.
textattack: Unknown if model of class <class 'transformers.pipelines.text_classification.TextClassificationPipeline'> compatible with goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'>.
Reusing dataset allocine_dataset (/root/.cache/huggingface/datasets/allocine_dataset/allocine/1.0.0/bbee2ebb45a067891973b91ebdd40a93598d1e2dd5710b6714cdc2cd81d0ed65)
textattack: Loading [94mdatasets[0m dataset [94mallocine[0m, split [94mtest[0m.

  0%|          | 0/5 [00:00<?, ?it/s][A

Attack(
  (search_method): GreedyWordSwapWIR(
    (wir_method):  weighted-saliency
  )
  (goal_function):  UntargetedClassification
  (transformation):  WordSwapWordNet
  (constraints): 
    (0): RepeatModification
    (1): StopwordModification
  (is_black_box):  True
) 




 20%|██        | 1/5 [00:52<03:30, 52.67s/it][A
[Succeeded / Failed / Total] 0 / 1 / 1:  20%|██        | 1/5 [00:52<03:30, 52.68s/it][A

--------------------------------------------- Result 1 ---------------------------------------------
[92mPositive (100%)[0m --> [91m[FAILED][0m

Magnifique épopée, une belle histoire, touchante avec des acteurs qui interprètent très bien leur rôles (Mel Gibson, Heath Ledger, Jason Isaacs...), le genre de film qui se savoure en famille! :)





[Succeeded / Failed / Total] 0 / 1 / 1:  40%|████      | 2/5 [03:11<04:47, 95.76s/it][A
[Succeeded / Failed / Total] 1 / 1 / 2:  40%|████      | 2/5 [03:11<04:47, 95.76s/it][A

--------------------------------------------- Result 2 ---------------------------------------------
[91mNegative (94%)[0m --> [92mPositive (91%)[0m

Je n'ai pas aimé mais pourtant je lui mets [91m2[0m étoiles car l'expérience est louable. Rien de conventionnel ici. Une visite E.T. mais jonchée d'idées /- originales. Le soucis, tout ceci avait-il vraiment sa place dans un film de S.F. tirant sur l'horreur ? Voici un film qui, à l'inverse de tant d'autres qui y ont droit, mériterait peut-être un remake.

Je n'ai pas aimé mais pourtant je lui mets [92m4[0m étoiles car l'expérience est louable. Rien de conventionnel ici. Une visite E.T. mais jonchée d'idées /- originales. Le soucis, tout ceci avait-il vraiment sa place dans un film de S.F. tirant sur l'horreur ? Voici un film qui, à l'inverse de tant d'autres qui y ont droit, mériterait peut-être un remake.





[Succeeded / Failed / Total] 1 / 1 / 2:  60%|██████    | 3/5 [03:15<02:10, 65.23s/it][A
[Succeeded / Failed / Total] 2 / 1 / 3:  60%|██████    | 3/5 [03:15<02:10, 65.24s/it][A

--------------------------------------------- Result 3 ---------------------------------------------
[92mPositive (85%)[0m --> [91mNegative (91%)[0m

Un [92mdessin[0m animé qui brille par sa féerie et ses chansons.

Un [91mbrouillon[0m animé qui brille par sa féerie et ses chansons.





[Succeeded / Failed / Total] 2 / 1 / 3:  80%|████████  | 4/5 [03:49<00:57, 57.43s/it][A
[Succeeded / Failed / Total] 3 / 1 / 4:  80%|████████  | 4/5 [03:49<00:57, 57.43s/it][A

--------------------------------------------- Result 4 ---------------------------------------------
[91mNegative (100%)[0m --> [92mPositive (80%)[0m

[91mSi[0m c'est là le renouveau du cinéma français, c'est tout [91mde[0m même foutrement chiant. [91mSi[0m l'objet est [91mtrès[0m stylisé et la tension palpable, le film paraît [91mplutôt[0m [91mcreux[0m.

[92maussi[0m c'est là le renouveau du cinéma français, c'est tout [92mabolir[0m même foutrement chiant. [92mtellement[0m l'objet est [92mprodigieusement[0m stylisé et la tension palpable, le film paraît [92mpeu[0m [92mtrou[0m.





[Succeeded / Failed / Total] 3 / 1 / 4: 100%|██████████| 5/5 [04:09<00:00, 49.95s/it][A
[Succeeded / Failed / Total] 3 / 2 / 5: 100%|██████████| 5/5 [04:09<00:00, 49.95s/it]

--------------------------------------------- Result 5 ---------------------------------------------
[91mNegative (100%)[0m --> [91m[FAILED][0m

Et pourtant on s’en Doutait !Second volet très mauvais, sans fraîcheur et particulièrement lourdingue. Quel dommage.



+-------------------------------+--------+
| Attack Results                |        |
+-------------------------------+--------+
| Number of successful attacks: | 3      |
| Number of failed attacks:     | 2      |
| Number of skipped attacks:    | 0      |
| Original accuracy:            | 100.0% |
| Accuracy under attack:        | 40.0%  |
| Attack success rate:          | 60.0%  |
| Average perturbed word %:     | 10.72% |
| Average num. words per input: | 29.4   |
| Avg num queries:              | 324.6  |
+-------------------------------+--------+



