Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing Attacker class #70

Open
rajaswa opened this issue May 21, 2020 · 16 comments
Open

Implementing Attacker class #70

rajaswa opened this issue May 21, 2020 · 16 comments
Labels
enhancement New feature or request help wanted Extra attention is needed Priority: High

Comments

@rajaswa
Copy link
Member

rajaswa commented May 21, 2020

  • An attacker should take the pre-trained model, original dataset, adversarial dataset (made with transforms), and a list of criteria to monitor (like accuracy, loss, etc) [All the entities in PyTorch]
  • The attacker should have a method .attack(), when called, the model shall run inference over each sample from the given dataset and its adversarial counterpart in the adversarial dataset.
  • Finally giving out things like the performance difference due to the attacks, worst hit attacks, best hit attacks etc (in terms of given criteria)
@rajaswa rajaswa added enhancement New feature or request help wanted Extra attention is needed Priority: High labels May 21, 2020
@abheesht17
Copy link
Contributor

Sounds interesting 🤩

@rajaswa
Copy link
Member Author

rajaswa commented May 21, 2020

The ultimate usage can be something like this:

from decepticonlp import attacker
from deceptionlp.transforms import transforms
import torch
from torch.uitls.data import Dataset

#define adversarial transforms
tfms = transforms.Compose(
        [
            transforms.AddChar(),
            transforms.ShuffleChar("RandomWordExtractor", True),
            transforms.VisuallySimilarChar(),
            transforms.TypoChar("RandomWordExtractor", probability=0.5),
        ]
    )

#Original dataset
class IMDB_Dataset(Dataset):
    def __init__(self):
        #some code

    def _len__(self):
        #some code

    def __getitem__(self, idx):
        text = data['review_text'][idx]    #get text from sample
        embeddings = getWordEmbeddings(text)    #convert to sequence of word embeddings
        label = torch.tensor(data['sentiment'][idx])    #sentiment label
        return embeddings, label

#Adversarial dataset
class IMDB_Adversarial_Dataset(Dataset):
    def __init__(self):
        #some code

    def _len__(self):
        #some code

    def __getitem__(self, idx):
        text = data['review_text'][idx]    #get text from sample
        adversarial_text = tfms(text)    #apply adversarial transform
        embeddings = getWordEmbeddings(adversarial_text)    #convert to sequence of word embeddings
        label = torch.tensor(data['sentiment'][idx])    #sentiment label
        return embeddings, label

#Load pre-trained model
imdb_classifier = torch.load("IMDB_Classifier.pth")
imdb_classifier.eval()

#Set up the attacker
IMDB_attacker = attacker()
IMDB_attacker.model = imdb_classifier
IMDB_attacker.dataset = IMDB_Dataset()
IMDB_attacker.adversarial_dataset = IMDB_Adversarial_Dataset()
IMDB_attacker.criterion = ['accuracy', 'F1', 'BCELoss']

#Attack and get logs
IMDB_attacker.attack()
IMDB_attacker.get_crtierion_logs()
IMDB_attacker.show_best_attacks()
IMDB_attacker.show_worst_attacks()

#Maybe more functionalities?

@rajaswa
Copy link
Member Author

rajaswa commented May 21, 2020

This shall be a multi-step process, avoid limiting yourself to the above mentioned example functionalities and please feel free to discuss more functionalities.

@Sharad24
Copy link
Member

Sharad24 commented May 21, 2020 via email

@abheesht17
Copy link
Contributor

Maybe we can add functionality to draw graphs as well for loss, accuracy, etc. for both the original dataset and the adversarial dataset. More of a utility function rather than a necessary one though.

@Sharad24
Copy link
Member

Sharad24 commented May 21, 2020 via email

@rajaswa
Copy link
Member Author

rajaswa commented May 21, 2020

Which one seems the better option:

  1. Implement everything in crude way as shown in example above and then added logger, huggingface/nlp and other enhancements slowly over time
  2. Start implementation with all these things taken into consideration beforehand

@Sharad24
Copy link
Member

To me, definitely point 1

@Sharad24
Copy link
Member

You can make a checklist here maybe with the suggestions here to keep track of the enhancements.

@parantak
Copy link
Contributor

I am not sure if this would be feasible but introducing an option for the different available embeddings could be done. The attacker should have an option, if possible, depending on his model.

@abheesht17
Copy link
Contributor

The user will define it himself in his Dataset class; we don't have to bother about which embedding he has used.

@abheesht17
Copy link
Contributor

What should I check in the test function of the Attacker class?

@Sharad24
Copy link
Member

Sharad24 commented May 22, 2020 via email

@rajaswa
Copy link
Member Author

rajaswa commented May 22, 2020

@Sharad24 can you provide some reference examples for these 'integration' type tests?
Maybe from GenRL?

@Sharad24
Copy link
Member

Sharad24 commented May 22, 2020

Hmm the integration tests in GenRL are not that good and sort of brittle as of yet. Try reading from here: https://www.fullstackpython.com/integration-testing.html

The goal of them is that the individual units that are going to be used inside the attacker in our case the different transforms, etc., finally are able to work together through one API as they were intended. You don't really have to check the output from each individual unit for each different case as that is already done in their unit testing. Here, we only test how good the objects work together and if they are any brittle points in their interfacing.

Although, if there are some methods in the Attacker class that are working as individual units then there should be unit tests for them.

@abheesht17
Copy link
Contributor

Thanks! Will do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed Priority: High
Projects
None yet
Development

No branches or pull requests

4 participants