hard label classification #635

cogeid · 2022-04-13T18:58:11Z

What does this PR do?

the previous pull request was accidentally closed and could not be reopened due to the original repo being deleted, so I created a new one with the same chagnes

This PR adds a new Goal Function, called "HardLabelClassification", which finds the maximum semantic similarity between two pieces of text such that the generated text is outside of the target model's decision boundary.

Summary

This PR adds the Hard Label Classification goal function, which finds the maximum semantic similarity between two pieces of text such that the generated text is outside of the target model's decision boundary. Below is an example use case, where the user would be able to specify "goal-function hard-label-classification". The implementation for the goal function is based on the paper as well as the corresponding implementation, but only the goal function is being implemented as part of TextAttack.

Additions

Added a new Goal Function file in the classification folder called "hardlabel_classification.py"
Specified the new objective function for the Goal Function in the file

Changes

The "hardlabel-classification" attack argument was created for hard label attacks.

Deletions

There were no deletions made for this PR.

Checklist

The title of your pull request should be a summary of its contribution.
Please write detailed description of what parts have been newly added and what parts have been modified. Please also explain why certain changes were made.
If your pull request addresses an issue, please mention the issue number in the pull request description to make sure they are linked (and people consulting the issue know you are working on it)
To indicate a work in progress please mark it as a draft on Github.
[ ] Make sure existing tests pass.
[ ] Add relevant tests. No quality testing = no merge.
[ ] All public methods must have informative docstrings that work nicely with sphinx. For new modules/files, please add/modify the appropriate .rst file in TextAttack/docs/apidoc.'

jxmorris12

@cogeid can you run the formatter (make format) and push? The recipe looks great. Once the checks show up, we can merge.

jxmorris12 · 2022-06-09T22:03:11Z

textattack/goal_functions/classification/hardlabel_classification.py

@@ -0,0 +1,39 @@
+"""
+Determine if an attack has been successful in Hard Label Classficiation.


typo: Classification

jxmorris12 · 2022-06-29T19:38:52Z

@cogeid - everything should work once you format the code, fix the typo, and push again! Please let me know if you're interested in finishing this

hard label classification

4854eb7

jxmorris12 approved these changes Jun 9, 2022

View reviewed changes

make format

87c4671

qiyanjun merged commit f848247 into master Sep 11, 2023
3 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hard label classification #635

hard label classification #635

cogeid commented Apr 13, 2022 •

edited

jxmorris12 left a comment

jxmorris12 Jun 9, 2022

jxmorris12 commented Jun 29, 2022

		@@ -0,0 +1,39 @@
		"""
		Determine if an attack has been successful in Hard Label Classficiation.

hard label classification #635

hard label classification #635

Conversation

cogeid commented Apr 13, 2022 • edited

What does this PR do?

Summary

Additions

Changes

Deletions

Checklist

jxmorris12 left a comment

Choose a reason for hiding this comment

jxmorris12 Jun 9, 2022

Choose a reason for hiding this comment

jxmorris12 commented Jun 29, 2022

cogeid commented Apr 13, 2022 •

edited