New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance augment function #531
Conversation
@Hanyu-Liu-123 please consider using or improve the newly added metric module #514 to check the quality of the augmented samples... |
I'll look into the metric module. I think the USE and perplexity metrics can be really helpful if implemented to the augmenter. |
Hi @alexander-zap , this pull request is a truncated version of https://gitlab.com/taforkacc/textattack. Because the original pull request requires extensive structure changes, we were unable to incorporate your full addition at this moment, but would definitely like to include the At your convenience, could you review this pull request that adds the |
@Hanyu-Liu-123 please check out https://github.com/QData/TextAttack/blob/master/textattack/attack_results/attack_result.py to figure out how to use metric module |
@Hanyu-Liu-123 please also add a test func in the test_augment_api |
|
@Hanyu-Liu-123 I reviewed the code. The usage of |
Thank you so much! |
@Hanyu-Liu-123 please add docstring and testing code. Then it is ready to merge! |
@qiyanjun Added the docstrings! Here's an sample output when running in interactive mode:
|
Slides explaining the limitation of fast-augment |
high_yield: Whether to return a set of augmented texts that will be relatively similar, or to return only a | ||
single one. | ||
fast_augment: Stops additional transformation runs when number of successful augmentations reaches | ||
transformations_per_example | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hanyu-Liu-123 I am a bit confused by the fast_augment tag...
- If we already have the transformation_per_example argument, what is the purpose of fast_augment?
- When fast_augment = true, we will generate more examples than the transformation_per_example argument??
text | ||
for text in transformed_texts | ||
if len(text.attack_attrs["modified_indices"]) | ||
>= num_words_to_swap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hanyu-Liu-123 I am confused by the num_words_to_swap use here... Does this specify lower_bound or upper_bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the number_words_to_swap is lower bound.
|
What does this PR do?
Summary
This PR introduces 2 new augmenter parameters,
high_yield
andfast_augment
. Thehigh_yield
option was originally implemented in pull request #507 that still requires additional implementation before merging.When
high_yield
is set toTrue
, every augmentation that fits the criteria of a successful transformation will be added to the final output. In most cases, the high-yield augmenter will generate far more augmentations than what users specify intransformations_per_example
.When
fast_augment
is set toTrue
, the augmenter terminate and returntransformations_per_example
number of transformations when the number of successful augmentations reachestransformations_per_example
.This improves the running time of the augmenter but may cause skewness in returned augmentations (speed is improved via early stop).
Additions
high_yield
andfast_augment
parameters in augmenterChanges
Checklist
.rst
file inTextAttack/docs/apidoc
.'