To Improve the Metric Module #532

qiyanjun · 2021-09-29T19:36:00Z

@sanchit97 This looks awesome! Thank you for the hard work! This will be a great addition to TextAttack.

One (just one I promise!) concern I have is the perplexity numbers from the test output. They looks a little bit too high (200-300, 1000-2000), though I cannot see any issues with the code. One reason might be that we're just using poor samples for testing, but maybe we can do a sanity check on a larger number of samples (e.g. 1000 IMDB, Yelp, etc) just to make sure.

Otherwise, most of my comments are minor ones about the docstring. I'm going to approve the PR since code and design looks good! 🤗

Lastly, these are some possible suggestions on how we can improve Metrics module later on in the future (not necessary for this PR).

Batched passes with GPT2: I know doing batches with GPT2 is tricky b/c there are no padding tokens, but we can manually set them to be EOS token and then use attention masking to make sure we don't use the padding tokens. Also we need to make sure that we set the labels with EOS tokens to be -100 when calculating the loss. This will speed up the perplexity metric a lot.
Global object sharing for GPT2 and USE: note than we also use GPT2 and USE as part of our constraints, so it's possible that we already loaded the model onto our GPUs when running the attacks. It would be nice if we could reuse them for metrics without instantiating new models (very easy to hit GPU OOM). One solution is to make a GLOBAL_SHARED_OBJECTS dictionary within shared/utils, make it global scope, and then insert GPT2 and USE models into the dictionary whenever we instantiate them the first time and reuse them later in other modules.

Originally posted by @jinyongyoo in #514 (review)

The text was updated successfully, but these errors were encountered:

qiyanjun · 2021-10-01T10:27:40Z

@sanchit97 can you fix this issue?

plus, would you please add docstring in the metric module providing example codes for the API use?

sanchit97 · 2021-10-15T06:09:08Z

@qiyanjun Done in #540

sanchit97 · 2021-10-29T12:53:20Z

This still requires a check on the Github Actions pytest

sanchit97 mentioned this issue Oct 7, 2021

Fix metric-module Issue#532 #540

Merged

srujanjoshi added the enhancement New feature or request label Oct 15, 2021

srujanjoshi closed this as completed Oct 29, 2021

sanchit97 reopened this Oct 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

To Improve the Metric Module #532

To Improve the Metric Module #532

qiyanjun commented Sep 29, 2021

qiyanjun commented Oct 1, 2021

sanchit97 commented Oct 15, 2021

sanchit97 commented Oct 29, 2021

To Improve the Metric Module #532

To Improve the Metric Module #532

Comments

qiyanjun commented Sep 29, 2021

qiyanjun commented Oct 1, 2021

sanchit97 commented Oct 15, 2021

sanchit97 commented Oct 29, 2021