Has ContextualWordEmbsAug gone slow? #248

rajat-tech-002 · 2021-10-28T09:29:28Z

I had used ContextualWordEmbsAug for larger datasets earlier also. Has the code been significantly affected by some update. The augmentor speed is too slow now.

050644zf · 2021-10-30T15:52:45Z

I'm using this augmentor recently too and it takes 90 secs to process an sentence on my machine. and still take seconds on colab.

rajat-tech-002 · 2021-11-01T07:50:06Z

@050644zf, so you also faced same issue?

050644zf · 2021-11-01T08:10:15Z

@050644zf, so you also faced same issue?

Yes, here is a testing notebook. You can see still takes 2.5sec to process an sentence.

makcedward · 2021-11-21T02:55:28Z

@rajat-tech-002
May I know which nlpaug version do you use? Supposed that speed should be improved for ContextualWordEmbsAug after applying tuning to fit multiple inputs to transformer's models.

makcedward · 2021-11-21T17:53:02Z

Tested from version 1.1.5 to 1.1.8

Performance is downgraded by 23%. (13.3s vs 16.4s) for 100 same inputs

rajat-tech-002 · 2021-11-21T17:59:15Z

@rajat-tech-002 May I know which nlpaug version do you use? Supposed that speed should be improved for ContextualWordEmbsAug after applying tuning to fit multiple inputs to transformer's models.

I am using the latest version.

rajat-tech-002 · 2021-11-21T18:02:43Z

Please close the issue.
I think on longer texts for eg. Yahoo dataset with average sentence length of 127, the augmenter would be very slow.
For eg. 300 seconds for 400 sentences on A100 GPU.
And it would be more if you do aug_max = None.
and aug_p = 30%

makcedward · 2021-11-21T18:16:54Z

Longer sentences affect performance as transformers need to handle more text rather than padding. Secondly, more percentage of augmentation affect performance more seriously.

More technical details:
For example, 10 tokens will be augmented, It needs to pass through transformers 10 times rather than 1 time. By BERT (or other models) design, masking language modeling predicts one token at a time.

However, I just notice there is major performance downgrade from 1.1.3 to 1.1.4. Time spending increased from 1s to 9s. The major change is adopting HuggingFace API rather than using my custom implementation. More tests need to conduct in order to identify the root cause.

rajat-tech-002 closed this as completed Nov 21, 2021

makcedward mentioned this issue Nov 23, 2021

gpu not being fully used in ContextualWordEmbsAug #246

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Has ContextualWordEmbsAug gone slow? #248

Has ContextualWordEmbsAug gone slow? #248

rajat-tech-002 commented Oct 28, 2021

050644zf commented Oct 30, 2021

rajat-tech-002 commented Nov 1, 2021

050644zf commented Nov 1, 2021

makcedward commented Nov 21, 2021

makcedward commented Nov 21, 2021 •

edited

Loading

rajat-tech-002 commented Nov 21, 2021

rajat-tech-002 commented Nov 21, 2021

makcedward commented Nov 21, 2021 •

edited

Loading

Has ContextualWordEmbsAug gone slow? #248

Has ContextualWordEmbsAug gone slow? #248

Comments

rajat-tech-002 commented Oct 28, 2021

050644zf commented Oct 30, 2021

rajat-tech-002 commented Nov 1, 2021

050644zf commented Nov 1, 2021

makcedward commented Nov 21, 2021

makcedward commented Nov 21, 2021 • edited Loading

rajat-tech-002 commented Nov 21, 2021

rajat-tech-002 commented Nov 21, 2021

makcedward commented Nov 21, 2021 • edited Loading

makcedward commented Nov 21, 2021 •

edited

Loading

makcedward commented Nov 21, 2021 •

edited

Loading