Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BERTAug affects Proper Noun #2

Closed
memahesh opened this issue Jun 4, 2019 · 3 comments
Closed

BERTAug affects Proper Noun #2

memahesh opened this issue Jun 4, 2019 · 3 comments
Labels
bug Something isn't working

Comments

@memahesh
Copy link

memahesh commented Jun 4, 2019

Hi,

I have been using your BERTAug. for Text Augmentation. It works fine on a lot of tasks but it starts messing up the Proper Nouns.

image

Is there any fix to this ?

@makcedward
Copy link
Owner

makcedward commented Jun 5, 2019

@memahesh
There is a bug. The expected output should concatenate tokenizer word. It will be fixed in next release and the expected output should be not include "#".

For proper nouns case, it will treat as general word. In other words, "Mahesh" treats as a word and it may replace by other words. Here is one of the possible outcome.
bhesh is a good person
If you prefer to reserve the proper noun, you may leverage include it in stopwords list and passing to augmenter. Here is the example.

stopwords = ['mahesh']
aug = naw.BertAug(action=Action.SUBSTITUTE, stopwords=stopwords)

@makcedward makcedward added the bug Something isn't working label Jun 5, 2019
@makcedward
Copy link
Owner

Fixed in BETA release. You may try again. It will be included 0.0.4 release.

@anvitha-jain
Copy link

AttributeError: module 'nlpaug.augmenter.word' has no attribute 'BertAug'
why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants