You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using Character Augmenters (random and keyboard, specifically) on French utterances, I noticed two things:
When there is a space before punctuation (non-breaking space, as described here), it is removed by the augmenter.
The augmenter adds space before and after an apostrophe.
It seems like both of these would be unwanted behaviors, as ideally the augmenter would only make the change specified in the docs, and not change anything else.
For example, when I run this:
nlpaug.augmenter.char.KeyboardAug(min_char=4, aug_word_max=1, aug_char_p=0.1).augment("un espace avant le point d'interrogation ?", n=1)
I get this:
"un esoace avant le point d ' interrogation?"
The text was updated successfully, but these errors were encountered:
It seems there is a general problem with the char augmenters whenever certain punctuation chars are provided. The following is annoying: string.punctuation
result: !"#$%&\'()*+,-./:;<=>?@[\\]^_{|}~`
And this is what happens when applying one of the noted char augs: nac.RandomCharAug(action="insert",).augment(string.punctuation)
result: ! " # $% & \' () * +, -. /: ; <= >? @ [\\] ^ _ {|} ~`
When using Character Augmenters (random and keyboard, specifically) on French utterances, I noticed two things:
It seems like both of these would be unwanted behaviors, as ideally the augmenter would only make the change specified in the docs, and not change anything else.
For example, when I run this:
I get this:
The text was updated successfully, but these errors were encountered: