In [1]:
! pip install nlpaug fairseq

Collecting nlpaug
  Downloading nlpaug-1.1.10-py3-none-any.whl (410 kB)
[K     |████████████████████████████████| 410 kB 242 kB/s 
[?25hCollecting fairseq
  Downloading fairseq-0.10.2-cp37-cp37m-manylinux1_x86_64.whl (1.7 MB)
[K     |████████████████████████████████| 1.7 MB 4.1 MB/s 
Collecting hydra-core
  Downloading hydra_core-1.1.1-py3-none-any.whl (145 kB)
[K     |████████████████████████████████| 145 kB 31.1 MB/s 
Collecting sacrebleu>=1.4.12
  Downloading sacrebleu-2.0.0-py3-none-any.whl (90 kB)
[K     |████████████████████████████████| 90 kB 6.3 MB/s 
Collecting tabulate>=0.8.9
  Downloading tabulate-0.8.9-py3-none-any.whl (25 kB)
Collecting importlib-resources
  Downloading importlib_resources-5.4.0-py3-none-any.whl (28 kB)
Collecting antlr4-python3-runtime==4.8
  Downloading antlr4-python3-runtime-4.8.tar.gz (112 kB)
[K     |████████████████████████████████| 112 kB 27.1 MB/s 
[?25hCollecting omegaconf==2.1.*
  Downloading omegaconf-2.1.1-py3-none-an

In [2]:
# ! pip install nlpaug fairseq >> /dev/null

In [3]:
import nlpaug.augmenter.char as nac
import nlpaug.augmenter.word as naw

test_sentence = "I genuinely have no idea what the output of this sequence of words will be - it will be interesting to find out what nlpaug can do with this!"

### Character Augmenter

1. keyboard : Augmenter that apply typo error simulation to textual input.

In [4]:
aug = nac.KeyboardAug(name='Keyboard_Aug', aug_char_min=1, aug_char_max=10, aug_char_p=0.3, aug_word_p=0.3, 
                      aug_word_min=1, aug_word_max=10, stopwords=None, tokenizer=None, reverse_tokenizer=None, 
                      include_special_char=True, include_numeric=True, include_upper_case=True, lang='en', verbose=0, 
                      stopwords_regex=None, model_path=None, min_char=4)

test_sentence_aug = aug.augment(test_sentence)
print(test_sentence)
print(test_sentence_aug)

I genuinely have no idea what the output of this sequence of words will be - it will be interesting to find out what nlpaug can do with this!
I geb&ine:y have no kdeZ qhQt the 8uYput of tTid sequsnDr of aorVs will be - it wi,k be jnterewtlHg to find out what nlpaug can do with this!


2. ocr : Augmenter that apply ocr error simulation to textual input.

In [5]:
aug = nac.OcrAug(name='OCR_Aug', aug_char_min=1, aug_char_max=10, aug_char_p=0.3, aug_word_p=0.3, aug_word_min=1, 
                 aug_word_max=10, stopwords=None, tokenizer=None, reverse_tokenizer=None, verbose=0, stopwords_regex=None, 
                 min_char=1)

test_sentence_aug = aug.augment(test_sentence)
print(test_sentence)
print(test_sentence_aug)

I genuinely have no idea what the output of this sequence of words will be - it will be interesting to find out what nlpaug can do with this!
I 9enoine1y have no idea what the ootpot of this sequence of wokd8 will be - it will be inteke8tin9 to find out what nlpaug can du with this!


3. random : Augmenter that apply random character error to textual input.

In [6]:
aug = nac.RandomCharAug(action='substitute', name='RandomChar_Aug', aug_char_min=1, aug_char_max=10, aug_char_p=0.3, 
                        aug_word_p=0.3, aug_word_min=1, aug_word_max=10, include_upper_case=True, include_lower_case=True, 
                        include_numeric=True, min_char=4, swap_mode='adjacent', spec_char='!@#$%^&*()_+', stopwords=None, 
                        tokenizer=None, reverse_tokenizer=None, verbose=0, stopwords_regex=None, candidiates=None)

test_sentence_aug = aug.augment(test_sentence)
print(test_sentence)
print(test_sentence_aug)

I genuinely have no idea what the output of this sequence of words will be - it will be interesting to find out what nlpaug can do with this!
I genuinely have no iWea wyft the output of Jlis sUUuenc( of wKrdL 7Bll be - it wm6l be interesting to f#6d out what nlpaug can do Gi+h this!


### Word Augmenter

1. antonym : Augmenter that apply semantic meaning based to textual input.

In [7]:
aug = naw.AntonymAug(name='Antonym_Aug', aug_min=1, aug_max=10, aug_p=0.3, lang='eng', stopwords=None, tokenizer=None, 
                     reverse_tokenizer=None, stopwords_regex=None, verbose=0)

test_sentence_aug = aug.augment(test_sentence)
print(test_sentence)
print(test_sentence_aug)

I genuinely have no idea what the output of this sequence of words will be - it will be interesting to find out what nlpaug can do with this!
I genuinely lack no idea what the output of this sequence of words will differ - it will differ uninteresting to lose out what nlpaug can unmake with this!


3. context_word_embedding : Augmenter that apply operation (word level) to textual input based on contextual word embeddings.

In [8]:
aug = naw.ContextualWordEmbsAug(model_path='bert-base-uncased', model_type='', action='substitute', # temperature=1.0, 
                                top_k=100,
                                # top_p=None, 
                                name='ContextualWordEmbs_Aug', aug_min=1, aug_max=10, aug_p=0.3, 
                                stopwords=None, device='cpu', force_reload=False,
                                # optimize=None, 
                                stopwords_regex=None, 
                                verbose=0, silence=True)

test_sentence_aug = aug.augment(test_sentence)
print(test_sentence)
print(test_sentence_aug)

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/440M [00:00<?, ?B/s]

I genuinely have no idea what the output of this sequence of words will be - it will be interesting to find out what nlpaug can do with this!
i genuinely have no clue what his rest of this series of words will say - its will seemed impossible to find just what we can do with this!


4. random : Augmenter that apply random word operation to textual input.

In [9]:
aug = naw.RandomWordAug(action='delete', name='RandomWord_Aug', aug_min=1, aug_max=10, aug_p=0.3, stopwords=None, 
                        target_words=None, tokenizer=None, reverse_tokenizer=None, stopwords_regex=None, verbose=0)

test_sentence_aug = aug.augment(test_sentence)
print(test_sentence)
print(test_sentence_aug)

I genuinely have no idea what the output of this sequence of words will be - it will be interesting to find out what nlpaug can do with this!
I genuinely have idea what the of this sequence of will - it will be interesting to what can this!


5. spelling : Augmenter that apply spelling error simulation to textual input.

In [10]:
aug = naw.SpellingAug(dict_path=None, name='Spelling_Aug', aug_min=1, aug_max=10, aug_p=0.3, stopwords=None, 
                      tokenizer=None, reverse_tokenizer=None, include_reverse=True, stopwords_regex=None, verbose=0)

test_sentence_aug = aug.augment(test_sentence)
print(test_sentence)
print(test_sentence_aug)

I genuinely have no idea what the output of this sequence of words will be - it will be interesting to find out what nlpaug can do with this!
I genuinely have in idea what the output of this sequence ot words will be - it weill te intressting ro find out what nlpaug can dos qith These!


6. split : Augmenter that apply word splitting operation to textual input.

In [11]:
aug = naw.SplitAug(name='Split_Aug', aug_min=1, aug_max=10, aug_p=0.3, min_char=4, stopwords=None, tokenizer=None, 
                   reverse_tokenizer=None, stopwords_regex=None, verbose=0)

test_sentence_aug = aug.augment(test_sentence)
print(test_sentence)
print(test_sentence_aug)

I genuinely have no idea what the output of this sequence of words will be - it will be interesting to find out what nlpaug can do with this!
I gen uinely ha ve no idea what the output of this sequence of wor ds will be - it wi ll be intere sting to f ind out wh at nlpaug can do wi th th is!


7. synonym : Augmenter that apply semantic meaning based to textual input.

In [12]:
aug = naw.SynonymAug(aug_src='wordnet', model_path=None, name='Synonym_Aug', aug_min=1, aug_max=10, aug_p=0.3, lang='eng', 
                     stopwords=None, tokenizer=None, reverse_tokenizer=None, stopwords_regex=None, force_reload=False, 
                     verbose=0)

test_sentence_aug = aug.augment(test_sentence)
print(test_sentence)
print(test_sentence_aug)

I genuinely have no idea what the output of this sequence of words will be - it will be interesting to find out what nlpaug can do with this!
One truly have no idea what the output of this sequence of word volition be - it testament be interesting to find out what nlpaug can do with this!
