##Data Augmentation using NLPaug

This notebook demostrate the usage of a character augmenter, word augmenter. There are other types such as augmentation for sentences, audio, spectrogram inputs etc. All of the types many before mentioned types and many more can be found at the [github repo](https://github.com/makcedward/nlpaug) and [docs](https://nlpaug.readthedocs.io/en/latest/) of nlpaug.

In [0]:
#Installing the nlpaug package
!pip install nlpaug==0.0.14

Collecting nlpaug
[?25l  Downloading https://files.pythonhosted.org/packages/1f/6c/ca85b6bd29926561229e8c9f677c36c65db9ef1947bfc175e6641bc82ace/nlpaug-0.0.14-py3-none-any.whl (101kB)
[K     |███▎                            | 10kB 18.2MB/s eta 0:00:01[K     |██████▌                         | 20kB 6.7MB/s eta 0:00:01[K     |█████████▊                      | 30kB 7.9MB/s eta 0:00:01[K     |█████████████                   | 40kB 8.5MB/s eta 0:00:01[K     |████████████████▏               | 51kB 7.1MB/s eta 0:00:01[K     |███████████████████▍            | 61kB 7.4MB/s eta 0:00:01[K     |██████████████████████▋         | 71kB 8.0MB/s eta 0:00:01[K     |█████████████████████████▉      | 81kB 8.5MB/s eta 0:00:01[K     |█████████████████████████████   | 92kB 7.9MB/s eta 0:00:01[K     |████████████████████████████████| 102kB 4.9MB/s 
[?25hInstalling collected packages: nlpaug
Successfully installed nlpaug-0.0.14


In [0]:
#this will be the base text which we will be using throughout this notebook
text="The quick brown fox jumps over the lazy dog ."

### Augmentation at the Character Level


1.   OCR Augmenter: To read textual data from on image, we need an OCR(optical character recognition) model. Once the text is extracted from the image, there may be errors like; '0' instead of an 'o', '2' instead of 'z' and other such similar errors.  
2.   Keyboard Augmenter: While typing/texting typos are fairly common this augmenter simulates the errors by substituting characters in words with ones at a similar distance on a keyboard.



In [0]:
#OCR augmenter
import nlpaug.augmenter.char as nac

aug = nac.OcrAug()  
augmented_texts = aug.augment(text, n=3) #specifying n=3 gives us only 3 augmented versions of the sentence.

print("Original:")
print(text)

print("Augmented Texts:")
print(augmented_texts)

Original:
The quick brown fox jumps over the lazy dog .
Augmented Texts:
['The qoick bruwn fox jumps over the lazy dog .', 'The qoick brown fux jumps over the lazy dog .', 'The qoick bkown fox jumps over the la2y dog .']


In [0]:
#Keyboard Augmenter
import nlpaug.augmenter.word as naw


aug = nac.KeyboardAug()
augmented_text = aug.augment(text, n=3) #specifying n=3 gives us only 3 augmented versions of the sentence.

print("Original:")
print(text)

print("Augmented Text:")
print(augmented_text)

Original:
The quick brown fox jumps over the lazy dog .
Augmented Text:
['The quick brown fox jumps ofer the .azy dog .', 'The qHick vrown fox jumps ovwr the lazy dog .', 'The quick brKwn fox jumps over the lazT dog .']


There are other types of character augmenters too. Their details are avaiable in the links mentioned at the beginning of this notebook.

### Augmentation at the Word Level