### Data Argumentation
To avoid overfitting and create a larger dataset from a smaller one we can use a technique called ``data augmentation``. This is simply performing random transofrmations on our images so that our model can generalize better. These transformations can be things like compressions, rotations, stretches and even color changes.

```python

tf.keras.preprocessing.image.ImageDataGenerator(
    featurewise_center=False,
    samplewise_center=False,
    featurewise_std_normalization=False,
    samplewise_std_normalization=False,
    zca_whitening=False,
    zca_epsilon=1e-06,
    rotation_range=0,
    width_shift_range=0.0,
    height_shift_range=0.0,
    brightness_range=None,
    shear_range=0.0,
    zoom_range=0.0,
    channel_shift_range=0.0,
    fill_mode="nearest",
    cval=0.0,
    horizontal_flip=False,
    vertical_flip=False,
    rescale=None,
    preprocessing_function=None,
    data_format=None,
    validation_split=0.0,
    dtype=None,
)
```

### Imports

In [10]:
import tensorflow as tf
import os
import tensorflow.keras as keras
import matplotlib.pyplot as plt

> Let's generate a dataset of ``10 bees`` and ``10 ants`` from the 2 images that we have in the images directory. The structure of the directories are as follows:


**From:**
```
images
    - bee17.png
    - ant5.png
```

**To:**
```
data
    - bees
        -bee.png
        - bee.png
        ...
     - ants
         -ant.png
         -ant.png
         ...
```

In [3]:
bee_save_path = "data/bees"
ant_save_path = "data/ant"

ant_path = 'images/ant5.png'
bee_path  = 'images/bee17.png'


In [4]:
if os.path.exists(bee_save_path) == False:
    os.makedirs(bee_save_path)

if os.path.exists(ant_save_path) == False:
    os.makedirs(ant_save_path)
    
print("Done")

Done


In [28]:
ant_image = keras.preprocessing.image.load_img(ant_path)
ant = keras.preprocessing.image.img_to_array(ant_image)

bee_image = keras.preprocessing.image.load_img(bee_path)
bee = keras.preprocessing.image.img_to_array(bee_image)

> Now that we have `2` different images let's create a data generator that will make transforms to our images.

In [29]:
datagen = keras.preprocessing.image.ImageDataGenerator(
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest',
#     rescale=(224, 224)
)

**Note:** - When plotting an image using `plt.imshow()` function we need to divide the `pixels` by `255` so that the image will not be `white` for example:
        
```python
plt.imshow(image/255)
```
    
> Now let's generate images of bees and ants and save them to their respective `dirs`

#### `.flow()` function:
```python
ImageDataGenerator.flow(
    x,
    y=None,
    batch_size=32,
    shuffle=True,
    sample_weight=None,
    seed=None,
    save_to_dir=None,
    save_prefix="",
    save_format="png",
    subset=None,
)
```

In [30]:
bee.shape

(140, 140, 3)

In [31]:
## Reshape the bee and an ant

bee = bee.reshape((-1, ) + bee.shape)
ant = ant.reshape((-1, ) + ant.shape)


i = 0
for batch in datagen.flow(bee, save_prefix='bee', save_to_dir = bee_save_path, save_format='png'):
    i+=1
    if i == 10:
        break
print(f"Saved\n Total:\t {i} images")

i = 0
for batch in datagen.flow(ant, save_prefix='ant', save_to_dir= ant_save_path, save_format='png'):
    i+=1
    if i == 10:
        break
print(f"Saved\n Total:\t {i} images")

print("Done")

Saved
 Total:	 10 images
Saved
 Total:	 10 images
Done
