In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras 
#from tensorflow.keras import layers
#from tensorflow.keras import Sequential
#from tensorflow.keras import Dense, Dropout
#from keras.models import Sequential
#from keras.layers import Dense, Dropout
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error

# Images

Neural networks tend to be very well suited to dealing with unstructured data, and one place where they have made a massive impact is in image related tasks. 

## Images with a Neural Network

In [1]:
# Download a remote dataset of cats and dogs
!curl -O https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_3367a.zip
!unzip -q kagglecatsanddogs_3367a.zip
!ls

In [None]:
import os

num_skipped = 0
for folder_name in ("Cat", "Dog"):
    folder_path = os.path.join("PetImages", folder_name)
    for fname in os.listdir(folder_path):
        fpath = os.path.join(folder_path, fname)
        try:
            fobj = open(fpath, "rb")
            is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
        finally:
            fobj.close()

        if not is_jfif:
            num_skipped += 1
            # Delete corrupted image
            os.remove(fpath)

print("Deleted %d images" % num_skipped)

### Datasets

When dealing with things like images we commonly have actual images, not arrays or dataframes. Keras has a preprocessing function to take a folder of images and automatically create a dataset from it. 

A dataset is a built in datatype in tensorflow, it is kind of a specialized type of dataframe that is meant to store larger volumes of generally non-tabular data, and is purpose made to be put through tensorflow networks. Here we will basically have the image files on disk be automatically loaded and split into two datasets - training and validation. When fitting the model we can use this dataset just as we would an array. 

This type of setup is fairly common when dealing with images. 

In [None]:
image_size = (180, 180)
batch_size = 32

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    "PetImages",
    validation_split=0.2,
    subset="training",
    seed=1337,
    image_size=image_size,
    batch_size=batch_size,
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    "PetImages",
    validation_split=0.2,
    subset="validation",
    seed=1337,
    image_size=image_size,
    batch_size=batch_size,
)

In [None]:
# Model
model = Sequential()
model.add(Flatten(input_shape=(180,180,3))
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(1))
model.summary()

## CNNs

To deal with images a little bit better we can use a different kind of neural network design - a CNN, or convolutional neural network. 

In short, a CNN is able to look at an image "as it is" caputuring spatial relationships that processing an image as a flattened array do not. When using a CNN we can first process the image in its original dimensions in the initial layers of the network, then flatten it down to go through a more familiar set of layers for the final prediction. 

A CNN looks at an image bit by bit, looking at a small square, then sliding over a few pixels, looking at another square, and so on. This has the effect of being able to extract features from areas of an image - as an example, think of an image of a bike, a CNN would be able to identify the distinct shape of a seat or handle bars as the image passes through the layers. 

### CNN Structure

A CNN has some new types of layers:
<ul>
<li> Convolutional layer - 
<li> Pooling layer - 
</ul>

### Convolutional Layer

### Pooling Layer