# Using neural networks to classify cat and dog pictures

In this tutorial, we are going to use machine learning to classify images! We will write a program that will automatically learn to recognize whether a picture is of a cat or dog.

Here is a cat image:

<img src='neural_assets/cat.jpg' width=250 alt=''/>

Meow! I'm too fat and upside down. Classify me as a cat still, will you?

Here is a dog image:

<img src='neural_assets/dog.jpg' width=250 alt=''/>

Bow, wow! I've got this toy in my mouth! Don't tell me I'm not a dog anymore!

Recognizing whether a picture is of a cat or a dog used to be a difficult task for computers. One of the main reason for this was the fact that all the pictures to be recognized were quite different. If somebody wrote a program to detect a cat by sensing its eyes and ears in the middle of the image, the cat would outsmart the computer by choosing to appear in the top right corner of the image. This was irritating and computer scientists found it near impossible to write a program that could satisfactorily distinguish between all the possible cat and dog images.

Neural networks (and specifically, an advanced type of them called convolutional neural networks) have come to solve to problem of image recognizition rather differently. First, these networks try to sense different features of the image from different parts of the image. In case of cat and dog images, these features would represent ears, eyes, body contours, tails, legs etc. Then, these features are used to determine what might be in the picture. For example, cats might have different ears in general than dogs, and neural network might be trained to use this information to distinguish between cat and dog images. In general, there can be hundreds or thousands of such facts on which the neural network can be trained. When we keep training neural networks with more and more data, they start recognizing the differences between the images and begin making correct guesses about what's in the image.

So if we train a neural network with lots of cat and dog images, it should also start telling us whether the image we give to it is of a cat or a dog! In this tutorial, we are going to do exactly that, but before we begin, we will go through a short introduction of what neural networks are, and what are they made up of i.e. nodes, edges, and layers.

# A short introduction to neural networks

Neural networks (NNs) are machine learning algorithms that are inspired by our knowledge of brain function. Although, if you ask machine learning scientists, they might say that neural networks have little to do with how the brain works. Typically, these networks are made up of nodes, edges, and layers. Let us briefly see what each of them are.

## Nodes

Nodes are the fundamental unit of neural networks. Their main task is to take in some information, process it, and produce an output. For example, a node can take in a number, square it, and output it. Another node can take in 3 numbers, take their 5th root, and output them. Typically, a node takes one or inputs and gives one output. In pictures, a node looks like this:

<img src='neural_assets/nodes.png'/>

## Edges

Edges carry information from one node to another. They take output of one node and feed that as input in another node. In this way, edges allow us to connect different together. For example, we can have three nodes connected to each other where input of first node gets processed by one node after another, and gets out at the end of third node:

<img src='neural_assets/edges.png'/>

We can see that when we input 5 in the first node, we get 1.44 as the output in the last node. Intermediate outputs of the nodes are shown in the diagram.

## Layers

Layers are made by putting many nodes together on top of each other. A layer typically takes in many inputs and generates many inputs. Technically, a layer is just a collection nodes and edges. Here is how a layer might look like in a neural netwrok:

<img src='neural_assets/layer1.png'/>

We can also criss cross inputs so that each input goes in all nodes. When that happens, the layer can look like this:

<img src='neural_assets/layer2.png'/>

You can imagine the criss crossing happening over and over, as the outputs of one layer become inputs of another. The layer above does simple computation, but other more advanced kind of layers do very specialized operations like applying filters to images, reducing image dimensionality etc. These kind of layers are typically employed in applications like image and speech recognizition. Although we will use these advanced type of layers in our example later, we will not cover them here.

## Network

In a typical NN, there are many nodes and edges, connected to each other layer after layer. Some NNs are shallow and have fewer layers and some NNs are deep and have more layers. Deep learning refers to NNs with more layers. Here are some pictures of what shallow and deep NNs look like:

### A shallow neural network

<img src='neural_assets/shallow_net.png' width=300/>

### A deep neural network

<img src='neural_assets/deep_net.png' width=500/>

Source: https://synapse.koreamed.org/DOIx.php?id=10.3348/kjr.2017.18.4.570&vmode=PUBREADER

## How does it work?

Neural networks work in a simple manner: we pass the data to the input layer, and they keep going through hidden layers. At the end, the data reach the output layer and we get to see the final guess that the network has made about the input.

# How will we use neural networks to classify images?

In any machine learning project, bulk of the work happens before we start using algorithms like neural networks. So before we begin using neural networks to classify images, we will have to do what is generally called data preparation. Doing it will bring the data in a format that can be fed to a machine learning algorithm. After the data preparation, we will split data into training and test sets. We will using training set to train our model and test set to assess the accuracy of our model. Remember, it is important to assess the accuracy of the model with test set!

* Data preparation
* Data splitting (into training and test sets)
* Model building (on the training set)
* Model validation (on the test set)

# Data preparation

In [1]:
# TBD

# Data splitting

In [2]:
# TBD

# Model building

In [3]:
# TBD

# Model validation

In [4]:
# TBD

In [1]:
print("this")

[1] "this"
