# Project Report
### By Sean Brouwer
#### May 13, 2019


### Vision

For this project I am classifying dogs by breed, using an existing
dataset of dogs. My motivation for choosing this project is that I
am a bit of a dog lover, and am fascinated by the great variety
between dog breeds, especially when compared to cat breeds. Cats
mostly have differences between breeds that are shown in fur color
and face shape, whereas dogs are entirely different sizes and have
very different body structures. This immense variety between dog
breeds allows for easier classification using machine learning.
Because I am a beginner to the proper utilization of machine learning
and neural networks, this easier classification problem is feasible
whereas a cat classification problem may not be.


### Background

The dataset of dogs used in this project is the "Stanford Dogs
Dataset". This dataset comes with the dog images already classified
by breed, allowing for the use of supervised machine learning. The
dataset can be found at "https://www.kaggle.com/jessicali9530/stanford
-dogs-dataset". Contained in the dataset 20,580 images of dogs from
120 different dog breeds. Those breeds include common breeds such as
golden retrievers and rottweilers, but also much more obscure breeds
that are mostly only known to dog breeding enthusiasts. Examples of
these breeds include welsh springer spaniels and tibetan mastiffs.
The distribution of dogs in this dataset is relatively even between
the different breeds, typically having between 150 and 250 images for
each. The dogs also are pictured in a variety of poses, so that
recognition can be trained for pictures with the mouth open or closed,
and from the side or front.

This project is not the first dog classification algorithm using the
Stanford Dogs Dataset. Because other projects have been done and have
made their source code available for reference, the base of this
project could be made on a previous project. The main project that
was utilized in this manner can be found at the following GitHub:
"https://github.com/NeverForged/DogClassifier". Without being able to
use this code as a guide, I would never have been able to build a
functional neural network to perform the required classifications
because of my very limited experience and exposure to the building of
such networks.

The broad classification of the technology used to make the dog breed
classifier function is machine learning. More specifically, neural
networks were employed and implemented using TensorFlow. TensorFlow
was selected because it is an excellent open source platform for
building and deploying machine learning applications with relative
ease. TensorFlow is also a platform that I have a bit of experience in
via the lab work and homework done in CS 344. Finally, the previous
projects done on this dataset also used TensorFlow, so examples of the
proper methods for creating TensorFlow neural networks could be
followed to ease some of the confusion and frustration that
complicated projects such as this one commonly entail.

To go into more depth with the technology used in this project, a
convolutional neural network was implemented. A convolutional network
was chosen instead of a regular neural network because convolutional
neural networks are specialized to work well with picture analysis.
This network was created with five distinct layers. The first
and third layers were each convolutional layers, while the second and
fourth layers each utilized max pooling. The final layer is a fully
connected layer and  narrows the network down enough to generate an
output layer. The convolutional layers function by taking small
areas (in this case 3x3 pixels) out of the large picture, and after
each sample sliding the area over 1 pixel. This is done so that the
subsections have plenty of overlap to avoid features being cut in
half. Each subsection is evaluated to see if it matches subsections
from the trained images. The max pooling layers each take a small
section of the output from the convolutional layers and remove all
but the one subsection with the highest activation values, thereby
removing the less-useful data and leaving the sections that contain
recognizable features.


### Implementation

This implementation of a convolutional neural network uses five
layers, as mentioned in the background section. The first and third
layers are convolutional layers, the second and fourth layers are
max pooling layers, and the final layer is a fully connected layer.
Together, these layers produce an output layer which provides the
classification of the dog by its breed.

Most of the implementation of this project follows exactly the
network layout provided in the GitHub: "https://github.com/
NeverForged/DogClassifier". The main areas where this project
differs from the existing convolutional neural nework are in its
final computation. I found that the original training epoch count
(100) overfitted the data profusely. By shortening the amount of
training on the data, I was able to bring the training and validation
results to very similar percents, and even increase the validation
accuracy slightly.

In this project implementation I also struggled with memory errors.
Some of the network layer computations required more memory than
my laptop could offer, even when I increased the memory stack size
in PyCharm. To work around this, I had to simplify a couple of the
computations done in the network, so that they did not require
multiple copies of the images stored in variables. As I was unable
to run the network before making this change, I do not know how
the results are affected. My best estimate is that the accuracy
decreased, but not by a significant amount.

Before finding the network I modeled my system after, I struggled
with data. When I downloaded the Stanford Dogs Database directly
from its source, it came in a file format I was very unfamiliar
with (.m for a MatLAB file). I struggled for a long while trying
to find a functional method for converting the files to something 
I could use, like .jpg. Thankfully, the network I found at the
previously mentioned GitHub had a version of the database that
had already been converted to a .jpg format. If not for this
previously transformed dataset, I would have had much more
difficulty in getting this project to work, and may not have had
time to make the performance enhancements I was able to implement.



### Results

The best result I was able to achieve with this system was 54.839%.
Although this number initially does not sound all that great, when
one considers that the correct classification is given out of 120
possible answers over half of the time it seems a lot better of a
result. Random guessing would result in less than one percent
accuracy, which is far less than this convolutional neural network
was able to achieve.

Initially, the system provided by the GitHub performed much better
in the training dataset (over 75% accuracy), but was only in the 
low 40s percent accuracy in the validation set. By decreasing the
number of epochs from 100 to 4, I was both able to decrease the
computation time of the system and decrease the overfitting that
had occurred with the longer computations. In fact, overfitting
was removed to the extent that the training dataset performed worse
(51.02% accuracy) than the validation dataset, which happened due
to random chance of the images in both sets, but did indicate that
overfitting was nearly non-existent in the network.


### Implications

Neural networks as a whole act much like a black box. The network
has clear inputs and outputs, but it is very difficult to
understand what happens in between. Because the internal workings
of the network are defined by that network and not by the user
(for the most part), the functioning of the network could happen
in a manner that is discriminatory, without the creator of the
network having any knowledge of it. The discrimination of a network
then would often be discovered much later, after the network is
already deployed and being used to make real world decisions. One
example is a network to estimate how likely an insurance customer
is to default, in which the network makes a link between race
and customer rating.

Another ethical consideration for neural networks, especially
those implemented on classification problems, is the results of
failed classification. One example of this is a Google Photo
captioning system that mistakenly captioned a black couple as
gorillas. This obviously had some negative consequences for
Google as many people were outraged.

For this system, the ethical concerns listed above are not as
big of a deal because of the subject matter. Mis-classifying
a dog as the wrong breed will never offend the dog as it cannot
understand the program, and most dog owners will also not make a
big deal of it. That said, it is still important for a program
such as this to be as accurate as possible, as potential users
would benefit from it not making mistakes.


### Bibliography

https://www.kaggle.com/jessicali9530/stanford-dogs-dataset

https://github.com/NeverForged/DogClassifier

https://github.com/saksham789/DOG-BREED-CLASSIFICATION-STANFORD-DOG-DATASET

https://developers.google.com/machine-learning/crash-course/

https://en.wikipedia.org/wiki/Convolutional_neural_network

https://aleph.se/andart2/ethics/ethics-for-neural-networks/

