# <span style="font-family:Trebuchet MS; font-size:2em;">Generative Dog Images</span>

<img src="https://thehappypuppysite.com/wp-content/uploads/2018/07/how-long-do-golden-retrievers-live-KH-long.jpg" alt="https://thehappypuppysite.com/wp-content/uploads/2018/07/how-long-do-golden-retrievers-live-KH-long.jpg" style="width: 600px;" title="Golden" align="middle"/>

## <span style="font-family:Papyrus; font-size:1em;">Experiment with creating puppy pics</span>

# Table of Content
* [General information](#General-information)
* [Visualisation of pictures](#Visualisation-of-pictures)
* [Breeds of dogs](#Breeds-of-dogs)
* [Local conclusion](#Local-conclusion)

# General information
* **Task:** to generate pictures of dogs with about $20k$ different pictures of dogs given. "This ~~person~~ ~~rental~~ ~~vessel~~ ~~waifu~~ ~~[etc.](https://thisxdoesnotexist.com/)~~ **dog** does not exist".

* **Data:** the data is [Stanford Dogs Dataset](http://vision.stanford.edu/aditya86/ImageNetDogs/) built from ImageNet. But, one of $20580$ images doesn't include, so we have $20579$.
    * 120 dog breeds, from 148 to 252 photos per breed, with 75% quantile equal to 186 photos per breed.
    * Format of pictures name: `ImageNet `[`synset`](https://en.wikipedia.org/wiki/Synonym_ring)` _ some number`. E.g. `n02090721_42` -- [Irish Wolfhound](https://en.wikipedia.org/wiki/Irish_wolfhound), picture with index 42.
    * Here is a [poster](http://vision.stanford.edu/documents/KhoslaJayadevaprakashYaoFeiFei_FGVC2011.pdf)  of a dataset.
* **Metric:** the metric is MiFID (Memorization-informed Fréchet Inception Distance). Explanation on the [forum](https://www.kaggle.com/c/generative-dog-images/discussion/97809#latest-564085) or in the [overview]((https://www.kaggle.com/c/generative-dog-images/overview/evaluation)).
* [Kaggle overview](https://www.kaggle.com/c/generative-dog-images/overview)

[To table of Content](#Table-of-Content)
***

In [None]:
# import
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import seaborn as sns
%matplotlib inline
plt.style.use('ggplot')
np.random.seed(2)


import keras
print(keras.backend.tensorflow_backend._get_available_gpus())
print(keras.backend.image_data_format()) # set_image_data_format('channels_first')

import warnings
warnings.filterwarnings('ignore')

# Visualisation of pictures

Interesting data facts are:

* There are pictures with more than one dog (even with $3$ dogs);
* There are pictures with the dog (-s) and person (people);
* There are pictures with more than one person (even with $4$ people);
* There are pictures where dogs occupy less than $1/5$ of the picture;

* There are pictures with text (magazine covers, from dog shows, memes and pictures with text);
* Even wild predators included, e.g. [African wild dog](https://en.wikipedia.org/wiki/African_wild_dog) or [Dingo](https://en.wikipedia.org/wiki/Dingo), but not wolves.


Let's visualize $9$ random pictures of a given dataset.

In [None]:
PATH = '../input/all-dogs/all-dogs/'
PATH_LIST = os.listdir(PATH)
print(f'There are #{len(os.listdir(PATH))} pictures of dogs.')

fig, axes = plt.subplots(nrows=3, ncols=3, figsize=(12,10))

for indx, axis in enumerate(axes.flatten()):
    rnd_indx = np.random.randint(0, len(os.listdir(PATH)))
    # https://matplotlib.org/users/image_tutorial.html
    img = plt.imread(PATH + PATH_LIST[rnd_indx])
    imgplot = axis.imshow(img)
    axis.set_title(PATH_LIST[rnd_indx])
    axis.set_axis_off()
plt.tight_layout(rect=[0, 0.03, 1, 0.95])

[To table of Content](#Table-of-Content)
***

# Breeds of dogs

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
.tg .tg-ntt9{background-color:#ffffff;color:#ffffff;border-color:#ffffff;text-align:left;vertical-align:top}
.tg .tg-sv5y{background-color:#ffffff;color:#ffffff;border-color:#ffffff;text-align:left}
</style>
<table class="tg">
  <tr>
    <th class="tg-sv5y">
      <a href="http://vision.stanford.edu/aditya86/ImageNetDogs/n02090622.html">
        <img src="https://i.pinimg.com/originals/91/37/80/9137801911ac107b252532cb19ef2470.gif" alt="Borzoi" style="width: 120px;"   title="Borzoi"/>
    </a>  </th>
    <th class="tg-sv5y"><a href="http://vision.stanford.edu/aditya86/ImageNetDogs/n02113799.html">
<img src="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTS2VVaklPQVwMlt5ugBZJ1bUftdKMsMkJha_cIVJlIzDAgjkcw" alt="Standard poodle" style="width: 90px;" title="Standard poodle"/>
</a></th>
    <th class="tg-ntt9"><a href="http://vision.stanford.edu/aditya86/ImageNetDogs/n02111500.html">
<img src="https://i.pinimg.com/originals/60/22/c1/6022c1255e5ee2af51e94b6eff00c2c6.png" alt="Great Pyrenees" style="width: 160px;" title="Great Pyrenees"/></th>
    <th class="tg-ntt9"><a href="http://vision.stanford.edu/aditya86/ImageNetDogs/n02102973.html">
<img src="https://rlv.zcache.ca/irish_water_spaniel_silhouette_trucker_hat-r9e97977e1b594be8be6764f36a99600b_eahvn_8byvr_307.jpg?rvtype=content" alt="Irish water spaniel" style="width: 130px;" title="Irish water spaniel"/>
</a></th>
    <th class="tg-ntt9"><a href="http://vision.stanford.edu/aditya86/ImageNetDogs/n02106166.html">
<img src="https://img2.embroiderypatterns.com/StockDesign/XLarge/King_Graphics/ed0858.jpg" alt="Border collie" style="width: 150px;" title="Border collie"/>
</a></th>
    <th class="tg-ntt9"><a href="http://vision.stanford.edu/aditya86/ImageNetDogs/n02106662.html">
<img src="https://www.publicdomainpictures.net/pictures/60000/nahled/dog-german-shepherd-silhouette.jpg" alt="German shepherd" style="width: 180px;" title="German shepherd"/>
</a></th>
  </tr>
</table>

* **Silhouettes upper are clickable**. A right-click opens the [ImageNetDogs](http://vision.stanford.edu/aditya86/ImageNetDogs/) page with pictures of this breed.
* There are $120$ **different by size, hairiness and other body characteristics** breeds of dogs.
* Lower you can see some visualization of photo distribution of breeds.

In [None]:
from collections import defaultdict
by_breeds_dict = defaultdict(list)
for breed_code_and_pict_indx in PATH_LIST:
    breed_code, pict_indx = breed_code_and_pict_indx.split('_')
    by_breeds_dict[breed_code].append(pict_indx) 

In [None]:
df_aux = pd.DataFrame.from_dict(by_breeds_dict, orient='index').T
print(f'There are {df_aux.shape[1]} breeds')

fig = plt.figure(figsize=(12, 6))
fig.suptitle('Distribution of number of photos per breed')
fig.add_subplot(121)
sns.boxplot(df_aux.shape[0] - df_aux.isnull().sum())
plt.xlabel('# of photos')
#plt.xticks([])
fig.add_subplot(122)
sns.distplot(df_aux.shape[0] - df_aux.isnull().sum())
plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()

[To table of Content](#Table-of-Content)
***

# Local conclusion
**Maybe knowledge of breeds features, distribution of photos and visual analysis of pictures may help in this task of an image generation.**  

_Also, it is possible to get poses estimation and number of dogs per some pictures from the [ImageNetDogs](http://vision.stanford.edu/aditya86/ImageNetDogs/) page, but the question is it qualifies as External data ("No external data can be added as a data source" and "C. External Data. You may use data other than the Competition Data (“External Data”) to develop and test your models and Submissions...")_

[To table of Content](#Table-of-Content)
***