<a href="https://colab.research.google.com/github/DarkDk123/Dog-Breed-Classification/blob/main/Dog-Breed-Classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🐶 Dog Breeds Classification Using Tensorflow

Have you ever seen a dog and wondered what breed it is?

I have. And then someone says, *"it's a BullDog"* and you think, how did they know that?


<img width=30% src="https://media1.tenor.com/m/InjqAJQgZngAAAAd/star-wars-get-ready-for-battle-boys.gif">

In this project we're going to be using machine learning to help us identify different breeds of dogs.

To do this, we'll be using data from the [Kaggle dog breed identification competition](https://www.kaggle.com/c/dog-breed-identification/overview). It consists of a collection of **10,000+ labelled images** of **120 different dog breeds.**

This kind of problem is called **multi-class image classification**. It's multi-class because we're trying to classify mutliple different breeds of dog.

Multi-class image classification is an important problem because it's the same kind of technology Tesla uses in their self-driving cars or Airbnb uses in automatically adding information to their listings.

Since the most important step in a deep learning problem is **getting the data ready** *(turning it into numbers)*, that's what we're going to start with.

---
### **We're going to go through the following TensorFlow/Deep Learning workflow:** 🚀

#### **1. Problem Definition**

We need to identify a Dog's breed based on it's image, classifying a dog into almost 120 different classes of breeds.

#### **2. Get data ready for Training**

- **A. Downloading and storing data (download from Kaggle, store, import):**\
- **B. Prepare the data *(preprocessing, the 3 sets i.e. train, test, validation And  Separating X & y).***

#### **3. Choose and fit/train a model.**

Based on the type of problem, either using a pretrained (TensorFlow Hub or other) model, or training one from scratch.\
([TensorFlow Hub](https://www.tensorflow.org/hub), `tf.keras.applications`, [TensorBoard](https://www.tensorflow.org/tensorboard), [EarlyStopping](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/EarlyStopping))

#### **4. Evaluating a model.**

Making predictions on **validation set**, comparing them with the ground truth labels, and according moving further.

#### **5. Improve the model through experimentation.**

Start with few 1000 images, make sure it works then increase the number of images accordingly.

#### **6. Save, sharing and reloading your model**
Saving the model once **reasonable results** are achieved.

---

### Getting our workspace ready

Before we get started, since we'll be using **TensorFlow 2.x** and **TensorFlow Hub (TensorFlow Hub)**. So make sure all the packages are installed as in [`requirements.txt`](./requirements.txt).

As we'll require a **GPU** for this intensive training, please use a **GPU** or Google Colab for free resources.\
All of the packages come pre-installed with it.


Now, let's start by downloading the data.

### **Getting data ready for Training**

##### **Downloading & Storing the data.**

Since much of machine learning is getting your data ready to be used with a machine learning model, we'll take extra care getting it setup.

There are a few ways we could do this. Many of them are detailed in the [Google Colab notebook on I/O (input and output)](https://colab.research.google.com/notebooks/io.ipynb).

And because the data we're using is hosted on Kaggle, we could even use the [Kaggle API](https://www.kaggle.com/docs/api).

This is great but what if the data you want to use wasn't on Kaggle?

**[Optional for colab]**\
One method is to **upload it to your Google Drive**, mount your drive in this notebook and import the file.

In [19]:
# Running this cell will provide you with a token to link your drive to this notebook
from google.colab import drive

drive.mount("/content/drive")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Following the prompts from the cell above, if everything worked, you should see a "drive" folder available under the Files tab.

This means we'll be able to access files in our Google Drive right in this notebook.

For this project, I've [downloaded the data from Kaggle](https://www.kaggle.com/c/dog-breed-identification/data) and uploaded it to my Google Drive as a .zip file under the folder "Data".

Or alternatively you can use Kaggle API ***(token required!)***

```python
mv "kaggle.json" "/root/.kaggle/" # move kaggle key to path

kaggle competitions download -c dog-breed-identification # download using API
```

Finally, to access it, we'll have to unzip it.\
***(Or unzip it locally)***

```python
# Use the '-d' parameter as the destination for where the files should go
!unzip "drive/My Drive/Data/dog-breed-identification.zip" -d "drive/My Drive/Data/"
```

*Note: Paths can differ.*

Now All of our data should be in `/data` folder.