# Your First Image Classifier: Using k-NN to Classify Images
# Fetch Data

The purpose of this dataset is to correctly classify an image as containing a dog, cat, or panda.
Containing only 3,000 images, the Animals dataset is meant to be another **introductory** dataset
that we can quickly train a KNN model and obtain initial results (no so good accuracy) that has potential to be used as a baseline. 

Let's take the following steps:

1. Load libraries
2. Fetch raw data
3. Upload raw data to W&B

<center><img width="600" src="https://drive.google.com/uc?export=view&id=1a-nyAPNPiVh-Xb2Pu2t2p-BhSvHJS0pO"></center>

## Step 01: Setup

Start out by installing the experiment tracking library and setting up your free W&B account:


*   **pip install wandb** – Install the W&B library
*   **import wandb** – Import the wandb library
*   **wandb login** – Login to your W&B account so you can log all your metrics in one place

In [None]:
!pip install wandb -qU

In [None]:
import wandb
wandb.login()

### Download the code zip file

In [None]:
# download the dataset
!gdown https://drive.google.com/uc?id=1drh-JoatOlE26bdFQZ-dubj2A_RB8MNQ
!unzip -qq dataset.zip

### Import Packages

In [None]:
# import the necessary packages
from imutils import paths
import os
import logging

In [None]:
# configure logging
# reference for a logging obj
logger = logging.getLogger()

# set level of logging
logger.setLevel(logging.INFO)

# create handlers
c_handler = logging.StreamHandler()
c_format = logging.Formatter(fmt="%(asctime)s %(message)s",datefmt='%d-%m-%Y %H:%M:%S')
c_handler.setFormatter(c_format)

# add handler to the logger
logger.handlers[0] = c_handler

## Step 02: Upload raw data

In [None]:
# since we are using Jupyter Notebooks we can replace our argument
# parsing code with *hard coded* arguments and values
args = {
	"dataset": "animals",
  "project_name": "first_image_classifier",
  "artifact_name": "animals_raw_data"
}

In [None]:
run = wandb.init(entity="ivanovitch-silva",project=args["project_name"], job_type="fetch_data")

# create an artifact for all the raw data
raw_data = wandb.Artifact(args["artifact_name"], type="raw_data")

# grab the list of images that we'll be describing
logger.info("[INFO] loading images...")
imagePaths = list(paths.list_images(args["dataset"]))

# append all images to the artifact
for img in imagePaths:
  "animals/dogs/dogs_0001.jpg > dogs/dogs_0001.jpg"
  label = img.split(os.path.sep)
  raw_data.add_file(img, name=os.path.join(label[-2],label[-1]))

# save artifact to W&B
run.log_artifact(raw_data)
run.finish()