# Quickstart Guide

This guide will give a quick intro to training PyTorch models with HugsVision. We'll start by loading in some data and defining a model, then we'll train it for a few epochs and see how well it does.

**Note**: The easiest way to use this tutorial is as a colab notebook, which allows you to dive in with no setup. We recommend you enable a free GPU with

> **Runtime**   →   **Change runtime type**   →   **Hardware Accelerator: GPU**

**Note**: You need to have at least Python 3.6 to run the scripts.

## Install HugsVision

First we install HugsVision if needed. 

In [None]:
try:
    import hugsvision
except:
    !pip install -q hugsvision
    import hugsvision
    
print(hugsvision.__version__)

## Downloading Data

First, we need to download the dataset called `Kvasir Dataset v2` [here](https://datasets.simula.no/kvasir/) which weight around ~2.3 GB.

## Loading Data

Once it has been converted, we can start loading the data.

In [None]:
from hugsvision.dataio.VisionDataset import VisionDataset

train, test, id2label, label2id = VisionDataset.fromImageFolder(
	"./data/",
	test_ratio   = 0.15,
	balanced     = True,
	augmentation = True,
)

## Choose a image classifier model on HuggingFace

Now we can choose our base model on which we will perform a fine-tuning to make it fit our needs.

Our choices aren't very large since we haven't a lot of model available yet on HuggingFace for this task.

So, to be sure that the model will be compatible with `HugsVision` we need to have a model exported in `PyTorch` and compatible with the `image-classification` task obviously.

Models available with this criterias: https://huggingface.co/models?filter=pytorch&pipeline_tag=image-classification&sort=downloads

At the time I'am writing this, I recommand to use the following models:

*   `google/vit-base-patch16-224`
*   `facebook/deit-base-distilled-patch16-224`
*   `microsoft/beit-base-patch16-224`

In [None]:
huggingface_model = 'facebook/deit-base-distilled-patch16-224'

## Train the model

So, once the model choosen, we can start building the `Trainer` and start the fine-tuning:

In [None]:

from hugsvision.nnet.VisionClassifierTrainer import VisionClassifierTrainer
from transformers import DeiTFeatureExtractor, DeiTForImageClassification

trainer = VisionClassifierTrainer(
	model_name   = args.name,
	train      	 = train,
	test      	 = test,
	output_dir   = args.output,
	max_epochs   = args.epochs,
	batch_size   = 32, # On RTX 2080 Ti
	test_ratio   = 0.15,
    lr 		     = 2e-5,
	fp16	     = True,
	balanced     = True,
	augmentation = True,
	model = DeiTForImageClassification.from_pretrained(
	    huggingface_model,
	    num_labels = len(label2id),
	    label2id   = label2id,
	    id2label   = id2label
	),
	feature_extractor = DeiTFeatureExtractor.from_pretrained(
		huggingface_model,
	),
)

## Evaluate F1-Score

Using the F1-Score metrics will allow us to get a better representation of predictions for all the labels and find out if their are any anomalies wit ha specific label.

In [None]:
hyp, ref = trainer.evaluate_f1_score()

```
                        precision    recall  f1-score   support

    dyed-lifted-polyps       0.93      0.94      0.93       145
dyed-resection-margins       0.95      0.95      0.95       169
           esophagitis       0.86      0.86      0.86       126
          normal-cecum       0.99      0.99      0.99       144
        normal-pylorus       1.00      0.98      0.99       172
         normal-z-line       0.87      0.88      0.88       140
                polyps       0.98      0.99      0.98       146
    ulcerative-colitis       0.97      0.97      0.97       158

              accuracy                           0.95      1200
             macro avg       0.94      0.94      0.94      1200
          weighted avg       0.95      0.95      0.95      1200
```

## Make a prediction

Rename the `./out/MODEL_PATH/config.json` file present in the model output to `./out/MODEL_PATH/preprocessor_config.json`

In [None]:
import os.path
from transformers import DeiTFeatureExtractor, DeiTForImageClassification
from hugsvision.inference.VisionClassifierInference import VisionClassifierInference

path = "./out/KVASIR_V2_MODEL_DEIT/20_2021-08-20-01-46-44/model/"
img  = "../../../samples/kvasir_v2/dyed-lifted-polyps.jpg"

classifier = VisionClassifierInference(
    feature_extractor = DeiTFeatureExtractor.from_pretrained(path),
    model = DeiTForImageClassification.from_pretrained(path),
)

label = classifier.predict(img_path=img)
print("Predicted class:", label)