# Setting up DVC in a fastai project
> Using DVC to manage your data, training params, models, and metrics in a fastai project.

## Example: Pets breed classification dataset


First off, let's take the most simple example there is to get started.

This notebook is the [first lesson from the fastai's course](https://github.com/fastai/fastai2/blob/master/nbs/course/lesson1-pets.ipynb), but with tracking metrics with DVC. Before doing anything else, I wanted to make sure I can train a basic model with fastai2.

In [1]:
%matplotlib inline

In [2]:
from fastai2.vision.all import *
set_seed(2)

In [None]:
bs = 64

In [None]:
path = untar_data(URLs.PETS); path

In [None]:
path_anno = path/'annotations'
path_img = path/'images'

In [None]:
fnames = get_image_files(path_img)
fnames

In [None]:
dls = ImageDataLoaders.from_name_re(
    path, fnames, pat=r'(.+)_\d+.jpg$', item_tfms=Resize(460), bs=bs,
    batch_tfms=[*aug_transforms(size=224, min_scale=0.75), Normalize.from_stats(*imagenet_stats)])

In [None]:
dls.show_batch(max_n=9, figsize=(7,6))

In [None]:
print(dls.vocab)
len(dls.vocab),dls.c

In [None]:
learn = cnn_learner(dls, resnet34, metrics=error_rate)

In [None]:
learn.fit_one_cycle(2, 1e-2)

In [None]:
learn.save('pets-resnet34-2epochs')

## Adding things to DVC in the example above

- Add data to DVC
- Add training as a stage to DVC
- Add evaluation as a stage to DVC
- Add metrics callbacks to store evaluation scalar metrics to DVC
- Add metrics callbacks to add continuous metrics to DVC

This will mean that we'll need to more certain parts of the example above into python files in `src`.

In [None]:
import shutil

### Adding data to DVC

Should be as easy as untar to a local directory (./data) and add the directory to DVC:

In [6]:
# Untar the data in our current project instead of ~/.fastai 
path = untar_data(URLs.PETS, dest="../", force_download=True); path

A new version of this dataset is available, downloading...


Path('../oxford-iiit-pet')

In [None]:
!dvc add ../data

## Using DVC to manage experiments

- dvc plots: visualising metrics for a certain experiment
- Changing the training script and tracking metrics change from that
- dvc plots diff: visualising metrics for a number of experiments
- Using params: changing the network architecture and tracking metrics change from that
- Using params: changing the epochs count and fine-tuning
- Using params: changing other hyperparameters

In [4]:
untar_data??