# Lightweight Development Pipelines with DVC

In this notebook we will highlight important elements of DVC. You can find extensive information for dvc on their [website](https://dvc.org).

As a showcase we will implement a simple classification pipeline.

### Some Preparations

In [None]:
# --no-scm because we don't want to interfere with the workshops' git
!dvc init -f --no-scm

Optional: We add a new remote storage (could be S3, GCS, SSH, ...)

In [None]:
!dvc remote add -d -f local_storage /tmp/dvc_introduction

Let's check our current status. Attention: DVC does not have a sophisticated git-like `stage area`, but a cache-directory, that is being synced with the remote.

In [None]:
!dvc status -c

That wasn't too surprising...

We can either add files to our DVC versioning by manually adding them or implicitly in a pipeline.

### Building a Pipeline

In [None]:
!mkdir output-introduction -p

In [None]:
%%sh 
dvc run -n configure \
        -d dvc_introduction.py \
        -o output-introduction/config.pickle \
        python dvc_introduction.py configure output-introduction/config.pickle

In [None]:
%%sh 
dvc run -n train \
        -d dvc_introduction.py \
        -d output-introduction/config.pickle \
        -d ../../fruits \
        -o output-introduction/model.h5 \
        python dvc_introduction.py train_model ../../fruits output-introduction/config.pickle output-introduction/model.h5

In [None]:
%%sh 
dvc run -n export \
        -d dvc_introduction.py \
        -d output-introduction/model.h5 \
        -O ../models/fruits/2 \
        python dvc_introduction.py export output-introduction/model.h5 ../models/fruits/2

### Inspecting and Modifying a Pipeline 

In [None]:
!dvc dag

In [None]:
!dvc status -c

In [None]:
!dvc push

In [None]:
!dvc repro

Let's modify a file and reproduce our pipeline!

#### New Features

Get a file from another (external) git+DVC repository.

In [None]:
!dvc get https://github.com/iterative/example-get-started model.pkl

In [None]:
!rm model.pkl

Get a file *including* its .dvc file from another (external) git+DVC repository.

In [None]:
!dvc import https://github.com/iterative/example-get-started model.pkl

In [None]:
!cat model.pkl.dvc

#### Clean-up

In [None]:
%%sh
dvc destroy -f
rm -rf model.pkl
rm -rf output-introduction
rm -rf /tmp/dvc_introduction