# DVC Options - Metrics and Plots
## The DVC way

In the following part we will look into metrics and plots from ZnTrack Nodes.
All `dvc run` options listed [here](https://dvc.org/doc/command-reference/run#options) can be used via `dvc.<option>`.
With the exception of params, which is handled automatically.
All these options take either `str` or `pathlib.Path` directed to the file the content should be stored in.
As shown before, `dvc.deps` can also take another `Node` as an argument.

In [1]:
from zntrack import Node, dvc, zn, config
from pathlib import Path
import json
import pandas as pd
import numpy as np

In [2]:
config.nb_name = "04_metrics_and_plots.ipynb"

In [3]:
from zntrack.utils import cwd_temp_dir

temp_dir = cwd_temp_dir()

In [4]:
!git init
!dvc init

Initialized empty Git repository in C:/Users/fabia/AppData/Local/Temp/tmps4fejpcl/.git/
Initialized DVC repository.

You can now commit the changes to git.

+---------------------------------------------------------------------+
|                                                                     |
|        DVC has enabled anonymous aggregate usage analytics.         |
|     Read the analytics documentation (and how to opt-out) here:     |
|             <https://dvc.org/doc/user-guide/analytics>              |
|                                                                     |
+---------------------------------------------------------------------+

What's next?
------------
- Check out the documentation: <https://dvc.org/doc>
- Get help and share ideas: <https://dvc.org/chat>
- Star us on GitHub: <https://github.com/iterative/dvc>


In the following we define a simple Node that produces a metric and a plot output using `json` and `pandas`.
We will queue multiple experiments with different outputs and then compare them afterwards.
With `Node.write_graph(silent=True)` we can reduce the amount of logs that will be displayed.

In [5]:
class MetricAndPlot(Node):
    my_metric: Path = dvc.metrics(Path("my_metric.json"))
    my_plots: Path = dvc.plots("my_plots.csv")
    pre_factor = zn.params()

    def __init__(self, pre_factor=1.0, **kwargs):
        super().__init__(**kwargs)
        self.pre_factor = pre_factor

    def run(self):
        self.my_metric.write_text(
            json.dumps(
                {"metric_1": 17 * self.pre_factor, "metric_2": 42 * self.pre_factor}
            )
        )

        x_data = np.linspace(0, 1.0 * self.pre_factor, 1000)
        y_data = np.exp(x_data)
        df = pd.DataFrame({"y": y_data, "x": x_data}).set_index("x")

        df.to_csv(self.my_plots)

In [6]:
MetricAndPlot().write_graph(silent=True)
!dvc repro
!git add .
!git commit -m "First Run"

Running stage 'MetricAndPlot':
> -- my_metric.json --plots my_plots.csv python -c "from src.MetricAndPlot import MetricAndPlot; MetricAndPlot.load(name='MetricAndPlot').run_and_save()" 


Der Befehl "--" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
ERROR: failed to reproduce 'dvc.yaml': failed to run: -- my_metric.json --plots my_plots.csv python -c "from src.MetricAndPlot import MetricAndPlot; MetricAndPlot.load(name='MetricAndPlot').run_and_save()" , exited with 1


[master (root-commit) d9de2e3] First Run
 14 files changed, 1147 insertions(+)
 create mode 100644 .dvc/.gitignore
 create mode 100644 .dvc/config
 create mode 100644 .dvc/plots/confusion.json
 create mode 100644 .dvc/plots/confusion_normalized.json
 create mode 100644 .dvc/plots/linear.json
 create mode 100644 .dvc/plots/scatter.json
 create mode 100644 .dvc/plots/simple.json
 create mode 100644 .dvc/plots/smooth.json
 create mode 100644 .dvcignore
 create mode 100644 04_metrics_and_plots.ipynb
 create mode 100644 dvc.yaml
 create mode 100644 params.yaml
 create mode 100644 src/MetricAndPlot.py
 create mode 100644 zntrack.json


In [7]:
MetricAndPlot(pre_factor=2).write_graph(silent=True)
!dvc exp run --queue --name "factor_2"
MetricAndPlot(pre_factor=3).write_graph(silent=True)
!dvc exp run --queue --name "factor_3"
MetricAndPlot(pre_factor=4).write_graph(silent=True)
!dvc exp run --queue --name "factor_4"
MetricAndPlot(pre_factor=5).write_graph(silent=True)
!dvc exp run --queue --name "factor_5"

Queued experiment '92dde7d' for future execution.
Queued experiment '910fab6' for future execution.
Queued experiment '8f5e874' for future execution.
Queued experiment '33221ba' for future execution.


In [8]:
!dvc exp run --run-all -j 4

Running stage 'MetricAndPlot':

Der Befehl "--" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
ERROR: failed to reproduce 'dvc.yaml': failed to run: -- my_metric.json --plots my_plots.csv python -c "from src.MetricAndPlot import MetricAndPlot; MetricAndPlot.load(name='MetricAndPlot').run_and_save()" , exited with 1
Der Befehl "--" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
ERROR: failed to reproduce 'dvc.yaml': failed to run: -- my_metric.json --plots my_plots.csv python -c "from src.MetricAndPlot import MetricAndPlot; MetricAndPlot.load(name='MetricAndPlot').run_and_save()" , exited with 1
Der Befehl "--" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
Der Befehl "--" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
ERROR: failed to reproduce 'dvc.yaml': failed to run: -- my_metric.json --plots my_plots.csv python -c "from src.MetricAndPlot import MetricAndPlot; MetricAndPlot.load(name='MetricAndPlot').run_and_save()" , exited with 1
ER


> -- my_metric.json --plots my_plots.csv python -c "from src.MetricAndPlot import MetricAndPlot; MetricAndPlot.load(name='MetricAndPlot').run_and_save()" 
Running stage 'MetricAndPlot':
> -- my_metric.json --plots my_plots.csv python -c "from src.MetricAndPlot import MetricAndPlot; MetricAndPlot.load(name='MetricAndPlot').run_and_save()" 
Running stage 'MetricAndPlot':
> -- my_metric.json --plots my_plots.csv python -c "from src.MetricAndPlot import MetricAndPlot; MetricAndPlot.load(name='MetricAndPlot').run_and_save()" 
Running stage 'MetricAndPlot':
> -- my_metric.json --plots my_plots.csv python -c "from src.MetricAndPlot import MetricAndPlot; MetricAndPlot.load(name='MetricAndPlot').run_and_save()" 


Now that all experiments are done, we can look at the metrics directly with `dvc exp show` or `dvc metrics show/diff`

In [9]:
!dvc exp show --csv > exp_show.csv
pd.read_csv("exp_show.csv", index_col=0)

Unnamed: 0_level_0,rev,typ,Created,parent,State,MetricAndPlot.pre_factor
Experiment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
,workspace,baseline,,,,5.0
master,d9de2e3,baseline,2022-02-17T16:04:58,,,1.0
factor_5,33221ba,branch_commit,2022-02-17T16:05:19,,Queued,5.0
factor_4,8f5e874,branch_commit,2022-02-17T16:05:13,,Queued,4.0
factor_3,910fab6,branch_commit,2022-02-17T16:05:08,,Queued,3.0
factor_2,92dde7d,branch_base,2022-02-17T16:05:03,,Queued,2.0


We can also use `dvc plots show/diff` to evaluate the plot data that we produced.

In [10]:
!dvc plots diff HEAD factor_2 factor_3 factor_4 factor_5

ERROR: unknown Git revision 'factor_2'


## The ZnTrack way

ZnTrack provides and easier way to handle metrics. Similar to `zn.outs()` which does not require defining a path to outs file, one can use `zn.metrics`.
The same is possible for plots via `zn.plots()` which requires a `pd.DataFrame` with a defined index name.

In [11]:
class ZnTrackMetric(Node):
    my_metric = zn.metrics()
    my_plot = zn.plots()

    def run(self):
        self.my_metric = {"alpha": 1.0, "beta": 0.00473}
        self.my_plot = pd.DataFrame({"val": np.sin(np.linspace(0, 3.14, 100))})
        self.my_plot.index.name = (  # For DVC it is required that the index has a column name
            "index"
        )


ZnTrackMetric().write_graph(no_exec=False)

Submit issues to https://github.com/zincware/ZnTrack.
2022-02-17 16:05:29,952 utils (INFO): Running stage 'ZnTrackMetric':
> python -c "from src.ZnTrackMetric import ZnTrackMetric; ZnTrackMetric.load(name='ZnTrackMetric').run_and_save()" 
Adding stage 'ZnTrackMetric' in 'dvc.yaml'
Generating lock file 'dvc.lock'
Updating lock file 'dvc.lock'

To track the changes with git, run:

    git add dvc.yaml dvc.lock

To enable auto staging, run:

	dvc config core.autostage true



In [12]:
!dvc exp show --csv > exp_show.csv
pd.read_csv("exp_show.csv", index_col=0)

Unnamed: 0_level_0,rev,typ,Created,parent,State,my_metric.alpha,my_metric.beta,MetricAndPlot.pre_factor
Experiment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
,workspace,baseline,,,,1.0,0.00473,5.0
master,d9de2e3,baseline,2022-02-17T16:04:58,,,,,1.0
factor_5,33221ba,branch_commit,2022-02-17T16:05:19,,Queued,,,5.0
factor_4,8f5e874,branch_commit,2022-02-17T16:05:13,,Queued,,,4.0
factor_3,910fab6,branch_commit,2022-02-17T16:05:08,,Queued,,,3.0
factor_2,92dde7d,branch_base,2022-02-17T16:05:03,,Queued,,,2.0


In [13]:
!dvc plots show

file:///C:/Users/fabia/AppData/Local/Temp/tmps4fejpcl/dvc_plots/index.html


In [None]:
temp_dir.cleanup()