# DVC Options - Metrics and Plots
## The DVC way

In the following part we will look into metrics and plots from ZnTrack Nodes.
All `dvc run` options listed [here](https://dvc.org/doc/command-reference/run#options) can be used via `dvc.<option>`.
With the exception of params, which is handled automatically.
All these options take either `str` or `pathlib.Path` directed to the file the content should be stored in.
As shown before, `dvc.deps` can also take another `Node` as an argument.

In [1]:
from zntrack import Node, dvc, zn, config
from pathlib import Path
import json
import pandas as pd
import numpy as np

In [2]:
config.nb_name = "04_metrics_and_plots.ipynb"

In [3]:
from zntrack.utils import cwd_temp_dir

temp_dir = cwd_temp_dir()

In [4]:
!git init
!dvc init

Initialized empty Git repository in /tmp/tmpkx5zm9qj/.git/
Initialized DVC repository.

You can now commit the changes to git.

[31m+---------------------------------------------------------------------+
[0m[31m|[0m                                                                     [31m|[0m
[31m|[0m        DVC has enabled anonymous aggregate usage analytics.         [31m|[0m
[31m|[0m     Read the analytics documentation (and how to opt-out) here:     [31m|[0m
[31m|[0m             <[36mhttps://dvc.org/doc/user-guide/analytics[39m>              [31m|[0m
[31m|[0m                                                                     [31m|[0m
[31m+---------------------------------------------------------------------+
[0m
[33mWhat's next?[39m
[33m------------[39m
- Check out the documentation: <[36mhttps://dvc.org/doc[39m>
- Get help and share ideas: <[36mhttps://dvc.org/chat[39m>
- Star us on GitHub: <[36mhttps://github.com/iterative/dvc[39m>
[0m

In the following we define a simple Node that produces a metric and a plot output using `json` and `pandas`.
We will queue multiple experiments with different outputs and then compare them afterwards.
With `@Node(silent=True)` we can reduce the amount of logs that will be displayed.

In [5]:
class MetricAndPlot(Node):
    my_metric: Path = dvc.metrics(Path("my_metric.json"))
    my_plots: Path = dvc.plots("my_plots.csv")
    pre_factor = zn.params()

    def __init__(self, pre_factor=1.0, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.pre_factor = pre_factor

    def run(self):
        self.my_metric.write_text(
            json.dumps(
                {"metric_1": 17 * self.pre_factor, "metric_2": 42 * self.pre_factor}
            )
        )

        x_data = np.linspace(0, 1.0 * self.pre_factor, 1000)
        y_data = np.exp(x_data)
        df = pd.DataFrame({"y": y_data, "x": x_data}).set_index("x")

        df.to_csv(self.my_plots)

In [6]:
MetricAndPlot().write_graph(silent=True)
!dvc repro
!git add .
!git commit -m "First Run"

Submit issues to https://github.com/zincware/ZnTrack.
Running stage 'MetricAndPlot':                                        core[39m>
> python3 -c "from src.MetricAndPlot import MetricAndPlot; MetricAndPlot.load(name='MetricAndPlot').run_and_save()" 
Generating lock file 'dvc.lock'                                                 
Updating lock file 'dvc.lock'

To track the changes with git, run:

	git add dvc.lock
Use `dvc push` to send your updates to remote storage.
[0m[master (root-commit) f7bbc73] First Run
 17 files changed, 1145 insertions(+)
 create mode 100644 .dvc/.gitignore
 create mode 100644 .dvc/config
 create mode 100644 .dvc/plots/confusion.json
 create mode 100644 .dvc/plots/confusion_normalized.json
 create mode 100644 .dvc/plots/linear.json
 create mode 100644 .dvc/plots/scatter.json
 create mode 100644 .dvc/plots/simple.json
 create mode 100644 .dvc/plots/smooth.json
 create mode 100644 .dvcignore
 create mode 100644 .gitignore
 create mode 100644 04_metrics_and_pl

In [7]:
MetricAndPlot(pre_factor=2).write_graph(silent=True)
!dvc exp run --queue --name "factor_2"
MetricAndPlot(pre_factor=3).write_graph(silent=True)
!dvc exp run --queue --name "factor_3"
MetricAndPlot(pre_factor=4).write_graph(silent=True)
!dvc exp run --queue --name "factor_4"
MetricAndPlot(pre_factor=5).write_graph(silent=True)
!dvc exp run --queue --name "factor_5"

Submit issues to https://github.com/zincware/ZnTrack.
Queued experiment '161b0f1' for future execution.                     core[39m>
Submit issues to https://github.com/zincware/ZnTrack.
Queued experiment '6025cd2' for future execution.                     core[39m>
Submit issues to https://github.com/zincware/ZnTrack.
Queued experiment 'b73c938' for future execution.                     core[39m>
Submit issues to https://github.com/zincware/ZnTrack.
Queued experiment 'bae8465' for future execution.                     core[39m>
[0m

In [8]:
!dvc exp run --run-all -j 4

  0% Checkout|                                       |0/2 [00:00<?,     ?file/s]
![A
  0%|          |.8GA8Rn2S9BGqTFRPiTvEHQ.tmp           0/1 [00:00<?,       ?it/s][A
                                                                                [A
![A
  0%|          |.WRrVAwRm5pFPtWo6Ly7vVN.tmp          0/36 [00:00<?,       ?it/s][A
  0% Checkout|                                       |0/2 [00:00<?,     ?file/s][A
![A
  0%|          |.AVUiArE9zyygGe6bGLga9o.tmp           0/1 [00:00<?,       ?it/s][A
                                                                                [A
![A
  0%|          |.ETTY6EVTE3thEsDoEjRaZP.tmp          0/36 [00:00<?,       ?it/s][A
                                                                                [A
![A
  0%|          |.bok6bxm5a2kGW7RV7A7w5t.tmp           0/1 [00:00<?,       ?it/s][A
                                                                                [A
![A
  0%|          |.kbbsVbfXm7jcqAxz7Bw7mV.tmp      

Now that all experiments are done, we can look at the metrics directly with `dvc exp show` or `dvc metrics show/diff`

In [9]:
!dvc exp show --csv --no-timestamp > exp_show.csv
pd.read_csv("exp_show.csv", index_col=0)

                                                                      core[39m>

Unnamed: 0_level_0,rev,typ,parent,metric_1,metric_2,MetricAndPlot.pre_factor
Experiment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
,workspace,baseline,,17.0,42.0,5.0
master,f7bbc73,baseline,,17.0,42.0,1.0
factor_3,22de2fb,branch_commit,,51.0,126.0,3.0
factor_5,5b0a0bc,branch_commit,,85.0,210.0,5.0
factor_2,c27f1f8,branch_commit,,34.0,84.0,2.0
factor_4,be0604e,branch_base,,68.0,168.0,4.0


We can also use `dvc plots show/diff` to evaluate the plot data that we produced.

In [16]:
!dvc plots diff HEAD factor_2 factor_3 factor_4 factor_5

file:///tmp/tmpkx5zm9qj/dvc_plots/index.html
[0m

## The ZnTrack way

ZnTrack provides and easier way to handle metrics. Similar to `zn.outs()` which does not require defining a path to outs file, one can use `zn.metrics`.
The same is possible for plots via `zn.plots()` which requires a `pd.DataFrame` with a defined index name.

In [19]:
class ZnTrackMetric(Node):
    my_metric = zn.metrics()
    my_plot = zn.plots()

    def run(self):
        self.my_metric = {"alpha": 1.0, "beta": 0.00473}
        self.my_plot = pd.DataFrame({"val": np.sin(np.linspace(0, 3.14, 100))})
        self.my_plot.index.name = "index" # For DVC it is required that the index has a column name

ZnTrackMetric().write_graph(no_exec=False)

Submit issues to https://github.com/zincware/ZnTrack.


[NbConvertApp] Converting notebook 04_metrics_and_plots.ipynb to script




[NbConvertApp] Writing 3981 bytes to 04_metrics_and_plots.py


2022-01-21 10:03:12,824 (INFO): Stage 'ZnTrackMetric' is cached - skipping run, checking out outputs
Modifying stage 'ZnTrackMetric' in 'dvc.yaml'

To track the changes with git, run:

	git add dvc.yaml



In [20]:
!dvc exp show --csv --no-timestamp > exp_show.csv
pd.read_csv("exp_show.csv", index_col=0)

                                                                      core[39m>

Unnamed: 0_level_0,rev,typ,parent,metric_1,metric_2,my_metric.alpha,my_metric.beta,MetricAndPlot.pre_factor
Experiment,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
,workspace,baseline,,85.0,210.0,1.0,0.00473,5.0
master,f7bbc73,baseline,,17.0,42.0,1.0,,
factor_2,c27f1f8,branch_commit,,34.0,84.0,2.0,,
factor_5,5b0a0bc,branch_commit,,85.0,210.0,5.0,,
factor_4,be0604e,branch_commit,,68.0,168.0,4.0,,
factor_3,22de2fb,branch_base,,51.0,126.0,3.0,,


In [21]:
!dvc plots show

file:///tmp/tmpkx5zm9qj/dvc_plots/index.html
[0m

In [22]:
temp_dir.cleanup()