Skip to content

Conversation

@pared
Copy link
Contributor

@pared pared commented Dec 7, 2021

Fixes: #6389

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

Changes in this PR:

  • .dvc/plots will no longer be created upon dvc init
  • user has to explicitly call dvc plots templates to create and populate .dvc/plots

TODO

  • docs

@pared pared requested a review from a team as a code owner December 7, 2021 10:34
@pared pared requested a review from skshetry December 7, 2021 10:34
@pared pared force-pushed the 6389_optional_plots branch from 2d22f8e to 82016e9 Compare December 7, 2021 15:00
@pared pared requested a review from dberenbaum December 7, 2021 15:10
@pared pared added A: plots Related to the plots enhancement Enhances DVC labels Dec 7, 2021
Copy link
Contributor

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems like this has two uses:

  1. Make it easy to see and modify templates.
  2. Get updated templates if the ones currently in .dvc/plots are outdated.

Is that what you have in mind @pared?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to be able to specify a different path to dump to? Maybe I am worried about overwriting what I have now, or I've made modifications to the defaults.

Also, can I dump just one template? This seems less necessary, but it might be nice to not have to generate all the templates if I only want one, especially if the number of templates grows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to extend the functionality of templates, though it raises another questions.

  1. If we want to specify particular template, do we want any means of obtaining templates available for this particular repo? I think so, that would make the user experience much better than circling between terminal and docs.
  2. If we agree in 1, should templates have subcommands? eg (write and list)? That would make sense and be unified with our other commands.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When dumping targets, I was thinking of the default templates, not any file, since otherwise it's just copying a file, right? Is there a use case you had in mind? I see that it works to do something like dvc plots template linear, but it's not that clear from the help description that targets doesn't have to specify existing files. If I already have linear.json in .dvc/plots, will dvc templates linear dump the internal version or the one on file?

Having a way to list templates seems great. Maybe --list can be an option to avoid subcommands?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should throw an exception if we are trying to dump a template that is different from the existing one under the given path. Do you want a new one? Fine, remove the old one manually. No chance of accidental removal of custom changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use case you had in mind?

Not exactly, just wanted to clarify

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I already have linear.json in .dvc/plots, will dvc templates linear dump the internal version or the one on file?

So this will fail because it will overwrite .dvc/plots/linear.json, correct? What about dvc templates linear -o new_linear.json? Which template version will it dump?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this will fail because it will overwrite .dvc/plots/linear.json, correct?

It will fail only if the content of the template differs.

dvc templates linear -o new_linear.json?

About that, another question: do we want to specify particular file? Currently it is implemented that out is dir to dump templates to (to be able to dump few at once). But I guess it makes more sense to specify particular templates and particular paths.

Which template version will it dump?

The one that is loaded from DVC code, so the newest one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will fail only if the content of the template differs.

Sounds good, although it also seems like more work for you to check if the contents are the same. It also seems fine to me to fail anytime a file would be overwritten.

do we want to specify particular file?

No, sorry, I got confused. No need to specify a particular file.

Which template version will it dump?

The one that is loaded from DVC code, so the newest one.

So will dvc plots templates always operate on the built-in templates? Maybe we should specify in the help description if so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, sorry, I got confused. No need to specify a particular file.

I probably should explaining it in option description

So will dvc plots templates always operate on the built-in templates? Maybe we should specify in the help description if so.

Will do

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Dec 9, 2021

What do we think about renaming plots dump to plots templates or something else? Is "dumping" the right concept? What plots are we dumping? TBH "dump" sounds like data or logs-related to me (in general).

@pared pared force-pushed the 6389_optional_plots branch from f3d5ef0 to 671ced2 Compare December 16, 2021 13:09
@shcheklein
Copy link
Contributor

@pared @dberenbaum qq - how will the workflow with plots look like? how will this affect docs?

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Dec 16, 2021

I think you'll have to "dump" the templates first before you can use dvc plots commands with data series files. So I guess those commands will throw an ERROR with exact hint of what to do if no templates exist. Or maybe they can include this command automatically on first use and just keep plots dump as an optional helper (I'd still rename it plots templates though).

@dberenbaum
Copy link
Contributor

You should not have to dump the templates. The default templates are stored internally, so these should not be breaking changes. The goal is to provide a mechanism to update plots without breaking current functionality.

dvc init will no longer generate templates in .dvc/plots. For new repos, plots will default to the templates in the installed dvc version instead of the dvc version used during dvc init. Templates with the same name will be prioritized in the order of 1) explicit paths, 2) .dvc/plots, 3) internal defaults. The templates command allows users with old .dvc/plots templates to upgrade, as well as maintaining the current ability to view/edit those templates.

@pared Please correct any mistakes/oversights!

@pared
Copy link
Contributor Author

pared commented Dec 20, 2021

@dberenbaum that's precisely what this change is about

@pared pared force-pushed the 6389_optional_plots branch from 8d5d1e2 to 02d473c Compare December 20, 2021 15:53
@pared pared changed the title plots: make templates generation explicit [WIP] plots: make templates generation explicit Dec 20, 2021
@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Dec 22, 2021

The default templates are stored internally

"Internally" as in inside a DVC package installation directory? Something like /etc/dvc/plots ? oic

(This) command allows users with old .dvc/plots templates to upgrade

Wait. So the system-level plots templates are being updated compared to the repo-level ones as well? If so how will users know that there's newer templates available and why/how to get them? We could just tell them to rm -d .dvc/plots instead.

Sorry, still unclear to me why you'd ever need to dump default templates if they're always available in the installation files. We should document the system plot templates location anyway, so people can just cp them if needed.

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Dec 22, 2021

unclear to me why you'd ever need to dump default templates

I guess as a helper to avoid figuring out the system file location on your platform?

If we do keep this, maybe plots template could accept an optional argument i.e. default (would be the default), scatter, smooth, etc. so you can only dump the specific one you want to customize. Not sure

@dberenbaum
Copy link
Contributor

"Internally" as in inside a DVC package installation directory? Something like /etc/dvc/plots ? oic

That was the initial plan, but it caused difficulties with packaging. Instead, they are composed within the code, and there's no path where the template exists as a JSON file.

We could just tell them to rm -d .dvc/plots instead.

I guess as a helper to avoid figuring out the system file location on your platform?

Yup, you got it. As mentioned above, it's unfortunately difficult to keep copies of the templates in some easy-to-find system file location, so there's no way to see the templates without some command like this.

If we do keep this, maybe plots template could accept an optional argument i.e. default (would be the default), scatter, smooth, etc. so you can only dump the specific one you want to customize. Not sure

Yeah, this has been added as well.

@jorgeorpinel
Copy link
Contributor

they are composed within the code, and there's no path where the template exists as a JSON file

oic!

No more doubts on my end. Except that I still wonder how/if we'll message users of existing projects on older DVC versions on how to upgrade and if so (or as docs in general) I suggest we recommend the rm .dvc/plots route instead of this command.

@dberenbaum
Copy link
Contributor

I suggest we recommend the rm .dvc/plots route instead of this command.

Right, we were still working through this when the PR first opened. In fact, they will need to do this since dvc plots templates will fail to overwrite an old version of the template (see #7108 (comment)).

However, if they (or users with new repos) want to see what the current templates look like (or to modify them), they will need this command.

@pared pared force-pushed the 6389_optional_plots branch 4 times, most recently from 1e8f5f1 to 63b1db7 Compare January 6, 2022 14:29
@pared pared changed the title [WIP] plots: make templates generation explicit plots: make templates generation explicit Jan 6, 2022
@pared
Copy link
Contributor Author

pared commented Jan 6, 2022

@dberenbaum The change is ready.
I have a feeling that specifying directory as -o is a bit clunky, but I don't have a better idea if we don't want to make users dump the templates one by one. And in the current state, there is a choice. Also, plot show itself also requires dir as output (since the static page is now generated as dir), so it does make sense.

Copy link
Contributor

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the actual built-in templates as choices here? Right now, it dumps an empty directory if passed an invalid template. Ideally, it would fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, though I am not sure it will be able to handle all as default, I will see what I can do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to make nargs="?" (only allow one template) to simplify. What do you think? In that case, I would adjust the help text:

Suggested change
help="Templates to write. Writes all templates by default.",
help="Template to write. Writes all templates by default.",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the built-in templates show up as choices above then we might not need this option at all, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed for now, check #7108 (comment) for my current proposition for this problem

@pared pared force-pushed the 6389_optional_plots branch from 5ddee73 to 0f04107 Compare January 7, 2022 11:35
@pared pared force-pushed the 6389_optional_plots branch from a4a03eb to d0f1e8f Compare January 10, 2022 17:08
@dberenbaum
Copy link
Contributor

@pared What's the status for this?

@pared
Copy link
Contributor Author

pared commented Jan 25, 2022

Got distracted with flexible plots and support, will try to finish it this week.

@pared pared force-pushed the 6389_optional_plots branch from d0f1e8f to 8b22406 Compare January 25, 2022 11:42
@skshetry skshetry removed their request for review January 27, 2022 03:53
@pared pared force-pushed the 6389_optional_plots branch from 8b22406 to 8863876 Compare January 27, 2022 11:35
@pared
Copy link
Contributor Author

pared commented Jan 27, 2022

@dberenbaum
So the problem with choices is that it will be hard to implement default value use case: we would need to provide empty string or none as one of the options.

I have been playing a bit with it and the nicest looking thing that I was able to make work is currently at the master:

  1. default:
(dvc3.8) ➜  dvc git:(6389_optional_plots) βœ— dvc plots templates                                                                   0.000s 
Templates have been written into '.dvc/plots'.
(dvc3.8) ➜  dvc git:(6389_optional_plots) βœ— tree .dvc/plots                                                                       0.000s 
.dvc/plots
β”œβ”€β”€ confusion.json
β”œβ”€β”€ confusion_normalized.json
β”œβ”€β”€ linear.json
β”œβ”€β”€ scatter.json
β”œβ”€β”€ simple.json
└── smooth.json

0 directories, 6 files
(dvc3.8) ➜  dvc git:(6389_optional_plots) βœ—      
  1. multiple choices available:
(dvc3.8) ➜  dvc git:(6389_optional_plots) dvc plots templates simple linear                                                       0.000s 
Templates have been written into '.dvc/plots'.
(dvc3.8) ➜  dvc git:(6389_optional_plots) βœ— tree .dvc/plots                                                                       0.000s 
.dvc/plots
β”œβ”€β”€ linear.json
└── simple.json

0 directories, 2 files
(dvc3.8) ➜  dvc git:(6389_optional_plots) βœ—        
  1. help:
(dvc3.8) ➜  dvc git:(6389_optional_plots) βœ— dvc plots templates --help                                                            0.000s 
usage: dvc plots templates [-h] [-q | -v] [-o <path>]
                           [['simple', 'linear', 'confusion', 'confusion_normalized', 'scatter', 'smooth']
                           [['simple', 'linear', 'confusion', 'confusion_normalized', 'scatter', 'smooth'] ...]]

Write built-in plots templates to a directory (.dvc/plots by default).
Documentation: <https://man.dvc.org/plots/templates>

positional arguments:
  ['simple', 'linear', 'confusion', 'confusion_normalized', 'scatter', 'smooth']  Templates to write. Writes all templates by default.

optional arguments:
  -h, --help            show this help message and exit
  -q, --quiet           Be quiet.
  -v, --verbose         Be verbose.
  -o <path>, --out <path>
                        Directory to save templates to.
(dvc3.8) ➜  dvc git:(6389_optional_plots) βœ—             
  1. wrong argument:
(dvc3.8) ➜  dvc git:(6389_optional_plots) dvc plots templates wrong_template simple                                               0.000s 
ERROR: argument ['simple', 'linear', 'confusion', 'confusion_normalized', 'scatter', 'smooth']: Incorrect template name: 'wrong_template' not in 'simple, linear, confusion, confusion_normalized, scatter, smooth'
usage: dvc plots templates [-h] [-q | -v] [-o <path>]
                           [['simple', 'linear', 'confusion', 'confusion_normalized', 'scatter', 'smooth']
                           [['simple', 'linear', 'confusion', 'confusion_normalized', 'scatter', 'smooth'] ...]]

Write built-in plots templates to a directory (.dvc/plots by default).
Documentation: <https://man.dvc.org/plots/templates>

positional arguments:
  ['simple', 'linear', 'confusion', 'confusion_normalized', 'scatter', 'smooth']  Templates to write. Writes all templates by default.

optional arguments:
  -h, --help            show this help message and exit
  -q, --quiet           Be quiet.
  -v, --verbose         Be verbose.
  -o <path>, --out <path>
                        Directory to save templates to.
(dvc3.8) ➜  dvc git:(6389_optional_plots)             

Comment on lines +375 to +380
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to add the names of the templates manually to avoid importing plots on every cmd use. This test has been created to make sure plots module and templates command are in sync.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, nice idea!

@dberenbaum
Copy link
Contributor

Looks good @pared!

I was thinking in #7108 (comment) that if you only allow one template (which seems reasonable to me), then it could look like:

    plots_templates_parser.add_argument(
        "targets",
        nargs="?",
        choices=CmdPlotsTemplates.TEMPLATES_CHOICES,
        help="Template to write. Writes all templates by default.",
    )

I don't think that requires you to manually check for valid template names (although you still need TEMPLATE_CHOICES). It also keeps the outputs a little simpler.

Multiple choices not supported:

dvc plots templates simple linear
ERROR: unrecognized arguments: linear
usage: dvc plots templates [-h] [-q | -v] [-o <path>]
                           [{simple,linear,confusion,confusion_normalized,scatter,smooth}]

Write built-in plots templates to a directory (.dvc/plots by default).
Documentation: <https://man.dvc.org/plots/templates>

positional arguments:
  {simple,linear,confusion,confusion_normalized,scatter,smooth}
                        Template to write. Writes all templates by default.

optional arguments:
  -h, --help            show this help message and exit
  -q, --quiet           Be quiet.
  -v, --verbose         Be verbose.
  -o <path>, --out <path>
                        Directory to save templates to.

Help:

dvc plots templates --help
usage: dvc plots templates [-h] [-q | -v] [-o <path>]
                           [{simple,linear,confusion,confusion_normalized,scatter,smooth}]

Write built-in plots templates to a directory (.dvc/plots by default).
Documentation: <https://man.dvc.org/plots/templates>

positional arguments:
  {simple,linear,confusion,confusion_normalized,scatter,smooth}
                        Templates to write. Writes all templates by default.

optional arguments:
  -h, --help            show this help message and exit
  -q, --quiet           Be quiet.
  -v, --verbose         Be verbose.
  -o <path>, --out <path>
                        Directory to save templates to.

Wrong argument:

dvc plots templates wrong_template
ERROR: argument targets: invalid choice: 'wrong_template' (choose from 'simple', 'linear', 'confusion', 'confusion_normalized', 'scatter', 'smooth')
usage: dvc plots templates [-h] [-q | -v] [-o <path>]
                           [{simple,linear,confusion,confusion_normalized,scatter,smooth}]

Write built-in plots templates to a directory (.dvc/plots by default).
Documentation: <https://man.dvc.org/plots/templates>

positional arguments:
  {simple,linear,confusion,confusion_normalized,scatter,smooth}
                        Template to write. Writes all templates by default.

optional arguments:
  -h, --help            show this help message and exit
  -q, --quiet           Be quiet.
  -v, --verbose         Be verbose.
  -o <path>, --out <path>
                        Directory to save templates to.

Now that you have done it this way, I'm fine with either. Do you prefer keeping the option for multiple targets?

@pared
Copy link
Contributor Author

pared commented Jan 27, 2022

@dberenbaum

Ok, so I guess my research was to shallow - as I mentioned earlier, providing choices requires to add None or '' as option among choices. However as I am experimenting now it seems to be the case only in case of nargs='*' and not nargs='?'. In that case I think there is no point in providing validation and metavar just to be able to provide multiple targets. I will modify the PR.

@pared
Copy link
Contributor Author

pared commented Jan 28, 2022

@dberenbaum So I though (as per my previous comment) that we will be unable to use choices without adding None or '' as one of the options, which would make the help ugly. But if we make it as you proposed (with ? instead of *) it does work. So I think we will be better off going your way. I made necessary change.

@pared pared force-pushed the 6389_optional_plots branch from f7491d2 to 4c3663b Compare January 31, 2022 12:53
@pared pared force-pushed the 6389_optional_plots branch from 4c3663b to ed074f9 Compare February 1, 2022 13:36
@daavoo daavoo enabled auto-merge (rebase) February 1, 2022 13:41
@daavoo daavoo merged commit 7bc14a5 into treeverse:main Feb 1, 2022
@dberenbaum
Copy link
Contributor

@pared Is there a corresponding issue or PR in dvc.org??

@pared
Copy link
Contributor Author

pared commented Feb 16, 2022

@dberenbaum not yet, creating

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A: plots Related to the plots enhancement Enhances DVC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make plot templates optional

5 participants