Skip to content

Conversation

@skshetry
Copy link
Collaborator

@skshetry skshetry commented Jan 12, 2022

This commit adds support for external commands that are available
in one of the directory set in PATH.
The executable should be in format of 'dvc-*' format, should
have executable permissions and need to exist (duh). The suffix,
after 'dvc-' will be used as a subcommand name, and all tha arguments
after that subcommand will be passed to that external command.

Closes #6489.

Motivation

  1. This makes DVC extensible with new subcommands without having to modify DVC itself. This can help users to use
    custom commands outside of DVC.

  2. We may be able to maintain dvclive outside dvc. Although it's not clear if this will be enough, or if we need python-based plugins.

  3. Users may be able to combine multiple dvc commands for their workflow or use it to set an alias. Introduce DVC aliases #7040.

  4. We also lack plumbing/low-level commands that I always wanted to have, especially during debugging/development.
    These commands could be modifying dvc-objects or manipulating dvc.yaml/dvc.lock files, or debugging pipelines workflow, etc. We have too many commands already, and we may not want to have more commands, that are niche.
    This gives us the ability to share scripts among the team. See dvc: introduce dvc md5 or dvc checksum command #1500.

This is not expected to be documented yet, we may try using it within the team and gather feedback first.

Other projects that support this mechanism:

Thank you for the contribution - we'll try to review it as soon as possible. 🙏

@skshetry skshetry added ui user interface / interaction A: cli Related to the CLI labels Jan 12, 2022
@skshetry skshetry self-assigned this Jan 12, 2022
@skshetry skshetry requested a review from a team as a code owner January 12, 2022 15:30
@skshetry skshetry requested a review from casperdcl January 12, 2022 15:30
@skshetry skshetry added discussion requires active participation to reach a conclusion T: discussion labels Jan 12, 2022
@skshetry skshetry force-pushed the external-command branch 2 times, most recently from 5332d55 to 02c0fb1 Compare January 13, 2022 04:22
@pmrowla
Copy link
Contributor

pmrowla commented Jan 13, 2022

  1. We may be able to maintain dvclive outside dvc. Although it's not clear if this will be enough, or if we need python-based plugins.

Would we be better off using python entrypoints here (rather than using PATH)?

If we defined a dvc.cli_plugin (or similar) entry point, users could register their own plugin with attributes for command_cls and add_parser, where the user command subclasses the existing dvc.command.base types, and add_parser just conforms to the existing dvc.command add_parser usage for setting up a subparser.

This way everyone could expect things like CmdBase.repo/CmdBase.config to always be loaded and passed into the command object the same way (whether it's an official dvc core command or a user defined one). And we would be able to enforce that users conform to the existing parser/subparser structure in DVC, so setting up help text (and even including user commands in completion installation) would be consistent.

The disadvantage here would be that using PATH means users aren't forced to use python, but I'm assuming that anyone writing custom DVC subcommands is probably using python and dvc.repo.Repo already.

e: Also, thinking about it some more, the entrypoints method would only work with DVC installed in pip/conda environments (and not with the binary packages), but using PATH would not have that limitation.

@skshetry skshetry force-pushed the external-command branch 4 times, most recently from 9037b36 to 055830b Compare January 13, 2022 13:19
@skshetry
Copy link
Collaborator Author

skshetry commented Jan 14, 2022

Would we be better off using python entrypoints here (rather than using PATH)?

I think we need both approaches, and have different use cases. Let's start with their respective pros and cons:

PATH based plugins:

Pros:
  1. Low friction to get started.
  2. Works with DVC binaries as well as packages.
  3. Can be done on an ad-hoc basis, not too formal.
Cons:
  1. Does not work out of the box (eg: messing up with PATH may not be easy, especially on Windows).
  2. Cannot reliably use DVC's Python API: DVC's API is unstable, and DVC binary invoking the external command may not be the same version as the installed DVC library.

So, this approach forces users to use the CLI interface more. This is more useful for using simple scripts the user has.

Python entrypoints based approach

Pros:
  1. Extensible, can reliably access DVC's Python API.
  2. Works out of the box without any setups other than installation.
  3. Works natively, like other DVC's command.
Cons:
  1. Requires to create a library, and to be available on packaging index (eg: pypi)
  2. Does not work with binary packages.

So, at least to me, it looks like for dvclive and similar packages that we maintain, providing an API is the best solution.
(dvclive could provide a dvc-live entrypoints as well). But for users, PATH solution may be more useful.

@gvwilson
Copy link

See also #7231 (request for the ability to add notes to .dvc files) and #7232 (implementation of notes). A plugin mechanism based on Python entry points would have been ideal for this.

@skshetry skshetry mentioned this pull request Jan 19, 2022
2 tasks
This commit adds support for external commands that are available
in one of the directory set in PATH.
The executable should be in format of \'dvc-*\' format, should
have executable permissions and need to exist (duh). The suffix,
after 'dvc-' will be used as a subcommand name, and all tha arguments
after that subcommand will be passed to that external command.
@skshetry skshetry closed this Jan 20, 2022
@skshetry skshetry deleted the external-command branch January 21, 2022 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A: cli Related to the CLI discussion requires active participation to reach a conclusion ui user interface / interaction

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for external commands?

5 participants