Skip to content

dvc: provide granularity for commands that could target specific tracked files #2458

@jorgeorpinel

Description

@jorgeorpinel

This can be seen as revisiting feature request #1026

UPDATE: Please scroll down to #2458 (comment) for most recent, summarized requirement.
Here is the original context also (still relevant):


There's different scenarios in which being able to manipulate files granularly independently of how they were committed/pushed to DVC could be useful. The problem with using dvc add -R now is that it can generate lots of .dvc files, but what if a directory could be added without -R (producing a single DVC-file) and yet other commands (lock, update, get, etc) could be applied to individual files inside the added directory tree?

Example (from iterative/dataset-registry@7476a85)

Project 1:

$ tree
.
└── tutorial
    └── nlp
        ├── Posts.xml.zip
        └── pipeline.zip
$ dvc add tutorial
...
$ dvc push
...

Project 2:

$ dvc import {project-1-url} tutorial/nlp/pipeline.zip
...
$ tree
.
├── tutorial
│   └── nlp
│       └── pipeline.zip
└── tutorial.dvc

Not sure about where the .dvc would have to be placed in this example though.

And also this is how Git works, I believe. Files are tracked individually (in fact it doesn't even recognize empty dirs).

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature requestRequesting a new featurep1-importantImportant, aka current backlog of things to doproduct: VSCodeIntegration with VSCode extension

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions