-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Description
2.0 will be a major release, which is a good chance to change some things we've been talking about in a potentially non-backward compatible way. Creating this ticket as a reminder for smaller things, as they might get forgotten while working on major planned stuff. Please feel free to add other points
rename .dvc/cache into .dvc/objects(or smth like that)delayed for 3.0get rid of CRLF conversion when computing md5drop dos2unix behaviour #4658 will be handled in 3.0 cacheeither get rid of *.dvc files or at least fix the order of fields in it. E.g.leaving for 3.0
- md5: 456789
path: myfile
should really be
- path: myfile
md5: 456789
so that we create less confusion when merging/deleting/--no-exec'ing.
change cache/remote format from:Split data into blocks #829
12/3456
98/7654
...
runs/
to
objects/ # for potential chunking https://github.com/iterative/dvc/issues/829
files/ # same as our current cache files, aka assembled objects
dirs/ # for .dir files, will help us quicker list .dir files in push/pull/status optimization (currently we just walk the whole remote looking for .dir files)
runs/
tmp/ # for .tmp files that we use for universal atomic upload/download (download .tmp + rename) ?
...
also worth making files/
use 12/123456
format like git and our runs/ do instead of the current 12/3456
Change .dir formatSplit data into blocks #829 (comment) remote/cache: consider de-duplication for .dir files #3791Drop dvc run --single-stage deprecate old .dvc file basedNot a blocker, could be removed later.run
stages? #3936 ?switch to sha256(512?) as a default algo and drop dosunix for md5Split data into blocks #829-
drop .dvc files forDecided to do that after 2.0. Clear that we will be moving this way so no reason to refactor old .dvc file schema.dvc add/import
in favor of special fields in dvc.yaml revisit API exceptions(ping @skshetry ) For 3.0(optional) revisit config options for remotes, maybe get create an alias to later get rid of things like gdrive_*a topic for discussion- adjust conda package to not install extras by default conda: make rename
dvc-base
->dvc
package in 2.0 conda-forge/dvc-feedstock#168 - start pushing stable packages to v2 snap channel.
- add docs for experiments, dvc.yaml 2.0 - besides technical docs, we'll need to revisit get started at the very least, tutorials
- drop callbacks run: abandon
callback stage
feature in favor of--always-changed
#1407
Consider forbidding backtracking output paths and maybe even wdir? Similar tofor 3.0.gitignore
.
- Check for 3.9 compatibility, including
/dulwich
/pyarrow
and maybe others (ping @skshetry)ruamel-yaml
- Make dvc config/remote load all configs by default and introduce
--repo
for the current behavior.
Remove schema validation during run-cache dump?
-
Drop pyinstaller? (check analytics to see % for binary packages) Get EV code signing certificate to sign our binary packages #936having a standalone package is still needed for some scenarios - get rid of can_be_skipped run: run is computing checksums even though --no-exec is specified #5368
- straight to remote/cache support adding/transfering data straight to cache/remote #4520
- reconsider
dvc add
behavior cache: generalize save and checkout #5412 (comment) @efiop (might requiredvc add
doc change)
shcheklein and courentincourentin and shcheklein
Metadata
Metadata
Assignees
Labels
No labels