Skip to content

Weekly Digest (26 April, 2020 - 3 May, 2020)Β #3723

@weekly-digest

Description

@weekly-digest

Here's the Weekly Digest for iterative/dvc:


ISSUES

Last week 39 issues were created.
Of these, 23 issues have been closed and 16 issues are still open.

OPEN ISSUES

πŸ’š #3722 gdrive: progress for downloads, by casperdcl
πŸ’š #3720 status: add --recursive flag, by nik123
πŸ’š #3719 Duplicated remote repository name in different config levels., by karajan1001
πŸ’š #3718 ERROR: unexpected error - are any errors expected by the user?, by skshetry
πŸ’š #3717 DVC add fails when there are broken symlinks in the dataset, by greaber
πŸ’š #3716 [WIP] repo: use unified RepoTree for erepos, by pmrowla
πŸ’š #3715 Dvc remote default in list validation , by karajan1001
πŸ’š #3714 Get Remote Storage URL for files/directories added directly from S3, by AratiNagmal
πŸ’š #3706 run: ui issue for run, by skshetry
πŸ’š #3703 remote: should DVC prevent external cache overlap default remote?, by jorgeorpinel
πŸ’š #3700 pipeline file: characters to allow in stage name, by skshetry
πŸ’š #3698 invalid start byte, by Sunny-Day200
πŸ’š #3697 Linking type should not be committed to git, by drorata
πŸ’š #3693 cleanups: todo after implementation of pipeline file, by skshetry
πŸ’š #3690 --show-md output for metrics, params and dvc diff, by dmpetrov
πŸ’š #3685 publish conda package for python 3.8, by antonkulaga

CLOSED ISSUES

❀️ #3721 progress: add postfix info to avoid overwriting desc, by casperdcl
❀️ #3713 stage: cache: use lockfiles, by efiop
❀️ #3712 Restyle dump: lockfile dump deterministically, by restyled-io[bot]
❀️ #3711 dump: deterministic lockfile dump, by skshetry
❀️ #3710 dvc: rename pipelines.yaml -> dvc.yaml, by efiop
❀️ #3709 run: try to save deps before running the command, by efiop
❀️ #3708 serialize: use checksums that are already saved, by efiop
❀️ #3707 run: params are not checked before running the command, by skshetry
❀️ #3705 remote: adjust traverse threshold multiplier, by pmrowla
❀️ #3704 remote: reduce traverse weight multiplier, by pmrowla
❀️ #3702 setup: relax python-dateutil pip version constraint to include v2.8.2. #3701, by dchichkov
❀️ #3701 Relax pip dependency versions constraints for python-dateutil, by dchichkov
❀️ #3699 tag: getting rid of it, by skshetry
❀️ #3696 lockfile: order of content changes on a repro, by skshetry
❀️ #3695 refactor: dvc/output class names unification, by nik123
❀️ #3694 dvc: implement params support for pipeline file, by skshetry
❀️ #3692 Restyle [WIP] dvc: introduce local stage cache, by restyled-io[bot]
❀️ #3691 Restyle dvc: implement multistage dvcfile, by restyled-io[bot]
❀️ #3689 repo: use reverse post-order DFS in repro --downstream, by pmrowla
❀️ #3688 refactor: dvc/dependency class names unification, by nik123
❀️ #3687 install: rename Windows package installation name, by fabiosantoscode
❀️ #3686 gdrive: fix multi-remote workflow, cont. cleanup, by shcheklein
❀️ #3684 refactor: dvc/remotes class names unification, by nik123

LIKED ISSUE

πŸ‘ #3684 refactor: dvc/remotes class names unification, by nik123
It received πŸ‘ x2, πŸ˜„ x0, πŸŽ‰ x2 and ❀️ x2.

NOISY ISSUE

πŸ”ˆ #3687 install: rename Windows package installation name, by fabiosantoscode
It received 6 comments.


PULL REQUESTS

Last week, 24 pull requests were created, updated or merged.

UPDATED PULL REQUEST

Last week, 4 pull requests were updated.
πŸ’› #3722 gdrive: progress for downloads, by casperdcl
πŸ’› #3716 [WIP] repo: use unified RepoTree for erepos, by pmrowla
πŸ’› #3715 Dvc remote default in list validation , by karajan1001
πŸ’› #3647 remote: add support for WebDAV, by shizacat

MERGED PULL REQUEST

Last week, 20 pull requests were merged.
πŸ’œ #3721 progress: add postfix info to avoid overwriting desc, by casperdcl
πŸ’œ #3713 stage: cache: use lockfiles, by efiop
πŸ’œ #3711 dump: deterministic lockfile dump, by skshetry
πŸ’œ #3710 dvc: rename pipelines.yaml -> dvc.yaml, by efiop
πŸ’œ #3709 run: try to save deps before running the command, by efiop
πŸ’œ #3708 serialize: use checksums that are already saved, by efiop
πŸ’œ #3705 remote: adjust traverse threshold multiplier, by pmrowla
πŸ’œ #3702 setup: relax python-dateutil pip version constraint to include v2.8.2. #3701, by dchichkov
πŸ’œ #3699 tag: getting rid of it, by skshetry
πŸ’œ #3695 refactor: dvc/output class names unification, by nik123
πŸ’œ #3694 dvc: implement params support for pipeline file, by skshetry
πŸ’œ #3689 repo: use reverse post-order DFS in repro --downstream, by pmrowla
πŸ’œ #3688 refactor: dvc/dependency class names unification, by nik123
πŸ’œ #3686 gdrive: fix multi-remote workflow, cont. cleanup, by shcheklein
πŸ’œ #3684 refactor: dvc/remotes class names unification, by nik123
πŸ’œ #3676 dvc: implement multistage dvcfile, by skshetry
πŸ’œ #3675 remote.ssh: suppress paramiko logging, by pmrowla
πŸ’œ #3672 remote: use string paths over PathInfo for performance reasons, by pmrowla
πŸ’œ #3603 dvc: introduce local stage cache, by efiop
πŸ’œ #3577 Metrics - plotting for multiple revisions initial, by pared


COMMITS

Last week there were 22 commits.
πŸ› οΈ [progress: add postfix info to avoid overwriting desc (#3721) * progress: persist primary description

Move subsequent updates to a postfix.
Clear postfix on exit.
TODO: align nicely.
Fixes #3681.

  • progress: move to posfix[info] for full control

It was bound to happen.

  • progress: git: persist description](c873787) by casperdcl
    πŸ› οΈ [Metrics - plotting for multiple revisions initial (Metrics - plotting for multiple revisions initialΒ #3577) * init

  • rename to plot data insertion basig on dicts update

  • revision support

  • roll back revision

  • plot makedirs for backward compatibility

  • log path

  • pretty plot link to visualization page

  • make target default title

  • efiop review

  • efiop review

  • plot multiple initial

  • add some missing metric file tests

  • proper id generation

  • proper id generation

  • add confusion matrix template

  • refactor tests

  • plot from dvct file

  • plot from dvct

  • brush up commands

  • fix confusion matrix multiple plot

  • plot: change confusion matrix data schema

  • should be working as intended

  • support for src file in dvct files

  • minor fixes

  • plot: support json templates

  • plot: rename confusion template

  • plot: polish command behaviour

  • fix test for json

  • plot: test command

  • some minor fixes for tests

  • plot: unit test loading

  • plot: unit test loading

  • plot: handle TODOS

  • cleanup

  • use mocker

  • plot: support tsv

  • plot: command refactoring

  • plot: fix windows issues with tests

  • plot: test: some more windows fixes

  • plot: _load_from_revisions complexity fix

  • plot: reduce complexity

  • plot: complexity reduction

  • plot: deepsource suggestions

  • plot: move template path evaluation

  • fixup

  • fixup

  • exception on no datafile and no template

  • json metric load with OrderedDict

  • plot: improve handling non-existing files on revisions

  • plot: improve handling non-existing files on revisions

  • change default plot path

  • some exceptions and fixes

  • add yaml metrics support

  • fixup

  • some more suggestions

  • default filename fix

  • efiop review requests

  • log exception on failur

  • move revisions deduction to commands

  • json templates

  • extract template filling to separate method

  • some parsing improvements

  • add columns functionality

  • extract default data transformation to separate method

  • plot: initial support for jsonpath

  • plot: rename columns to filters, tests are dict based

  • plot: fixups

  • plot: refactoring

  • repo: plot: convert to package

  • plot: data loading refactor, support searching for data

  • plot: raise if wrong fields provided

  • plot: command description

  • plot: default: pass y axis info for default plot

  • plot: get rid of fieldnames, expect ordered data

  • plot: handle default plot in separate method

  • plot: fix default

  • plot: command option names fixes

  • refactoring

  • fixes

  • plot: provide option for stdout redirection

  • plot: rename show-json to no-html

  • plot: add no-csv-header option

  • plot: improve error message for wrongly structured metric

  • plot: match template name exactly, whit suffix appended only

  • plot: dmpetrov and ivan review

  • plot: refactor --stdout help message

  • plot: move template to repo/plot

  • plot: add -x and -y options

  • plot: add -x and -y options

  • plot: command: order change

  • plot: scatter

  • plot: rename confusion matrix template, new name generation format

  • plot: add title anchor

  • plot: review from jorgeorpinel

  • plot: rename filter and result options to select and file

  • plot: add --title, --x-title, --y-title

  • plot: xlab ylab

  • Update dvc/repo/plot/template.py

Co-authored-by: Ruslan Kuprieiev kupruser@gmail.com

  • Update dvc/repo/plot/template.py

Co-authored-by: Ruslan Kuprieiev kupruser@gmail.com

  • efiop review

  • plot: bash completion

  • plot: static code analysis fixes

Co-authored-by: Ruslan Kuprieiev kupruser@gmail.com](e553511) by pared
πŸ› οΈ [stage: cache: use lockfiles (#3713) * stage: cache: use lockfiles

  • reorganize

  • Reorganize, use fill_from_lock to load stage cache

  • load params from build cache

  • adjust tests

Co-authored-by: Saugat Pachhai suagatchhetri@outlook.com](827c994) by efiop
πŸ› οΈ [dump: deterministic lockfile dump (#3711) * dump: lockfile is dumped deterministically

The dump is no longer deterministic/dependent on the pipeline file,
but is sorted based on file names in outs, deps or params.
Also, the params inside each files are also sorted based on name.
However, the objects inside params are not sorted deterministically
as I think it's too much to sort that, and is not easy (considering
the types of objects it might hold, eg: lists, objects, etc).

This will also provide ordered dumps for Python3.5

  • fix windows 3.8 test

Co-authored-by: Ruslan Kuprieiev ruslan@iterative.ai](22c60dd) by skshetry
πŸ› οΈ dvc: rename pipelines.yaml -> dvc.yaml (#3710) by efiop
πŸ› οΈ [run: try to save deps before running the command (#3709) Unlike old _check_missing_deps, this also verifies that we are able to
save more complex dependencies such as parameters, where we not only
care about the config file, but also about the parameters in it.

Fixes #3707](71d156a) by efiop
πŸ› οΈ serialize: use checksums that are already saved (#3708) get_checksum() recomputes the checksum which might not match the
pre-recorded one. checksum is the one that was save()ed during run
and it is the one that should be used in the lockfile.
by efiop
πŸ› οΈ remote: adjust traverse threshold multiplier (#3705) * Fixes #3704 by pmrowla
πŸ› οΈ [dvc: introduce local build cache (#3603) This patch introduces .dvc/cache/stages that is used to store previous
runs and their results, which could then be reused later when we stumble
upon the same command with the same deps and outs.

Format of build cache entries is single-line json, which is readable by
humans and might also be used for lock files discussed in #1871.

Related to #1871
Local part of #1234](18e8f07) by efiop
πŸ› οΈ [gdrive: fix multi-remote workflow, cont. cleanup (#3686) * remote, minor: fix parameter method name for consistency

  • gdrive: cleanup, fix workflow with multiple gdrive remotes

  • config: resolve gdrive cred file parth, typo fix

  • grdive: address deepsource warning

  • gdrive: fix tests after simplifying auth flow

  • gdrive: address PR review, use backticks where appropriate

Co-Authored-By: Jorge Orpinel jorgeorpinel@users.noreply.github.com

  • gdrive: exception text improvements

Co-Authored-By: Jorge Orpinel jorgeorpinel@users.noreply.github.com

  • gdrive: fix exception message

Co-Authored-By: Jorge Orpinel jorgeorpinel@users.noreply.github.com

  • gdrive: fix root not found exception message

Co-Authored-By: Jorge Orpinel jorgeorpinel@users.noreply.github.com

  • gdrive: minor warnings/exceptions text improvement

  • gdrive: add tests for the gdrive_user_credentials_file relative path

  • gdrive: address review, slightly change text

  • gdrive: comments -> docstrings, addressing PR review

Co-authored-by: Jorge Orpinel jorgeorpinel@users.noreply.github.com](8aefbac) by shcheklein
πŸ› οΈ [setup: relax python-dateutil pip version constraint to include v2.8.2. #3701 (#3702) * Relax python-dateutil pip version constraint to include v2.8.2. #3701

[WIP]. Attempt to relax upper constraints on the version of python-dateutil, to include 2.8.1 and 2.8.2.

Original constraint was:
"python-dateutil<2.8.1,>=2.1", # Consolidates azure-blob-storage and boto3

  • Update setup.py

Co-authored-by: Ruslan Kuprieiev kupruser@gmail.com](40c6b56) by dchichkov
πŸ› οΈ [tag: getting rid of it (#3699) * tags: get rid of it

  • cloud remotes still default to using PathInfo's
  • cache fspath string

  • use abspath in checksum_to_path

  • refactor Dvcfile into Pipeline file and Single stage file

  • fix tests

  • dvc: fix outputs

  • add more tests for collection of outputs

  • add tests for data cloud/get/import/ls

  • tests: test for checkouts

  • Allow other checksums other than md5

  • tests: use iterdir instead of os.listdir

  • cleanup errors reported by cc and ds

  • utils: throw DvcException instead of plain Exception

  • tests: use yaml.load instead of json.load

  • run: split assignments

  • Update dvc/stage/exceptions.py

Co-authored-by: Ruslan Kuprieiev kupruser@gmail.com](1937527) by skshetry
πŸ› οΈ refactor: dvc/dependency class names unification (#3688) by nik123
πŸ› οΈ refactor: dvc/remotes name unification (#3684) Partially fixes #2089 by nik123


CONTRIBUTORS

Last week there were 8 contributors.
πŸ‘€ casperdcl
πŸ‘€ pared
πŸ‘€ efiop
πŸ‘€ skshetry
πŸ‘€ pmrowla
πŸ‘€ shcheklein
πŸ‘€ dchichkov
πŸ‘€ nik123


STARGAZERS

Last week there were 44 stagazers.
⭐ SuryaThiru
⭐ hoisee
⭐ tinyRatP
⭐ rlalpha
⭐ SergeevVladislav
⭐ agis85
⭐ twang96
⭐ HtutLynn
⭐ VictorGuedes
⭐ homutov
⭐ deshraj
⭐ ms-sharma
⭐ sagewhocodes
⭐ virendrasuryavanshi
⭐ neutrinus
⭐ Mordin13
⭐ hademircii
⭐ oke-aditya
⭐ eyehattaya
⭐ ravila4
⭐ oiotoxt
⭐ kaka-lin
⭐ stefanocoretta
⭐ alheio
⭐ polololya
⭐ Pachec0o0
⭐ jayvasantjv
⭐ Shastick
⭐ nilsdebruin
⭐ conraddd
⭐ SAr2r
⭐ MHarland
⭐ mattc-eostar
⭐ DanHugoDanHugo
⭐ akanz1
⭐ phamquiluan
⭐ germz01
⭐ courentin
⭐ Oktosha
⭐ GiulioRossetti
⭐ bassrehab
⭐ achicha
⭐ MsMandelbrot
⭐ ghrahul
You all are the stars! 🌟


RELEASES

Last week there were no releases.


That's all for last week, please πŸ‘€ Watch and ⭐ Star the repository iterative/dvc to receive next weekly updates. πŸ˜ƒ

You can also view all Weekly Digests by clicking here.

Your Weekly Digest bot. πŸ“†

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions