Skip to content

dvc push after dvc push is slow #2867

@JohanMollevik

Description

@JohanMollevik

Please provide information about your setup
DVC version(i.e. dvc --version), Platform and method of installation (pip, homebrew, pkg Mac, exe (Windows), DEB(Linux), RPM(Linux))

  • debian stretch
  • dvc installed from pip
    $ dvc --version
    0.71.0

I have a large dataset (#2512 ) and have been trying to debug performance to evaluate if dvc will work for this type of data.

I did one dvc push against azure taking 4 hours for 132 GB data in 2.5M files 1 .dvc file. That is ok assuming there was changes. I then immediately again ran dvc push and it is taking 4 hours again.

Why does dvc not compleat much faster on the second run. There should be no changes between the remote and local cache so I was expecting it to finish after a few minutes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceimprovement over resource / time consuming tasksuiuser interface / interaction

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions