Skip to content

zombie processes from dvc pull failures #3744

@ychou85

Description

@ychou85

Please provide information about your setup
DVC version(i.e. dvc --version), Platform and method of installation (pip, homebrew, pkg Mac, exe (Windows), DEB(Linux), RPM(Linux))

DVC version 0.93.0
Platform: DGX2 kubernetes cluster, installation with pip,

The base image we use.

FROM nvcr.io/nvidia/pytorch:19.10-py3

When we call dvc pull, if the pull fails for any reason (lack of right credentials for one, or if we ctr-C), we see a lot of zombie processes start spawning on our pod. This occurs to the point that it ties up all free resources on the cluster and grinds things to a halt. Can someone look into this please? It's affecting a pilot team promoting DVC usage at a major healthcare corporation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    awaiting responsewe are waiting for your reply, please respond! :)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions