Skip to content

push: OOM when uploading 1.5TB(3.5Mfiles) dataset to local remote #4139

@lenrys29

Description

@lenrys29

Bug Report

Output of dvc version:

DVC version: 1.0.1
Python version: 3.7.1
Platform: Linux-4.19.0-9-amd64-x86_64-with-debian-10.4
Binary: True
Package: deb
Supported remotes: azure, gdrive, gs, hdfs, http, https, s3, ssh, oss
Cache: reflink - supported, hardlink - supported, symlink - supported
Filesystem type (cache directory): ('xfs', '/dev/mapper/vg_data-lv_data')
Repo: dvc, git
Filesystem type (workspace): ('xfs', '/dev/mapper/vg_data-lv_data')

Please provide information about your setup

We've got a single server with 1 big partition of 2 TO.

We are using git + dvc to manage our datasets, training, etc.

For now, we try to push 1,5TB (3,18M files) on 1 single repo to create our 1st original dataset

We are using reflink and xfs for storage.

SSH + local usage for : repo / git / training

But when we are pushing dvc crashed :

2020-06-29 21:36:34,255 DEBUG: Uploading '.dvc/cache/db/4b681a2fcb2d45436c2bc9420cb116' to '../../../../datasets/upper/idl/db/4b681a2fcb2d45436c2bc9420cb116'
2020-06-29 21:36:34,260 DEBUG: Uploading '.dvc/cache/8f/9081ae7243d5d9b2f63c6d18b27678' to '../../../../datasets/upper/idl/8f/9081ae7243d5d9b2f63c6d18b27678'
2020-06-29 21:36:34,603 DEBUG: Uploading '.dvc/cache/fd/7a74c5c5bc38a23845b0e79671b81f' to '../../../../datasets/upper/idl/fd/7a74c5c5bc38a23845b0e79671b81f'
2020-06-29 21:36:35,078 DEBUG: Uploading '.dvc/cache/9c/b4e48c89859fe5ded008d8f1e0e74a' to '../../../../datasets/upper/idl/9c/b4e48c89859fe5ded008d8f1e0e74a'
2020-06-29 21:36:35,095 DEBUG: Uploading '.dvc/cache/f7/6bd859740457c7b561d5fd5425083d' to '../../../../datasets/upper/idl/f7/6bd859740457c7b561d5fd5425083d'
2020-06-29 21:36:35,226 DEBUG: Uploading '.dvc/cache/42/41c18d87d54a4d6600119da7696261' to '../../../../datasets/upper/idl/42/41c18d87d54a4d6600119da7696261'
2020-06-29 21:36:35,227 DEBUG: Uploading '.dvc/cache/4c/c093f27f05614413e86628c6eec537' to '../../../../datasets/upper/idl/4c/c093f27f05614413e86628c6eec537'
2020-06-29 21:36:35,366 DEBUG: Uploading '.dvc/cache/77/d4e0c59a0be6de0d095593d2866710' to '../../../../datasets/upper/idl/77/d4e0c59a0be6de0d095593d2866710'
2020-06-29 21:36:35,366 DEBUG: Uploading '.dvc/cache/2e/387c8af8732930d3717037c1ce1826' to '../../../../datasets/upper/idl/2e/387c8af8732930d3717037c1ce1826'
2020-06-29 21:36:35,408 DEBUG: Uploading '.dvc/cache/7f/838c5ede1e6e430a786e5e735f01de' to '../../../../datasets/upper/idl/7f/838c5ede1e6e430a786e5e735f01de'
Killed
Jun 29 21:36:43 server kernel: [299632.013388] Out of memory: Kill process 31523 (dvc) score 962 or sacrifice child
Jun 29 21:36:43 server kernel: [299632.013456] Killed process 31523 (dvc) total-vm:12977932kB, anon-rss:7808560kB, file-rss:0kB, shmem-rss:0kB
Jun 29 21:36:43 server kernel: [299632.564083] oom_reaper: reaped process 31523 (dvc), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Metadata

Metadata

Assignees

Labels

p2-mediumMedium priority, should be done, but less importantperformanceimprovement over resource / time consuming tasksresearch

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions