-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Labels
performanceimprovement over resource / time consuming tasksimprovement over resource / time consuming tasks
Description
DVC version:
DVC version: 0.71.0+d42a04
Python version: 3.7.5
Platform: Darwin-18.2.0-x86_64-i386-64bit
Binary: False
Package: None
Cache: reflink - True, hardlink - True, symlink - True
Filesystem type (cache directory): ('apfs', '/dev/disk1s1')
Filesystem type (workspace): ('apfs', '/dev/disk1s1')
Reproduce:
python -m cProfile -o s3-dep-images.prof -m dvc run -v -d s3://dvc-common/images -o out.txt "aws s3 ls s3://dvc-storage/get-started/fe/ >out.txt"
takes 3-4 hours to complete.
s3://dvc-common/images "Directory" has ~0.5M files.
Profile file:
https://www.dropbox.com/s/uqdjmm4pjkn0rj1/s3-dep-images.prof?dl=0
96% of time goes to thread lock.
Output:
Mon Dec 2 23:11:43 2019 s3-dep-images.prof
325633688 function calls (322461350 primitive calls) in 10724.500 seconds
Ordered by: internal time
List reduced from 7431 to 100 due to restriction <100>
ncalls tottime percall cumtime percall filename:lineno(function)
596400 10371.995 0.017 10371.995 0.017 {method 'acquire' of '_thread.lock' objects}
23894 152.922 0.006 152.922 0.006 {method 'read' of '_ssl._SSLSocket' objects}
5782566 12.714 0.000 21.243 0.000 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/dateutil/parser/_parser.py:83(get_token)
4475479 9.278 0.000 10.816 0.000 {method 'write' of '_io.TextIOWrapper' objects}
444810 5.790 0.000 52.679 0.000 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/dateutil/parser/_parser.py:669(_parse)
459 5.261 0.011 5.261 0.011 {method 'feed' of 'xml.etree.ElementTree.XMLParser' objects}
16273231 5.019 0.000 8.242 0.000 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/tqdm/utils.py:300(<genexpr>)
440822 4.726 0.000 5.103 0.000 /usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:216(__init__)
440796 4.490 0.000 10374.041 0.024 /usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/concurrent/futures/_base.py:408(result)
134568 4.240 0.000 28.491 0.000 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/tqdm/std.py:309(format_meter)
1352110/1352109 4.128 0.000 5.655 0.000 {method 'format' of 'str' objects}
889629 3.983 0.000 13.170 0.000 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/dateutil/parser/_parser.py:878(_parse_numeric_token)
297700 3.369 0.000 10376.913 0.035 /usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py:264(wait)
269133 3.349 0.000 4.986 0.000 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/tqdm/utils.py:209(_is_ascii)
445259/463 3.343 0.000 83.457 0.180 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/botocore/parsers.py:362(_handle_structure)
16138665 3.222 0.000 3.222 0.000 {built-in method unicodedata.east_asian_width}
579377 2.974 0.000 12.928 0.000 {built-in method builtins.sum}
1334439 2.732 0.000 2.732 0.000 {method 'replace' of 'datetime.datetime' objects}
2203982 2.699 0.000 5.480 0.000 /Users/ivan/Projects/dvc/dvc/path_info.py:183(parts)
440797 2.623 0.000 10418.901 0.024 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/tqdm/std.py:1061(__iter__)
2673319/521 2.482 0.000 83.474 0.160 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/botocore/parsers.py:299(_parse_shape)
440797 2.433 0.000 10376.699 0.024 /usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/concurrent/futures/_base.py:591(result_iterator)
441675 2.391 0.000 2.621 0.000 /usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pathlib.py:700(__str__)
1334430 2.356 0.000 3.138 0.000 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/dateutil/parser/_parser.py:434(append)
2807755 2.261 0.000 2.261 0.000 {method 'sub' of 're.Pattern' objects}
12454653 2.224 0.000 3.369 0.000 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/dateutil/parser/_parser.py:214(isnum)
881628 1.954 0.000 4.412 0.000 /usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pathlib.py:63(parse_parts)
440797 1.850 0.000 8.336 0.000 /usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/concurrent/futures/thread.py:145(submit)
2203982 1.828 0.000 7.840 0.000 /Users/ivan/Projects/dvc/dvc/path_info.py:209(__hash__)
5782566 1.813 0.000 23.057 0.000 /Users/ivan/Projects/dvc/.env/lib/python3.7/site-packages/dateutil/parser/_parser.py:195(__next__)
11596386 1.809 0.000 1.811 0.000 {built-in method builtins.getattr}
1 1.726 1.726 10433.537 10433.537 /Users/ivan/Projects/dvc/dvc/remote/base.py:180(_calculate_checksums)
WilliamHarvey97
Metadata
Metadata
Assignees
Labels
performanceimprovement over resource / time consuming tasksimprovement over resource / time consuming tasks
