-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add/multiprocessing #634
Add/multiprocessing #634
Conversation
As yes and two more things that came to mind now that I've slept a bit:
|
ok I'm running on a n1-standard-32 (32 vCPUs, 120 GB memory) machine. First with a dinky image, ubuntu:latest. Here are results: serial
Here are times using 2* CPU, in this case, 64 workers
(not remarkably different) and finally here are times that are varying the workers (the idea being one configuration would be most optimal)
Again, it's really not convincingly different to warrant adding this. What I'm doing next is testing a much heftier image (one with 24 layers that are each much larger) and from there, I would suggest the following:
For those interested in running their own test cases (@ArangoGutierrez) I have provided the testing code here. And you of course can use the PR here if you don't want to work internally in Python with simple dummy functions. |
ok a quick update on testing this (trivially) with a larger image. It's expensive so I limited to the test to a subset of layers [0:12] for
and the multiprocessing hung and never finished, and I wasn't able to get any kind of log, but I can talk about my observations. What I observed is that the multiprocessing handled the smaller layers simultaneously, and they were all completed very quickly. The bottle neck was the larger layers - because we are still handling the layer download and permission changes via one worker. This means that any improvements / speed ups in time are due to making the smaller layers efficient to finish in "parallel" but the bottle neck is always going to be the largest layer. Interestingly enough, dumping more stuff into one layer is the recommended Dockery way of doing things, to prevent having more smaller layers. It's even farther from our approach since we are working with entire image files. It would be great if someone that has this kind of infrastructure could test this proper, or just give me a log in and I'll do it. I can't be charging my group dollars per hour for it, it's not really good. |
In case it's of interest (even outside this PR) there are modules in python that let us do a pretty detailed logging of every function call! For example, here is just creating the client and getting images for ubuntu:latest:
it's a bit of a hack, but the script to run the test (sitting in libexec/python) is like:
and then the profiler is like
then I'm calling just like:
it could of course be done a lot more cleanly, targeted toward some use, just wanted to share in case it's desired. |
Here are some results using the "hack" script and the PR installed on my host ( Dual XEON DeLL WS) Average Download speed using speedtest-cli : 53.94 Mbit/s ['/tmp/singularity/libexec/python', '/usr/lib/python27.zip', '/usr/lib64/python2.7', '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload', '/usr/lib64/python2.7/site-packages', '/usr/lib64/python2.7/site-packages/gtk-2.0', '/usr/lib/python2.7/site-packages', '/tmp/singularity/libexec/python', '/tmp/singularity/libexec/python/docker']
VERBOSE2 SINGULARITY_COMMAND_ASIS found as False
VERBOSE2 SINGULARITY_ROOTFS not defined (None)
VERBOSE2 SINGULARITY_METADATA_FOLDER found as None/.singularity.d
VERBOSE2 SINGULARITY_DISABLE_CACHE found as False
VERBOSE2 SINGULARITY_CACHEDIR found as /home/Eduardo/.singularity
VERBOSE2 SINGULARITY_ENVBASE found as None/.singularity.d/env
VERBOSE2 SINGULARITY_LABELFILE found as None/.singularity.d/labels.json
VERBOSE2 SINGULARITY_INCLUDECMD found as False
VERBOSE2 SINGULARITY_PULLFOLDER found as /tmp/singularity/libexec/python
VERBOSE2 SHUB_NAMEBYHASH not defined (None)
VERBOSE2 SHUB_NAMEBYCOMMIT not defined (None)
VERBOSE2 SHUB_CONTAINERNAME not defined (None)
VERBOSE2 SINGULARITY_CONTENTS found as None/.singularity.d/.layers
DEBUG Headers found: Content-Type,Accept
VERBOSE Registry: index.docker.io
VERBOSE Namespace: library
VERBOSE Repo Name: ubuntu
VERBOSE Repo Tag: latest
VERBOSE Version: None
VERBOSE Obtaining tags: https://index.docker.io/v2/library/ubuntu/tags/list
DEBUG GET https://index.docker.io/v2/library/ubuntu/tags/list
DEBUG Http Error with code 401
DEBUG GET https://auth.docker.io/token?service=registry.docker.io&expires_in=9000&scope=repository:library/ubuntu:pull
DEBUG Headers found: Content-Type,Authorization,Accept
VERBOSE Obtaining manifest: https://index.docker.io/v2/library/ubuntu/manifests/latest
DEBUG GET https://index.docker.io/v2/library/ubuntu/manifests/latest
DEBUG Image manifest version 2.2 found.
DEBUG Adding digest sha256:aafe6b5e13de557451e1781fe7276620275625f970015cbd10036ab7d8ae27c0
DEBUG Adding digest sha256:0a2b43a726608d3835aa027bbe181624789130c212eb191baa481f1d788a0676
DEBUG Adding digest sha256:18bdd1e546d21240bd4c3e4e29032a38f18102d780a78686508d40fe01c60158
DEBUG Adding digest sha256:8198342c3e05101bb56b0a585dc34e39346ece39ee5d39287cda24b1fb445fed
DEBUG Adding digest sha256:f56970a44fd409805e3872cac236d29f5eed1a45e189bb765bcd2163c915f384
1633 function calls (1568 primitive calls) in 2.436 seconds
Ordered by: internal time, call count
ncalls tottime percall cumtime percall filename:lineno(function)
4 0.716 0.179 0.718 0.179 ssl.py:833(do_handshake)
4 0.716 0.179 0.716 0.179 ssl.py:642(read)
4 0.587 0.147 0.936 0.234 socket.py:541(create_connection)
24 0.349 0.015 0.349 0.015 socket.py:227(meth)
4 0.048 0.012 0.048 0.012 ssl.py:401(load_default_certs)
12/10 0.007 0.001 0.007 0.001 socket.py:280(close)
4 0.002 0.000 0.002 0.000 ssl.py:346(__new__)
40 0.001 0.000 0.717 0.018 socket.py:410(readline)
8/3 0.001 0.000 0.007 0.002 socket.py:340(read)
19 0.001 0.000 0.001 0.000 message.py:146(write)
4 0.000 0.000 0.000 0.000 ssl.py:673(getpeercert)
13/5 0.000 0.000 0.001 0.000 sre_parse.py:395(_parse)
4 0.000 0.000 0.718 0.180 ssl.py:529(__init__)
8 0.000 0.000 0.000 0.000 ssl.py:814(close)
4 0.000 0.000 0.001 0.000 httplib.py:271(readheaders)
4 0.000 0.000 2.423 0.606 urllib2.py:1154(do_open)
9 0.000 0.000 0.000 0.000 urllib2.py:348(add_handler)
24 0.000 0.000 0.000 0.000 httplib.py:1005(putheader)
24/5 0.000 0.000 0.000 0.000 sre_compile.py:64(_compile)
8 0.000 0.000 0.000 0.000 socket.py:189(__init__)
4 0.000 0.000 0.000 0.000 ssl.py:711(send)
4 0.000 0.000 0.000 0.000 socket.py:196(close)
35/16 0.000 0.000 0.000 0.000 sre_parse.py:151(getwidth)
123 0.000 0.000 0.000 0.000 sre_parse.py:193(__next)
19 0.000 0.000 0.001 0.000 message.py:112(emit)
4 0.000 0.000 0.001 0.000 ssl.py:198(_dnsname_match)
4 0.000 0.000 0.716 0.179 httplib.py:392(_read_status)
4 0.000 0.000 0.717 0.179 httplib.py:431(begin)
8 0.000 0.000 0.002 0.000 re.py:230(_compile)
4 0.000 0.000 0.050 0.012 ssl.py:410(create_default_context)
2 0.000 0.000 0.000 0.000 decoder.py:370(raw_decode)
6 0.000 0.000 0.000 0.000 urlparse.py:168(urlsplit)
8 0.000 0.000 0.000 0.000 re.py:208(escape)
4 0.000 0.000 1.655 0.414 httplib.py:1067(_send_request)
3/2 0.000 0.000 2.435 1.218 base.py:176(get)
82 0.000 0.000 0.000 0.000 sre_parse.py:141(__getitem__)
4 0.000 0.000 0.717 0.179 httplib.py:1084(getresponse)
2/1 0.000 0.000 2.000 2.000 api.py:113(update_token)
4 0.000 0.000 2.424 0.606 urllib2.py:411(open)
4 0.000 0.000 0.718 0.180 ssl.py:355(wrap_socket)
5 0.000 0.000 0.002 0.000 sre_compile.py:567(compile)
2 0.000 0.000 0.005 0.002 httplib.py:609(_read_chunked)
3 0.000 0.000 0.000 0.000 utf_8.py:15(decode)
75 0.000 0.000 0.000 0.000 sre_parse.py:149(append)
106 0.000 0.000 0.000 0.000 sre_parse.py:212(get)
4 0.000 0.000 0.001 0.000 mimetools.py:24(__init__)
4 0.000 0.000 0.001 0.000 urllib2.py:226(__init__)
5 0.000 0.000 0.001 0.000 sre_parse.py:706(parse)
4 0.000 0.000 1.654 0.414 httplib.py:1252(connect)
24 0.000 0.000 0.000 0.000 urllib2.py:1184(<genexpr>)
8 0.000 0.000 0.000 0.000 socket.py:250(__init__)
26 0.000 0.000 0.000 0.000 rfc822.py:202(isheader)
4 0.000 0.000 0.000 0.000 mimetools.py:33(parsetype)
13/5 0.000 0.000 0.001 0.000 sre_parse.py:317(_parse_sub)
4 0.000 0.000 0.000 0.000 urllib2.py:1125(do_request_)
4 0.000 0.000 1.655 0.414 httplib.py:840(send)
5 0.000 0.000 0.000 0.000 urllib.py:958(__init__)
1 0.000 0.000 0.000 0.000 api.py:333(read_digests)
4 0.000 0.000 0.000 0.000 urllib.py:1092(splithost)
4 0.000 0.000 0.000 0.000 urllib.py:1078(splittype)
4 0.000 0.000 2.423 0.606 urllib2.py:439(_open)
4 0.000 0.000 0.000 0.000 urllib2.py:208(request_host)
1 0.000 0.000 0.000 0.000 urllib.py:1372(getproxies_environment)
6 0.000 0.000 0.000 0.000 urlparse.py:137(urlparse)
4 0.000 0.000 0.000 0.000 urllib.py:1183(splittag)
3 0.000 0.000 0.000 0.000 sre_compile.py:256(_optimize_charset)
4 0.000 0.000 0.000 0.000 ssl.py:743(sendall)
4 0.000 0.000 1.655 0.414 httplib.py:867(_send_output)
8/7 0.000 0.000 0.002 0.000 socket.py:289(__del__)
4 0.000 0.000 2.423 0.606 urllib2.py:1241(https_open)
4 0.000 0.000 0.000 0.000 httplib.py:888(putrequest)
3/2 0.000 0.000 2.428 1.214 base.py:205(submit_request)
1 0.000 0.000 0.000 0.000 urllib2.py:479(build_opener)
26 0.000 0.000 0.000 0.000 httplib.py:257(addheader)
4 0.000 0.000 0.000 0.000 ssl.py:895(makefile)
4 0.000 0.000 0.001 0.000 ssl.py:249(match_hostname)
1 0.000 0.000 0.436 0.436 api.py:208(get_manifest)
5 0.000 0.000 0.000 0.000 sre_compile.py:433(_compile_info)
4 0.000 0.000 0.050 0.013 httplib.py:1239(__init__)
1 0.000 0.000 0.000 0.000 shell.py:71(parse_image_uri)
4 0.000 0.000 0.000 0.000 httplib.py:364(__init__)
10 0.000 0.000 2.423 0.242 urllib2.py:399(_call_chain)
19 0.000 0.000 0.000 0.000 message.py:80(emitError)
2 0.000 0.000 0.000 0.000 decoder.py:359(decode)
4 0.000 0.000 0.000 0.000 httplib.py:826(close)
4 0.000 0.000 0.936 0.234 httplib.py:818(connect)
24 0.000 0.000 0.000 0.000 rfc822.py:290(getheader)
6 0.000 0.000 0.006 0.001 httplib.py:564(read)
8 0.000 0.000 0.000 0.000 urllib2.py:286(get_host)
4 0.000 0.000 0.000 0.000 httplib.py:721(__init__)
11 0.000 0.000 0.000 0.000 sre_compile.py:428(_simple)
5 0.000 0.000 0.000 0.000 urllib.py:1017(__init__)
2 0.000 0.000 0.000 0.000 base.py:110(update_headers)
7 0.000 0.000 0.001 0.000 re.py:192(compile)
4 0.000 0.000 0.001 0.000 base.py:235(prepare_request)
5 0.000 0.000 0.001 0.000 sre_compile.py:552(_code)
10 0.000 0.000 0.000 0.000 <string>:8(__new__)
4 0.000 0.000 0.001 0.000 rfc822.py:88(__init__)
4 0.000 0.000 0.000 0.000 httplib.py:667(_safe_read)
12 0.000 0.000 0.001 0.000 message.py:234(debug)
4 0.000 0.000 0.000 0.000 mimetools.py:50(parseplist)
4 0.000 0.000 0.716 0.179 ssl.py:759(recv)
44 0.000 0.000 0.000 0.000 sre_parse.py:137(__len__)
3 0.000 0.000 0.000 0.000 urlparse.py:160(_splitnetloc)
1 0.000 0.000 2.000 2.000 api.py:179(get_tags)
58 0.000 0.000 0.000 0.000 sre_parse.py:206(match)
4 0.000 0.000 1.655 0.414 httplib.py:1040(request)
1 0.000 0.000 0.001 0.001 re.py:138(match)
4 0.000 0.000 2.424 0.606 urllib2.py:131(urlopen)
4 0.000 0.000 0.000 0.000 urllib2.py:541(http_response)
8 0.000 0.000 0.000 0.000 urllib2.py:279(get_type)
7 0.000 0.000 0.007 0.001 httplib.py:547(close)
30 0.000 0.000 0.000 0.000 rfc822.py:214(islast)
24 0.000 0.000 0.000 0.000 sre_parse.py:92(__init__)
5 0.000 0.000 0.000 0.000 sre_parse.py:268(_escape)
2 0.000 0.000 0.000 0.000 sutils.py:59(add_http)
1 0.000 0.000 0.000 0.000 urllib2.py:181(__init__)
8 0.000 0.000 0.000 0.000 ssl.py:634(_check_connected)
4 0.000 0.000 0.000 0.000 httplib.py:766(_get_hostport)
3 0.000 0.000 0.000 0.000 sre_compile.py:228(_compile_charset)
4 0.000 0.000 0.000 0.000 httplib.py:517(_check_close)
4 0.000 0.000 0.000 0.000 urllib.py:1221(unquote)
2 0.000 0.000 0.000 0.000 __init__.py:294(loads)
1 0.000 0.000 0.000 0.000 urllib2.py:454(error)
19 0.000 0.000 0.000 0.000 message.py:104(isEnabledFor)
10 0.000 0.000 0.000 0.000 sre_compile.py:546(isstring)
4 0.000 0.000 0.000 0.000 httplib.py:1044(_set_content_length)
1 0.000 0.000 2.436 2.436 test_stuff.py:9(main)
1 0.000 0.000 2.000 2.000 api.py:68(__init__)
5 0.000 0.000 0.000 0.000 sre_parse.py:189(__init__)
13 0.000 0.000 0.000 0.000 urllib2.py:528(__lt__)
12 0.000 0.000 0.000 0.000 urllib2.py:1173(<genexpr>)
8 0.000 0.000 0.000 0.000 urllib2.py:314(add_header)
4 0.000 0.000 0.000 0.000 urllib.py:1069(unwrap)
4 0.000 0.000 1.655 0.414 httplib.py:1025(endheaders)
1 0.000 0.000 0.436 0.436 api.py:157(get_images)
30 0.000 0.000 0.000 0.000 rfc822.py:224(iscomment)
28 0.000 0.000 0.000 0.000 httplib.py:860(_output)
7 0.000 0.000 0.000 0.000 message.py:222(verbose)
1 0.000 0.000 0.000 0.000 urllib2.py:555(http_error_default)
10 0.000 0.000 0.000 0.000 urllib2.py:273(get_full_url)
8 0.000 0.000 0.000 0.000 urllib2.py:267(has_data)
8 0.000 0.000 0.000 0.000 sre_parse.py:74(opengroup)
8 0.000 0.000 0.000 0.000 sre_parse.py:85(closegroup)
5 0.000 0.000 0.000 0.000 sre_parse.py:67(__init__)
4 0.000 0.000 0.000 0.000 ssl.py:352(__init__)
2 0.000 0.000 0.000 0.000 UserDict.py:57(items)
1 0.000 0.000 2.000 2.000 api.py:100(load_image)
20 0.000 0.000 0.000 0.000 ssl.py:630(_checkClosed)
8 0.000 0.000 0.000 0.000 socket.py:296(flush)
8 0.000 0.000 0.000 0.000 urllib2.py:322(has_header)
8 0.000 0.000 0.000 0.000 urllib2.py:318(add_unredirected_header)
4 0.000 0.000 0.000 0.000 urllib.py:1054(_is_unicode)
11 0.000 0.000 0.000 0.000 sre_parse.py:145(__setitem__)
4 0.000 0.000 0.000 0.000 httplib.py:785(set_debuglevel)
4 0.000 0.000 0.000 0.000 urllib2.py:256(get_method)
4 0.000 0.000 0.000 0.000 ssl.py:617(context)
6 0.000 0.000 0.000 0.000 urlparse.py:206(<genexpr>)
4 0.000 0.000 0.000 0.000 urllib2.py:293(get_selector)
1 0.000 0.000 0.000 0.000 urllib2.py:337(__init__)
1 0.000 0.000 0.000 0.000 urllib2.py:739(__init__)
1 0.000 0.000 0.000 0.000 base.py:96(__init__)
1 0.000 0.000 0.000 0.000 rfc822.py:444(__contains__)
8 0.000 0.000 0.000 0.000 urllib2.py:521(add_parent)
4 0.000 0.000 0.000 0.000 urllib2.py:305(has_proxy)
1 0.000 0.000 0.000 0.000 urllib2.py:1237(__init__)
4 0.000 0.000 0.000 0.000 urllib.py:1023(info)
2 0.000 0.000 0.000 0.000 urllib2.py:1119(__init__)
1 0.000 0.000 0.000 0.000 rfc822.py:391(__getitem__)
1 0.000 0.000 0.000 0.000 api.py:94(_init_headers)
1 0.000 0.000 0.000 0.000 UserDict.py:103(__contains__)
0 0.000 0.000 profile:0(profiler) ['/tmp/singularity/libexec/python', '/usr/lib/python27.zip', '/usr/lib64/python2.7', '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload', '/usr/lib64/python2.7/site-packages', '/usr/lib64/python2.7/site-packages/gtk-2.0', '/usr/lib/python2.7/site-packages', '/tmp/singularity/libexec/python', '/tmp/singularity/libexec/python/docker']
VERBOSE2 SINGULARITY_COMMAND_ASIS found as False
VERBOSE2 SINGULARITY_ROOTFS not defined (None)
VERBOSE2 SINGULARITY_METADATA_FOLDER found as None/.singularity.d
VERBOSE2 SINGULARITY_DISABLE_CACHE found as False
VERBOSE2 SINGULARITY_CACHEDIR found as /home/Eduardo/.singularity
VERBOSE2 SINGULARITY_ENVBASE found as None/.singularity.d/env
VERBOSE2 SINGULARITY_LABELFILE found as None/.singularity.d/labels.json
VERBOSE2 SINGULARITY_INCLUDECMD found as False
VERBOSE2 SINGULARITY_PULLFOLDER found as /tmp/singularity/libexec/python
VERBOSE2 SHUB_NAMEBYHASH not defined (None)
VERBOSE2 SHUB_NAMEBYCOMMIT not defined (None)
VERBOSE2 SHUB_CONTAINERNAME not defined (None)
VERBOSE2 SINGULARITY_CONTENTS found as None/.singularity.d/.layers
DEBUG Headers found: Content-Type,Accept
VERBOSE Registry: index.docker.io
VERBOSE Namespace: library
VERBOSE Repo Name: ubuntu
VERBOSE Repo Tag: latest
VERBOSE Version: None
VERBOSE Obtaining tags: https://index.docker.io/v2/library/ubuntu/tags/list
DEBUG GET https://index.docker.io/v2/library/ubuntu/tags/list
DEBUG Http Error with code 401
DEBUG GET https://auth.docker.io/token?service=registry.docker.io&expires_in=9000&scope=repository:library/ubuntu:pull
DEBUG Headers found: Content-Type,Authorization,Accept
VERBOSE Obtaining manifest: https://index.docker.io/v2/library/ubuntu/manifests/latest
DEBUG GET https://index.docker.io/v2/library/ubuntu/manifests/latest
DEBUG Image manifest version 2.2 found.
DEBUG Adding digest sha256:aafe6b5e13de557451e1781fe7276620275625f970015cbd10036ab7d8ae27c0
DEBUG Adding digest sha256:0a2b43a726608d3835aa027bbe181624789130c212eb191baa481f1d788a0676
DEBUG Adding digest sha256:18bdd1e546d21240bd4c3e4e29032a38f18102d780a78686508d40fe01c60158
DEBUG Adding digest sha256:8198342c3e05101bb56b0a585dc34e39346ece39ee5d39287cda24b1fb445fed
DEBUG Adding digest sha256:f56970a44fd409805e3872cac236d29f5eed1a45e189bb765bcd2163c915f384
1633 function calls (1568 primitive calls) in 2.948 seconds
Ordered by: internal time, call count
ncalls tottime percall cumtime percall filename:lineno(function)
4 1.331 0.333 1.693 0.423 socket.py:541(create_connection)
4 0.713 0.178 0.714 0.179 ssl.py:833(do_handshake)
4 0.481 0.120 0.481 0.120 ssl.py:642(read)
24 0.362 0.015 0.362 0.015 socket.py:227(meth)
4 0.043 0.011 0.043 0.011 ssl.py:401(load_default_certs)
12/10 0.007 0.001 0.008 0.001 socket.py:280(close)
4 0.001 0.000 0.001 0.000 ssl.py:346(__new__)
40 0.001 0.000 0.482 0.012 socket.py:410(readline)
19 0.001 0.000 0.001 0.000 message.py:146(write)
8 0.001 0.000 0.001 0.000 socket.py:189(__init__)
8/3 0.001 0.000 0.007 0.002 socket.py:340(read)
4 0.000 0.000 0.000 0.000 ssl.py:673(getpeercert)
13/5 0.000 0.000 0.001 0.000 sre_parse.py:395(_parse)
8 0.000 0.000 0.001 0.000 ssl.py:814(close)
4 0.000 0.000 0.715 0.179 ssl.py:529(__init__)
4 0.000 0.000 2.935 0.734 urllib2.py:1154(do_open)
9 0.000 0.000 0.000 0.000 urllib2.py:348(add_handler)
4 0.000 0.000 0.001 0.000 httplib.py:271(readheaders)
24/5 0.000 0.000 0.000 0.000 sre_compile.py:64(_compile)
24 0.000 0.000 0.000 0.000 httplib.py:1005(putheader)
4 0.000 0.000 0.001 0.000 ssl.py:198(_dnsname_match)
4 0.000 0.000 0.000 0.000 ssl.py:711(send)
4 0.000 0.000 0.000 0.000 socket.py:196(close)
35/16 0.000 0.000 0.000 0.000 sre_parse.py:151(getwidth)
19 0.000 0.000 0.001 0.000 message.py:112(emit)
123 0.000 0.000 0.000 0.000 sre_parse.py:193(__next)
8 0.000 0.000 0.002 0.000 re.py:230(_compile)
4 0.000 0.000 0.482 0.120 httplib.py:431(begin)
4 0.000 0.000 0.044 0.011 ssl.py:410(create_default_context)
8 0.000 0.000 0.000 0.000 re.py:208(escape)
2 0.000 0.000 0.000 0.000 decoder.py:370(raw_decode)
6 0.000 0.000 0.000 0.000 urlparse.py:168(urlsplit)
4 0.000 0.000 2.409 0.602 httplib.py:1067(_send_request)
3/2 0.000 0.000 2.947 1.474 base.py:176(get)
4 0.000 0.000 0.481 0.120 httplib.py:392(_read_status)
4 0.000 0.000 0.482 0.120 httplib.py:1084(getresponse)
82 0.000 0.000 0.000 0.000 sre_parse.py:141(__getitem__)
4 0.000 0.000 2.936 0.734 urllib2.py:411(open)
2/1 0.000 0.000 2.576 2.576 api.py:113(update_token)
4 0.000 0.000 0.715 0.179 ssl.py:355(wrap_socket)
5 0.000 0.000 0.002 0.000 sre_compile.py:567(compile)
2 0.000 0.000 0.005 0.003 httplib.py:609(_read_chunked)
4 0.000 0.000 0.001 0.000 mimetools.py:24(__init__)
106 0.000 0.000 0.000 0.000 sre_parse.py:212(get)
4 0.000 0.000 0.001 0.000 urllib2.py:226(__init__)
4 0.000 0.000 2.408 0.602 httplib.py:1252(connect)
5 0.000 0.000 0.001 0.000 sre_parse.py:706(parse)
8 0.000 0.000 0.000 0.000 socket.py:250(__init__)
4 0.000 0.000 2.408 0.602 httplib.py:840(send)
1 0.000 0.000 0.001 0.001 api.py:333(read_digests)
3 0.000 0.000 0.000 0.000 utf_8.py:15(decode)
4 0.000 0.000 0.000 0.000 urllib2.py:1125(do_request_)
4 0.000 0.000 0.000 0.000 ssl.py:743(sendall)
4 0.000 0.000 0.000 0.000 mimetools.py:33(parsetype)
24 0.000 0.000 0.000 0.000 urllib2.py:1184(<genexpr>)
1 0.000 0.000 0.000 0.000 urllib2.py:479(build_opener)
26 0.000 0.000 0.000 0.000 rfc822.py:202(isheader)
5 0.000 0.000 0.000 0.000 urllib.py:958(__init__)
4 0.000 0.000 1.693 0.423 httplib.py:818(connect)
6 0.000 0.000 0.000 0.000 urlparse.py:137(urlparse)
4 0.000 0.000 0.000 0.000 urllib.py:1078(splittype)
4 0.000 0.000 0.000 0.000 urllib2.py:208(request_host)
1 0.000 0.000 0.000 0.000 urllib.py:1372(getproxies_environment)
8/7 0.000 0.000 0.002 0.000 socket.py:289(__del__)
5 0.000 0.000 0.000 0.000 sre_compile.py:433(_compile_info)
4 0.000 0.000 2.935 0.734 urllib2.py:439(_open)
4 0.000 0.000 0.000 0.000 urllib.py:1183(splittag)
4 0.000 0.000 2.408 0.602 httplib.py:867(_send_output)
3 0.000 0.000 0.000 0.000 sre_compile.py:256(_optimize_charset)
13/5 0.000 0.000 0.001 0.000 sre_parse.py:317(_parse_sub)
4 0.000 0.000 0.000 0.000 httplib.py:888(putrequest)
4 0.000 0.000 0.001 0.000 ssl.py:249(match_hostname)
26 0.000 0.000 0.000 0.000 httplib.py:257(addheader)
4 0.000 0.000 2.935 0.734 urllib2.py:1241(https_open)
10 0.000 0.000 2.935 0.294 urllib2.py:399(_call_chain)
1 0.000 0.000 0.371 0.371 api.py:208(get_manifest)
3/2 0.000 0.000 2.940 1.470 base.py:205(submit_request)
19 0.000 0.000 0.000 0.000 message.py:80(emitError)
4 0.000 0.000 0.000 0.000 ssl.py:895(makefile)
4 0.000 0.000 0.044 0.011 httplib.py:1239(__init__)
4 0.000 0.000 0.000 0.000 urllib.py:1092(splithost)
4 0.000 0.000 0.000 0.000 httplib.py:364(__init__)
6 0.000 0.000 0.007 0.001 httplib.py:564(read)
58 0.000 0.000 0.000 0.000 sre_parse.py:206(match)
2 0.000 0.000 0.000 0.000 decoder.py:359(decode)
75 0.000 0.000 0.000 0.000 sre_parse.py:149(append)
24 0.000 0.000 0.000 0.000 rfc822.py:290(getheader)
4 0.000 0.000 0.000 0.000 httplib.py:721(__init__)
2 0.000 0.000 0.000 0.000 base.py:110(update_headers)
4 0.000 0.000 0.001 0.000 rfc822.py:88(__init__)
11 0.000 0.000 0.000 0.000 sre_compile.py:428(_simple)
5 0.000 0.000 0.000 0.000 sre_parse.py:189(__init__)
1 0.000 0.000 0.000 0.000 urllib2.py:454(error)
7 0.000 0.000 0.001 0.000 re.py:192(compile)
10 0.000 0.000 0.000 0.000 <string>:8(__new__)
12 0.000 0.000 0.001 0.000 message.py:234(debug)
8 0.000 0.000 0.000 0.000 urllib2.py:286(get_host)
4 0.000 0.000 2.936 0.734 urllib2.py:131(urlopen)
4 0.000 0.000 0.481 0.120 ssl.py:759(recv)
1 0.000 0.000 2.576 2.576 api.py:179(get_tags)
5 0.000 0.000 0.000 0.000 urllib.py:1017(__init__)
4 0.000 0.000 0.001 0.000 base.py:235(prepare_request)
3 0.000 0.000 0.000 0.000 urlparse.py:160(_splitnetloc)
4 0.000 0.000 0.000 0.000 httplib.py:667(_safe_read)
24 0.000 0.000 0.000 0.000 sre_parse.py:92(__init__)
5 0.000 0.000 0.001 0.000 sre_compile.py:552(_code)
4 0.000 0.000 0.000 0.000 mimetools.py:50(parseplist)
1 0.000 0.000 0.000 0.000 shell.py:71(parse_image_uri)
8 0.000 0.000 0.000 0.000 urllib2.py:279(get_type)
4 0.000 0.000 0.000 0.000 urllib2.py:541(http_response)
7 0.000 0.000 0.008 0.001 httplib.py:547(close)
4 0.000 0.000 2.409 0.602 httplib.py:1040(request)
1 0.000 0.000 0.001 0.001 re.py:138(match)
1 0.000 0.000 0.000 0.000 urllib2.py:181(__init__)
28 0.000 0.000 0.000 0.000 httplib.py:860(_output)
4 0.000 0.000 0.000 0.000 httplib.py:826(close)
2 0.000 0.000 0.000 0.000 sutils.py:59(add_http)
30 0.000 0.000 0.000 0.000 rfc822.py:214(islast)
4 0.000 0.000 0.000 0.000 httplib.py:517(_check_close)
4 0.000 0.000 0.000 0.000 httplib.py:766(_get_hostport)
3 0.000 0.000 0.000 0.000 sre_compile.py:228(_compile_charset)
5 0.000 0.000 0.000 0.000 sre_parse.py:268(_escape)
4 0.000 0.000 0.000 0.000 urllib.py:1221(unquote)
44 0.000 0.000 0.000 0.000 sre_parse.py:137(__len__)
1 0.000 0.000 2.948 2.948 test_stuff.py:9(main)
2 0.000 0.000 0.000 0.000 __init__.py:294(loads)
30 0.000 0.000 0.000 0.000 rfc822.py:224(iscomment)
4 0.000 0.000 0.000 0.000 httplib.py:1044(_set_content_length)
4 0.000 0.000 0.000 0.000 urllib.py:1069(unwrap)
1 0.000 0.000 2.576 2.576 api.py:68(__init__)
13 0.000 0.000 0.000 0.000 urllib2.py:528(__lt__)
10 0.000 0.000 0.000 0.000 urllib2.py:273(get_full_url)
10 0.000 0.000 0.000 0.000 sre_compile.py:546(isstring)
4 0.000 0.000 2.408 0.602 httplib.py:1025(endheaders)
8 0.000 0.000 0.000 0.000 urllib2.py:314(add_header)
1 0.000 0.000 0.372 0.372 api.py:157(get_images)
19 0.000 0.000 0.000 0.000 message.py:104(isEnabledFor)
12 0.000 0.000 0.000 0.000 urllib2.py:1173(<genexpr>)
8 0.000 0.000 0.000 0.000 ssl.py:634(_check_connected)
7 0.000 0.000 0.000 0.000 message.py:222(verbose)
5 0.000 0.000 0.000 0.000 sre_parse.py:67(__init__)
1 0.000 0.000 2.576 2.576 api.py:100(load_image)
1 0.000 0.000 0.000 0.000 urllib2.py:555(http_error_default)
8 0.000 0.000 0.000 0.000 socket.py:296(flush)
8 0.000 0.000 0.000 0.000 urllib2.py:322(has_header)
8 0.000 0.000 0.000 0.000 sre_parse.py:74(opengroup)
4 0.000 0.000 0.000 0.000 urllib.py:1054(_is_unicode)
4 0.000 0.000 0.000 0.000 ssl.py:352(__init__)
2 0.000 0.000 0.000 0.000 UserDict.py:57(items)
8 0.000 0.000 0.000 0.000 urllib2.py:318(add_unredirected_header)
20 0.000 0.000 0.000 0.000 ssl.py:630(_checkClosed)
11 0.000 0.000 0.000 0.000 sre_parse.py:145(__setitem__)
6 0.000 0.000 0.000 0.000 urlparse.py:206(<genexpr>)
4 0.000 0.000 0.000 0.000 urllib2.py:256(get_method)
8 0.000 0.000 0.000 0.000 urllib2.py:267(has_data)
4 0.000 0.000 0.000 0.000 ssl.py:617(context)
1 0.000 0.000 0.000 0.000 urllib2.py:739(__init__)
1 0.000 0.000 0.000 0.000 base.py:96(__init__)
8 0.000 0.000 0.000 0.000 sre_parse.py:85(closegroup)
4 0.000 0.000 0.000 0.000 httplib.py:785(set_debuglevel)
4 0.000 0.000 0.000 0.000 urllib.py:1023(info)
1 0.000 0.000 0.000 0.000 urllib2.py:1237(__init__)
1 0.000 0.000 0.000 0.000 urllib2.py:337(__init__)
1 0.000 0.000 0.000 0.000 rfc822.py:444(__contains__)
8 0.000 0.000 0.000 0.000 urllib2.py:521(add_parent)
4 0.000 0.000 0.000 0.000 urllib2.py:293(get_selector)
4 0.000 0.000 0.000 0.000 urllib2.py:305(has_proxy)
2 0.000 0.000 0.000 0.000 urllib2.py:1119(__init__)
1 0.000 0.000 0.000 0.000 rfc822.py:391(__getitem__)
1 0.000 0.000 0.000 0.000 UserDict.py:103(__contains__)
1 0.000 0.000 0.000 0.000 api.py:94(_init_headers)
0 0.000 0.000 profile:0(profiler) |
sorry - where is the base case, and where is the list of times? The logger helper was not intended for this. I would recommend trying with image
thanks! |
ok, updates are the following! We wanted to be able to separate the downloading of layers from the changing of permissions, because right now they are packaged together, and each takes a substantial amount of time. However, you can imagine it would be less than ideal to do two "batches" like:
because let's say we have 15 files, it takes 10 minutes to get to number 10 and then for step 1, the entire thing dies. We have to start over. That's annoying, and that's why we have it (now) like:
so the challenge is - how to split up changing permissions from downloading without making it a PITA if you fail mid way? I decided to do the following. Dynamic Multiprocessing ClientWhat I basically wanted to do is allow for the multiprocessing client to, midway running things, add another "thing" to his queue, the "thing" being the file that is downloaded, but still needs permissions fixed. To support this, I made the MultiProcess client take an optional second function,
then, as before, we create a multiprocessing pool to start the first set of tasks to download the files.
but you'll notice I'm not appending these ids (
Then we have some catches to make sure keyboard interrupts are caught, and we return the list of finished. To support this, the different functions for download and changing permissions are now under docker/tasks, and the I think/hope this should speed things up a little bit! I haven't tested that all the results are still passing, but will update the PR with fixes for anything that doesn't. Progress BarThe progress bar can now minimally show the number of total tasks (downloads and permissions) as separate things instead of them being bundled into 1. I've also fixed it so the shub downloads will have the progress bar again. I changed the |
Fantastic, merged! |
This is the first PR to add multiprocessing to downloads, which, with some trivial testing, seems like it might speed things up by a small amount (see here http://www.vanessasaur.us/rawr/2017/multiprocess/). A few things I'd like to talk about:
Progress Bar
the stream downloader now has the progress bar removed, in favor of including it with the multiprocessing calls (to ensure that you don't see crazy output). This also means that the updates are less frequent, and a lot more jumpy (eg, one bar is shown for all layers, and it proceeds when a layer is done. I used Pool.imap instead of Pool.imap_unordered or Pool.apply_async to ensure that the ordering is preserved. This also means that the progress bar is gone for shub, and this should be changed. I'd like to have a discussion about a new progress bar strategy - is there a more ideal visual that can be shown that doesn't have the arrow
>
making the user expecting more moving progress? If so, should we do the same for the shub image, or implement the old version for that (having the call to bot.show_progress be in separate functions, which is different than how it was before, both were in the stream function.Testing
I'd like to have some test this out on different OS and CPUs to see if it does make a difference. And also get feedback on tweaking the MultiProcess module that I added in case we can make it faster.
I noticed strange behavior with pool, that it would print an empty newline after each process. As a fix I wrote a
mute
function that sends stderr output to null, and that resolved it. I don't think I need to close this, but wanted to check.Going to sleep!
@singularityware-admin