Extracted from #6550 (comment):
This only happens on AppVeyor (macOS again):
======================================================================
ERROR: datalad.core.distributed.tests.test_clone.test_ria_postclonecfg('ssh://datalad-test:/Users/appveyor/DLTMP/datalad_temp_ix8umpb9', '07c27167-6fef-443c-bbb7-3eec35daddc3')
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/appveyor/venv3.8.12/lib/python3.8/site-packages/nose/case.py", line 198, in runTest
self.test(*self.arg)
File "/Users/appveyor/projects/datalad/datalad/tests/utils.py", line 288, in _wrap_skip_ssh
return func(*args, **kwargs)
File "/Users/appveyor/projects/datalad/datalad/tests/utils.py", line 874, in _wrap_with_tempfile
return t(*(arg + (filename,)), **kw)
File "/Users/appveyor/projects/datalad/datalad/tests/utils.py", line 874, in _wrap_with_tempfile
return t(*(arg + (filename,)), **kw)
File "/Users/appveyor/projects/datalad/datalad/core/distributed/tests/test_clone.py", line 958, in _test_ria_postclonecfg
riaclone = clone('ria+{}#{}'.format(url, dsid), clone_path)
File "/Users/appveyor/projects/datalad/datalad/interface/utils.py", line 447, in eval_func
return return_func(*args, **kwargs)
File "/Users/appveyor/projects/datalad/datalad/interface/utils.py", line 439, in return_func
results = list(results)
File "/Users/appveyor/projects/datalad/datalad/interface/utils.py", line 424, in generator_func
raise IncompleteResultsError(
datalad.support.exceptions.IncompleteResultsError: Command did not complete successfully. 1 failed:
[{'action': 'install',
'message': ('Failed to clone from any candidate source URL. Encountered '
'errors per each url were:\n'
'- %s',
'ssh://datalad-test/Users/appveyor/DLTMP/datalad_temp_ix8umpb9/07c/27167-6fef-443c-bbb7-3eec35daddc3\n'
" CommandError: 'git -c diff.ignoreSubmodules=none clone "
'--progress '
'ssh://datalad-test/Users/appveyor/DLTMP/datalad_temp_ix8umpb9/07c/27167-6fef-443c-bbb7-3eec35daddc3 '
"/Users/appveyor/DLTMP/datalad_temp__test_ria_postclonecfgw5zk_49p' "
"failed with exitcode 128 [err: 'Cloning into "
"'/Users/appveyor/DLTMP/datalad_temp__test_ria_postclonecfgw5zk_49p'...\n"
'\r'
'remote: Total 37 (delta 7), reused 0 (delta 0) \n'
"fatal: remote transport reported error']"),
'path': '/Users/appveyor/DLTMP/datalad_temp__test_ria_postclonecfgw5zk_49p',
'source_url': 'ria+ssh://datalad-test:/Users/appveyor/DLTMP/datalad_temp_ix8umpb9#07c27167-6fef-443c-bbb7-3eec35daddc3',
'status': 'error',
'type': 'dataset'}]
This seems flaky. Logging into that AppVeyor build, showed, that this happens at different spots in this test.
Sometimes this clone seems to work out fine but then the subsequent get on a subdataset fails the same way.
So, currently the failure happens at line 958 in test_clone.py and on previous run (exact same commit) it only failed at line 1017.
Moreover, this should not be the only test where we clone from RIA via SSH. Not clear to me yet, how this is one is different.
Looking into this, I am seeing a Broken Pipe Error:
[DEBUG] ...>runner:192 Finished ['ssh', '-o', 'ControlPath=/Users/appveyor/Library/Caches/datalad/sockets/fb3f4327', '-o', 'SendEnv=GIT_PROTOCOL', 'datalad-test', "git-upload-pack '/private/var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_vlst5whp/376/1c829-d43c-420a-95fb-4467944477c4'"] with status 0
[ERROR] ...>main:136,185 [Errno 32] Broken pipe (BrokenPipeError)
fatal: remote transport reported error']
And git-upload-pack seems a bit off indeed:
appveyor$ git-upload-pack '/private/var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_vlst5whp/376/1c829-d43c-420a-95fb-4467944477c4'
010d74b31e4b1f6a81373783c6520507909436ca0f3b HEADmulti_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed symref=HEAD:refs/heads/dl-test-branch object-format=sha1 agent=git/2.35.1
004774b31e4b1f6a81373783c6520507909436ca0f3b refs/heads/dl-test-branch
0000
hanging at this point
And, of course, there's no problem running this right afterwards:
appveyor$ datalad clone "ria+ssh://datalad-test/private/var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_vlst5whp#3761c829-d43c-420a-95fb-4467944477c4" test5
Clone attempt: 0%| | 0.00/1.00 [00:00<?, ? Candidate locations/s]@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:b7Q9hN2pEJGEvu/BlO2GUD/EV+H/xlmDqx7oCUosGbg.
Please contact your system administrator.
Add correct host key in /Users/appveyor/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /Users/appveyor/.ssh/known_hosts:154
Password authentication is disabled to avoid man-in-the-middle attacks.
Keyboard-interactive authentication is disabled to avoid man-in-the-middle attacks.
install(ok): /Users/appveyor/projects/test5 (dataset)
Extracted from #6550 (comment):
This only happens on AppVeyor (macOS again):
This seems flaky. Logging into that AppVeyor build, showed, that this happens at different spots in this test.
Sometimes this clone seems to work out fine but then the subsequent
geton a subdataset fails the same way.So, currently the failure happens at line 958 in
test_clone.pyand on previous run (exact same commit) it only failed at line 1017.Moreover, this should not be the only test where we clone from RIA via SSH. Not clear to me yet, how this is one is different.
Looking into this, I am seeing a Broken Pipe Error:
And
git-upload-packseems a bit off indeed:hanging at this point
And, of course, there's no problem running this right afterwards: