-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Able to get all individual files, but not folder as a whole from RIA: "invalid start byte" & "broken pipe" #7214
Comments
Repeating the process with
Apparently, the error comes from the ora special remote ( Full debug output:
|
Interesting. This smells like the ORA remote is unable to write the streamed chunks into the target file. (As in: Are we writing a utf-8 string instead of binary somewhere?) To be clear: |
We had the decoding error before. A component split its output into random chunks, Andi the Receiver just decoded every chunk. That led the errors, but f the chunk borders where located inside an utf8 encoded character |
So it's either that, or we |
@mslw and I found it. It's a little more twisted, but yes we ended up decoding binary data. The |
Leaving a note on how the bug would be triggered: It depends on several things: First and foremost, the triggering annex keys must not have the key's size in it and the conncetion to the RIA store must be via SSH and lastly several keys are being retrieved. Such a key would in itself succeed, but screw up the ORA remote for the following key by leaving "garbage" in stdout of the persistent remote shell. Actual bug, however, was to issue a remote |
This delays the remote-end `cat` command until just before its output is needed, for two reasons: 1. The command output is not needed if `get` decides to use scp 2. When getting multiple files encrypted by git-annex (which would use scp, due to size not being part of the annex key), the `cat` output would remain in standard output and would be read when checking whether the *next* file exists, crashing upon `decode()` datalad#7214
FTR: The |
Issue fixed in |
I am working in a dataset clone made from
ria+ssh
. When trying todatalad get
a folder with several files, the result is "ok" for the first file in that folder, and then "error" for all subsequent files:However, I can get all these files one by one:
Trying with
--log-level debug
reveals that the error comes from git annex itself (at this moment we have three files locally, it gets the fourth, and errors on subsequent):And indeed:
Note: the remote which git-annex suggests to "maybe add" above is where the RIA store got populated from (same machine as the RIA). For the record, I can clone that "initial" dataset through SSH and have no problems getting the folder in that situation. But my intention is to drop data from there and only access the RIA from the outside.
Context:
datalad get
The text was updated successfully, but these errors were encountered: