Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UX: Better message when ORA does not find store via HTTP #6535

Merged
merged 3 commits into from
Apr 6, 2022

Conversation

bpoldrack
Copy link
Member

@bpoldrack bpoldrack commented Mar 10, 2022

@mih reported on the following current behavior:

This is what it was -- not pretty, but this is 0.15

% datalad clone https://github.com/psychoinformatics-de/studyforrest-data-phase2.git httpclone0.15
[INFO   ] scanning for annexed files (this may take some time)                                                                                                                             
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                                                      
[INFO   ] https://github.com/psychoinformatics-de/studyforrest-data-phase2.git/config download failed: Not Found 
[INFO   ] [INFO] Fetching 'http://studyforrest.ds.inm7.de/ria-layout-version' 
[INFO   ] Failed to establish a new session 1 times. 
install(ok): /tmp/httpclone0.15 (dataset)

and now it is even more verbose, failing to download, failing to establish a session, but then being OK nevertheless

% datalad clone https://github.com/psychoinformatics-de/studyforrest-data-phase2.git httpclone
[INFO   ] scanning for annexed files (this may take some time)                                                                                                                             
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                                                      
[INFO   ] https://github.com/psychoinformatics-de/studyforrest-data-phase2.git/config download failed: Not Found 
[INFO   ] [INFO] Fetching 'http://studyforrest.ds.inm7.de/ria-layout-version' 
[INFO   ] Failed to establish a new session 1 times.  -caused by- HTTPConnectionPool(host='studyforrest.ds.inm7.de', port=80): Max retries exceeded with url: /ria-layout-version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efc22bb5880>: Failed to establish a new connection: [Errno -2] Name or service not known')) 
install(ok): /tmp/httpclone (dataset)

This let's ORA account for HTTP failures during initremote/enableremote and report an invalid store (and its URL).
With support for annexremote's logging framework we'll have other means for detailed messaging soon.

Please note, that the fact we get an ok result is not ORA's fault. It's git annex init that actually succeeds (zero-exit), while ORA now correctly reports INITREMOTE-FAILURE Invalid store at ria+http://studyforrest.ds.inm7.de:

`git clone` + `git annex --debug init`
❱ git clone https://github.com/psychoinformatics-de/studyforrest-data-phase2.git httpclone0.162
Cloning into 'httpclone0.162'...
remote: Enumerating objects: 54786, done.
remote: Counting objects: 100% (7945/7945), done.
remote: Compressing objects: 100% (6769/6769), done.
remote: Total 54786 (delta 191), reused 7600 (delta 177), pack-reused 46841
Receiving objects: 100% (54786/54786), 5.00 MiB | 2.05 MiB/s, done.
Resolving deltas: 100% (4313/4313), done.
(datalad-core-dev) ben@tree in /tmp
❱ cd httpclone0.162
(datalad-core-dev) ben@tree in /tmp/httpclone0.162 on git:master
❱ git annex --debug init
init  [2022-03-10 17:29:00.489432914] (Utility.Process) process [37799] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"]
[2022-03-10 17:29:00.493406347] (Utility.Process) process [37799] done ExitFailure 1
[2022-03-10 17:29:00.494390089] (Utility.Process) process [37800] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--verify","-q","refs/remotes/origin/git-annex"]
[2022-03-10 17:29:00.498298774] (Utility.Process) process [37800] done ExitSuccess
[2022-03-10 17:29:00.499345349] (Utility.Process) process [37801] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","branch","--no-track","git-annex","refs/remotes/origin/git-annex"]
[2022-03-10 17:29:00.504030974] (Utility.Process) process [37801] done ExitSuccess
[2022-03-10 17:29:00.505082636] (Utility.Process) process [37802] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"]
[2022-03-10 17:29:00.509112137] (Utility.Process) process [37802] done ExitSuccess
[2022-03-10 17:29:00.511281667] (Utility.Process) process [37803] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","config","annex.uuid","9d6288da-33a0-405c-b753-23b352c6b070"]
[2022-03-10 17:29:00.515971936] (Utility.Process) process [37803] done ExitSuccess
[2022-03-10 17:29:00.517537884] (Utility.Process) process [37804] read: git ["config","--null","--list"]
[2022-03-10 17:29:00.521343702] (Utility.Process) process [37804] done ExitSuccess
[2022-03-10 17:29:00.526302903] (Utility.Process) process [37805] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","git-annex"]
[2022-03-10 17:29:00.531911791] (Utility.Process) process [37805] done ExitSuccess
[2022-03-10 17:29:00.532977264] (Utility.Process) process [37806] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"]
[2022-03-10 17:29:00.535839026] (Utility.Process) process [37806] done ExitSuccess
[2022-03-10 17:29:00.536672047] (Utility.Process) process [37807] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","log","refs/heads/git-annex..7ff5eafaad121f0abb0e1d12bee999f40be6da8e","--pretty=%H","-n1"]
[2022-03-10 17:29:00.539457162] (Utility.Process) process [37807] done ExitSuccess
[2022-03-10 17:29:00.549149919] (Utility.Process) process [37808] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"]
[2022-03-10 17:29:00.550645726] (Utility.Process) process [37808] done ExitSuccess
[2022-03-10 17:29:00.551187404] (Utility.Process) process [37809] feed: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","update-index","-z","--index-info"]
[2022-03-10 17:29:00.551590321] (Utility.Process) process [37810] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","ls-tree","-z","-r","--full-tree","refs/heads/git-annex"]
[2022-03-10 17:29:00.566835977] (Utility.Process) process [37810] done ExitSuccess
[2022-03-10 17:29:00.570494737] (Utility.Process) process [37809] done ExitSuccess
[2022-03-10 17:29:00.57109641] (Utility.Process) process [37811] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch"]
[2022-03-10 17:29:00.573069818] (Utility.Process) process [37812] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","config","annex.version","8"]
[2022-03-10 17:29:00.574504495] (Utility.Process) process [37812] done ExitSuccess
[2022-03-10 17:29:00.574963083] (Utility.Process) process [37813] read: git ["config","--null","--list"]
[2022-03-10 17:29:00.576102547] (Utility.Process) process [37813] done ExitSuccess
[2022-03-10 17:29:00.576491401] (Utility.Process) process [37814] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","status","--porcelain"]
[2022-03-10 17:29:00.603273269] (Utility.Process) process [37814] done ExitSuccess
[2022-03-10 17:29:00.603610109] (Utility.Process) process [37819] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","config","filter.annex.smudge","git-annex smudge -- %f"]
[2022-03-10 17:29:00.604544673] (Utility.Process) process [37819] done ExitSuccess
[2022-03-10 17:29:00.604855256] (Utility.Process) process [37820] read: git ["config","--null","--list"]
[2022-03-10 17:29:00.605643935] (Utility.Process) process [37820] done ExitSuccess
[2022-03-10 17:29:00.605921902] (Utility.Process) process [37821] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","config","filter.annex.clean","git-annex smudge --clean -- %f"]
[2022-03-10 17:29:00.606821445] (Utility.Process) process [37821] done ExitSuccess
[2022-03-10 17:29:00.606969428] (Utility.Process) process [37822] read: git ["config","--null","--list"]
[2022-03-10 17:29:00.607748951] (Utility.Process) process [37822] done ExitSuccess
[2022-03-10 17:29:00.615478554] (Utility.Process) process [37824] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","-c","filter.annex.smudge=","-c","filter.annex.clean=","-c","filter.annex.process=","write-tree"]
[2022-03-10 17:29:00.617062774] (Utility.Process) process [37824] done ExitSuccess
[2022-03-10 17:29:00.617374827] (Utility.Process) process [37825] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/annex/last-index"]
[2022-03-10 17:29:00.618255919] (Utility.Process) process [37825] done ExitFailure 1
[2022-03-10 17:29:00.618293711] (Database.Keys) reconcileStaged start
[2022-03-10 17:29:00.618616354] (Utility.Process) process [37826] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)","--buffer"]
[2022-03-10 17:29:00.618841825] (Utility.Process) process [37827] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch=%(objectname) %(objecttype) %(objectsize)","--buffer"]
[2022-03-10 17:29:00.619057717] (Utility.Process) process [37828] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","-c","filter.annex.smudge=","-c","filter.annex.clean=","-c","filter.annex.process=","-c","diff.external=","diff","4b825dc642cb6eb9a060e54bf8d69288fbee4904","3d0f4b475f3a0369f2a7cf0cf05c3d371da7c907","--raw","-z","--no-abbrev","-G/annex/objects/","--no-renames","--ignore-submodules=all","--no-ext-diff"]
[2022-03-10 17:29:00.647158018] (Utility.Process) process [37828] done ExitSuccess
(scanning for annexed files...)
[2022-03-10 17:29:00.848507909] (Utility.Process) process [37827] done ExitSuccess
[2022-03-10 17:29:00.848607397] (Utility.Process) process [37826] done ExitSuccess
[2022-03-10 17:29:00.849096028] (Utility.Process) process [37829] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","update-ref","refs/annex/last-index","3d0f4b475f3a0369f2a7cf0cf05c3d371da7c907"]
[2022-03-10 17:29:00.850252806] (Utility.Process) process [37829] done ExitSuccess
[2022-03-10 17:29:00.850288442] (Database.Keys) reconcileStaged end
[2022-03-10 17:29:00.850680954] (Utility.Process) process [37830] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","symbolic-ref","-q","HEAD"]
[2022-03-10 17:29:00.851465456] (Utility.Process) process [37830] done ExitSuccess
[2022-03-10 17:29:00.851821325] (Utility.Process) process [37831] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","refs/heads/master"]
[2022-03-10 17:29:00.852756416] (Utility.Process) process [37831] done ExitSuccess
[2022-03-10 17:29:00.85300547] (Utility.Process) process [37832] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","symbolic-ref","-q","HEAD"]
[2022-03-10 17:29:00.853874262] (Utility.Process) process [37832] done ExitSuccess
[2022-03-10 17:29:00.854192045] (Utility.Process) process [37833] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/master"]
[2022-03-10 17:29:00.855108305] (Utility.Process) process [37833] done ExitSuccess
[2022-03-10 17:29:00.856565785] (Utility.Process) process [37834] read: uname ["-n"]
[2022-03-10 17:29:00.856890913] (Utility.Process) process [37834] done ExitSuccess
[2022-03-10 17:29:00.857024361] (Annex.Branch) read uuid.log
[2022-03-10 17:29:00.857846891] (Annex.Branch) set uuid.log
[2022-03-10 17:29:00.857899122] (Annex.Branch) read remote.log
[2022-03-10 17:29:00.866725528] (Utility.Url) Request {
  host                 = "github.com"
  port                 = 443
  secure               = True
  requestHeaders       = [("Accept-Encoding","identity"),("User-Agent","git-annex/8.20211123")]
  path                 = "/psychoinformatics-de/studyforrest-data-phase2.git/config"
  queryString          = ""
  method               = "GET"
  proxy                = Nothing
  rawBody              = False
  redirectCount        = 10
  responseTimeout      = ResponseTimeoutDefault
  requestVersion       = HTTP/1.1
}


  Remote origin not usable by git-annex; setting annex-ignore
[2022-03-10 17:29:01.216567177] (Utility.Process) process [37835] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","config","remote.origin.annex-ignore","true"]
[2022-03-10 17:29:01.220896165] (Utility.Process) process [37835] done ExitSuccess
[2022-03-10 17:29:01.221789229] (Utility.Process) process [37836] read: git ["config","--null","--list"]
[2022-03-10 17:29:01.224924817] (Utility.Process) process [37836] done ExitSuccess

  https://github.com/psychoinformatics-de/studyforrest-data-phase2.git/config download failed: Not Found
[2022-03-10 17:29:01.225717781] (Annex.Branch) read trust.log
(Auto enabling special remote inm7-storage...)
[2022-03-10 17:29:01.230686423] (Utility.Process) process [37837] chat: /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora []
[2022-03-10 17:29:01.423465672] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> VERSION 1
[2022-03-10 17:29:01.423558387] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- EXTENSIONS INFO GETGITREMOTENAME ASYNC
[2022-03-10 17:29:01.423716679] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> EXTENSIONS
[2022-03-10 17:29:01.423748627] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- EXPORTSUPPORTED
[2022-03-10 17:29:01.423878052] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> EXPORTSUPPORTED-FAILURE
[2022-03-10 17:29:01.424240744] (Utility.Process) process [37856] chat: /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora []
[2022-03-10 17:29:01.591349978] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> VERSION 1
[2022-03-10 17:29:01.591449988] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- EXTENSIONS INFO GETGITREMOTENAME ASYNC
[2022-03-10 17:29:01.59159535] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> EXTENSIONS
[2022-03-10 17:29:01.591638169] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- LISTCONFIGS
[2022-03-10 17:29:01.591866055] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> CONFIG archive-id Dataset ID. Should be set automatically by datalad
[2022-03-10 17:29:01.591925873] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> CONFIG push-url URL for pushing to the RIA store. Optional.
[2022-03-10 17:29:01.592063811] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> CONFIG url RIA store to use
[2022-03-10 17:29:01.59211857] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> CONFIGEND
[2022-03-10 17:29:01.592194611] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- INITREMOTE
[2022-03-10 17:29:01.592341586] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> GETGITDIR
[2022-03-10 17:29:01.592400791] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- VALUE .git
[2022-03-10 17:29:01.620611209] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> GETCONFIG name
[2022-03-10 17:29:01.620739544] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- VALUE inm7-storage
[2022-03-10 17:29:01.620884218] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> GETCONFIG url
[2022-03-10 17:29:01.620951318] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- VALUE ria+http://studyforrest.ds.inm7.de
[2022-03-10 17:29:01.621118888] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> GETCONFIG push-url
[2022-03-10 17:29:01.621182383] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- VALUE ria+ssh://bulk1.htc.inm7.de/ds/studyforrest/srv
[2022-03-10 17:29:01.621695002] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> GETCONFIG archive-id
[2022-03-10 17:29:01.621754888] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- VALUE 5eaff716-54eb-11e8-803d-a0369f7c647e
[2022-03-10 17:29:01.621855758] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> GETCONFIG force-write
[2022-03-10 17:29:01.621912012] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] <-- VALUE 
[INFO] Fetching 'http://studyforrest.ds.inm7.de/ria-layout-version' 
[2022-03-10 17:29:01.678509094] (Annex.ExternalAddonProcess) /home/ben/venvs/datalad-core-dev/bin/git-annex-remote-ora[1] --> INITREMOTE-FAILURE Invalid store at ria+http://studyforrest.ds.inm7.de

  Invalid store at ria+http://studyforrest.ds.inm7.de
(Auto enabling special remote mddatasrc...)
[2022-03-10 17:29:01.683094365] (Utility.Process) process [37898] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","remote","add","mddatasrc","http://psydata.ovgu.de/studyforrest/phase2/.git"]
[2022-03-10 17:29:01.693890189] (Utility.Process) process [37898] done ExitSuccess
ok
[2022-03-10 17:29:01.694650893] (Utility.Process) process [37899] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","hash-object","-w","--stdin-paths","--no-filters"]
[2022-03-10 17:29:01.695079283] (Utility.Process) process [37900] feed: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","update-index","-z","--index-info"]
[2022-03-10 17:29:01.699625725] (Utility.Process) process [37900] done ExitSuccess
[2022-03-10 17:29:01.699977964] (Utility.Process) process [37901] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"]
[2022-03-10 17:29:01.701146888] (Utility.Process) process [37901] done ExitSuccess
(recording state in git...)
[2022-03-10 17:29:01.701545396] (Utility.Process) process [37902] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","write-tree"]
[2022-03-10 17:29:01.727341916] (Utility.Process) process [37902] done ExitSuccess
[2022-03-10 17:29:01.727890169] (Utility.Process) process [37903] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","commit-tree","65ca5c50dabe1c3780324db639fb2faa9a4e6571","--no-gpg-sign","-p","refs/heads/git-annex"]
[2022-03-10 17:29:01.72929311] (Utility.Process) process [37903] done ExitSuccess
[2022-03-10 17:29:01.729748125] (Utility.Process) process [37904] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","update-ref","refs/heads/git-annex","b6416923520ae31f5bebf640baa981c47e16f62f"]
[2022-03-10 17:29:01.730915073] (Utility.Process) process [37904] done ExitSuccess
[2022-03-10 17:29:01.731647167] (Utility.Process) process [37811] done ExitSuccess
[2022-03-10 17:29:01.731941703] (Utility.Process) process [37899] done ExitSuccess

Overall the above call would now look like this:

❱ datalad clone https://github.com/psychoinformatics-de/studyforrest-data-phase2.git httpclone0.16
[INFO   ] scanning for annexed files (this may take some time)                                                                                                                                                                                                                  
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                                                                                                                                           
[INFO   ] https://github.com/psychoinformatics-de/studyforrest-data-phase2.git/config download failed: Not Found 
[INFO   ] [INFO] Fetching 'http://studyforrest.ds.inm7.de/ria-layout-version' 
[INFO   ] Invalid store at ria+http://studyforrest.ds.inm7.de 
install(ok): /tmp/httpclone0.16 (dataset)

Changelog

🐛 Bug Fixes

  • Fix ORA special remote not properly reporting on HTTP failures.

@bpoldrack bpoldrack added semver-patch Increment the patch version when merged team-remotes (special) remote implementations (https://github.com/datalad/datalad/issues/6365) UX user experience labels Mar 10, 2022
Copy link
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR has no information on the rational of this particular approach for a fix, so I am wondering why the exception handler is implemented in verify_store() and not in _get_version_info(). This present fix solves the problem for the usage pattern directly associated with the bug report, but leaves it in place for usage via verify_ds_in_store().

IMHO _get_version_config() must document and homogenize its exception behavior. The FileNotFoundError error is already handled, as is the UnkownLayoutVersion, as is the NoLayoutVersion case (none documented either). I don't see why a very similar scenario now is handled with yet another exception type.

If that is necessary, why is the handling for HTTP different from other transports? Both UnkownLayoutVersion, and NoLayoutVersion are handled gracefully (read-only), and FileNotFoundError is subsequently tests if there even is a store. Not so for HTTP which is left to crash with "Invalid store" -- although both intermittent connections issues, and "no store" are valid causes of such behavior (again none of these differences documented).

Why must a user see

[INFO] Fetching 'http://studyforrest.ds.inm7.de/ria-layout-version'

?

Isn't this no more than an implementation detail? Whenever things work, it carries no meaning. When things do not work, the exception has the info on what went wrong.

@bpoldrack
Copy link
Member Author

Why must a user see

[INFO] Fetching 'http://studyforrest.ds.inm7.de/ria-layout-version'

?

This is the downloader unconditionally logging at INFO level. Not sure what to do about that ATM. I think this needs to be addressed in a different PR, where the downloaders need to be (somewhat) changed to account for the fact that a download may be an internal thing and that this kind of messaging needs to be left to higher-level code.

Re difference between HTTP and other access methods:

The reason was that we can't easily distinguish what went wrong, since the exception we get in the HTTP case doesn't come as a proper chain, but has its chain folded into its string. So, a Name or service unknown can't easily be identified while I thought it's pointless to keep sending requests when that happens (as in a graceful read-only approach). Whereas in the local and SSH cases things would be different (failing to open an SSH connection to begin with would happen earlier and not be baked into accessing a particular file).
However, it's not nice, I agree. Thinking about it again - there may be a way to unify the behavior from the IO level. Exploring ... Thanks for insisting.

@bpoldrack
Copy link
Member Author

bpoldrack commented Mar 14, 2022

@mih :

How about this: 2fe29c7 ?

Note, that messaging for verify_store and verify_ds_in_store is different, since in the latter case we already succeeded accessing the store at top-level and only failed for the specific dataset, which the message reflects.
More unification could be done (and more reliable error analysis), but I think this is out of scope for two reasons:

  • it should be done by providing better errors from the downloaders (See Downloaders not suitable for internal usage #6545)
  • additional refactoring within ORA and across IO classes would be needed. For example your suggestion to have that error handling within _get_version_config wouldn't work w/o changing what the IO for SSH is doing. However, the plan was to replace it altogether with the different access methods being actually different special remotes. I'd aim to do that for 0.17 and rather spend time on that than refactoring code that is only loosely related to the actual issue and that I want to replace soon-ish.

Agree on that?

@mih
Copy link
Member

mih commented Mar 17, 2022

it should be done by providing better errors from the downloaders

I grew a bit tired re waiting over the years: #6564 -- I think the logging part it is a no brainer, it just needs to be done. I don't see why it would need to wait for even better errors.

@adswa
Copy link
Member

adswa commented Mar 21, 2022

No remarks on the code, but just a note that I would appreciate changes to the looks of cloning the studyforrest dataset sooner rather than later. Its my go-to example throughout talks, tutorials, and during workshops (the next one is on Wednesday), and the current way a clone looks like really makes the impression that something went wrong or broke for someone who just executed their first clone... :/

@mih
Copy link
Member

mih commented Mar 22, 2022

With this PR merged into the present mainline it now looks like

(datalad-dev) mih@meiner /tmp % datalad clone https://github.com/psychoinformatics-de/studyforrest-data-phase2.git httpclone
[INFO   ] scanning for annexed files (this may take some time)                                                                                       
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                
[INFO   ] https://github.com/psychoinformatics-de/studyforrest-data-phase2.git/config download failed: Not Found 
[INFO   ] Invalid store at ria+http://studyforrest.ds.inm7.de 
install(ok): /tmp/httpclone (dataset)

I think the volume of reasoning on the error handling in the patch is an indication that this is not in a good state. But given the urgency, we should move.

We just need to fix the actual error message first. The store at http://studyforrest.ds.inm7.de is perfectly valid, it is just not reachable from the outside network. I think the error message should reflect that the implemented logic cannot tell these scenarios apart. I don't know what would be possible to say here. Perhaps No store or <whatever it has actually tried to figure out at <location>.

It might also make sense to not use a ria+ location description here, but rather the actual location it was using.

This is all just speculation, because I honestly don't see why _get_version_config() must let self.io.read_file() allow to crash, and only handles reporting problems in case of a syntax error within a the info file, and not access permissions, or download files of that file. Your statement on that is just hand-waving:

additional refactoring within ORA and across IO classes would be needed. For example your suggestion to have that error handling within _get_version_config wouldn't work w/o changing what the IO for SSH is doing.

@bpoldrack
Copy link
Member Author

I think the volume of reasoning on the error handling in the patch is an indication that this is not in a good state

Yes. The fundamental issue is SSH, which doesn't parse stderr currently. That's b/c originally we didn't do any validation of store/datasets but simply tried to access keys and treat failures as "key not available" and be done. That's why we only get a RemoteError no matter how things failed.
Adding stderr reading will explode such PR's, though, while we already agreed to rewrite in separate remotes (and ORA could then share the implementation). I don't think we should approach any of this right before 0.16. It has the potential for lots of bugs and will need some time in master.

Will adjust the message.

@mih
Copy link
Member

mih commented Mar 22, 2022

I must be missing a large chunk of the problem. I am taking about the unprotected use of io.read_file() in _get_version_config()

Now you are saying this (in this context):

Yes. The fundamental issue is SSH, which doesn't parse stderr currently. That's b/c originally we didn't do any validation of store/datasets but simply tried to access keys and treat failures as "key not available" and be done. That's why we only get a RemoteError no matter how things failed. Adding stderr reading will explode such PR's, though, while we already agreed to rewrite in separate remotes (and ORA could then share the implementation). I don't think we should approach any of this right before 0.16. It has the potential for lots of bugs and will need some time in master.

AFAICS: SSHRemoteIO raises RIARemoteError if it encounters RemoteCommandFailedError and would crash on anything else.

This method (read_file()) is used exactly once in ora_remote.py, which is precisely in _get_version_config() -- the usage I am talking about.

Outside this file it is used in _ensure_version() which does a similar thing (and is not used in create_sibling_ria, but a plain read_file again, with a different error handling for the same purpose), but is only used when a store/dataset is created, and in clone to download config

None of which indicates:

why _get_version_config() must let self.io.read_file() allow to crash

What am I missing?

@bpoldrack
Copy link
Member Author

bpoldrack commented Mar 25, 2022

We just need to fix the actual error message first. The store at http://studyforrest.ds.inm7.de is perfectly valid, it is just not reachable from the outside network. I think the error message should reflect that the implemented logic cannot tell these scenarios apart. I don't know what would be possible to say here. Perhaps No store or <whatever it has actually tried to figure out at .

Now:

[INFO   ] RIA store unavailable: Failed to access http://studyforrest.ds.inm7.de/ria-layout-version

If it happens at the dataset level after the store validation already succeded, the message would reflect that and report Dataset unavailable from RIA store: Failed to access http://studyforrest.ds.inm7.de/<dsid-path>/ria-layout-version

This let's ORA account for HTTP failures during initremote/enableremote
and report an inaccessible store (and its URL).

Otherwise we'd spit out a possibly gigantic exception traceback that is
not helpful for users. With support for annexremote's logging framework
we'll have other means for detailed messaging soon.

The approach implemented here tries to have a somewhat uniform handling
across different IO classes by raising FileNotFoundError from within
`HTTPRemoteIO.read_file` if we encountered a 404, while all other
exceptions coming from the downloaders are bubbling up as they indicate
a more general issue.
AccessDeniedError and AccessFailedError are handled by the special
remote class to provide a leaner, user-facing message when this happens
during the verification of the store itself or the dataset within.
Everything else will lead to failure as before.
Unify the way `RemoteError` renders the first level of an exception
cause chain with what `format_exception_with_cause()` does down the
line.
@bpoldrack
Copy link
Member Author

bpoldrack commented Mar 31, 2022

New proposal, @mih: 4d89cdd

Reasoning in commit message.
Would currently look like this:

❱ datalad clone https://github.com/psychoinformatics-de/studyforrest-data-phase2.git httpclone0.16
[INFO   ] scanning for annexed files (this may take some time)                                                                         
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                  
[INFO   ] https://github.com/psychoinformatics-de/studyforrest-data-phase2.git/config download failed: Not Found 
[INFO   ] RIA store unavailable. -caused by- Failed to access http://studyforrest.ds.inm7.de/ria-layout-version -caused by- Failed to access http://studyforrest.ds.inm7.de/ria-layout-version -caused by- Failed to establish a new session 1 times.  -caused by- HTTPConnectionPool(host='studyforrest.ds.inm7.de', port=80): Max retries exceeded with url: /ria-layout-version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4719758070>: Failed to establish a new connection: [Errno -2] Name or service not known')) 
install(ok): /tmp/httpclone0.16 (dataset)

What's curious to me right now, is that this

[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                  
[INFO   ] https://github.com/psychoinformatics-de/studyforrest-data-phase2.git/config download failed: Not Found 

Appears to come from git-annex init but somehow the (Autoenabling special remote ... doesn't make it. Anyways - output from init and what we do with it is a different (never ending) story.

If you are fine with this approach, I'll squash.

@bpoldrack bpoldrack requested a review from mih March 31, 2022 12:17
(Somehwhat) unify exceptions raised by the `read_file` methods of the IO
classes. Use proper exception chains all the way up, so that
`read_file`, `_get_version_config` and the
`verify_store/verify_ds_in_store` methods amend the exception chain with
their respective higher-level assessment. Goal is to have maintain the
full chain for debugging but have it's start (first part of user-facing
message) be comprehensible and hopefully actionable for (non-technical)
users.

For example: A message starting with "RIA store unavailable. -caused by-
..." may make clearer whether this is something to report to the store
maintainer or something they can do anything about. While, of course
this depends on the cause, I think it's much easier to parse when the
first "high-level" assessment isn't something to be concluded from a
long chain of technical gibberish.
Furthermore, as the cause one would first see "Failed to access
http://studyforrest.ds.inm7.de/ria-layout-version" indicating the RIA
store's address (incl. access method) instead of the first thing being a
beast like "Failed to establish a new session 1 times.  -caused by-
HTTPConnectionPool(host='studyforrest.ds.inm7.de', port=80): Max retries
exceeded with url: /ria-layout-version (Caused by NewConnectionError(
..."

I think the chain of causes should only be thrown out after we have
proper logging in special remotes, where this part then could move to.
@codeclimate
Copy link

codeclimate bot commented Apr 4, 2022

Code Climate has analyzed commit e6b22d1 and detected 6 issues on this pull request.

Here's the issue category breakdown:

Category Count
Security 6

View more on Code Climate.

@codecov
Copy link

codecov bot commented Apr 4, 2022

Codecov Report

Merging #6535 (e6b22d1) into master (99eeb57) will increase coverage by 0.10%.
The diff coverage is 74.28%.

@@            Coverage Diff             @@
##           master    #6535      +/-   ##
==========================================
+ Coverage   91.06%   91.16%   +0.10%     
==========================================
  Files         349      351       +2     
  Lines       44218    44288      +70     
==========================================
+ Hits        40265    40375     +110     
+ Misses       3953     3913      -40     
Impacted Files Coverage Δ
datalad/distributed/ora_remote.py 80.49% <73.52%> (+0.43%) ⬆️
datalad/customremotes/__init__.py 88.88% <100.00%> (ø)
datalad/core/local/tests/test_status.py 97.79% <0.00%> (-0.68%) ⬇️
datalad/cmd.py 94.09% <0.00%> (-0.40%) ⬇️
datalad/core/local/tests/test_save.py 97.95% <0.00%> (-0.17%) ⬇️
datalad/support/annexrepo.py 91.19% <0.00%> (-0.13%) ⬇️
datalad/__init__.py 85.95% <0.00%> (-0.08%) ⬇️
datalad/interface/test.py 63.63% <0.00%> (ø)
datalad/metadata/metadata.py 89.48% <0.00%> (ø)
datalad/support/annex_utils.py 100.00% <0.00%> (ø)
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 99eeb57...e6b22d1. Read the comment docs.

@bpoldrack bpoldrack merged commit 01ba254 into datalad:master Apr 6, 2022
@mih
Copy link
Member

mih commented Apr 13, 2022

TL;DR final change was:

From 0.15

[INFO ] [INFO] Fetching 'http://studyforrest.ds.inm7.de/ria-layout-version'
[INFO ] Failed to establish a new session 1 times.

To 0.16

RIA store unavailable. -caused by- Failed to access http://studyforrest.ds.inm7.de/ria-layout-version -caused by- Failed to access http://studyforrest.ds.inm7.de/ria-layout-version -caused by- Failed to establish a new session 1 times. -caused by- HTTPConnectionPool(host='studyforrest.ds.inm7.de', port=80): Max retries exceeded with url: /ria-layout-version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4719758070>: Failed to establish a new connection: [Errno -2] Name or service not known'))

Not sure how I feel about this. Certainly the "Failed to establish a new session 1 times." made little sense, but now we have a detailed technical log about a "failure" that any user will simply have to endure and cannot do anyhing about. The dataset has a remote configured that is not publicly accessible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-patch Increment the patch version when merged team-remotes (special) remote implementations (https://github.com/datalad/datalad/issues/6365) UX user experience
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants