-
Notifications
You must be signed in to change notification settings - Fork 110
Address recent changes in behavior of git-annex #7372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## maint #7372 +/- ##
==========================================
- Coverage 91.53% 91.51% -0.03%
==========================================
Files 325 325
Lines 43345 43392 +47
Branches 5806 5819 +13
==========================================
+ Hits 39678 39711 +33
- Misses 3652 3666 +14
Partials 15 15
☔ View full report in Codecov by Sentry. |
6cbe73e
to
339bae0
Compare
re double escaping -- filed a https://git-annex.branchable.com/bugs/started_to_escape_characters_in_the_output/?updated |
I've reverted the change to |
I have fixed git-annex's output on errors to contain "git-annex:" again, so you do not need commit 7f1b37b Of course, parsing the text of error messages is fragile, and it should be possible to add a message-id for that one, and whatever other ones datalad might currently parse. |
Thanks @joeyh! And uff -- I guess I should have just reported but wanted to get some concrete detail so spent hours to chase a new rabbit. With current version having ❯ mkdir /tmp/emptydir
❯ ( source /home/yoh/git-annexes/10.20230407+git131-gb90c2156a6.env ; HOME=/tmp/emptydir git annex log; )
+ Wed, 3 May 2023 13:48:41 EDT " |;&%b5{}'\"<>\316\224\320\231\327\247\331\205\340\271\227\343\201\202 .datc _1" | 3ef4cd3b-38ae-4666-82da-d5f47e6b8b67 -- yoh@bilena:/home/yoh/.tmp/datalad_temp_test_AnnexRepo_always_commitxmf6ldph
❯ ( source /home/yoh/git-annexes/10.20230407+git14-ga0e6fa18eb.env ; HOME=/tmp/emptydir git annex log; )
+ Wed, 3 May 2023 13:48:41 EDT |;&%b5{}'"<>ΔЙקم๗あ .datc _1 | 3ef4cd3b-38ae-4666-82da-d5f47e6b8b67 -- yoh@bilena:/home/yoh/.tmp/datalad_temp_test_AnnexRepo_always_commitxmf6ldph |
oh - that is due to
RTFM:
and I guess it relates to 10.20230407-18-gdf6f9f1ee8 or somewhere within there:
how I got that config -- likely while trying for #7004 which I abandoned but left my config behind. uff -- at least some mysteries are getting resolved. So, I guess, we are doomed to get to #7004 one way or another:
edit 1: note that ❯ git -c core.quotepath=false ls-files
" |;&%b5{}'\"<>ΔЙקم๗あ .datc _1"
❯ git -c core.quotepath=true ls-files
" |;&%b5{}'\"<>\316\224\320\231\327\247\331\205\340\271\227\343\201\202 .datc _1" i.e. that |
Right, core.quotePath only affects utf-8 (or similar high characters) and git-annex supports it now. As to a separate CI run, I'd hope you can avoid it since you still do need to handle quoting of low characters eg |
re - As for ❯ ( source /home/yoh/git-annexes/10.20230407+git131-gb90c2156a6.env ; git -c core.quotepath=false annex find --in here ; )
|;&%b5{}'"<>ΔЙקم๗あ .datc _2
❯ ( source /home/yoh/git-annexes/10.20230407+git131-gb90c2156a6.env ; git -c core.quotepath=true annex find --in here ; )
|;&%b5{}'"<>ΔЙקم๗あ .datc _2 |
…or unknown file" This reverts commit acdebc0. Per @joeyh datalad#7372 (comment) (commit hexsha differs due to rebases since then)
…own special remote" This reverts commit 90cf159. Per @joeyh datalad#7372 (comment) (commit hexsha differs due to rebases since then)
…r recent annex, fix+test unannex Not quoting filenames is not default git behavior and might have some ramifications! datalad#7372 has more discussion on what inspired these changes. "unannex" was completely broken in case of filenames with spaces, so this commit fixes it as well while going through interfaces where we still do not use --json. Filed https://git-annex.branchable.com/todo/--json_for_unannex__and_ideally_any_other_command_/ which was said to be already addressed. Also added explicit testing in that added test for get_annexed_files which uses find and which does not quote ATM.
appveyor (not new git-annex version, thus regression testing really): windows -- new test shows that windows cannot do [00:30:46] FAILED ../datalad/local/tests/test_rerun.py::test_rerun - datalad.runner.exception.CommandError: CommandError: 'git -c diff.ignoreSubmodules=none -c core.quotepath=false annex find --anything --json --json-error-messages -c annex.dotfiles=true -- /Users/appveyor/DLTMP/datalad_temp_test_rerunrdh7klqr/sub/sequence' failed with exitcode 139 under /Users/appveyor/DLTMP/datalad_temp_test_rerunrdh7klqr/sub [info keys: stdout_json] [err: 'error: git-annex died of signal 11'] travis: 3 or more runs stalled!!! then we got originally reported (which I thought I addressed, may be I reverted something I should have not yet, heh) and some new:
|
…r recent annex, fix+test unannex Not quoting filenames is not default git behavior and might have some ramifications! datalad#7372 has more discussion on what inspired these changes. "unannex" was completely broken in case of filenames with spaces, so this commit fixes it as well while going through interfaces where we still do not use --json. Filed https://git-annex.branchable.com/todo/--json_for_unannex__and_ideally_any_other_command_/ which was said to be already addressed. Also added explicit testing in that added test for get_annexed_files which uses find and which does not quote ATM.
grr -- can't reproduce this test_gin_cloning erroring out locally. yet to figure out on what test the other failing ones are stalling :-/_______________________________ test_gin_cloning _______________________________
[gw0] linux -- Python 3.7.16 /tmp/dl-miniconda-qkjoj5_z/bin/python
path = '/tmp/datalad_temp_test_gin_cloninghxn114de'
@xfail_buggy_annex_info
@skip_if_no_network
@with_tempfile
def test_gin_cloning(path=None):
# can we clone a public ds anoynmously from gin and retrieve content
ds = clone('https://gin.g-node.org/datalad/datalad-ci-target', path)
ok_(ds.is_installed())
annex_path = op.join('annex', 'two')
git_path = op.join('git', 'one')
eq_(ds.repo.file_has_content(annex_path), False)
> eq_(ds.repo.is_under_annex(git_path), False)
../datalad/core/distributed/tests/test_clone.py:1571:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../datalad/support/gitrepo.py:365: in _wrap_normalize_paths
result = func(self, files_new, *args, **kwargs)
../datalad/support/annexrepo.py:2060: in is_under_annex
return self._check_files(check, files, batch)
../datalad/support/annexrepo.py:2005: in _check_files
annex_res = fn(files, normalize_paths=False, batch=batch)
../datalad/support/annexrepo.py:2058: in check
fast=True, **kwargs)
../datalad/support/gitrepo.py:365: in _wrap_normalize_paths
result = func(self, files_new, *args, **kwargs)
../datalad/support/annexrepo.py:2596: in info
assert normpath(j.pop('file')) == normpath(f)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
path = None
def normpath(path):
"""Normalize path, eliminating double slashes, etc."""
> path = os.fspath(path)
E TypeError: expected str, bytes or os.PathLike object, not NoneType
/tmp/dl-miniconda-qkjoj5_z/lib/python3.7/posixpath.py:340: TypeError |
git-annex find is a special case, it does not quote and only hides escape sequences when connected to a terminal. In a pipe you will get raw filenames always from it. (Of course can use |
I guess it would be better for us to make it explicit to use |
That's the default, specifying it would not change anything. |
…or unknown file" This reverts commit acdebc0. Per @joeyh datalad#7372 (comment) (commit hexsha differs due to rebases since then)
…r recent annex, fix+test unannex Not quoting filenames is not default git behavior and might have some ramifications! datalad#7372 has more discussion on what inspired these changes. "unannex" was completely broken in case of filenames with spaces, so this commit fixes it as well while going through interfaces where we still do not use --json. Filed https://git-annex.branchable.com/todo/--json_for_unannex__and_ideally_any_other_command_/ which was said to be already addressed. Also added explicit testing in that added test for get_annexed_files which uses find and which does not quote ATM.
I was relating to |
…or unknown file" This reverts commit acdebc0. Per @joeyh datalad#7372 (comment) (commit hexsha differs due to rebases since then)
oh, travels kept my attention away from this PR/effort. I am thinking to decomission travis CI (or shrink it and move to free tier) since at large overlaps with appveyor coverage (has a little more though IIRC from some testing PR) and no more datalad grant funds available -- but may be meanwhile I will keep this TEMP change that travis tests against recent build, just later will change to the released version whenever it comes out. @datalad/developers - I will take this PR from Draft mode. It would be important to merge/release it before git-annex releases. So please have a look if you haven't done so yet. |
eh, under NFS we still have one tests hanging which I : ❯ tools/find-hanged-tests /tmp/log.txt
Working on /tmp/log.txt
461 started, 460 completed
Never completed:
../datalad/support/tests/test_fileinfo.py::test_report_absent_keys attempt to reproduce the stall, running solely on that test❯ ( source ~/git-annexes/10.20230407+git272-gf6dd34ca81.env; git annex version | head -n 1 ; tools/eval_under_nfs python -m pytest -s -v datalad/support/tests/test_fileinfo.py::test_report_absent_keys; )
git-annex version: 10.20230407+git272-gf6dd34ca81-1~ndall+1
I: mounting /home/yoh/.tmp/datalad-nfs-GVJ44.orig under /home/yoh/.tmp/datalad-nfs-GVJ44.nfs via nfs
+ mkdir -p /home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ mkdir -p /home/yoh/.tmp/datalad-nfs-GVJ44.orig
+ dpkg -l nfs-kernel-server
+ grep '^ii.*nfs-kernel-server'
ii nfs-kernel-server 1:2.6.2-4 amd64 support for NFS kernel server
+ sudo exportfs -o rw localhost:/home/yoh/.tmp/datalad-nfs-GVJ44.orig
+ sudo mount -t nfs localhost:/home/yoh/.tmp/datalad-nfs-GVJ44.orig /home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ sudo mount
+ grep /home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ sed -e 's,^,I: ,g'
I: localhost:/home/yoh/.tmp/datalad-nfs-GVJ44.orig on /home/yoh/.tmp/datalad-nfs-GVJ44.nfs type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp6,timeo=600,retrans=2,sec=sys,clientaddr=::1,local_lock=none,addr=::1)
+ echo 'I: running python' -m pytest -s -v datalad/support/tests/test_fileinfo.py::test_report_absent_keys
I: running python -m pytest -s -v datalad/support/tests/test_fileinfo.py::test_report_absent_keys
+ TMPDIR=/home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ DATALAD_TESTS_TEMP_DIR=/home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ python -m pytest -s -v datalad/support/tests/test_fileinfo.py::test_report_absent_keys
====================================================== test session starts ======================================================
platform linux -- Python 3.11.2, pytest-7.2.1, pluggy-1.0.0 -- /home/yoh/proj/datalad/datalad-maint/venvs/dev3/bin/python
cachedir: .pytest_cache
rootdir: /home/yoh/proj/datalad/datalad-maint, configfile: tox.ini
plugins: xdist-3.1.0, fail-slow-0.3.0, cov-4.0.0
collected 1 item
datalad/support/tests/test_fileinfo.py::test_report_absent_keys create(ok): . (dataset)
add(ok): dummy (file)
save(ok): . (dataset)
action summary:
add (ok: 1)
save (ok: 1)
drop(ok): dummy (file)
get(ok): mehasurlkey (file) [from web...]
PASSEDVersions: annexremote=1.6.0 boto=2.49.0 cmd:7z=16.02 cmd:annex=10.20230407+git272-gf6dd34ca81-1~ndall+1 cmd:bundled-git=2.30.2 cmd:git=2.30.2 cmd:ssh=9.2p1 cmd:system-git=2.39.2 cmd:system-ssh=9.2p1 datalad=0.18.4+45.g6b98865a8 humanize=4.5.0 iso8601=1.1.0 keyring=23.13.1 keyrings.alt=UNKNOWN msgpack=1.0.4 platformdirs=2.6.2 requests=2.28.2 scrapy=2.9.0
Obscure filename: str=b' |;&%b5{}\'"<>\xce\x94\xd0\x99\xd7\xa7\xd9\x85\xe0\xb9\x97\xe3\x81\x82 .datc ' repr=' |;&%b5{}\'"<>ΔЙקم๗あ .datc '
Encodings: default='utf-8' filesystem='utf-8' locale.prefered='UTF-8'
Environment: LANG='en_US.UTF-8' GIT_PAGER='less --no-init --quit-if-one-screen' PATH='/home/yoh/git-annexes/10.20230407+git272-gf6dd34ca81/usr/bin:/home/yoh/proj/datalad/datalad-maint/venvs/dev3/bin:/home/yoh/gocode/bin:/home/yoh/gocode/bin:/home/yoh/bin:/home/yoh/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin:/usr/sbin:/usr/local/sbin' GIT_CONFIG_PARAMETERS="'init.defaultBranch=dl-test-branch' 'clone.defaultRemoteName=dl-test-remote'" PYTHON_KEYRING_BACKEND='keyrings.alt.file.PlaintextKeyring' GIT_ASKPASS='true'
======================================================= warnings summary ========================================================
datalad/support/tests/test_fileinfo.py::test_report_absent_keys
/home/yoh/proj/datalad/datalad-maint/venvs/dev3/lib/python3.11/site-packages/requests_ftp/ftp.py:9: DeprecationWarning: 'cgi' is deprecated and slated for removal in Python 3.13
import cgi
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================= 1 passed, 1 warning in 2.57s ==================================================
+ ret=0
+ echo 'I: done, unmounting'
I: done, unmounting
+ sudo umount /home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ sudo exportfs -u localhost:/home/yoh/.tmp/datalad-nfs-GVJ44.orig
+ rm -rf /home/yoh/.tmp/datalad-nfs-GVJ44.nfs /home/yoh/.tmp/datalad-nfs-GVJ44.orig
+ exit 0
well -- we already have and in the verbose run we do still have the |
uff, now we got some new typos detected, also tests had some really unlucky run where tests tested over 5 minutes, and the same test_parallel.
|
…or unknown file" This reverts commit acdebc0. Per @joeyh datalad#7372 (comment) (commit hexsha differs due to rebases since then)
…r recent annex, fix+test unannex Not quoting filenames is not default git behavior and might have some ramifications! datalad#7372 has more discussion on what inspired these changes. "unannex" was completely broken in case of filenames with spaces, so this commit fixes it as well while going through interfaces where we still do not use --json. Filed https://git-annex.branchable.com/todo/--json_for_unannex__and_ideally_any_other_command_/ which was said to be already addressed. Also added explicit testing in that added test for get_annexed_files which uses find and which does not quote ATM.
hm, those travis NFS tests time out even with non-snapshot versions of datalad!
and no similar time outs in the recent
but they aren't 300sec! So it smells like smth in this PR which makes them that slow?! edit: but could not reproduce locally so far :-/ |
.codespellrc
Outdated
@@ -4,5 +4,5 @@ skip = .venv,venvs,.git,build,*.egg-info,*.lock,.asv,.mypy_cache,.tox,fixtures,_ | |||
# froms - plural "from" introduced by export_archive_ora | |||
# ned - Ned is a name | |||
# includeds - func arg name | |||
ignore-words-list = ba,commitish,froms,ro,ned,includeds | |||
ignore-words-list = afile,ba,commitish,froms,ro,ned,includeds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I CPed this and another typo fix to maint
.
I have tried various runs for that nfs matrix run and without other runs -- nothing times out, so I have no other explanation than it was a fluke. I will remove all TEMP commits, rewrite last commit so we have a full run and let's hope that finally it comes out green. |
…ial remote Somewhere in 708f4756d4...2efceba789 git-annex chanaged output reacting on an already known special remote which we unfortunately still parse
…sting file Prior now we would have produced a fake json record. I have decided to pass new one through, which would have its own error-messages. But our fake one also had a "note", so I decided to add it for consistency (unless git-annex would start adding it). We use that 'note' internally and seems only for comparison to 'not found'. Now that ther is 'message-id', in principle, we could uniformly fake and use that one instead.
I think it would be congruent with prior behavior since 'out' would not be None if there was non_existing. I think that the comment also grew organically and likely no longer pertains to the situation.
…or unknown file" This reverts commit acdebc0. Per @joeyh datalad#7372 (comment) (commit hexsha differs due to rebases since then)
…r recent annex, fix+test unannex Not quoting filenames is not default git behavior and might have some ramifications! datalad#7372 has more discussion on what inspired these changes. "unannex" was completely broken in case of filenames with spaces, so this commit fixes it as well while going through interfaces where we still do not use --json. Filed https://git-annex.branchable.com/todo/--json_for_unannex__and_ideally_any_other_command_/ which was said to be already addressed. Also added explicit testing in that added test for get_annexed_files which uses find and which does not quote ATM.
From appveyor ______________________________ test_unannex_etc _______________________________ [gw0] win32 -- Python 3.9.13 C:\Python39-x64\python.exe arg = (), kw = {}, tkwargs_ = {'dir': None, 'prefix': 'datalad_temp_tree_'} d = 'C:\\DLTMP\\datalad_temp_tree_vanxhpml' @wraps(t) def _wrap_with_tree(*arg, **kw): if 'dir' not in tkwargs.keys(): # if not specified otherwise, respect datalad.tests.temp.dir config # as this is a test helper tkwargs['dir'] = dl_cfg.get("datalad.tests.temp.dir") tkwargs_ = get_tempfile_kwargs(tkwargs, prefix="tree", wrapped=t) d = tempfile.mkdtemp(**tkwargs_) > create_tree(d, tree, archives_leading_dir=archives_leading_dir) ..\datalad\tests\utils_pytest.py:628: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ..\datalad\utils.py:2581: in create_tree with open_func() as f: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ def open_func() -> IO[bytes]: > return open(full_name, "wb") E OSError: [Errno 22] Invalid argument: 'C:\\DLTMP\\datalad_temp_tree_vanxhpml\\";&%b5{}\'.datc"' ..\datalad\utils.py:2580: OSError ============================== warnings summary =============================== datalad/local/tests/test_add_archive_content.py::TestAddArchiveOptions::test_add_delete
and in our case it is ❯ ./find-hanged-tests /tmp/log.txt Working on /tmp/log.txt 1233 started, 1232 completed Never completed: ../datalad/support/tests/test_annexrepo.py::test_AnnexRepo_web_remote
Travis is finally green after I restarted some errorred (ssh setup) run. Let's proceed! |
PR released in |
we get 3