Skip to content

Address recent changes in behavior of git-annex #7372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jun 26, 2023

Conversation

yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented Apr 20, 2023

we get 3

FAILED ../datalad/distributed/tests/test_create_sibling_ria.py::test_create_simple - datalad.support.exceptions.IncompleteResultsError: Command did not complete successfully. 1 failed:
[{'action': 'create-sibling-ria',
  'message': 'initremote failed.\n'
             'stdout: \n'
             '\n'
             'stderr: There is already a special remote named '
             '"datastore-storage". (Use enableremote to enable an existing '
             'special remote.)\n',
  'path': '/tmp/datalad_temp_tree__test_create_store3j2ta_0c',
  'status': 'error',
  'type': 'dataset'}]
FAILED ../datalad/support/tests/test_annexrepo.py::test_AnnexRepo_always_commit - assert ' |;&%b5{}\'"<>ΔЙקم๗あ .datc _1' in '+ Fri, 21 Apr 2023 16:19:55 UTC " |;&%b5{}\'\\"<>\\316\\224\\320\\231\\327\\247\\331\\205\\340\\271\\227\\343\\201\\202 .datc _1" | d04e5687-6201-4d71-abb1-ee675ef98483 -- travis@travis-job-33a67c8f-19ef-4aba-b5af-8877224dcfb8:/tmp/datalad_temp_test_AnnexRepo_always_commitpq5llhn6'
FAILED ../datalad/support/tests/test_annexrepo.py::test_error_reporting - assert [{'command': ... found', ...}] == [{'command': ... found', ...}]
  At index 0 diff: {'command': 'add', 'file': '"gl\\\\orious BS"', 'note': 'not found', 'success': False, 'error-messages': ['File unknown to git']} != {'command': 'add', 'file': 'gl\\orious BS', 'note': 'not found', 'error-messages': ['File unknown to git'], 'success': False}
  Full diff:
    [
     {'command': 'add',
      'error-messages': ['File unknown to git'],
  -   'file': 'gl\\orious BS',
  +   'file': '"gl\\\\orious BS"',
  ?            +  ++           +
      'note': 'not found',
      'success': False},
    ]
= 3 failed, 1227 passed, 58 skipped, 4 xfailed, 20 warnings in 2028.42s (0:33:48) =
  • remove the TEMP patch to use most recent build of git-annex

@yarikoptic yarikoptic added the semver-patch Increment the patch version when merged label Apr 20, 2023
@codecov
Copy link

codecov bot commented Apr 20, 2023

Codecov Report

Patch coverage: 75.00% and project coverage change: -0.03 ⚠️

Comparison is base (a68a4da) 91.53% compared to head (684c842) 91.51%.

Additional details and impacted files
@@            Coverage Diff             @@
##            maint    #7372      +/-   ##
==========================================
- Coverage   91.53%   91.51%   -0.03%     
==========================================
  Files         325      325              
  Lines       43345    43392      +47     
  Branches     5806     5819      +13     
==========================================
+ Hits        39678    39711      +33     
- Misses       3652     3666      +14     
  Partials       15       15              
Impacted Files Coverage Δ
datalad/distributed/create_sibling_ria.py 85.31% <ø> (ø)
datalad/distributed/export_archive_ora.py 78.88% <ø> (ø)
datalad/support/annexrepo.py 90.03% <67.74%> (-0.49%) ⬇️
datalad/support/tests/test_annexrepo.py 97.67% <80.64%> (-0.37%) ⬇️
datalad/dataset/gitrepo.py 96.86% <100.00%> (ø)
datalad/support/tests/test_fileinfo.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@yarikoptic yarikoptic force-pushed the bf-7370 branch 2 times, most recently from 6cbe73e to 339bae0 Compare April 21, 2023 15:53
@yarikoptic
Copy link
Member Author

@joeyh
Copy link

joeyh commented May 1, 2023

I've reverted the change to git-annex --json output, so you should not need 45ddd4b.

@joeyh
Copy link

joeyh commented May 1, 2023

I have fixed git-annex's output on errors to contain "git-annex:" again, so you do not need commit 7f1b37b

Of course, parsing the text of error messages is fragile, and it should be possible to add a message-id for that one, and whatever other ones datalad might currently parse.

@yarikoptic
Copy link
Member Author

Thanks @joeyh! And uff -- I guess I should have just reported but wanted to get some concrete detail so spent hours to chase a new rabbit. With current version having HOME pointing to an empty directory makes annex info not encode UTF-8 properly. Below see current and some prior snapshot versions:

❯ mkdir /tmp/emptydir
❯ ( source /home/yoh/git-annexes/10.20230407+git131-gb90c2156a6.env ; HOME=/tmp/emptydir git annex log; )
+ Wed,  3 May 2023 13:48:41 EDT " |;&%b5{}'\"<>\316\224\320\231\327\247\331\205\340\271\227\343\201\202 .datc _1" | 3ef4cd3b-38ae-4666-82da-d5f47e6b8b67 -- yoh@bilena:/home/yoh/.tmp/datalad_temp_test_AnnexRepo_always_commitxmf6ldph
❯ ( source /home/yoh/git-annexes/10.20230407+git14-ga0e6fa18eb.env ; HOME=/tmp/emptydir git annex log; )
+ Wed,  3 May 2023 13:48:41 EDT  |;&%b5{}'"<>ΔЙקم๗あ .datc _1 | 3ef4cd3b-38ae-4666-82da-d5f47e6b8b67 -- yoh@bilena:/home/yoh/.tmp/datalad_temp_test_AnnexRepo_always_commitxmf6ldph

@yarikoptic
Copy link
Member Author

yarikoptic commented May 3, 2023

oh - that is due to corequote=False I had in my ~/.gitconfig!

❯ ( source /home/yoh/git-annexes/10.20230407+git131-gb90c2156a6.env ; HOME=/tmp/emptydir git annex log; )
+ Wed,  3 May 2023 13:48:41 EDT " |;&%b5{}'\"<>\316\224\320\231\327\247\331\205\340\271\227\343\201\202 .datc _1" | 3ef4cd3b-38ae-4666-82da-d5f47e6b8b67 -- yoh@bilena:/home/yoh/.tmp/datalad_temp_test_AnnexRepo_always_commitxmf6ldph
❯ ( source /home/yoh/git-annexes/10.20230407+git131-gb90c2156a6.env ; HOME=/tmp/emptydir git -c core.quotepath=false annex log; )
+ Wed,  3 May 2023 13:48:41 EDT " |;&%b5{}'\"<>ΔЙקم๗あ .datc _1" | 3ef4cd3b-38ae-4666-82da-d5f47e6b8b67 -- yoh@bilena:/home/yoh/.tmp/datalad_temp_test_AnnexRepo_always_commitxmf6ldph

RTFM:

       core.quotePath
           Commands that output paths (e.g.  ls-files, diff), will quote "unusual" characters in
           the pathname by enclosing the pathname in double-quotes and escaping those characters
           with backslashes in the same way C escapes control characters (e.g.  \t for TAB, \n
           for LF, \\ for backslash) or bytes with values larger than 0x80 (e.g. octal \302\265
           for "micro" in UTF-8). If this variable is set to false, bytes higher than 0x80 are
           not considered "unusual" any more. Double-quotes, backslash and control characters are
           always escaped regardless of the setting of this variable. A simple space character is
           not considered "unusual". Many commands can output pathnames completely verbatim using
           the -z option. The default value is true.

and I guess it relates to 10.20230407-18-gdf6f9f1ee8 or somewhere within there:

❯ ( source /home/yoh/git-annexes/10.20230407+git14-ga0e6fa18eb.env ; HOME=/tmp/emptydir git -c core.quotepath=true annex log; )
+ Wed,  3 May 2023 13:48:41 EDT  |;&%b5{}'"<>ΔЙקم๗あ .datc _1 | 3ef4cd3b-38ae-4666-82da-d5f47e6b8b67 -- yoh@bilena:/home/yoh/.tmp/datalad_temp_test_AnnexRepo_always_commitxmf6ldph
❯ ( source /home/yoh/git-annexes/10.20230407+git25-g708f4756d4.env ; HOME=/tmp/emptydir git -c core.quotepath=true annex log; )
error: bogus format in GIT_CONFIG_PARAMETERS
fatal: unable to parse command-line config
git-annex: user error (git ["config","--null","--list"] exited 128)
❯ sed -i -e 's,usr/bin,usr/lib/git-annex.linux,g' /home/yoh/git-annexes/10.20230407+git25-g708f4756d4.env
❯ ( source /home/yoh/git-annexes/10.20230407+git25-g708f4756d4.env ; HOME=/tmp/emptydir git -c core.quotepath=true annex log; )
+ Wed,  3 May 2023 13:48:41 EDT " |;&%b5{}'\"<>\316\224\320\231\327\247\331\205\340\271\227\343\201\202 .datc _1" | 3ef4cd3b-38ae-4666-82da-d5f47e6b8b67 -- yoh@bilena:/home/yoh/.tmp/datalad_temp_test_AnnexRepo_always_commitxmf6ldph

how I got that config -- likely while trying for #7004 which I abandoned but left my config behind. uff -- at least some mysteries are getting resolved.

So, I guess, we are doomed to get to #7004 one way or another:

  • we might either need to provide explicit -c core.quotepath=VALUE option to git/git-annex invocations or interpret results differently based on the value of the config setting
  • if we care about interpreting -- we need a dedicated CI run for such cases ... uff

edit 1: note that

❯ git -c core.quotepath=false ls-files
" |;&%b5{}'\"<>ΔЙקم๗あ .datc _1"
❯ git -c core.quotepath=true ls-files
" |;&%b5{}'\"<>\316\224\320\231\327\247\331\205\340\271\227\343\201\202 .datc _1"

i.e. that "" quoting applied regardless of the config, and also thus \" escape there is also regardless of that setting. Prior git-annex did not bother ""ing and thus there were no escaping of ".

@joeyh
Copy link

joeyh commented May 3, 2023

Right, core.quotePath only affects utf-8 (or similar high characters) and git-annex supports it now. As to a separate CI run, I'd hope you can avoid it since you still do need to handle quoting of low characters eg \n

@yarikoptic
Copy link
Member Author

yarikoptic commented May 3, 2023

re - \n - we just pre-opened the pandora's box there in #7175 ... didn't look after.

As for quotePath support -- should it be supported by all or just some or in which cases? e.g. find seems to not care

❯ ( source /home/yoh/git-annexes/10.20230407+git131-gb90c2156a6.env ; git -c core.quotepath=false annex find --in here ; )
 |;&%b5{}'"<>ΔЙקم๗あ .datc _2
❯ ( source /home/yoh/git-annexes/10.20230407+git131-gb90c2156a6.env ; git -c core.quotepath=true annex find --in here ; )
 |;&%b5{}'"<>ΔЙקم๗あ .datc _2

yarikoptic added a commit to yarikoptic/datalad that referenced this pull request May 3, 2023
…or unknown file"

This reverts commit acdebc0.

Per @joeyh datalad#7372 (comment)
(commit hexsha differs due to rebases since then)
yarikoptic added a commit to yarikoptic/datalad that referenced this pull request May 3, 2023
…own special remote"

This reverts commit 90cf159.

Per @joeyh datalad#7372 (comment)
(commit hexsha differs due to rebases since then)
yarikoptic added a commit to yarikoptic/datalad that referenced this pull request May 3, 2023
…r recent annex, fix+test unannex

Not quoting filenames is not default git behavior and might have some ramifications!

datalad#7372 has more discussion on what inspired these changes.

"unannex" was completely broken in case of filenames with spaces, so this
commit fixes it as well while going through interfaces where we still do not
use --json.  Filed
https://git-annex.branchable.com/todo/--json_for_unannex__and_ideally_any_other_command_/
which was said to be already addressed.

Also added explicit testing in that added test for get_annexed_files which
uses find and which does not quote ATM.
@yarikoptic
Copy link
Member Author

yarikoptic commented May 4, 2023

appveyor (not new git-annex version, thus regression testing really): windows -- new test shows that windows cannot do "OBSCURE" filename, will be skipped. OSX some odd death of git-annex :

[00:30:46] FAILED ../datalad/local/tests/test_rerun.py::test_rerun - datalad.runner.exception.CommandError: CommandError: 'git -c diff.ignoreSubmodules=none -c core.quotepath=false annex find --anything --json --json-error-messages -c annex.dotfiles=true -- /Users/appveyor/DLTMP/datalad_temp_test_rerunrdh7klqr/sub/sequence' failed with exitcode 139 under /Users/appveyor/DLTMP/datalad_temp_test_rerunrdh7klqr/sub [info keys: stdout_json] [err: 'error: git-annex died of signal 11']

travis: 3 or more runs stalled!!! then we got originally reported (which I thought I addressed, may be I reverted something I should have not yet, heh) and some new:

FAILED ../datalad/distributed/tests/test_create_sibling_ria.py::test_create_simple - datalad.support.exceptions.IncompleteResultsError: Command did not complete successfully. 1 failed:
[{'action': 'create-sibling-ria',
  'message': 'initremote failed.\n'
             'stdout: \n'
             '\n'
             'stderr: There is already a special remote named '
             '"datastore-storage". (Use enableremote to enable an existing '
             'special remote.)\n',
  'path': '/var/tmp/sym link/datalad_temp_tree__test_create_store_ji45unl',
  'status': 'error',
  'type': 'dataset'}]
FAILED ../datalad/core/distributed/tests/test_clone.py::test_gin_cloning - TypeError: expected str, bytes or os.PathLike object, not NoneType

yarikoptic added a commit to yarikoptic/datalad that referenced this pull request May 4, 2023
…r recent annex, fix+test unannex

Not quoting filenames is not default git behavior and might have some ramifications!

datalad#7372 has more discussion on what inspired these changes.

"unannex" was completely broken in case of filenames with spaces, so this
commit fixes it as well while going through interfaces where we still do not
use --json.  Filed
https://git-annex.branchable.com/todo/--json_for_unannex__and_ideally_any_other_command_/
which was said to be already addressed.

Also added explicit testing in that added test for get_annexed_files which
uses find and which does not quote ATM.
@yarikoptic
Copy link
Member Author

grr -- can't reproduce this test_gin_cloning erroring out locally. yet to figure out on what test the other failing ones are stalling :-/
_______________________________ test_gin_cloning _______________________________
[gw0] linux -- Python 3.7.16 /tmp/dl-miniconda-qkjoj5_z/bin/python
path = '/tmp/datalad_temp_test_gin_cloninghxn114de'
    @xfail_buggy_annex_info
    @skip_if_no_network
    @with_tempfile
    def test_gin_cloning(path=None):
        # can we clone a public ds anoynmously from gin and retrieve content
        ds = clone('https://gin.g-node.org/datalad/datalad-ci-target', path)
        ok_(ds.is_installed())
        annex_path = op.join('annex', 'two')
        git_path = op.join('git', 'one')
        eq_(ds.repo.file_has_content(annex_path), False)
>       eq_(ds.repo.is_under_annex(git_path), False)
../datalad/core/distributed/tests/test_clone.py:1571: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../datalad/support/gitrepo.py:365: in _wrap_normalize_paths
    result = func(self, files_new, *args, **kwargs)
../datalad/support/annexrepo.py:2060: in is_under_annex
    return self._check_files(check, files, batch)
../datalad/support/annexrepo.py:2005: in _check_files
    annex_res = fn(files, normalize_paths=False, batch=batch)
../datalad/support/annexrepo.py:2058: in check
    fast=True, **kwargs)
../datalad/support/gitrepo.py:365: in _wrap_normalize_paths
    result = func(self, files_new, *args, **kwargs)
../datalad/support/annexrepo.py:2596: in info
    assert normpath(j.pop('file')) == normpath(f)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
path = None
    def normpath(path):
        """Normalize path, eliminating double slashes, etc."""
>       path = os.fspath(path)
E       TypeError: expected str, bytes or os.PathLike object, not NoneType
/tmp/dl-miniconda-qkjoj5_z/lib/python3.7/posixpath.py:340: TypeError

@joeyh
Copy link

joeyh commented May 5, 2023

git-annex find is a special case, it does not quote and only hides escape sequences when connected to a terminal. In a pipe you will get raw filenames always from it. (Of course can use --format='${escaped_file}')

@yarikoptic
Copy link
Member Author

git-annex find is a special case, it does not quote and only hides escape sequences when connected to a terminal. In a pipe you will get raw filenames always from it. (Of course can use --format='${escaped_file}')

I guess it would be better for us to make it explicit to use --format='${file}\n' explicitly to ensure that we get an exact filename right?

@joeyh
Copy link

joeyh commented May 11, 2023

I guess it would be better for us to make it explicit to use --format='${file}\n' explicitly to ensure that we get an exact filename right?

That's the default, specifying it would not change anything.

yarikoptic added a commit to yarikoptic/datalad that referenced this pull request May 17, 2023
…or unknown file"

This reverts commit acdebc0.

Per @joeyh datalad#7372 (comment)
(commit hexsha differs due to rebases since then)
yarikoptic added a commit to yarikoptic/datalad that referenced this pull request May 17, 2023
…r recent annex, fix+test unannex

Not quoting filenames is not default git behavior and might have some ramifications!

datalad#7372 has more discussion on what inspired these changes.

"unannex" was completely broken in case of filenames with spaces, so this
commit fixes it as well while going through interfaces where we still do not
use --json.  Filed
https://git-annex.branchable.com/todo/--json_for_unannex__and_ideally_any_other_command_/
which was said to be already addressed.

Also added explicit testing in that added test for get_annexed_files which
uses find and which does not quote ATM.
@yarikoptic
Copy link
Member Author

and only hides escape sequences when connected to a terminal.

I was relating to and only hides escape sequences when connected to a terminal. as seeking for getting true original filename without hiding any characters (despite them possibly being "handled" by the ANSI capable terminal)

yarikoptic added a commit to yarikoptic/datalad that referenced this pull request May 23, 2023
…or unknown file"

This reverts commit acdebc0.

Per @joeyh datalad#7372 (comment)
(commit hexsha differs due to rebases since then)
@yarikoptic
Copy link
Member Author

oh, travels kept my attention away from this PR/effort.
So going for a newer git-annex snapshot did not lead to those problems we encountered before - that is great! I restarted that only failing run, since that failure should is just spurious and indeed 99.9% certain has nothing to do with git-annex ;)

I am thinking to decomission travis CI (or shrink it and move to free tier) since at large overlaps with appveyor coverage (has a little more though IIRC from some testing PR) and no more datalad grant funds available -- but may be meanwhile I will keep this TEMP change that travis tests against recent build, just later will change to the released version whenever it comes out.

@datalad/developers - I will take this PR from Draft mode. It would be important to merge/release it before git-annex releases. So please have a look if you haven't done so yet.

@yarikoptic yarikoptic marked this pull request as ready for review June 20, 2023 20:00
@yarikoptic
Copy link
Member Author

eh, under NFS we still have one tests hanging which I :

❯  tools/find-hanged-tests /tmp/log.txt
Working on /tmp/log.txt
461 started, 460 completed
Never completed:
../datalad/support/tests/test_fileinfo.py::test_report_absent_keys
attempt to reproduce the stall, running solely on that test
❯ ( source ~/git-annexes/10.20230407+git272-gf6dd34ca81.env; git annex version | head -n 1 ;  tools/eval_under_nfs python -m pytest -s -v datalad/support/tests/test_fileinfo.py::test_report_absent_keys; )
git-annex version: 10.20230407+git272-gf6dd34ca81-1~ndall+1
I: mounting /home/yoh/.tmp/datalad-nfs-GVJ44.orig under /home/yoh/.tmp/datalad-nfs-GVJ44.nfs via nfs
+ mkdir -p /home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ mkdir -p /home/yoh/.tmp/datalad-nfs-GVJ44.orig
+ dpkg -l nfs-kernel-server
+ grep '^ii.*nfs-kernel-server'
ii  nfs-kernel-server 1:2.6.2-4    amd64        support for NFS kernel server
+ sudo exportfs -o rw localhost:/home/yoh/.tmp/datalad-nfs-GVJ44.orig
+ sudo mount -t nfs localhost:/home/yoh/.tmp/datalad-nfs-GVJ44.orig /home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ sudo mount
+ grep /home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ sed -e 's,^,I: ,g'
I: localhost:/home/yoh/.tmp/datalad-nfs-GVJ44.orig on /home/yoh/.tmp/datalad-nfs-GVJ44.nfs type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp6,timeo=600,retrans=2,sec=sys,clientaddr=::1,local_lock=none,addr=::1)
+ echo 'I: running python' -m pytest -s -v datalad/support/tests/test_fileinfo.py::test_report_absent_keys
I: running python -m pytest -s -v datalad/support/tests/test_fileinfo.py::test_report_absent_keys
+ TMPDIR=/home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ DATALAD_TESTS_TEMP_DIR=/home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ python -m pytest -s -v datalad/support/tests/test_fileinfo.py::test_report_absent_keys
====================================================== test session starts ======================================================
platform linux -- Python 3.11.2, pytest-7.2.1, pluggy-1.0.0 -- /home/yoh/proj/datalad/datalad-maint/venvs/dev3/bin/python
cachedir: .pytest_cache
rootdir: /home/yoh/proj/datalad/datalad-maint, configfile: tox.ini
plugins: xdist-3.1.0, fail-slow-0.3.0, cov-4.0.0
collected 1 item                                                                                                                

datalad/support/tests/test_fileinfo.py::test_report_absent_keys create(ok): . (dataset)
add(ok): dummy (file)
save(ok): . (dataset)
action summary:
  add (ok: 1)
  save (ok: 1)
drop(ok): dummy (file)
get(ok): mehasurlkey (file) [from web...]
PASSEDVersions: annexremote=1.6.0 boto=2.49.0 cmd:7z=16.02 cmd:annex=10.20230407+git272-gf6dd34ca81-1~ndall+1 cmd:bundled-git=2.30.2 cmd:git=2.30.2 cmd:ssh=9.2p1 cmd:system-git=2.39.2 cmd:system-ssh=9.2p1 datalad=0.18.4+45.g6b98865a8 humanize=4.5.0 iso8601=1.1.0 keyring=23.13.1 keyrings.alt=UNKNOWN msgpack=1.0.4 platformdirs=2.6.2 requests=2.28.2 scrapy=2.9.0
Obscure filename: str=b' |;&%b5{}\'"<>\xce\x94\xd0\x99\xd7\xa7\xd9\x85\xe0\xb9\x97\xe3\x81\x82 .datc ' repr=' |;&%b5{}\'"<>ΔЙקم๗あ .datc '
Encodings: default='utf-8' filesystem='utf-8' locale.prefered='UTF-8'
Environment: LANG='en_US.UTF-8' GIT_PAGER='less --no-init --quit-if-one-screen' PATH='/home/yoh/git-annexes/10.20230407+git272-gf6dd34ca81/usr/bin:/home/yoh/proj/datalad/datalad-maint/venvs/dev3/bin:/home/yoh/gocode/bin:/home/yoh/gocode/bin:/home/yoh/bin:/home/yoh/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin:/usr/sbin:/usr/local/sbin' GIT_CONFIG_PARAMETERS="'init.defaultBranch=dl-test-branch' 'clone.defaultRemoteName=dl-test-remote'" PYTHON_KEYRING_BACKEND='keyrings.alt.file.PlaintextKeyring' GIT_ASKPASS='true'


======================================================= warnings summary ========================================================
datalad/support/tests/test_fileinfo.py::test_report_absent_keys
  /home/yoh/proj/datalad/datalad-maint/venvs/dev3/lib/python3.11/site-packages/requests_ftp/ftp.py:9: DeprecationWarning: 'cgi' is deprecated and slated for removal in Python 3.13
    import cgi

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================= 1 passed, 1 warning in 2.57s ==================================================
+ ret=0
+ echo 'I: done, unmounting'
I: done, unmounting
+ sudo umount /home/yoh/.tmp/datalad-nfs-GVJ44.nfs
+ sudo exportfs -u localhost:/home/yoh/.tmp/datalad-nfs-GVJ44.orig
+ rm -rf /home/yoh/.tmp/datalad-nfs-GVJ44.nfs /home/yoh/.tmp/datalad-nfs-GVJ44.orig
+ exit 0

well -- we already have test_commit_annex_commit_changed marked to be skipped on travis/nfs since stalls... I will mark now test_report_absent_keys similarly so we could proceed.

and in the verbose run we do still have the ../datalad/local/tests/test_add_archive_content.py::test_add_archive_content failure... but it seems to be "benign" taking too long now (probably because this build is standalone whenever usually we use the one from conda to actually make things run faster). So I am just changing default time out to 60 from 30 in the light of using standalone now.

@yarikoptic
Copy link
Member Author

uff, now we got some new typos detected, also tests had some really unlucky run where tests tested over 5 minutes, and the same test_parallel.
I think the course of action would be:

  • I will do remove TEMP commits to use standalone, after new annex released I might just make new release installed in conda
  • I will include codespell fixes
  • we merge and release new maint to open ways for git-annex.

yarikoptic added a commit to yarikoptic/datalad that referenced this pull request Jun 21, 2023
…or unknown file"

This reverts commit acdebc0.

Per @joeyh datalad#7372 (comment)
(commit hexsha differs due to rebases since then)
yarikoptic added a commit to yarikoptic/datalad that referenced this pull request Jun 21, 2023
…r recent annex, fix+test unannex

Not quoting filenames is not default git behavior and might have some ramifications!

datalad#7372 has more discussion on what inspired these changes.

"unannex" was completely broken in case of filenames with spaces, so this
commit fixes it as well while going through interfaces where we still do not
use --json.  Filed
https://git-annex.branchable.com/todo/--json_for_unannex__and_ideally_any_other_command_/
which was said to be already addressed.

Also added explicit testing in that added test for get_annexed_files which
uses find and which does not quote ATM.
@yarikoptic
Copy link
Member Author

yarikoptic commented Jun 22, 2023

hm, those travis NFS tests time out even with non-snapshot versions of datalad!

____________________ test_AnnexRepo_file_has_content[False] ____________________
[gw1] linux -- Python 3.7.16 /tmp/dl-miniconda-3382xkm9/bin/python
Test passed but took too long to run: Duration 609.4745141559999s > 60.0s
____________________ test_AnnexRepo_file_has_content[True] _____________________
[gw1] linux -- Python 3.7.16 /tmp/dl-miniconda-3382xkm9/bin/python
Test passed but took too long to run: Duration 306.5869063230002s > 60.0s
_____________________________ test_dropkey[False] ______________________________
[gw1] linux -- Python 3.7.16 /tmp/dl-miniconda-3382xkm9/bin/python
Test passed but took too long to run: Duration 304.78485737200026s > 60.0s
____________________ test_get_local_file_url_compatibility _____________________
[gw0] linux -- Python 3.7.16 /tmp/dl-miniconda-3382xkm9/bin/python
Test passed but took too long to run: Duration 913.253736396s > 60.0s

and no similar time outs in the recent maint build for the same run https://app.travis-ci.com/github/datalad/datalad/jobs/604575436 where we do some of those listed in the slowest

============================== slowest durations ===============================
7.83s call     datalad/support/tests/test_repo_save.py::test_save_typechange
6.81s call     datalad/support/tests/test_network.py::test_get_local_file_url_compatibility
6.32s call     datalad/support/tests/test_annexrepo.py::test_is_available[False]
6.14s call     datalad/support/tests/test_annexrepo.py::test_annex_copy_to
6.02s call     datalad/support/tests/test_annexrepo.py::test_AnnexRepo_migrating_backends
6.01s call     datalad/support/tests/test_annexrepo.py::test_is_available[True]
5.98s call     datalad/support/tests/test_annexrepo.py::test_AnnexRepo_dirty
5.92s call     datalad/support/tests/test_annexrepo.py::test_AnnexRepo_web_remote
5.90s call     datalad/support/tests/test_fileinfo.py::test_subds_path
5.79s call     datalad/support/tests/test_annexrepo.py::test_annex_drop
5.75s call     datalad/support/tests/test_repo_save.py::test_save_subds_change
5.20s call     datalad/support/tests/test_fileinfo.py::test_info_path_inside_submodule
5.12s call     datalad/support/tests/test_annexrepo.py::test_whereis_batch_eqv
5.01s call     datalad/support/tests/test_annexrepo.py::test_AnnexRepo_file_has_content[False]

but they aren't 300sec! So it smells like smth in this PR which makes them that slow?! edit: but could not reproduce locally so far :-/

.codespellrc Outdated
@@ -4,5 +4,5 @@ skip = .venv,venvs,.git,build,*.egg-info,*.lock,.asv,.mypy_cache,.tox,fixtures,_
# froms - plural "from" introduced by export_archive_ora
# ned - Ned is a name
# includeds - func arg name
ignore-words-list = ba,commitish,froms,ro,ned,includeds
ignore-words-list = afile,ba,commitish,froms,ro,ned,includeds
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I CPed this and another typo fix to maint.

@yarikoptic
Copy link
Member Author

I have tried various runs for that nfs matrix run and without other runs -- nothing times out, so I have no other explanation than it was a fluke. I will remove all TEMP commits, rewrite last commit so we have a full run and let's hope that finally it comes out green.

…ial remote

Somewhere in 708f4756d4...2efceba789 git-annex chanaged output reacting on
an already known special remote which we unfortunately still parse
…sting file

Prior now we would have produced a fake json record. I have decided to pass new one
through, which would have its own error-messages. But our fake one also had a "note", so I
decided to add it for consistency (unless git-annex would start adding it).
We use that 'note' internally and seems only for comparison to 'not found'.
Now that ther is 'message-id', in principle, we could uniformly fake and use that one instead.
I think it would be congruent with prior behavior since 'out' would not be
None if there was non_existing.  I think that the comment also grew organically
and likely no longer pertains to the situation.
…or unknown file"

This reverts commit acdebc0.

Per @joeyh datalad#7372 (comment)
(commit hexsha differs due to rebases since then)
…r recent annex, fix+test unannex

Not quoting filenames is not default git behavior and might have some ramifications!

datalad#7372 has more discussion on what inspired these changes.

"unannex" was completely broken in case of filenames with spaces, so this
commit fixes it as well while going through interfaces where we still do not
use --json.  Filed
https://git-annex.branchable.com/todo/--json_for_unannex__and_ideally_any_other_command_/
which was said to be already addressed.

Also added explicit testing in that added test for get_annexed_files which
uses find and which does not quote ATM.
From appveyor

	______________________________ test_unannex_etc _______________________________
	[gw0] win32 -- Python 3.9.13 C:\Python39-x64\python.exe
	arg = (), kw = {}, tkwargs_ = {'dir': None, 'prefix': 'datalad_temp_tree_'}
	d = 'C:\\DLTMP\\datalad_temp_tree_vanxhpml'
		@wraps(t)
		def  _wrap_with_tree(*arg, **kw):
			if 'dir' not in tkwargs.keys():
				# if not specified otherwise, respect datalad.tests.temp.dir config
				# as this is a test helper
				tkwargs['dir'] = dl_cfg.get("datalad.tests.temp.dir")
			tkwargs_ = get_tempfile_kwargs(tkwargs, prefix="tree", wrapped=t)
			d = tempfile.mkdtemp(**tkwargs_)
	>       create_tree(d, tree, archives_leading_dir=archives_leading_dir)
	..\datalad\tests\utils_pytest.py:628:
	_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
	..\datalad\utils.py:2581: in create_tree
		with open_func() as f:
	_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
		def open_func() -> IO[bytes]:
	>       return open(full_name, "wb")
	E       OSError: [Errno 22] Invalid argument: 'C:\\DLTMP\\datalad_temp_tree_vanxhpml\\";&%b5{}\'.datc"'
	..\datalad\utils.py:2580: OSError
	============================== warnings summary ===============================
	datalad/local/tests/test_add_archive_content.py::TestAddArchiveOptions::test_add_delete
and in our case it is

	❯ ./find-hanged-tests /tmp/log.txt
	Working on /tmp/log.txt
	1233 started, 1232 completed
	Never completed:
	../datalad/support/tests/test_annexrepo.py::test_AnnexRepo_web_remote
@yarikoptic
Copy link
Member Author

Travis is finally green after I restarted some errorred (ssh setup) run. Let's proceed!

@yarikoptic yarikoptic merged commit 42389c6 into datalad:maint Jun 26, 2023
@yarikoptic-gitmate
Copy link
Collaborator

PR released in 0.19.1

@yarikoptic yarikoptic deleted the bf-7370 branch November 27, 2023 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release Create a release when this pr is merged semver-patch Increment the patch version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tests against dev version of git-annex on Ubuntu keep getting stuck
3 participants