Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_local_url_with_fetch: fails on crippled fs #4644

Closed
yarikoptic opened this issue Jun 19, 2020 · 9 comments · Fixed by #5276
Closed

test_local_url_with_fetch: fails on crippled fs #4644

yarikoptic opened this issue Jun 19, 2020 · 9 comments · Fixed by #5276
Labels
crippled-filesystems Issue concerned with operation on crippled filesystems, regardless of OS

Comments

@yarikoptic
Copy link
Member

As present in #4620 https://github.com/datalad/datalad/pull/4620/checks?check_run_id=788686809 and possibly in other PRs etc (didn't check).

======================================================================
ERROR: datalad.core.distributed.tests.test_clone.test_local_url_with_fetch
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/tests/utils.py", line 541, in newfunc
    return t(*(arg + (d,)), **kw)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/tests/utils.py", line 714, in newfunc
    return t(*(arg + (filename,)), **kw)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/core/distributed/tests/test_clone.py", line 582, in test_local_url_with_fetch
    ds_cloned.repo.fetch()
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/support/gitrepo.py", line 2271, in fetch
    git_options=git_options,
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/support/gitrepo.py", line 2288, in fetch_
    git_options=git_options)
  File "/opt/hostedtoolcache/Python/3.5.9/x64/lib/python3.5/site-packages/datalad/support/gitrepo.py", line 2429, in _fetch_push_helper
    "from nor a tracking branch is set up.".format(action))
ValueError: Neither a remote is specified to fetch from nor a tracking branch is set up.
@mih mih added the crippled-filesystems Issue concerned with operation on crippled filesystems, regardless of OS label Jul 15, 2020
@yarikoptic
Copy link
Member Author

reproduces locally with python 3.6 conda environment:

$> tools/eval_under_testloopfs 'for s in {1..30}; do python -m nose -s -v datalad/core/distributed/tests/test_clone.py:test_local_url_with_fetch; done'
I: vfat of 10:  /home/yoh/.tmp/datalad-fs-VokNp
10+0 records in
10+0 records out
10321920 bytes (10 MB, 9.8 MiB) copied, 0.00425128 s, 2.4 GB/s
mkfs.fat 4.1 (2017-01-24)
I: running for s in {1..30}; do python -m nose -s -v datalad/core/distributed/tests/test_clone.py:test_local_url_with_fetch; done
datalad.core.distributed.tests.test_clone.test_local_url_with_fetch ... ok
Versions: annexremote=1.3.1 appdirs=1.4.3 boto=2.49.0 cmd:7z=16.02 cmd:annex=8.20200309-g07fcace cmd:bundled-git=UNKNOWN cmd:git=2.26.2 cmd:system-git=2.26.2 cmd:system-ssh=8.0p1 exifread=2.1.2 git=3.1.2 gitdb=4.0.5 humanize=2.4.0 iso8601=0.1.12 keyring=21.2.1 keyrings.alt=3.1 msgpack=1.0.0 mutagen=1.41.1 requests=2.23.0 wrapt=1.12.1
Obscure filename: str=b";&%b5{}'\xce\x94\xd0\x99\xd7\xa7\xd9\x85\xe0\xb9\x97\xe3\x81\x82.datc" repr=";&%b5{}'ΔЙקم๗あ.datc"
Encodings: default='utf-8' filesystem='utf-8' locale.prefered='UTF-8'
Environment: LANG='en_US.UTF-8' GIT_PAGER='less --no-init --quit-if-one-screen' PATH='/home/yoh/anaconda-5.2.0-2.7/envs/datalad-py3.6/bin:/home/yoh/anaconda-5.2.0-2.7/condabin:/home/yoh/gocode/bin:/home/yoh/gocode/bin:/home/yoh/bin:/home/yoh/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin:/usr/sbin:/usr/local/sbin' GIT_CONFIG_PARAMETERS="'init.defaultBranch=master'"

----------------------------------------------------------------------
Ran 1 test in 2.030s

OK
datalad.core.distributed.tests.test_clone.test_local_url_with_fetch ... ERROR
Versions: annexremote=1.3.1 appdirs=1.4.3 boto=2.49.0 cmd:7z=16.02 cmd:annex=8.20200309-g07fcace cmd:bundled-git=UNKNOWN cmd:git=2.26.2 cmd:system-git=2.26.2 cmd:system-ssh=8.0p1 exifread=2.1.2 git=3.1.2 gitdb=4.0.5 humanize=2.4.0 iso8601=0.1.12 keyring=21.2.1 keyrings.alt=3.1 msgpack=1.0.0 mutagen=1.41.1 requests=2.23.0 wrapt=1.12.1
Obscure filename: str=b";&%b5{}'\xce\x94\xd0\x99\xd7\xa7\xd9\x85\xe0\xb9\x97\xe3\x81\x82.datc" repr=";&%b5{}'ΔЙקم๗あ.datc"
Encodings: default='utf-8' filesystem='utf-8' locale.prefered='UTF-8'
Environment: LANG='en_US.UTF-8' GIT_PAGER='less --no-init --quit-if-one-screen' PATH='/home/yoh/anaconda-5.2.0-2.7/envs/datalad-py3.6/bin:/home/yoh/anaconda-5.2.0-2.7/condabin:/home/yoh/gocode/bin:/home/yoh/gocode/bin:/home/yoh/bin:/home/yoh/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin:/usr/sbin:/usr/local/sbin' GIT_CONFIG_PARAMETERS="'init.defaultBranch=master'"

======================================================================
ERROR: datalad.core.distributed.tests.test_clone.test_local_url_with_fetch
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/yoh/anaconda-5.2.0-2.7/envs/datalad-py3.6/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 578, in _wrap_with_tree
    return t(*(arg + (d,)), **kw)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 757, in _wrap_with_tempfile
    return t(*(arg + (filename,)), **kw)
  File "/home/yoh/proj/datalad/datalad-master/datalad/core/distributed/tests/test_clone.py", line 629, in test_local_url_with_fetch
    ds_cloned.repo.fetch()
  File "/home/yoh/proj/datalad/datalad-master/datalad/support/gitrepo.py", line 2394, in fetch
    git_options=git_options,
  File "/home/yoh/proj/datalad/datalad-master/datalad/support/gitrepo.py", line 2411, in fetch_
    git_options=git_options)
  File "/home/yoh/proj/datalad/datalad-master/datalad/support/gitrepo.py", line 2552, in _fetch_push_helper
    "from nor a tracking branch is set up.".format(action))
ValueError: Neither a remote is specified to fetch from nor a tracking branch is set up.

----------------------------------------------------------------------
Ran 1 test in 1.835s
DataLad 0.13.7.dev614 WTF (configuration, credentials, datalad, dataset, dependencies, environment, extensions, git-annex, location, metadata_extractors, metadata_indexers, python, system)

WTF

configuration <SENSITIVE, report disabled by configuration>

credentials

  • keyring:
    • active_backends:
      • SecretService Keyring
      • PlaintextKeyring with no encyption v.1.0 at /home/yoh/.local/share/python_keyring/keyring_pass.cfg
    • config_file: /home/yoh/.local/share/python_keyring/keyringrc.cfg
    • data_root: /home/yoh/.local/share/python_keyring

datalad

  • full_version: 0.13.7.dev614-gd996e
  • version: 0.13.7.dev614

dataset

  • id: None
  • metadata: <SENSITIVE, report disabled by configuration>
  • path: /home/yoh/proj/datalad/datalad-master
  • repo: GitRepo

dependencies

  • annexremote: 1.3.1
  • appdirs: 1.4.3
  • boto: 2.49.0
  • cmd:7z: 16.02
  • cmd:annex: 8.20200309-g07fcace
  • cmd:bundled-git: UNKNOWN
  • cmd:git: 2.26.2
  • cmd:system-git: 2.26.2
  • cmd:system-ssh: 8.0p1
  • exifread: 2.1.2
  • git: 3.1.2
  • gitdb: 4.0.5
  • humanize: 2.4.0
  • iso8601: 0.1.12
  • keyring: 21.2.1
  • keyrings.alt: 3.1
  • msgpack: 1.0.0
  • mutagen: 1.41.1
  • requests: 2.23.0
  • wrapt: 1.12.1

environment

  • GIT_PAGER: less --no-init --quit-if-one-screen
  • LANG: en_US.UTF-8
  • PATH: /home/yoh/anaconda-5.2.0-2.7/envs/datalad-py3.6/bin:/home/yoh/anaconda-5.2.0-2.7/condabin:/home/yoh/gocode/bin:/home/yoh/gocode/bin:/home/yoh/bin:/home/yoh/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin:/usr/sbin:/usr/local/sbin

extensions

git-annex

  • build flags:
    • Assistant
    • Webapp
    • Pairing
    • S3
    • WebDAV
    • Inotify
    • DBus
    • DesktopNotify
    • TorrentParser
    • MagicMime
    • Feeds
    • Testsuite
  • dependency versions:
    • aws-0.21.1
    • bloomfilter-2.0.1.0
    • cryptonite-0.25
    • DAV-1.3.4
    • feed-1.2.0.1
    • ghc-8.6.5
    • http-client-0.6.4
    • persistent-sqlite-2.9.3
    • torrent-10000.1.1
    • uuid-1.3.13
    • yesod-1.6.0.1
  • key/value backends:
    • SHA256E
    • SHA256
    • SHA512E
    • SHA512
    • SHA224E
    • SHA224
    • SHA384E
    • SHA384
    • SHA3_256E
    • SHA3_256
    • SHA3_512E
    • SHA3_512
    • SHA3_224E
    • SHA3_224
    • SHA3_384E
    • SHA3_384
    • SKEIN256E
    • SKEIN256
    • SKEIN512E
    • SKEIN512
    • BLAKE2B256E
    • BLAKE2B256
    • BLAKE2B512E
    • BLAKE2B512
    • BLAKE2B160E
    • BLAKE2B160
    • BLAKE2B224E
    • BLAKE2B224
    • BLAKE2B384E
    • BLAKE2B384
    • BLAKE2BP512E
    • BLAKE2BP512
    • BLAKE2S256E
    • BLAKE2S256
    • BLAKE2S160E
    • BLAKE2S160
    • BLAKE2S224E
    • BLAKE2S224
    • BLAKE2SP256E
    • BLAKE2SP256
    • BLAKE2SP224E
    • BLAKE2SP224
    • SHA1E
    • SHA1
    • MD5E
    • MD5
    • WORM
    • URL
  • operating system: linux x86_64
  • remote types:
    • git
    • gcrypt
    • p2p
    • S3
    • bup
    • directory
    • rsync
    • web
    • bittorrent
    • webdav
    • adb
    • tahoe
    • glacier
    • ddar
    • git-lfs
    • hook
    • external
  • supported repository versions:
    • 8
  • upgrade supported from repository versions:
    • 0
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
  • version: 8.20200309-g07fcace

location

  • path: /home/yoh/proj/datalad/datalad-master
  • type: dataset

metadata_extractors

  • annex (datalad 0.13.7):
    • distribution: datalad 0.13.7
    • load_error: None
    • module: datalad.metadata.extractors.annex
    • version: None
  • audio (datalad 0.13.7):
    • distribution: datalad 0.13.7
    • load_error: None
    • module: datalad.metadata.extractors.audio
    • version: None
  • datacite (datalad 0.13.7):
    • distribution: datalad 0.13.7
    • load_error: None
    • module: datalad.metadata.extractors.datacite
    • version: None
  • datalad_core (datalad 0.13.7):
    • distribution: datalad 0.13.7
    • load_error: None
    • module: datalad.metadata.extractors.datalad_core
    • version: None
  • datalad_rfc822 (datalad 0.13.7):
    • distribution: datalad 0.13.7
    • load_error: None
    • module: datalad.metadata.extractors.datalad_rfc822
    • version: None
  • exif (datalad 0.13.7):
    • distribution: datalad 0.13.7
    • load_error: None
    • module: datalad.metadata.extractors.exif
    • version: None
  • frictionless_datapackage (datalad 0.13.7):
    • distribution: datalad 0.13.7
    • load_error: None
    • module: datalad.metadata.extractors.frictionless_datapackage
    • version: None
  • image (datalad 0.13.7):
    • distribution: datalad 0.13.7
    • load_error: None
    • module: datalad.metadata.extractors.image
    • version: None
  • xmp (datalad 0.13.7):
    • distribution: datalad 0.13.7
    • load_error: No module named 'libxmp' [xmp.py::20]
    • module: datalad.metadata.extractors.xmp

metadata_indexers

python

  • implementation: CPython
  • version: 3.6.10

system

@yarikoptic
Copy link
Member Author

reproduced also in conda env with python 3.7.9
datalad.core.distributed.tests.test_clone.test_local_url_with_fetch ... ERROR
Versions: annexremote=1.4.3 appdirs=1.4.4 boto=2.49.0 cmd:7z=16.02 cmd:annex=8.20201127+git54-ga1b227171-1~ndall+1 cmd:bundled-git=2.24.0 cmd:git=2.24.0 cmd:system-git=2.29.2 cmd:system-ssh=8.4p1 exifread=2.3.1 humanize=3.0.1 iso8601=0.1.13 keyring=21.4.0 keyrings.alt=4.0.0 msgpack=1.0.0 mutagen=1.45.1 requests=2.24.0 wrapt=1.12.1
Obscure filename: str=b";&%b5{}'\xce\x94\xd0\x99\xd7\xa7\xd9\x85\xe0\xb9\x97\xe3\x81\x82.datc" repr=";&%b5{}'ΔЙקم๗あ.datc"
Encodings: default='utf-8' filesystem='utf-8' locale.prefered='UTF-8'
Environment: LANG='en_US.UTF-8' GIT_PAGER='less --no-init --quit-if-one-screen' PATH='/home/yoh/anaconda-5.2.0-2.7/envs/datalad-py3.7.9/bin:/home/yoh/anaconda-5.2.0-2.7/condabin:/home/yoh/gocode/bin:/home/yoh/gocode/bin:/home/yoh/bin:/home/yoh/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin:/usr/sbin:/usr/local/sbin' GIT_CONFIG_PARAMETERS="'init.defaultBranch=master'"

======================================================================
ERROR: datalad.core.distributed.tests.test_clone.test_local_url_with_fetch
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/yoh/anaconda-5.2.0-2.7/envs/datalad-py3.7.9/lib/python3.7/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 578, in _wrap_with_tree
    return t(*(arg + (d,)), **kw)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 757, in _wrap_with_tempfile
    return t(*(arg + (filename,)), **kw)
  File "/home/yoh/proj/datalad/datalad-master/datalad/core/distributed/tests/test_clone.py", line 629, in test_local_url_with_fetch
    ds_cloned.repo.fetch()
  File "/home/yoh/proj/datalad/datalad-master/datalad/support/gitrepo.py", line 2394, in fetch
    git_options=git_options,
  File "/home/yoh/proj/datalad/datalad-master/datalad/support/gitrepo.py", line 2411, in fetch_
    git_options=git_options)
  File "/home/yoh/proj/datalad/datalad-master/datalad/support/gitrepo.py", line 2552, in _fetch_push_helper
    "from nor a tracking branch is set up.".format(action))
ValueError: Neither a remote is specified to fetch from nor a tracking branch is set up.

----------------------------------------------------------------------
Ran 1 test in 4.534s

FAILED (errors=1)
and finally (just needed to wait a bit longer) in local environment
datalad.core.distributed.tests.test_clone.test_local_url_with_fetch ... ERROR
Versions: annexremote=1.4.3 appdirs=1.4.4 boto=2.49.0 cmd:7z=16.02 cmd:annex=8.20201127+git54-ga1b227171-1~ndall+1 cmd:bundled-git=2.24.0 cmd:git=2.24.0 cmd:system-git=2.29.2 cmd:system-ssh=8.4p1 exifread=2.3.2 git=3.1.11 gitdb=4.0.5 humanize=0.0.0 iso8601=0.1.13 keyring=21.5.0 keyrings.alt=4.0.1 msgpack=1.0.0 mutagen=1.45.1 requests=2.24.0 scrapy=2.4.1 wrapt=1.12.1
Obscure filename: str=b";&%b5{}'\xce\x94\xd0\x99\xd7\xa7\xd9\x85\xe0\xb9\x97\xe3\x81\x82.datc" repr=";&%b5{}'ΔЙקم๗あ.datc"
Encodings: default='utf-8' filesystem='utf-8' locale.prefered='UTF-8'
Environment: LANG='en_US.UTF-8' GIT_PAGER='less --no-init --quit-if-one-screen' PATH='/home/yoh/proj/datalad/datalad-master/venvs/dev3/bin:/home/yoh/anaconda-5.2.0-2.7/condabin:/home/yoh/gocode/bin:/home/yoh/gocode/bin:/home/yoh/bin:/home/yoh/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin:/usr/sbin:/usr/local/sbin' GIT_CONFIG_PARAMETERS="'init.defaultBranch=master'"

======================================================================
ERROR: datalad.core.distributed.tests.test_clone.test_local_url_with_fetch
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 578, in _wrap_with_tree
    return t(*(arg + (d,)), **kw)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 757, in _wrap_with_tempfile
    return t(*(arg + (filename,)), **kw)
  File "/home/yoh/proj/datalad/datalad-master/datalad/core/distributed/tests/test_clone.py", line 629, in test_local_url_with_fetch
    ds_cloned.repo.fetch()
  File "/home/yoh/proj/datalad/datalad-master/datalad/support/gitrepo.py", line 2389, in fetch
    return list(
  File "/home/yoh/proj/datalad/datalad-master/datalad/support/gitrepo.py", line 2400, in fetch_
    yield from self._fetch_push_helper(
  File "/home/yoh/proj/datalad/datalad-master/datalad/support/gitrepo.py", line 2550, in _fetch_push_helper
    raise ValueError(
ValueError: Neither a remote is specified to fetch from nor a tracking branch is set up.

----------------------------------------------------------------------
Ran 1 test in 5.650s

FAILED (errors=1)

so it is just a flaky issue, nothing about "github workflows esoteric nature"

@yarikoptic
Copy link
Member Author

and boils down again to "config caching":

(Pdb) p self.config.get('branch.{0}.remote'.format('master'), None)
None
(Pdb) p self.config.reload()
None
(Pdb) p self.config.get('branch.{0}.remote'.format('master'), None)
None
(Pdb) p self.config.reload(force=True)
[DEBUG  ] Async run ['git', 'config', '-z', '-l', '--show-origin'] 
[DEBUG  ] Launching process ['git', 'config', '-z', '-l', '--show-origin'] 
[DEBUG  ] Process 1512044 started 
[DEBUG  ] Waiting for process 1512044 to complete 
[DEBUG  ] Process 1512044 exited with return code 0 
[DEBUG  ] Async run ['git', 'config', '-z', '-l', '--show-origin', '--file', '/home/yoh/.tmp/datalad-fs-PK330/datalad_temp_test_local_url_with_fetchzzd23qxx/a/.datalad/config'] 
[DEBUG  ] Launching process ['git', 'config', '-z', '-l', '--show-origin', '--file', '/home/yoh/.tmp/datalad-fs-PK330/datalad_temp_test_local_url_with_fetchzzd23qxx/a/.datalad/config'] 
[DEBUG  ] Process 1512060 started 
[DEBUG  ] Waiting for process 1512060 to complete 
[DEBUG  ] Process 1512060 exited with return code 0 
None
(Pdb) p self.config.get('branch.{0}.remote'.format('master'), None)
'origin'

@yarikoptic
Copy link
Member Author

so I guess it is just a manifestation of still open #4363 with a performance-harmful candidate fix #4364 not merged/closed (seems to resolve this issue for me locally).

I guess someone should really look into config caching logic -- some time stamping assumptions might need tune ups to avoid the forceful reload proposed by #4364. But the point is -- the issue is still there.

@yarikoptic
Copy link
Member Author

FWIW: I will furnish a quick PR introducing an alternative approach I had mentioned in #4364 (comment) : reload config whenever a flyweight instance is reused

@yarikoptic
Copy link
Member Author

nope: no PR since that (that change is in my branch: yarikoptic@84f2981) is not 100% sufficient at Repo level (did not bother on dataset), even if I force=True the reload. It does seems to help but failure still can happen.
So it is pretty much boils down to the #4364 (comment) that we have no clear promise/guards on how/when we cache/react to external config changes. In this particular test/case I think we can just reload the config upon clone (seems to resolve it! I am just giving it more cycles to make sure).
But in the longer run -- we have a good candidate to extend the notion of "read only" vs not git commands, which we added to master version!
I will look into submitting two PRs for those.

yarikoptic added a commit to yarikoptic/datalad that referenced this issue Jan 5, 2021
It helps to mitigate but does not 100% resolve the
datalad#4644 -- it still might fail once in a rare while
@yarikoptic
Copy link
Member Author

I jinxed it!!! right before I was about to interrupt (so it lasted many more cycles than without a fix) -- a test run failed :-/ it does help quite a bit so I sent out #5275

@yarikoptic
Copy link
Member Author

yarikoptic commented Jan 5, 2021

f.cking A: I see the rabbit but don't know how it came to live like this

with the following changes to reload upon non-readonly command and also printing useful information while reloading
diff --git a/datalad/config.py b/datalad/config.py
index d56ee2dff..f138d8dc2 100644
--- a/datalad/config.py
+++ b/datalad/config.py
@@ -347,8 +347,14 @@ class ConfigManager(object):
                # if mtime age is less than worst resolution assume modified
                (current_time - curmtimes[c]) > 2.0
                for c in curmtimes):
-            return False
-        return True
+            reload = False
+        else:
+            reload = True
+        print(f"IN NEED RELOAD with runner cwd {self._runner.cwd} WITH {store['mtimes']} for {store['files']} at"
+              f" {current_time} "
+              f"decided to reload"
+              f"={reload}" )
+        return reload
 
     def _reload(self, run_args):
         # query git-config
diff --git a/datalad/support/gitrepo.py b/datalad/support/gitrepo.py
index 94cd1ebfc..c6b7f78bc 100644
--- a/datalad/support/gitrepo.py
+++ b/datalad/support/gitrepo.py
@@ -2187,6 +2187,9 @@ class GitRepo(RepoInterface, metaclass=PathBasedFlyweight):
         finally:
             if not read_only:
                 self._write_lock.release()
+                # ensure that we react to possible changes to config
+                print("RELOADING!")
+                self._cfg.reload()
 
         out = res['stdout']
         err = res['stderr']

I have got a failed test with

IN NEED RELOAD with runner cwd /home/yoh/.tmp/datalad-fs-5Zxjg/datalad_temp_tree_test_local_url_with_fetch0oelkr0n/a WITH {PosixPath('/home/yoh/.tmp/datalad-fs-5Zxjg/datalad_temp_pot_mvtd/.gitconfig'): 1609868580.0, PosixPath('/etc/gitconfig'): 1608746390.7433054, PosixPath('/home/yoh/.tmp/datalad-fs-5Zxjg/datalad_temp_tree_test_local_url_with_fetch0oelkr0n/a/.git/config'): 1609868582.0} for {PosixPath('/home/yoh/.tmp/datalad-fs-5Zxjg/datalad_temp_pot_mvtd/.gitconfig'), PosixPath('/etc/gitconfig'), PosixPath('/home/yoh/.tmp/datalad-fs-5Zxjg/datalad_temp_tree_test_local_url_with_fetch0oelkr0n/a/.git/config')} at 1609868612.4173946 decided to reload=False
while running in a loop and keeping temp directories etc
$> DATALAD_TESTS_TEMP_FSSIZE=100 tools/eval_under_testloopfs 'for s in {1..60}; do DATALAD_TESTS_TEMP_KEEP=1 DATALAD_LOG_LEVEL=debug python -m nose -s -v --pdb datalad/core/distributed/tests/test_clone.py:test_local_url_with_fetch; sudo rm -rf $TMPDIR/datalad-fs-*/*;  done'

note: ~~strange thing is that rm -rf seems to be not effective for me for some reason -- they keep piling up... not sure if relates, but might ~~ nah -- we just overload TMPDIR so I need to remove everything there ;)

and what makes little sense is that difference between time() and st_mtime for the a/.git/config is

In [1]: 1609868612.4173946 - 1609868582.0
Out[1]: 30.417394638061523

30 seconds!!! the whole test run even with heavy output seems to take about 7 seconds.... so I am ready to blame everything including fuse and vfat and python itself ... ;-)

so I give up for now and welcome your attempts as well!

@yarikoptic
Copy link
Member Author

grr.. just to solidify. Added to runner printing where it is actually running ... so below you can see that we do invoke a number of `git config --local add` and even `git annex init` etc calls under `/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a` BUT st_mtime does not change and we decide to not reload since by then over 2 sec passed...
IN NEED RELOAD with runner cwd /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a WITH stored {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.datalad/config'): 1609870704.0} for {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.datalad/config')} at 1609870705.6237636 with {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.datalad/config'): 1609870704.0}decided to reload=True
IN NEED RELOAD with runner cwd /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a WITH stored {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tnmsq1jk/.gitconfig'): 1609870702.0, PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.git/config'): 1609870704.0, PosixPath('/etc/gitconfig'): 1608746390.7433054} for {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tnmsq1jk/.gitconfig'), PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.git/config'), PosixPath('/etc/gitconfig')} at 1609870705.623854 with {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tnmsq1jk/.gitconfig'): 1609870702.0, PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.git/config'): 1609870704.0, PosixPath('/etc/gitconfig'): 1608746390.7433054}decided to reload=True
[DEBUG  ] Async run ['git', 'config', '-z', '-l', '--show-origin'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'config', '-z', '-l', '--show-origin'] 
[DEBUG  ] Process 2608963 started 
[DEBUG  ] Waiting for process 2608963 to complete 
[DEBUG  ] Process 2608963 exited with return code 0 
[DEBUG  ] Async run ['git', 'config', '-z', '-l', '--show-origin', '--file', '/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.datalad/config'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'config', '-z', '-l', '--show-origin', '--file', '/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.datalad/config'] 
[DEBUG  ] Process 2608979 started 
[DEBUG  ] Waiting for process 2608979 to complete 
[DEBUG  ] Process 2608979 exited with return code 0 
[DEBUG  ] Async run ['git', 'for-each-ref', '--format=%(refname:strip=2)', 'refs/heads', 'refs/remotes'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'for-each-ref', '--format=%(refname:strip=2)', 'refs/heads', 'refs/remotes'] 
[DEBUG  ] Process 2608995 started 
[DEBUG  ] Waiting for process 2608995 to complete 
[DEBUG  ] Process 2608995 exited with return code 0 
[DEBUG  ] Async run ['git', 'for-each-ref', '--format=%(refname:strip=2)', 'refs/heads', 'refs/remotes'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'for-each-ref', '--format=%(refname:strip=2)', 'refs/heads', 'refs/remotes'] 
[DEBUG  ] Process 2609011 started 
[DEBUG  ] Waiting for process 2609011 to complete 
[DEBUG  ] Process 2609011 exited with return code 0 
[DEBUG  ] Async run ['git', 'rev-parse', '--verify', 'HEAD^{commit}'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'rev-parse', '--verify', 'HEAD^{commit}'] 
[DEBUG  ] Process 2609027 started 
[DEBUG  ] Waiting for process 2609027 to complete 
[DEBUG  ] Process 2609027 exited with return code 0 
[DEBUG  ] Async run ['git', 'for-each-ref', '--format=%(refname:strip=2)', 'refs/heads', 'refs/remotes'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'for-each-ref', '--format=%(refname:strip=2)', 'refs/heads', 'refs/remotes'] 
[DEBUG  ] Process 2609043 started 
[DEBUG  ] Waiting for process 2609043 to complete 
[DEBUG  ] Process 2609043 exited with return code 0 
[DEBUG  ] Initializing annex repo at /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Async run ['git', 'for-each-ref', '--format=%(refname:strip=2)', 'refs/heads', 'refs/remotes'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'for-each-ref', '--format=%(refname:strip=2)', 'refs/heads', 'refs/remotes'] 
[DEBUG  ] Process 2609059 started 
[DEBUG  ] Waiting for process 2609059 to complete 
[DEBUG  ] Process 2609059 exited with return code 0 
[DEBUG  ] Async run ['git', 'for-each-ref', '--format=%(refname:strip=2)', 'refs/heads'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'for-each-ref', '--format=%(refname:strip=2)', 'refs/heads'] 
[DEBUG  ] Process 2609075 started 
[DEBUG  ] Waiting for process 2609075 to complete 
[DEBUG  ] Process 2609075 exited with return code 0 
[DEBUG  ] Async run ['git', 'config', '--local', '--add', 'branch.master.remote', 'origin'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'config', '--local', '--add', 'branch.master.remote', 'origin'] 
[DEBUG  ] Process 2609091 started 
[DEBUG  ] Waiting for process 2609091 to complete 
[DEBUG  ] Process 2609091 exited with return code 0 
[DEBUG  ] Async run ['git', 'config', '--local', '--add', 'branch.master.merge', 'refs/heads/master'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'config', '--local', '--add', 'branch.master.merge', 'refs/heads/master'] 
[DEBUG  ] Process 2609107 started 
[DEBUG  ] Waiting for process 2609107 to complete 
[DEBUG  ] Process 2609107 exited with return code 0 
[DEBUG  ] Async run ['git', 'annex', 'init', '-c', 'annex.dotfiles=true', '-c', 'annex.retry=3'] under /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a 
[DEBUG  ] Launching process ['git', 'annex', 'init', '-c', 'annex.dotfiles=true', '-c', 'annex.retry=3'] 
[DEBUG  ] Process 2609123 started 
[DEBUG  ] Waiting for process 2609123 to complete 
[INFO   ] Detected a filesystem without fifo support.
|   Disabling ssh connection caching. 
[INFO   ] Detected a crippled filesystem. 
[INFO   ] Scanning for unlocked files (this may take some time) 
[DEBUG  ] Process 2609123 exited with return code 0 
IN NEED RELOAD with runner cwd /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a WITH stored {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.datalad/config'): 1609870704.0} for {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.datalad/config')} at 1609870706.002961 with {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.datalad/config'): 1609870704.0}decided to reload=False
IN NEED RELOAD with runner cwd /home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a WITH stored {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tnmsq1jk/.gitconfig'): 1609870702.0, PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.git/config'): 1609870704.0, PosixPath('/etc/gitconfig'): 1608746390.7433054} for {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tnmsq1jk/.gitconfig'), PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.git/config'), PosixPath('/etc/gitconfig')} at 1609870706.003189 with {PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tnmsq1jk/.gitconfig'): 1609870702.0, PosixPath('/home/yoh/.tmp/datalad-fs-vPjCx/datalad_temp_tree_test_local_url_with_fetchuy4d2y9s/subdir/a/.git/config'): 1609870704.0, PosixPath('/etc/gitconfig'): 1608746390.7433054}decided to reload=False

there is an additional little issue with that time checking code that if a config file is removed it considers it to be "ok" and not changed (worth a PR on its own) but it is unrelated.

Given that I saw 30 seconds difference above, tune up of "go above 2 seconds buffer" would not help us really either.

I think the only logical getaway would be to consider more "complete" fingerprinting of the files, so reacting also to atime changes (who knows -- might be no longer readable!!) and size (definetely a change) changes. From quick benchmarking -- likely be the same in terms of "performance" since any additional hit (I see larger variance) would be infinitely tiny in respect to other operations

In [9]: mtime = p.stat().st_mtime

In [10]: stat = p.stat()

In [11]: %timeit p.stat().st_mtime == mtime
1.57 µs ± 5.92 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [12]: %timeit p.stat() == stat
1.45 µs ± 11.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

yarikoptic added a commit to yarikoptic/datalad that referenced this issue Jan 5, 2021
Weird and not yet explained behavior on crippled fs does happen where we do
get mtime "stuck" in the past (saw even 30 seconds) on stating .git/config,
see datalad#4644 (comment) .

For current tests failures at hand, probably should have been sufficient to
just add collection of file sizes, and reacting on those changes as well.

But in some cases I see value to react even on atime change (e.g. file was
made non-readable), so why not to compare that as well?

To not bother subselecting which stat to worry about, I have decided to collect/compare
all stats.  I do not think performance impact would be noticeable

	In [9]: mtime = p.stat().st_mtime

	In [10]: stat = p.stat()

	In [11]: %timeit p.stat().st_mtime == mtime
	1.57 µs ± 5.92 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

	In [12]: %timeit p.stat() == stat
	1.45 µs ± 11.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

but we might react on more possibly valid cases when to reload a config.

Also, part of the change is not to go/compare only "existing" paths, but all known.
If path disappeared, and did exist before (we had non-None stat) -- we must reload as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crippled-filesystems Issue concerned with operation on crippled filesystems, regardless of OS
Projects
None yet
2 participants