Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/maint'
Browse files Browse the repository at this point in the history
* origin/maint: (23 commits)
  BF: Make create's check for procedures work with several again
  Support older pytests
  [skip ci] Update RST changelog
  Update CHANGELOG.md [skip ci]
  DOC: minor fix - consistent DataLad (not Datalad) in docs and CHANGELOG
  DOC: fixup/harmonize Changelog for 0.17.0
  BF(TST, workaround): install any version of git-annex while testing against py 3.10
  BF: use --python-match minor option in new datalad-installer release to match outside version of python
  ENH: robustify against case where ssh version is None
  BF: do not compare for ssh implementation name
  RM(TST): remove testing of datalad.test which was removed from 0.17.0
  BF(TMP workaround): do allow for distutils to be used
  remove ssh type signaling
  Allow skip_if_no_module() and skip_if_no_network() to be called at module-level
  Remove a remaining import of nose-based datalad.tests.utils
  add ssh type to ssh version
  use existing external version detection
  improve version detection
  add '<' and '>' to obscure filename parts
  remove code duplication
  ...
  • Loading branch information
yarikoptic committed Jul 15, 2022
2 parents 52e822a + 8c65776 commit 90b2f6e
Show file tree
Hide file tree
Showing 16 changed files with 207 additions and 102 deletions.
11 changes: 7 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ env:
- DATALAD_DATASETS_TOPURL=https://datasets-tests.datalad.org
# How/which git-annex we install. conda's build would be the fastest, but it must not
# get ahead in PATH to not shadow travis' python
- _DL_ANNEX_INSTALL_SCENARIO="miniconda --batch git-annex=8.20201007 -m conda"
- _DL_ANNEX_INSTALL_SCENARIO="miniconda --python-match minor --batch git-annex=8.20201007 -m conda"

matrix:
include:
Expand All @@ -60,6 +60,9 @@ matrix:
env:
- PYTEST_SELECTION=
- PYTEST_SELECTION_OP=not
# current "default" version of git-annex 8.20201007 is not co-installable
# with python 3.10, so let's just allow for any git-annex version
- _DL_ANNEX_INSTALL_SCENARIO="miniconda --python-match minor --batch git-annex -m conda"
- if: type = cron
python: 3.7
# Single run for Python 3.7
Expand All @@ -73,12 +76,12 @@ matrix:
env:
- PYTEST_SELECTION_OP=not
- DATALAD_SSH_MULTIPLEX__CONNECTIONS=0
- _DL_ANNEX_INSTALL_SCENARIO="miniconda --batch git-annex=10.20220525 -m conda"
- _DL_ANNEX_INSTALL_SCENARIO="miniconda --python-match minor --batch git-annex=10.20220525 -m conda"
- python: 3.7
env:
- PYTEST_SELECTION_OP=""
- DATALAD_SSH_MULTIPLEX__CONNECTIONS=0
- _DL_ANNEX_INSTALL_SCENARIO="miniconda --batch git-annex=8.20210310 -m conda"
- _DL_ANNEX_INSTALL_SCENARIO="miniconda --python-match minor --batch git-annex=8.20210310 -m conda"
# To test https://github.com/datalad/datalad/pull/4342 fix in case of no "not" for pytest.
# From our testing in that PR seems to have no effect, but kept around since should not hurt.
- LANG=bg_BG.UTF-8
Expand Down Expand Up @@ -143,7 +146,7 @@ matrix:
- if: type = cron
python: 3.7
env:
- _DL_ANNEX_INSTALL_SCENARIO="miniconda --batch git-annex=8.20200309 -m conda"
- _DL_ANNEX_INSTALL_SCENARIO="miniconda --python-match minor --batch git-annex=8.20200309 -m conda"
# Run with git's master branch rather the default one on the system.
- if: type = cron
python: 3.7
Expand Down
32 changes: 27 additions & 5 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,24 @@
# 0.17.1 (Mon Jul 11 2022)

#### 🐛 Bug Fix

- DOC: minor fix - consistent DataLad (not Datalad) in docs and CHANGELOG [#6830](https://github.com/datalad/datalad/pull/6830) ([@yarikoptic](https://github.com/yarikoptic))
- DOC: fixup/harmonize Changelog for 0.17.0 a little [#6828](https://github.com/datalad/datalad/pull/6828) ([@yarikoptic](https://github.com/yarikoptic))
- BF: use --python-match minor option in new datalad-installer release to match outside version of Python [#6827](https://github.com/datalad/datalad/pull/6827) ([@christian-monch](https://github.com/christian-monch) [@yarikoptic](https://github.com/yarikoptic))
- Do not quote paths for ssh >= 9 [#6826](https://github.com/datalad/datalad/pull/6826) ([@christian-monch](https://github.com/christian-monch) [@yarikoptic](https://github.com/yarikoptic))
- Suppress DeprecationWarning to allow for distutils to be used [#6819](https://github.com/datalad/datalad/pull/6819) ([@yarikoptic](https://github.com/yarikoptic))
- RM(TST): remove testing of datalad.test which was removed from 0.17.0 [#6822](https://github.com/datalad/datalad/pull/6822) ([@yarikoptic](https://github.com/yarikoptic))
- Avoid import of nose-based tests.utils, make skip_if_no_module() and skip_if_no_network() allowed at module level [#6817](https://github.com/datalad/datalad/pull/6817) ([@jwodder](https://github.com/jwodder))
- BF(TST): use higher level asyncio.run instead of asyncio.get_event_loop in test_inside_async [#6808](https://github.com/datalad/datalad/pull/6808) ([@yarikoptic](https://github.com/yarikoptic))

#### Authors: 3

- Christian Mönch ([@christian-monch](https://github.com/christian-monch))
- John T. Wodder II ([@jwodder](https://github.com/jwodder))
- Yaroslav Halchenko ([@yarikoptic](https://github.com/yarikoptic))

---

# 0.17.0 (Thu Jul 7 2022) -- pytest migration

#### 💫 Enhancements and new features
Expand All @@ -6,14 +27,15 @@
- `datalad unlock` gained a progress bar. [#6704](https://github.com/datalad/datalad/pull/6704) (by @adswa)
- When `create-sibling-gitlab` is called on non-existing subdatasets or paths it now returns an impossible result instead of no feedback at all. [#6701](https://github.com/datalad/datalad/pull/6701) (by @adswa)
- `datalad wtf` includes a report on file system types of commonly used paths. [#6664](https://github.com/datalad/datalad/pull/6664) (by @adswa)
- use next generation metadata code in search, if it is available [#6518](https://github.com/datalad/datalad/pull/6518) (by @christian-monch)
- Use next generation metadata code in search, if it is available. [#6518](https://github.com/datalad/datalad/pull/6518) (by @christian-monch)

#### 🪓 Deprecations and removals
- Remove unused and untested log helpers `NoProgressLog` and `OnlyProgressLog`. [#6747](https://github.com/datalad/datalad/pull/6747) (by @mih)
- Remove unused `sorted_files()` helper. [#6722](https://github.com/datalad/datalad/pull/6722) (by @adswa)
- Discontinued the value `stdout` for use with the config variable `datalad.log.target` as its use would inevitably break special remote implementations. [#6675](https://github.com/datalad/datalad/pull/6675) (by @bpoldrack)
- `AnnexRepo.add_urls()` is deprecated in favor of `AnnexRepo.add_url_to_file()` or a direct call to `AnnexRepo.call_annex()`. [#6667](https://github.com/datalad/datalad/pull/6667) (by @mih)
- `datalad test` command and supporting functionality (e.g., `datalad.test`) were removed. [#](https://github.com/datalad/datalad/pull/6273) (by @jwodder)
- `datalad test` command and supporting functionality (e.g., `datalad.test`) were removed. [#6273](https://github.com/datalad/datalad/pull/6273) (by @jwodder)

#### 🐛 Bug Fixes
- `export-archive` does not rely on `normalize_path()` methods anymore and became more robust when called from subdirectories. [#6745](https://github.com/datalad/datalad/pull/6745) (by @adswa)
- Sanitize keys before checking content availability to ensure that the content availability of files with URL- or custom backend keys is correctly determined and marked. [#6663](https://github.com/datalad/datalad/pull/6663) (by @adswa)
Expand All @@ -27,7 +49,7 @@

#### 🏠 Internal
- Inline code of `create-sibling-ria` has been refactored to an internal helper to check for siblings with particular names across dataset hierarchies in `datalad-next`, and is reintroduced into core to modularize the code base further. [#6706](https://github.com/datalad/datalad/pull/6706) (by @adswa)
- `get_initialized_logger` now lets a given `logtarget` take precendence over `datalad.log.target`. [#6675](https://github.com/datalad/datalad/pull/6675) (by @bpoldrack)
- `get_initialized_logger` now lets a given `logtarget` take precedence over `datalad.log.target`. [#6675](https://github.com/datalad/datalad/pull/6675) (by @bpoldrack)
- Many uses of deprecated call options were replaced with the recommended ones. [#6273](https://github.com/datalad/datalad/pull/6273) (by @jwodder)
- Get rid of `asyncio` import by defining few noops methods from `asyncio.protocols.SubprocessProtocol` directly in `WitlessProtocol`. [#6648](https://github.com/datalad/datalad/pull/6648) (by @yarikoptic)
- Consolidate `GitRepo.remove()` and `AnnexRepo.remove()` into a single implementation. [#6783](https://github.com/datalad/datalad/pull/6783) (by @mih)
Expand Down Expand Up @@ -1382,7 +1404,7 @@
- The credential helper no longer asks the user to repeat tokens or
AWS keys. ([#5219][])

- The new option `datalad.locations.sockets` controls where Datalad
- The new option `datalad.locations.sockets` controls where DataLad
stores SSH sockets, allowing users to more easily work around file
system and path length restrictions. ([#5238][])

Expand Down Expand Up @@ -2841,7 +2863,7 @@ Primarily bugfixes with some optimizations and refactorings.
- [addurls][] now suggests close matches when the URL or file format
contains an unknown field. ([#3594][])

- Shared logic used in the setup.py files of Datalad and its
- Shared logic used in the setup.py files of DataLad and its
extensions has been moved to modules in the _datalad_build_support/
directory. ([#3600][])

Expand Down
34 changes: 22 additions & 12 deletions datalad/conftest.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import logging
import os
import re
from contextlib import ExitStack
from unittest.mock import patch

import pytest

Expand Down Expand Up @@ -49,13 +51,22 @@ def setup_package():
# enforce honoring TMPDIR (see gh-5307)
tempfile.tempdir = os.environ.get('TMPDIR', tempfile.gettempdir())

with pytest.MonkeyPatch().context() as m:
m.setattr(consts, "DATASETS_TOPURL", 'https://datasets-tests.datalad.org/')
m.setenv('DATALAD_DATASETS_TOPURL', consts.DATASETS_TOPURL)

m.setenv("GIT_CONFIG_PARAMETERS",
"'init.defaultBranch={}' 'clone.defaultRemoteName={}'"
.format(DEFAULT_BRANCH, DEFAULT_REMOTE))
# Use unittest's patch instead of pytest.MonkeyPatch for compatibility with
# old pytests
with ExitStack() as m:
m.enter_context(patch.object(consts, "DATASETS_TOPURL", 'https://datasets-tests.datalad.org/'))
m.enter_context(patch.dict(os.environ, {'DATALAD_DATASETS_TOPURL': consts.DATASETS_TOPURL}))

m.enter_context(
patch.dict(
os.environ,
{
"GIT_CONFIG_PARAMETERS":
"'init.defaultBranch={}' 'clone.defaultRemoteName={}'"
.format(DEFAULT_BRANCH, DEFAULT_REMOTE)
}
)
)

def prep_tmphome():
# re core.askPass:
Expand Down Expand Up @@ -96,15 +107,14 @@ def prep_tmphome():
# To overcome pybuild overriding HOME but us possibly wanting our
# own HOME where we pre-setup git for testing (name, email)
if 'GIT_HOME' in os.environ:
m.setenv('HOME', os.environ['GIT_HOME'])
m.enter_context(patch.dict(os.environ, {'HOME': os.environ['GIT_HOME']}))
else:
# we setup our own new HOME, the BEST and HUGE one
new_home, _ = prep_tmphome()
for v, val in get_home_envvars(new_home).items():
m.setenv(v, val)
m.enter_context(patch.dict(os.environ, get_home_envvars(new_home)))
else:
_, cfg_file = prep_tmphome()
m.setenv('GIT_CONFIG_GLOBAL', str(cfg_file))
m.enter_context(patch.dict(os.environ, {'GIT_CONFIG_GLOBAL': str(cfg_file)}))

# Re-load ConfigManager, since otherwise it won't consider global config
# from new $HOME (see gh-4153
Expand All @@ -124,7 +134,7 @@ def prep_tmphome():

# Prevent interactive credential entry (note "true" is the command to run)
# See also the core.askPass setting above
m.setenv('GIT_ASKPASS', 'true')
m.enter_context(patch.dict(os.environ, {'GIT_ASKPASS': 'true'}))

# Set to non-interactive UI
_test_states['ui_backend'] = ui.backend
Expand Down
2 changes: 1 addition & 1 deletion datalad/core/local/create.py
Original file line number Diff line number Diff line change
Expand Up @@ -349,7 +349,7 @@ def __call__(
discovered_procs = tbds.run_procedure(
discover=True,
result_renderer='disabled',
return_type='generator',
return_type='list',
)
for cfg_proc_ in cfg_proc:
for discovered_proc in discovered_procs:
Expand Down
3 changes: 1 addition & 2 deletions datalad/runner/tests/test_nonasyncrunner.py
Original file line number Diff line number Diff line change
Expand Up @@ -375,8 +375,7 @@ async def main():
(["cmd.exe", "/c"] if on_windows else []) + ["echo", "abc"],
StdOutCapture)

loop = asyncio.get_event_loop()
result = loop.run_until_complete(main())
result = asyncio.run(main())
eq_(result["stdout"], "abc" + os.linesep)


Expand Down
3 changes: 2 additions & 1 deletion datalad/runner/tests/test_threadsafety.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,12 @@
Tuple,
)

from datalad.tests.utils_pytest import assert_raises

from ..coreprotocols import StdOutCapture
from ..nonasyncrunner import ThreadedRunner
from ..protocol import GeneratorMixIn
from .utils import py2cmd
from datalad.tests.utils import assert_raises


class MinimalGeneratorProtocol(GeneratorMixIn, StdOutCapture):
Expand Down
14 changes: 10 additions & 4 deletions datalad/support/external_versions.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
# ## ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ##
"""Module to help maintain a registry of versions for external modules etc
"""
import re
import sys
import os.path as op
from os import linesep
Expand Down Expand Up @@ -116,10 +117,15 @@ def _get_ssh_version(exe=None):
stdout = out['stdout']
if out['stderr'].startswith('OpenSSH'):
stdout = out['stderr']
assert stdout.startswith('OpenSSH') # that is the only one we care about atm
# The last item in _-separated list in the first word which could be separated
# from the rest by , or yet have another word after space
return stdout.split(',', 1)[0].split(' ')[0].rstrip('.').split('_')[-1]
match = re.match(
"OpenSSH.*_([0-9][0-9]*)\\.([0-9][0-9]*)(p([0-9][0-9]*))?",
stdout)
if match:
return "{}.{}p{}".format(
match.groups()[0],
match.groups()[1],
match.groups()[3])
raise AssertionError(f"no OpenSSH client found: {stdout}")


def _get_system_ssh_version():
Expand Down
28 changes: 25 additions & 3 deletions datalad/support/sshconnector.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@
CommandError,
ConnectionOpenFailedError,
)
from datalad.support.external_versions import (
external_versions,
)
from datalad.utils import (
auto_repr,
Path,
Expand Down Expand Up @@ -82,7 +85,11 @@ def get_connection_hash(hostname, port='', username='', identity_file='',
# References:
# https://github.com/ansible/ansible/issues/11536#issuecomment-153030743
# https://github.com/datalad/datalad/pull/1377
return md5(

# The "# nosec" below skips insecure hash checks by 'codeclimate'. The hash
# is not security critical, since it is only used as an "abbreviation" of
# the unique connection property string.
return md5( # nosec
'{lhost}{rhost}{port}{identity_file}{username}{force_ip}'.format(
lhost=gethostname(),
rhost=hostname,
Expand Down Expand Up @@ -127,6 +134,7 @@ def __init__(self, sshri, identity_file=None,
"""
self._runner = None
self._ssh_executable = None
self._ssh_version = None

from datalad.support.network import SSHRI, is_ssh
if not is_ssh(sshri):
Expand Down Expand Up @@ -207,6 +215,13 @@ def runner(self):
self._runner = WitlessRunner()
return self._runner

@property
def ssh_version(self):
if self._ssh_version is None:
ssh_version = external_versions["cmd:ssh"]
self._ssh_version = ssh_version.version if ssh_version else None
return self._ssh_version

def _adjust_cmd_for_bundle_execution(self, cmd):
from datalad import cfg
# locate annex and set the bundled vs. system Git machinery in motion
Expand Down Expand Up @@ -251,6 +266,13 @@ def _get_scp_command_spec(self, recursive, preserve_attrs):
scp_options += ["-p"] if preserve_attrs else []
return ["scp"] + scp_options

def _quote_filename(self, filename):
if self.ssh_version and self.ssh_version[0] < 9:
return _quote_filename_for_scp(filename)

# no filename quoting for OpenSSH version 9 and above
return filename

def put(self, source, destination, recursive=False, preserve_attrs=False):
"""Copies source file/folder to destination on the remote.
Expand Down Expand Up @@ -285,7 +307,7 @@ def put(self, source, destination, recursive=False, preserve_attrs=False):
# add destination path
scp_cmd += ['%s:%s' % (
self.sshri.hostname,
_quote_filename_for_scp(destination),
self._quote_filename(destination),
)]
out = self.runner.run(scp_cmd, protocol=StdOutErrCapture)
return out['stdout'], out['stderr']
Expand Down Expand Up @@ -320,7 +342,7 @@ def get(self, source, destination, recursive=False, preserve_attrs=False):
self.open()
scp_cmd = self._get_scp_command_spec(recursive, preserve_attrs)
# add source filepath(s) to scp command, prefixed with the remote host
scp_cmd += ["%s:%s" % (self.sshri.hostname, _quote_filename_for_scp(s))
scp_cmd += ["%s:%s" % (self.sshri.hostname, self._quote_filename(s))
for s in ensure_list(source)]
# add destination path
scp_cmd += [destination]
Expand Down
3 changes: 1 addition & 2 deletions datalad/support/tests/test_sshconnector.py
Original file line number Diff line number Diff line change
Expand Up @@ -256,8 +256,7 @@ def test_ssh_copy(sourcedir=None, sourcefile1=None, sourcefile2=None):

# and now a quick smoke test for get
# but simplify the most obscure filename slightly to not trip `scp` itself
togetfile = Path(targetdir) / (
get_most_obscure_supported_name().replace('`', '') + '2')
togetfile = Path(targetdir) / (obscure_file.replace('`', '') + '2')
togetfile.write_text(str('something'))
ssh.get(opj(remote_url, str(togetfile)), sourcedir)
ok_((Path(sourcedir) / togetfile.name).exists())
Expand Down
13 changes: 0 additions & 13 deletions datalad/tests/test_misc.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,16 +30,3 @@ def test_get_response_stamp():
eq_(r['size'], 101)
eq_(r['mtime'], 1367377320)
eq_(r['url'], "http://www.example.com/1.dat")


def test_test():
try:
import numpy
assert Version(numpy.__version__) >= Version('1.2')
except:
raise SkipTest("Need numpy 1.2")

# we need to avoid running global teardown
with patch.dict('os.environ', {'DATALAD_TESTS_NOTEARDOWN': '1'}):
# we can't swallow outputs due to all the nosetests dances etc
datalad.test('datalad.support.tests.test_status', verbose=0)
6 changes: 3 additions & 3 deletions datalad/tests/utils_pytest.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ def skip_if_no_module(module):
try:
imp = __import__(module)
except Exception as exc:
pytest.skip("Module %s fails to load" % module)
pytest.skip("Module %s fails to load" % module, allow_module_level=True)


def skip_if_scrapy_without_selector():
Expand Down Expand Up @@ -252,7 +252,7 @@ def skip_if_no_network(func=None):

def check_and_raise():
if dl_cfg.get('datalad.tests.nonetwork'):
pytest.skip("Skipping since no network settings")
pytest.skip("Skipping since no network settings", allow_module_level=True)

if func:
@wraps(func)
Expand Down Expand Up @@ -1603,7 +1603,7 @@ def ignore_nose_capturing_stdout(func):
# filesystems across different OSs. Start with the most obscure
OBSCURE_PREFIX = os.getenv('DATALAD_TESTS_OBSCURE_PREFIX', '')
# Those will be tried to be added to the base name if filesystem allows
OBSCURE_FILENAME_PARTS = [' ', '/', '|', ';', '&', '%b5', '{}', "'", '"']
OBSCURE_FILENAME_PARTS = [' ', '/', '|', ';', '&', '%b5', '{}', "'", '"', '<', '>']
UNICODE_FILENAME = u"ΔЙקم๗あ"

# OSX is exciting -- some I guess FS might be encoding differently from decoding
Expand Down

0 comments on commit 90b2f6e

Please sign in to comment.