Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring create-sibling-github to the modern age #5551

Merged
merged 15 commits into from
Apr 16, 2021
Merged

Bring create-sibling-github to the modern age #5551

merged 15 commits into from
Apr 16, 2021

Conversation

mih
Copy link
Member

@mih mih commented Apr 8, 2021

Primarly turns the command into a normal DataLad command that emits result records. But it also changes a number of other small bits and pieces. I tried to keep the individual commits sensible and reasonably well described in their messages.

Fixes #5548

mih added 9 commits April 7, 2021 21:09
There is no need for this, as far as I can see.
- generator behavior
- result records
- command options

To achieve this, I turned the underlying `_make_github_repos()` helper
into a generator `_make_github_repos_()` too.

I also removed the custom result renderer, as it contributed nothing
important and was incompatible with standard result records.
This prevents triggering result hooks, when actually nothing
happened.
This changes the behavior in the `existing='error'` case (was
ValueError). However, I think this is fine, because with the new
approach we can control how we want to explore (e.g. in a dry-run),
with `--on-failure=continue|ignore`.
It would lead to unexpected results if one would make the mistake to
give something like 'mih/somerepo' (falsly assuming the org/account
needs to be provided).
But disable its result renderer. This also shifts the reporting on the
creation of the GitHub project to _before_ reporting on the sibling
configuration, because this is what is happening.
@codecov
Copy link

codecov bot commented Apr 8, 2021

Codecov Report

Merging #5551 (bdf1fb5) into master (f2f95b3) will decrease coverage by 0.22%.
The diff coverage is 81.27%.

❗ Current head bdf1fb5 differs from pull request most recent head 611a17c. Consider uploading reports for the commit 611a17c to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5551      +/-   ##
==========================================
- Coverage   88.52%   88.29%   -0.23%     
==========================================
  Files         296      305       +9     
  Lines       42467    42472       +5     
==========================================
- Hits        37592    37500      -92     
- Misses       4875     4972      +97     
Impacted Files Coverage Δ
datalad/cmdline/common_args.py 100.00% <ø> (ø)
datalad/consts.py 96.87% <ø> (-0.10%) ⬇️
datalad/customremotes/main.py 25.49% <0.00%> (ø)
...talad/distributed/tests/test_export_to_figshare.py 100.00% <ø> (ø)
datalad/interface/__init__.py 100.00% <ø> (ø)
datalad/interface/common_cfg.py 100.00% <ø> (ø)
datalad/local/tests/test_check_dates.py 44.68% <ø> (ø)
datalad/local/tests/test_export_archive.py 100.00% <ø> (ø)
datalad/plugin/__init__.py 100.00% <ø> (+10.00%) ⬆️
datalad/plugin/add_readme.py 0.00% <0.00%> (-93.11%) ⬇️
... and 79 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2f95b3...611a17c. Read the comment docs.

@mih
Copy link
Member Author

mih commented Apr 8, 2021

@yarikoptic I failed to recover the knowledge on how to run/adjust these VCR-based tests.

@yarikoptic
Copy link
Member

@yarikoptic I failed to recover the knowledge on how to run/adjust these VCR-based tests.

yeah, dark magic but we have not found white one yet. I will look into updating them later. From top of your head - what has changed in the interactions with github in this PR?

Copy link
Contributor

@kyleam kyleam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested locally, but reading through, this looks very good to me.

The main behavior change seems to be moving bare exceptions into proper results that can be controlled with --on-failure, which I think is a good thing.

Aside from vcr-related failures, it looks like the only other failure is the hirni one being discussed elsewhere.


# what to operate on
ds = require_dataset(
dataset, check_installed=True, purpose='create GitHub sibling')

res_kwargs = dict(
action='create_sibling_github [dry-run]' if dryrun else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prevents triggering result hooks, when actually nothing
happened.

Ah, that makes a lot of sense. I should make a note to think about this for other sites, including the in-flight run PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought: since those records are intended for analysis/act-upon by outside code, I wonder if "dry-run" should not appear in action but rather appear in a dedicated key? Without that, it would require an even more arbitrary .endswith(['dry-run']) for outside code to tell which action was "real" and which was actually executed.
The counter-argument I guess could be: a " [dry-run]" flavor of action is like an action on its own.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an alternative, but I think more complicated than needed. I indeed saw it as "it is not the same (or even any) action, when --dry-run prevents any actual intervention taking place". Adding another key to indicate "surprise, this isn't really what you were looking for", would imply that any user of result hooks now needs to anticipate this possibility and specifically test for this scenario (every time).

In contrast, I cannot really see an example of a result hook that is meant to run immediately after an action that did not actually happen (dry-run), because, by definition, it will not find the normal outcome.

@@ -165,17 +169,29 @@ def __call__(
access_protocol='https',
publish_depends=None,
private=False,
dryrun=False):
dryrun=False,
dry_run=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't these need to be merged internally? (It looks like the new dry_run isn't considered downstream.)

Also, I think it'd be good to add a warnings.warn(..., DeprecationWarning).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done now, thx for pointing it out!

res = dict(
url=access_url,
prexisted=True,
)
if existing in ('skip', 'reconfigure'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not new to this PR, but a note to myself that the help output should probably be updated to note that 'reconfigure' really just behaves in the same way as 'skip'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that bothered me too. I expected to find a difference, but if there is one, it can only be in the siblings() call, and here it is just used for a non-intuitive pass-through.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, indeed. FWIW: I thought that may be some outside logic behaves differently between the two, but found no such spot.

@@ -185,13 +185,19 @@ def _gen_github_entity(
yield ses.get_user(), token_str


def _make_github_repos(
def _make_github_repos_(
Copy link
Member

@yarikoptic yarikoptic Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a note: I bet I would need to fixup crawler as well for any of the RFing here. That logic is not tested in the crawler - no VCR tapes - so CI crawler run would not detect breakage

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it looks like the crawler code only uses the credential/login pieces of this code and it should be unaffected.

@mih
Copy link
Member Author

mih commented Apr 12, 2021

It seems to me that the failing tests do not need to be yarikoptic tests, correct? I could just record new cassettes... However, I am missing how I would make them available to the CIs. Any pointers on that one? I would like to move forward with this PR.

@mih mih added this to To be considered in Release 1.0 via automation Apr 12, 2021
@yarikoptic
Copy link
Member

cassettes are committed/shipped alongside under ./datalad/distribution/tests/vcr_cassettes/ for github tests. I will try now to update them

@yarikoptic
Copy link
Member

yarikoptic commented Apr 15, 2021

but it seems to be unrelated to VCR tapes

it is a comparison of return values fails (now full records, not adhoc tuples?)
======================================================================
FAIL: datalad.distribution.tests.test_create_github.test_integration1_datalad_tester_org
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 182, in _wrap_skip_if_no_network
    return func(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/vcr/cassette.py", line 99, in __call__
    return type(self)(self.cls, args_getter)._execute_function(function, args, kwargs)
  File "/usr/lib/python3/dist-packages/vcr/cassette.py", line 113, in _execute_function
    return self._handle_function(fn=handle_function)
  File "/usr/lib/python3/dist-packages/vcr/cassette.py", line 140, in _handle_function
    return fn(cassette)
  File "/usr/lib/python3/dist-packages/vcr/cassette.py", line 106, in handle_function
    return function(*args, **kwargs)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 1545, in _wrap_with_testsui
    ret = t(*args, **kwargs)
  File "/home/yoh/proj/datalad/datalad-master/datalad/distribution/tests/test_create_github.py", line 146, in test_integration1_datalad_tester_org
    check_integration1(
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 703, in _wrap_with_memory_keyring
    return t(*(args + (keyring,)), **kwargs)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 757, in _wrap_with_tempfile
    return t(*(arg + (filename,)), **kw)
  File "/home/yoh/proj/datalad/datalad-master/datalad/distribution/tests/test_create_github.py", line 186, in check_integration1
    eq_(res, [(ds, url_fmt.format(**locals()), False)])
AssertionError: [{'action': 'create_sibling_github', 'path': '/home/yoh/.tmp/datalad_temp_check_integration1o34barlm', 'type': 'dataset', 'refds': '/home/yoh/.tmp/datalad_temp_check_integration1o34barlm', 'status': 'ok', 'message': ('project at %s', 'https://datalad-tester@github.com/datalad-tester-org/test_integration1.git'), 'url': 'https://datalad-tester@github.com/datalad-tester-org/test_integration1.git', 'prexisted': False}, {'action': 'configure-sibling', 'path': '/home/yoh/.tmp/datalad_temp_check_integration1o34barlm', 'type': 'sibling', 'name': 'github', 'annex-ignore': True, 'datalad-push-default-first': 'true', 'url': 'https://datalad-tester@github.com/datalad-tester-org/test_integration1.git', 'fetch': '+refs/heads/*:refs/remotes/github/*', 'status': 'ok', 'refds': '/home/yoh/.tmp/datalad_temp_check_integration1o34barlm'}] != [(Dataset('/home/yoh/.tmp/datalad_temp_check_integration1o34barlm'), 'https://datalad-tester@github.com/datalad-tester-org/test_integration1.git', False)]

======================================================================
FAIL: datalad.distribution.tests.test_create_github.test_integration1_yarikoptic
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 182, in _wrap_skip_if_no_network
    return func(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/vcr/cassette.py", line 99, in __call__
    return type(self)(self.cls, args_getter)._execute_function(function, args, kwargs)
  File "/usr/lib/python3/dist-packages/vcr/cassette.py", line 113, in _execute_function
    return self._handle_function(fn=handle_function)
  File "/usr/lib/python3/dist-packages/vcr/cassette.py", line 140, in _handle_function
    return fn(cassette)
  File "/usr/lib/python3/dist-packages/vcr/cassette.py", line 106, in handle_function
    return function(*args, **kwargs)
  File "/home/yoh/proj/datalad/datalad-master/datalad/distribution/tests/test_create_github.py", line 127, in test_integration1_yarikoptic
    check_integration1(
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 703, in _wrap_with_memory_keyring
    return t(*(args + (keyring,)), **kwargs)
  File "/home/yoh/proj/datalad/datalad-master/datalad/tests/utils.py", line 757, in _wrap_with_tempfile
    return t(*(arg + (filename,)), **kw)
  File "/home/yoh/proj/datalad/datalad-master/datalad/distribution/tests/test_create_github.py", line 186, in check_integration1
    eq_(res, [(ds, url_fmt.format(**locals()), False)])
AssertionError: [{'action': 'create_sibling_github', 'path': '/home/yoh/.tmp/datalad_temp_check_integration1h1mo5kac', 'type': 'dataset', 'refds': '/home/yoh/.tmp/datalad_temp_check_integration1h1mo5kac', 'status': 'ok', 'message': ('project at %s', 'https://github.com/yarikoptic/test_integration1.git'), 'url': 'https://github.com/yarikoptic/test_integration1.git', 'prexisted': False}, {'action': 'configure-sibling', 'path': '/home/yoh/.tmp/datalad_temp_check_integration1h1mo5kac', 'type': 'sibling', 'name': 'github', 'annex-ignore': True, 'datalad-push-default-first': 'true', 'url': 'https://github.com/yarikoptic/test_integration1.git', 'fetch': '+refs/heads/*:refs/remotes/github/*', 'status': 'ok', 'refds': '/home/yoh/.tmp/datalad_temp_check_integration1h1mo5kac'}] != [(Dataset('/home/yoh/.tmp/datalad_temp_check_integration1h1mo5kac'), 'https://github.com/yarikoptic/test_integration1.git', False)]

and the same could be seen on travis: https://travis-ci.com/github/datalad/datalad/jobs/497255397#L2441

I do not think that should be caused by tape-recorded interactions with github since they should not mandate returns here

mih added 6 commits April 15, 2021 17:12
Before situations like a pre-existing project would raise an exception
and create an unrecoverable situation. Not we use standard result
reporting and leave it to the caller whether to proceed or not (the
latter being default).
This is the convention used by `push`, `addurls`, soon `run`, and also
Git and git-annex.
Trim tests that rely on assumption of result structure.
@mih
Copy link
Member Author

mih commented Apr 15, 2021

Thx for the eye-opener @yarikoptic

I have now pushed an attempt to fixup the tests.

@kyleam
Copy link
Contributor

kyleam commented Apr 15, 2021

I have now pushed an attempt to fixup the tests.

Success :)

The remaining failures are known ones (macos install issues, test_no_annex, test_asyncio_forked)

@mih
Copy link
Member Author

mih commented Apr 16, 2021

Thx all!

@mih mih merged commit 0f63e0c into datalad:master Apr 16, 2021
Release 1.0 automation moved this from To be considered to DONE Apr 16, 2021
@mih mih deleted the bf-5548 branch April 16, 2021 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Release 1.0
  
DONE
Development

Successfully merging this pull request may close these issues.

create-sibling-github needs to become a standard command
3 participants