BF: Remove create-sibling-gitlab's hierarchy layout, make collection the default#7410
Conversation
This commit applies a small change to remove the discrepancy between the docs description of the hierarchy layout and the actual implementation. The result is that each dataset name in a nested structure, including the top-level dataset, will map onto a gitlab group, and the associated dataset repositories will be created as a projects inside these nested groups. An additional requirement is that a user needs to create a top-level group via the Gitlab UI (this cannot be done via the API) and this group needs to be provided as the base of the value in the --project argument, e.g. 'newgroup/mydatasets'.
…gitlab-hierarchy-fix
|
I rebased your changes to sit on top of #7407 |
|
Hold off reviews or merges, I found an unrelated bug I'd like to slay in this PR, too |
This is a fix for datalad#7411. In a dataset hierarchy with subdatasets in subdirectories, the hierarchy layout of create-sibling-gitlab will keep path separators in the project path. This is problematic, because for GitLab, this makes it look like the subdirectory needs to be a pre-existing group. For example: [DS~0] /home/me/dl-101/DataLad-101 ├── recordings/ │ └── [DS~1] longnow/ will create a project path like this: <gitlab-instance>/<gitlab-group>/recordings/longnow/project DataLad will attempt to create the group 'longnow', but won't consider creating 'recordings'. Thus, the action will fail due to a missing parent level group that was never meant to be created. With this change, the project path becomes <gitlab-instance>/<gitlab-group>/longnow/project I.e., subdirectories are stripped.
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #7410 +/- ##
==========================================
+ Coverage 88.68% 91.54% +2.86%
==========================================
Files 327 325 -2
Lines 44703 43333 -1370
Branches 5948 5804 -144
==========================================
+ Hits 39645 39670 +25
+ Misses 5043 3648 -1395
Partials 15 15
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Some notes:
- Made a very minor doc edit
- I've tested it with multiple nested datasets, within subdirectories, and the collection and flat layouts seem to be working as expected
- The tests all pass locally
- While reviewing this I noticed some issues with the path separator code, but I have moved those comments over to #7407
Improve changelog Update changelog
9e50e41 to
7c83f4a
Compare
|
Given the time pressure imposed by upcoming events that require this command to work well, I have taken this changeset (plus #7407) and introduced it as a patch in datalad-next: datalad/datalad-next#413 It may be interesting to absorb the RF of the gitlab tests from that PR, which removes nose-residuals and switches to better use of |
Import create-sibling-gitlab patch from datalad/datalad#7410
|
woah, I haven't seen this much green in a while! |
|
btw I have created a dedicated walk-through (relies on this PR) in the handbook: https://datalad-handbook--963.org.readthedocs.build/en/963/basics/101-139-hostingservices.html#creating-a-sibling-on-gitlab |
* origin/master: [skip ci] Update docs/source/changelog.rst [skip ci] Update CHANGELOG [gh-actions](deps): Bump con/tributors from 0.0.21 to 0.1.1 Use getpwd() instead of Path.cwd() to account for $PWD (symlinked paths) OPT: parse_gitconfig_dump() origin path processing [release-action] Autogenerate changelog snippet for PR 7050 ENH: check file size on the filesytem for relaxed annexed files with content BF: status - report annexed files even if no bytesize known Use pytest.skip instead of raise SkipTest in the added test [release-action] Autogenerate changelog snippet for PR 7422 BF/RF(TST): skip test_system_ssh_version if no ssh found + split parsing into separate test ENH(TST): test that Runner raises FileNotFoundError if binary does not exist remove unused in the test @with_tempfile [release-action] Autogenerate changelog snippet for PR 7418 Do not map (leave as is) trailing \/ in github URLs. [release-action] Autogenerate changelog snippet for PR 7412 Use `sphinx_autodoc_typehints` [release-action] Autogenerate changelog snippet for PR 7388 (codespelled) RF: Issue a warning while minting annex key but getting KeyError BF: make addurls tollerate the case that not all rows have all metadata fields
|
Code Climate has analyzed commit 8510b5c and detected 0 issues on this pull request. View more on Code Climate. |
|
|
|
Everything is green! I'm hitting merge 💥 |
|
I think it didn't do anything... |
|
PR released in |
BF: Remove create-sibling-gitlab's hierarchy layout, make collection the default
This PR:addresses Fixhierarchylayout increate-sibling-gitlab#7409 and related issuesapplies a small change to remove the discrepancy between the docs description of the hierarchy layout and the actual implementation.adds adjustments for testsThe result is that each dataset name in a nested structure, including the top-level dataset, will map onto a gitlab group, and the associated dataset repositories will be created as a projects inside these nested groups.An additional requirement is that a user needs to create a top-level group via the Gitlab UI (this cannot be done via the API) and this group needs to be provided as the base of the value in the
--projectargument, e.g.newgroup/mydatasets.EDIT by adina: This PR started out trying to fix the hierarchy layout of create-sibling-gitlab, but eventually discovered the same problems others have discovered at several different times in the past. See #7411 and #7409.
Now, this PR implements a suggestion that has been made by Kyle already shortly after the original create-sibling-gitlab feature was merged: Make
collectionthe default layout. The need for this arose because the hierarchy layout has a number of edge cases that make it break when it operates on hierarchies of datasets (see #7411 and #7409 as well for details). Rather than keeping the hierarchy layout around in a dysfunctional state, this PR removes it entirely, and adjusts documentation and tests accordingly. I would argue that there is little use if we keep it - it has been broken since it existed for recursive operations, and in non-recursive operations, the outcome that thecollectionlayout provides is identical to whathierarchywould provide.