Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NF: create_sibling_gitlab #3447

Merged
merged 6 commits into from May 24, 2019
Merged

NF: create_sibling_gitlab #3447

merged 6 commits into from May 24, 2019

Conversation

@mih
Copy link
Member

@mih mih commented May 21, 2019

Initial draft of such a command with basic support for multiple GitLab remotes, access via SSH and/or HTTP, sibling creation, sibling (re-)configuration, and three essential layout modes (flat, collection, hierarchy).

  • fixes #3420
  • dryrun tests for all local decision making logic that does not involve taking to an actual gitlab deployment
  • patch API abstraction call to give fake responses and exercise the rest of the code
  • documentation
@mih mih force-pushed the nf-gitlab branch 3 times, most recently from edb5ded to dc9c9b1 May 22, 2019
@codecov
Copy link

@codecov codecov bot commented May 22, 2019

Codecov Report

Merging #3447 into master will decrease coverage by 12.75%.
The diff coverage is 50%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #3447       +/-   ##
===========================================
- Coverage   91.29%   78.54%   -12.76%     
===========================================
  Files         267      270        +3     
  Lines       34497    34700      +203     
===========================================
- Hits        31494    27255     -4239     
- Misses       3003     7445     +4442
Impacted Files Coverage Δ
datalad/interface/__init__.py 100% <ø> (ø) ⬆️
datalad/distributed/__init__.py 100% <100%> (ø)
datalad/distributed/create_sibling_gitlab.py 20.58% <20.58%> (ø)
...ad/distributed/tests/test_create_sibling_gitlab.py 98.75% <98.75%> (ø)
datalad/cmdline/tests/test_formatters.py 14.28% <0%> (-85.72%) ⬇️
datalad/plugin/addurls.py 18.26% <0%> (-81.43%) ⬇️
datalad/interface/tests/test_annotate_paths.py 23.74% <0%> (-76.26%) ⬇️
datalad/plugin/export_archive.py 24.65% <0%> (-75.35%) ⬇️
datalad/support/due_utils.py 16.27% <0%> (-74.42%) ⬇️
datalad/interface/add_archive_content.py 19.23% <0%> (-70.68%) ⬇️
... and 106 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8514268...dc9c9b1. Read the comment docs.

@codecov
Copy link

@codecov codecov bot commented May 22, 2019

Codecov Report

Merging #3447 into master will decrease coverage by 0.05%.
The diff coverage is 86.02%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3447      +/-   ##
==========================================
- Coverage    91.3%   91.24%   -0.06%     
==========================================
  Files         269      272       +3     
  Lines       34555    34834     +279     
==========================================
+ Hits        31551    31785     +234     
- Misses       3004     3049      +45
Impacted Files Coverage Δ
datalad/interface/__init__.py 100% <ø> (ø) ⬆️
datalad/distributed/__init__.py 100% <100%> (ø)
...ad/distributed/tests/test_create_sibling_gitlab.py 100% <100%> (ø)
datalad/distributed/create_sibling_gitlab.py 74.67% <74.67%> (ø)
datalad/downloaders/http.py 83.73% <0%> (-2.78%) ⬇️
datalad/interface/base.py 90.42% <0%> (+0.33%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ce9c8b1...3c8fefe. Read the comment docs.

@mih mih added this to the Release 0.12.0 milestone May 23, 2019
@mih mih requested a review from yarikoptic May 23, 2019
@mih mih added this to In progress in Refurbish `publish` via automation May 23, 2019
@kyleam
Copy link
Contributor

@kyleam kyleam commented May 23, 2019

I tried creating a project on gitlab.com, but I'm running into an error (probably because I've mispecified things).

I have python-gitlab:

$ pip freeze | grep gitlab
python-gitlab==1.8.0

I have ~/.python-gitlab.cfg file that looks like:

[glab]
url = https://gitlab.com/
api_version = 4
private_token = <the one I generated>

I'm see

% cd $(mktemp -dt dl-XXXXXXX)
% datalad create
% datalad create-sibling-gitlab --site glab --project kyleam/dltest --description 'datalad create test'
[ERROR  ] Failed to create GitLab project: list indices must be integers or slices, not str [base.py:__getattr__:59] [create_sibling_gitlab(/tmp/dl-J4flTx9)]
create_sibling_gitlab(error): . (dataset) [Failed to create GitLab project: list indices must be integers or slices, not str [base.py:__getattr__:59]]

@mih
Copy link
Member Author

@mih mih commented May 23, 2019

@kyleam the code looks good. Can you post more of the traceback.

@kyleam
Copy link
Contributor

@kyleam kyleam commented May 23, 2019

Can you post more of the traceback.

Sure

backtrace
  /home/kyle/src/python/venvs/datalad/bin/datalad(8)<module>()
      6 #
      7 from datalad.cmdline.main import main
----> 8 main()

  /home/kyle/src/python/datalad/datalad/cmdline/main.py(505)main()
    504             try:
--> 505                 ret = cmdlineargs.func(cmdlineargs)
    506             except InsufficientArgumentsError as exc:

  /home/kyle/src/python/datalad/datalad/interface/base.py(645)call_from_parser()
    644             if inspect.isgenerator(ret):
--> 645                 ret = list(ret)
    646             if args.common_output_format == 'tailored' and \

  /home/kyle/src/python/datalad/datalad/interface/utils.py(427)generator_func()
    426                     on_failure, incomplete_results,
--> 427                     result_renderer, result_xfm, _result_filter, **_kwargs):
    428                 yield r

  /home/kyle/src/python/datalad/datalad/interface/utils.py(520)_process_results()
    519     # of them according to the requested behavior (logging, rendering, ...)
--> 520     for res in results:
    521         if not res or 'action' not in res:

  /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(177)__call__()
    176                     site, project, name, layout, existing, access,
--> 177                     dryrun, siteobjs, publish_depends, description):
    178                 yield r

> /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(359)_proc_dataset()
    358     if site_project is None:
--> 359         site_project = site_api.create_project(project, description)
    360         # report success

  /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(478)create_project()
    477         path_l = path.split('/')
--> 478         namespace_id = self._obtain_namespace(path_l)
    479         # check for options:

  /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(504)_obtain_namespace()
    503                 parent_group = self.site.groups.get(
--> 504                     '/'.join(path_l[:-2]))
    505             except self.gitlab.GitlabGetError as e:

  /home/kyle/src/python/venvs/datalad/lib/python3.5/site-packages/gitlab/exceptions.py(255)wrapped_f()
    254             try:
--> 255                 return f(*args, **kwargs)
    256             except GitlabHttpError as e:

  /home/kyle/src/python/venvs/datalad/lib/python3.5/site-packages/gitlab/mixins.py(50)get()
     49         server_data = self.gitlab.http_get(path, **kwargs)
---> 50         return self._obj_cls(self, server_data)
     51 

  /home/kyle/src/python/venvs/datalad/lib/python3.5/site-packages/gitlab/base.py(41)__init__()
     40         self.__dict__['_parent_attrs'] = self.manager.parent_attrs
---> 41         self._create_managers()
     42 

  /home/kyle/src/python/venvs/datalad/lib/python3.5/site-packages/gitlab/base.py(106)_create_managers()
    105             cls = getattr(self._module, cls_name)
--> 106             manager = cls(self.manager.gitlab, parent=self)
    107             self.__dict__[attr] = manager

  /home/kyle/src/python/venvs/datalad/lib/python3.5/site-packages/gitlab/base.py(228)__init__()
    227         self._parent = parent  # for nested managers
--> 228         self._computed_path = self._compute_path()
    229 

  /home/kyle/src/python/venvs/datalad/lib/python3.5/site-packages/gitlab/base.py(242)_compute_path()
    241         data = {self_attr: getattr(self._parent, parent_attr, None)
--> 242                 for self_attr, parent_attr in self._from_parent_attrs.items()}
    243         self._parent_attrs = data

  /home/kyle/src/python/venvs/datalad/lib/python3.5/site-packages/gitlab/base.py(242)<dictcomp>()
    241         data = {self_attr: getattr(self._parent, parent_attr, None)
--> 242                 for self_attr, parent_attr in self._from_parent_attrs.items()}
    243         self._parent_attrs = data

  /home/kyle/src/python/venvs/datalad/lib/python3.5/site-packages/gitlab/base.py(59)__getattr__()
     58             try:
---> 59                 value = self.__dict__['_attrs'][name]
     60 

@kyleam
Copy link
Contributor

@kyleam kyleam commented May 23, 2019

mih force-pushed the mih:nf-gitlab branch from 8095d3c to e21377e 14 minutes ago

Just a note: This push dropped two fixup commits from me:

8095d3c72 * BF: create_sibling_gitlab: Fix placeholder for result message
b9ce37ba7 * DOC: create_sibling_gitlab: s/GitHub/GitLab/

Your push seems to independently have fixed the placeholder issue (8095d3c), but there is still the github typo (b9ce37b).

@mih
Copy link
Member Author

@mih mih commented May 23, 2019

@kyleam ah, sorry -- will not force-push again. Pushed that fix too. I got myself an account on GitLab to try it out myself.

@kyleam
Copy link
Contributor

@kyleam kyleam commented May 23, 2019

@kyleam ah, sorry -- will not force-push again

No worries. Please force push all you want :]

FWIW, depending on your workflow, --force-with-lease can be helpful in preventing unknown overwrites.

@mih
Copy link
Member Author

@mih mih commented May 23, 2019

@kyleam Found the bug... bugs. Works for me now, with and without groups.

@kyleam
Copy link
Contributor

@kyleam kyleam commented May 23, 2019

@kyleam Found the bug... bugs. Works for me now, with and without groups.

OK, thanks. "without groups" makes me think I'm not supposed to put my username as part of the project value to create a repo at https://gitlab.com/username/repo. Is that right?

I first tried with the same snippet from my original post (which has --project username/repon) and saw this failure:

% datalad create-sibling-gitlab --site glab --project kyleam/dltest
[ERROR  ] Failed to create GitLab project: 'NoneType' object has no attribute 'attributes' [create_sibling_gitlab.py:_obtain_namespace:535] [create_sibling_gitlab(/tmp/dl-KfUBybl)] 
create_sibling_gitlab(error): . (dataset) [Failed to create GitLab project: 'NoneType' object has no attribute 'attributes' [create_sibling_gitlab.py:_obtain_namespace:535]]
full backtrace
  /home/kyle/src/python/venvs/datalad/bin/datalad(8)<module>()
      6 #
      7 from datalad.cmdline.main import main
----> 8 main()

  /home/kyle/src/python/datalad/datalad/cmdline/main.py(505)main()
    504             try:
--> 505                 ret = cmdlineargs.func(cmdlineargs)
    506             except InsufficientArgumentsError as exc:

  /home/kyle/src/python/datalad/datalad/interface/base.py(645)call_from_parser()
    644             if inspect.isgenerator(ret):
--> 645                 ret = list(ret)
    646             if args.common_output_format == 'tailored' and \

  /home/kyle/src/python/datalad/datalad/interface/utils.py(427)generator_func()
    426                     on_failure, incomplete_results,
--> 427                     result_renderer, result_xfm, _result_filter, **_kwargs):
    428                 yield r

  /home/kyle/src/python/datalad/datalad/interface/utils.py(520)_process_results()
    519     # of them according to the requested behavior (logging, rendering, ...)
--> 520     for res in results:
    521         if not res or 'action' not in res:

  /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(177)__call__()
    176                     site, project, name, layout, existing, access,
--> 177                     dryrun, siteobjs, publish_depends, description):
    178                 yield r

  /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(359)_proc_dataset()
    358         try:
--> 359             site_project = site_api.create_project(project, description)
    360             # report success

  /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(485)create_project()
    484         path_l = path.split('/')
--> 485         namespace_id = self._obtain_namespace(path_l)
    486         # check for options:

> /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(536)_obtain_namespace()
    535                         path_l[-2],
--> 536                         parent_group.attributes['full_path'],
    537                         str(e)),

But dropping the username, it successfully created the repo on GitLab.

% datalad create-sibling-gitlab --site glab --project dltest
create_sibling_gitlab(ok): . (dataset)
configure-sibling(ok): . (sibling)
action summary:
  configure-sibling (ok: 1)
  create_sibling_gitlab (ok: 1)

@mih
Copy link
Member Author

@mih mih commented May 24, 2019

OK, thanks. "without groups" makes me think I'm not supposed to put my username as part of the project value to create a repo at https://gitlab.com/username/repo. Is that right?

@kyleam I cannot replicate this. I can do:

datalad -f json_pp create-sibling-gitlab --site gitlab.com --project michael.hanke/dltest --description 'datalad create test'
...
    "http_url_to_repo": "https://gitlab.com/michael.hanke/dltest.git",
...

a subsequent call without the (implied) user "group" gives me the expected error.

datalad -f json_pp create-sibling-gitlab --site gitlab.com --project dltest --description 'datalad create test' 
[ERROR  ] Failed to create GitLab project: 400: {'name': ['has already been taken'], 'path': ['has already been taken'], 'limit_reached': []} [exceptions.py:wrapped_f:257] [create_sibling_gitlab(/tmp/stuff)] 

So it seems to resolve to the correct (and same) location. However, the crash you reported is an actuall bug (fix pushed) that should not happen, but it should also not get to this conditional to begin with.

mih added 2 commits May 24, 2019
These personal groups always already exist.
@mih
Copy link
Member Author

@mih mih commented May 24, 2019

Ha, now I could replicate (had a lingering project). I pushed a fix. Now it should be irrelevant whether or not you specify your own user group.

@kyleam
Copy link
Contributor

@kyleam kyleam commented May 24, 2019

Ha, now I could replicate (had a lingering project). I pushed a fix. Now it should be irrelevant whether or not you specify your own user group.

Works on my end. Thanks!

@mih mih changed the title [WIP] NF: create_sibling_gitlab NF: create_sibling_gitlab May 24, 2019
@mih
Copy link
Member Author

@mih mih commented May 24, 2019

This is ready from my POV.

Copy link
Member

@yarikoptic yarikoptic left a comment

Cursory review... I suggested a few changes to logging/exception messages to make them more factual. It would help later on to debug issues by reacting to actual errors and not assumptions of when they thought to be thrown



def _proc_dataset(refds, ds, site, project, remotename, layout, existing,
access, dryrun, siteobjs, depends, description):
Copy link
Member

@yarikoptic yarikoptic May 24, 2019

if would have been nice to summarize expected actions to be performed by this helper. When later reading the code it is hard to figure out what it is actually supposed to do without either looking where it is used or going through the entirety of the function. And the name gives no hint really

Copy link
Member

@yarikoptic yarikoptic May 24, 2019

Something as simple as

"""The actual logic to establish a gitlab repository for a dataset

Made into a function for more straightforward logic in the main __call__.

Yields
--------
Action records
"""

would already be useful IMHO

except gitlab.config.GitlabDataError as e:
raise ValueError(
'{}, please configure access to this GitLab instance'.format(
str(e)))
Copy link
Member

@yarikoptic yarikoptic May 24, 2019

best to use our exc_str instead of str to eg enable tracebacks.

from ..dochelpers import exc_str

I also tend to position exception string at the end of the message since it might be long, and then the "please configure ..." would get lost visually

return self.site.projects.get(path).attributes
except self.gitlab.GitlabGetError as e:
lgr.debug("Project with path '%s' does not yet exist at site '%s'",
path, self.site.url)
Copy link
Member

@yarikoptic yarikoptic May 24, 2019

I would have made debug message more to the point -- "Failed to obtain project for path '%s'..." and include the actual exception (exc_str) into the the log message.
I would assume that GitlabGetError might also be thrown for other reasons (connection? permissions?) and not only for a stated here assumption. Doc says
"GitlabGetError – If the server cannot perform the request" .
Code has lots of them and most relevant place of use:

gitlab/v4/objects.py=class ProjectJob(RESTObject, RefreshMixin):
--
gitlab/v4/objects.py-    @cli.register_custom_action("ProjectJob")
gitlab/v4/objects.py:    @exc.on_http_error(exc.GitlabGetError)
gitlab/v4/objects.py-    def artifacts(self, streamed=False, action=None, chunk_size=1024, **kwargs):
--
gitlab/v4/objects.py-            GitlabAuthenticationError: If authentication is not correct
gitlab/v4/objects.py:            GitlabGetError: If the artifacts could not be retrieved
gitlab/v4/objects.py-
--
gitlab/v4/objects.py-    @cli.register_custom_action("ProjectJob")
gitlab/v4/objects.py:    @exc.on_http_error(exc.GitlabGetError)
gitlab/v4/objects.py-    def artifact(self, path, streamed=False, action=None, chunk_size=1024, **kwargs):
--
gitlab/v4/objects.py-            GitlabAuthenticationError: If authentication is not correct
gitlab/v4/objects.py:            GitlabGetError: If the artifacts could not be retrieved

so smells that my connection (not authentication) assumption was on spot?

if description:
props['description'] = description
project = self.site.projects.create(props)
return project.attributes
Copy link
Member

@yarikoptic yarikoptic May 24, 2019

FWIW I would have expected the entire project object to be returned, not just attributes

'/'.join(path_l[:-1]),
'/'.join(path_l),
'/'.join(path_l[:-2]),
))
Copy link
Member

@yarikoptic yarikoptic May 24, 2019

again the same too strong of an assumption on when GitlabGetError would be thrown...?

path_l[-2],
repr(parent_group.attributes['full_path'])
if parent_group else 'the account root',
str(e)),
Copy link
Member

@yarikoptic yarikoptic May 24, 2019

exc_str

@mih
Copy link
Member Author

@mih mih commented May 24, 2019

Thx, for the review. I will not be able to get back to this till after OHBM. Anyone, please feel free to adjust things as you see fit.

@yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented May 24, 2019

cool, will do... could do post-PR.
datalad-neuroimaging is failing as promised: datalad/datalad-neuroimaging#67

@yarikoptic yarikoptic merged commit afdaca2 into datalad:master May 24, 2019
2 of 3 checks passed
Refurbish `publish` automation moved this from In progress to Done May 24, 2019
Copy link
Contributor

@kyleam kyleam left a comment

I know this PR is already merged, but I'll put the comments I was working on here anyway. I'll also push some fix ups to kyleam/gitlab-fixups. @yarikoptic, it seems you're planning to open up a follow-up PR, so feel free to incorporate the commits from my branch.


I still don't have a great grasp of how the layout handling should work. But looking at the tests, I think this should work:

#!/bin/sh

set -eu

cd $(mktemp -dt dl-XXXXXXX)

datalad create
datalad create s1
datalad create s2
datalad save
git config datalad.gitlab-default-site glab
git config datalad.gitlab-glab-project dlhier
datalad -f json_pp create-sibling-gitlab -r --layout=hierarchy --dryrun

And the output seems to match the results shown in the tests:

eq_(
sorted(r['project'] for r in res),
[
'secret',
'secret/collection2/_repo_',
'secret/collection2/sub1/_repo_',
'secret/collection2/sub1/deepsub1/_repo_',
'secret/subdir/collection1/_repo_',
'secret/subdir/collection1/sub1/_repo_',
'secret/subdir/collection1/sub2/_repo_',
]
)

output
sh /tmp/scratch.sh
[INFO   ] Creating a new annex repo at /tmp/dl-UxaC1yc
create(ok): /tmp/dl-UxaC1yc (dataset)
[INFO   ] Creating a new annex repo at /tmp/dl-UxaC1yc/s1
create(ok): /tmp/dl-UxaC1yc/s1 (dataset)
[INFO   ] Creating a new annex repo at /tmp/dl-UxaC1yc/s2
create(ok): /tmp/dl-UxaC1yc/s2 (dataset)
add(ok): s1 (file)
add(ok): s2 (file)
add(ok): .gitmodules (file)
save(ok): . (dataset)
action summary:
  add (ok: 3)
  save (ok: 1)
{
  "action": "create_sibling_gitlab",
  "dryrun": true,
  "path": "/tmp/dl-UxaC1yc",
  "project": "dlhier",
  "refds": "/tmp/dl-UxaC1yc",
  "sibling": "glab",
  "site": "glab",
  "status": "ok",
  "type": "dataset"
}
{
  "action": "create_sibling_gitlab",
  "dryrun": true,
  "path": "/tmp/dl-UxaC1yc/s1",
  "project": "dlhier/s1/_repo_",
  "refds": "/tmp/dl-UxaC1yc",
  "sibling": "glab",
  "site": "glab",
  "status": "ok",
  "type": "dataset"
}
{
  "action": "create_sibling_gitlab",
  "dryrun": true,
  "path": "/tmp/dl-UxaC1yc/s2",
  "project": "dlhier/s2/_repo_",
  "refds": "/tmp/dl-UxaC1yc",
  "sibling": "glab",
  "site": "glab",
  "status": "ok",
  "type": "dataset"
}

But when I drop --dryrun from the above, I see the following failure:

[ERROR  ] Failed to create GitLab project: No parent group dlhier/s1 for project dlhier/s1/_repo_ found, and a group dlhier also does not exist. At most one parent group would be created. [create_sibling_gitlab.py:_obtain_namespace:566] [create_sibling_gitlab(/tmp/dl-CCw07ZI/s1)]
{
  "action": "create_sibling_gitlab",
  "path": "/tmp/dl-CCw07ZI/s1",
  "project": "dlhier/s1/_repo_",
  "refds": "/tmp/dl-CCw07ZI",
  "sibling": "glab",
  "site": "glab",
  "status": "error",
  "type": "dataset"
}
backtrace
  /home/kyle/src/python/venvs/datalad/bin/datalad(8)<module>()
      6 #
      7 from datalad.cmdline.main import main
----> 8 main()

  /home/kyle/src/python/datalad/datalad/cmdline/main.py(505)main()
    504             try:
--> 505                 ret = cmdlineargs.func(cmdlineargs)
    506             except InsufficientArgumentsError as exc:

  /home/kyle/src/python/datalad/datalad/interface/base.py(645)call_from_parser()
    644             if inspect.isgenerator(ret):
--> 645                 ret = list(ret)
    646             if args.common_output_format == 'tailored' and \

  /home/kyle/src/python/datalad/datalad/interface/utils.py(427)generator_func()
    426                     on_failure, incomplete_results,
--> 427                     result_renderer, result_xfm, _result_filter, **_kwargs):
    428                 yield r

  /home/kyle/src/python/datalad/datalad/interface/utils.py(520)_process_results()
    519     # of them according to the requested behavior (logging, rendering, ...)
--> 520     for res in results:
    521         if not res or 'action' not in res:

  /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(253)__call__()
    252                         site, project, name, layout, existing, access,
--> 253                         dryrun, siteobjs, publish_depends, description):
    254                     yield r

  /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(402)_proc_dataset()
    401         try:
--> 402             site_project = site_api.create_project(project, description)
    403             # report success

  /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(528)create_project()
    527         path_l = path.split('/')
--> 528         namespace_id = self._obtain_namespace(path_l)
    529         # check for options:

> /home/kyle/src/python/datalad/datalad/distributed/create_sibling_gitlab.py(566)_obtain_namespace()
    565                         '/'.join(path_l),
--> 566                         '/'.join(path_l[:-2]),
    567                     ))

On gitlab.com, it created a project for dlhier (not a group dlhier and repo _repo_ underneath, as I expected from reading the description of the hierarchy layout).

this group. Using this layout, arbitrarily deep hierarchies of
nested datasets can be represented, while the hierarchical structure
is reflected in the project path. This is the default layout, if
no project path is specified.
Copy link
Contributor

@kyleam kyleam May 24, 2019

Based on the code

    if layout is None:
        # figure out the layout of projects on the site
        # use the reference dataset as default, and fall back
        # on 'hierarchy' as the most generic method of representing
        # the filesystem in a group/subgroup structure
        layout_var = 'datalad.gitlab-{}-layout'.format(site)
        layout = ds.config.get(
            layout_var, refds.config.get(
                layout_var, 'hierarchy'))

it seems to always be the default even if a project is specified.

@yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented May 24, 2019

I know this PR is already merged, but I'll put the comments I was working on here anyway. I'll also push some fix ups to kyleam/gitlab-fixups. @yarikoptic, it seems you're planning to open up a follow-up PR, so feel free to incorporate the commits from my branch.

my PR probably wouldn't come very soon -- so you can proceed as if I didn't promise or even incorporate my suggestions (which would be appreciated)

@mih
Copy link
Member Author

@mih mih commented May 26, 2019

I still don't have a great grasp of how the layout handling should work.

Hmm, I think I haven't thought this through properly. My intuition says that in this case, the root dataset should get dlhier/_repo_ as the effictive repo path -- regardless of (or rather informed by) the explicit configuration. Otherwise, a layout=hierarchy (indeed the default) cannot work, because groups and projects share the same namespace. hierarchy is the default, because it is the most generic one. However, it is also the most unnatural layout from the perspective of a GitLab site organization.

Should we switch the default?

@kyleam
Copy link
Contributor

@kyleam kyleam commented May 26, 2019

Hmm, I think I haven't thought this through properly.

My intuition says that in this case, the root dataset should get dlhier/_repo_ as the effictive repo path -- regardless of (or rather informed by) the explicit configuration.

That certainly matches what I expected as an unfamiliar user, especially since if I switch from hierarchy to collection I see dlhier/_repo_ correctly created, and IIUC that is the hierarchy part of collections hierarchy+flat hybrid approach.

However, it is also the most unnatural layout from the perspective of a GitLab site organization.

Should we switch the default?

I don't have a strong feeling about what the best default is, but I find collection most appealing in terms of site layout.

@mih mih mentioned this pull request Jun 2, 2019
3 tasks
@mih mih deleted the nf-gitlab branch Sep 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Linked issues

Successfully merging this pull request may close these issues.

3 participants