Skip to content

Address gitlab API special character restrictions #7407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 8, 2023

Conversation

jsheunis
Copy link
Member

@jsheunis jsheunis commented Jun 6, 2023

The main change is from _repo_ to repo. This PR serves as a space to brainstorm an effective solution.

Closes #6681

@jsheunis jsheunis changed the base branch from maint to master June 6, 2023 07:21
@adswa
Copy link
Member

adswa commented Jun 6, 2023

The verdict from the discussion in the chat was:

  • rename _repo_ to project (by default, and config item to redefine it. It should be config, because it would need to be set within a dataset, to avoid users having to remember it each and every time)
  • replace the hierarchy separator -- with - (again with config item)

It should be configurations so that users themselves can act when GitLabs requirements change.

@adswa
Copy link
Member

adswa commented Jun 6, 2023

Also TODO:

  • adjust test (it tests for a specific _repo_ name

adswa added 5 commits June 6, 2023 10:48
This change introduces two configurable stubs: project as an alternative
to _repo_ (as a name for collection and hierarchy layouts root or subdataset
projects) and - as a path separator for flat and collection subdataset names.
The defaults comply to GitLabs naming rules for project paths, introduced in
version 14.9:
https://gitlab.com/gitlab-org/gitlab/-/merge_requests/80055

They remain configurable via dedicated dataset-level configurations to allow
overrides, for example to prevent name clashes with existing (sub)datasets
or if path restrictions on gitlabs side are lifted.
The configurations are:

* datalad.gitlab-default-projectname
* datalad.gitlab-default-pathseparator
Also mention the new dataset-level configurations to override the defaults
@adswa adswa marked this pull request as ready for review June 6, 2023 09:19
@adswa adswa added the CHANGELOG-missing When a PR's description does not contain a changelog item, yet. label Jun 6, 2023
@adswa adswa added the semver-minor Increment the minor version when merged label Jun 6, 2023
@github-actions github-actions bot removed the CHANGELOG-missing When a PR's description does not contain a changelog item, yet. label Jun 6, 2023
@yarikoptic
Copy link
Member

  • replace the hierarchy separator -- with - (again with config item)

FTR: I tried to create test--name-with--dashes project on salsa of debian and got "Path must not start or end with a special character and must not contain consecutive special characters."

@adswa
Copy link
Member

adswa commented Jun 6, 2023

FTR: I tried to create test--name-with--dashes project on salsa of debian and got "Path must not start or end with a special character and must not contain consecutive special characters."

yes, this was the motivation behind not using -- as a path separator anymore. But if a user has a local dataset with this pattern, it would still fail. I think we should detect this and rename it, too...

Copy link
Member

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes all sense. I left a bunch of comments and suggestions, some even on stuff that is only in the margin of the diffs.

The PR top-comment should mention that this closes #6681 (12 months old).

Copy link
Member Author

@jsheunis jsheunis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments

  1. Specification of the path separator via configuration datalad.gitlab-default-pathseparator should currently be done recursively in order to work correctly for subdatasets. For some other arguments set via config, the existing code infers the default from the top-level dataset (refds), and I think this should be the case for pathsep as well. I have made code suggestions to this effect.
  2. This is more a question and explanation of my thoughts: it is not clear to me where exactly the path separator should be injected in a project's path. E.g. should it be injected twice when representing the path for a 2nd-level subdataset? And twice when representing a subdataset that's inside a subdirectory of the parent dataset? I tested out some examples below with separator="+" (note that the datasets all have dashes in them, just to add unnecessary complexity :D)
    • It seems like the collection path separation is done correctly (although this should actually be tested with even deeper levels)
    • It looks like the flat path separation is wrong though, shouldn't all dashes after BOB2 be replaced with the +?

Dataset structure

BOB
.
├── gitlab-subtest (dataset)
│   └── gitlab-subsub (dataset)
└── subdir (directory)
    └── gitlab-subdirdataset (dataset)

Collection example - BOB1

Screenshot 2023-06-07 at 16 27 27

Flat example - BOB2

Screenshot 2023-06-07 at 16 27 16

@jsheunis
Copy link
Member Author

jsheunis commented Jun 7, 2023

A note in case we have not seen this error before: I tried to specify the path separator as ~ and received the following error:

[Failed to create GitLab project: GitlabCreateError(400: {'name': ["can contain only letters, digits, emojis, '_', '.', '+', dashes, or spaces. It must start with a letter, digit, emoji, or '_'."]})]

adswa and others added 3 commits June 8, 2023 11:06
Co-authored-by: Michael Hanke <michael.hanke@gmail.com>
Co-authored-by: Michael Hanke <michael.hanke@gmail.com>
This way, a custom configured path separator is applied
between all elements of the generated paths, yielding more
consistent appearance and behavior.

Co-authored-by: Michael Hanke <michael.hanke@gmail.com>

adjust test
@adswa adswa force-pushed the gitlab-special-characters branch from 32dd139 to eaa6fe8 Compare June 8, 2023 09:06
@codeclimate
Copy link

codeclimate bot commented Jun 8, 2023

Code Climate has analyzed commit eaa6fe8 and detected 0 issues on this pull request.

View more on Code Climate.

@adswa
Copy link
Member

adswa commented Jun 8, 2023

I have applied the suggestions from the review, adjusted the path-separator override test, and squashed a few commits together. Thanks, @mih and @jsheunis!

@codecov
Copy link

codecov bot commented Jun 8, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.06 ⚠️

Comparison is base (425a8f7) 91.59% compared to head (eaa6fe8) 91.54%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7407      +/-   ##
==========================================
- Coverage   91.59%   91.54%   -0.06%     
==========================================
  Files         325      325              
  Lines       43318    43338      +20     
  Branches     5781     5806      +25     
==========================================
- Hits        39679    39675       -4     
- Misses       3624     3648      +24     
  Partials       15       15              
Impacted Files Coverage Δ
datalad/distributed/create_sibling_gitlab.py 69.27% <100.00%> (+0.37%) ⬆️
...ad/distributed/tests/test_create_sibling_gitlab.py 98.62% <100.00%> (+0.12%) ⬆️

... and 30 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@adswa
Copy link
Member

adswa commented Jun 8, 2023

Tests have passed!

@adswa adswa merged commit f9e7efd into datalad:master Jun 8, 2023
@yarikoptic-gitmate
Copy link
Collaborator

PR released in 0.19.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-minor Increment the minor version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants