Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter projects_and_groups configuration using project topics #398

Open
armingerten opened this issue Aug 15, 2022 · 19 comments
Open

Filter projects_and_groups configuration using project topics #398

armingerten opened this issue Aug 15, 2022 · 19 comments

Comments

@armingerten
Copy link

armingerten commented Aug 15, 2022

Problem description

The current configuration syntax allows the usage of * wildcards to apply a configuration to all projects or groups (or group/subgroup). However, sometimes multiple projects that share a common characteristic are scattered across multiple (sub-)groups. This makes it hard to apply the same configuration to multiple projects with said common characteristic without moving them into the same (sub-)group.

Imagine the following structure:

my_group
└─ my-product-a (group)
   └─ cool-python-app (python project)
   └─ really-cool-java-app (java project)
└─ my-product-b (group)
   └─ awesome-python-app (python project)
   └─ cool-application (python project)

In this scenario, I would like to apply a configuration to all python projects... (and only to the python projects!)

Implementation proposal

GitLab allows adding "topics" to a project. Topics are single strings that annotate a project and allow searching / filtering for a topic. As described in the example above, all python projects could be annotated with the a topic identified as python.

This proposal would introduce a new configuration syntax that filters based on topics. The topic filter is indicated by the ? character that follows appended to the * wildcard. The topic filter then allows specifying one or multiple topics (separated by ,) that are required for the specified configuration:

projects_and_groups:
  "*?<topic>":
    # common-level config filtered by <topic>
  group_1/*?<topic>:
    # group-level config filtered by <topic>
  group_2/*?<topicA>,<topicB>:
    # group-level config filtered by <topicA> and <topicB>

Example configuration

projects_and_groups:
  "my_group/*?python":
    # common-level config for python projects

Related issues / features

#325

@amimas
Copy link
Collaborator

amimas commented Aug 15, 2022

One work-around could be yaml anchors. You can move the shared configs into one or more yaml anchor and then apply them wherever needed.

@armingerten
Copy link
Author

One work-around could be yaml anchors. You can move the shared configs into one or more yaml anchor and then apply them wherever needed.

That is true in terms of re-usability of configuration. I was thinking more in the direction of not having to explicitly specify my projects in the config.yml but rather apply the configuration based on a predicate (e.g. "a python project").

@gdubicki
Copy link
Member

Hi @armingerten!

Thank you for the detailed feature proposal.

We will take it into consideration when designing v4 of the app, but I cannot commit to any deadlines for it, sorry (see #343 for more info).

@armingerten
Copy link
Author

We will take it into consideration when designing v4 of the app, but I cannot commit to any deadlines for it, sorry (see #343 for more info).

Thanks @gdubicki for the quick reply! No worries about the implementation timeframe. I was mainly wondering whether you agree with the proposed design. If yes, I could try drafting an implementation and providing a PR.

@gdubicki
Copy link
Member

After thinking a bit more about it I think this feature could be very useful and fill an important feature gap - allow to provide shared configs for entities regardless of their location in the group hierarchy.

I even had a case where I would use it at Egnyte very recently - a Python & Golang team's projects are scattered within a few GitLab groups split by product/service, but they should all have some shared config related to that team...

But I am not sure if I completely understand how it would be used so can you please provide a semi-complete usage example, @armingerten? I mean a config where you define f.e. 2 topics in the config and then use it for a few projects/groups.

(We could also then compare how this would in practice differ from just using YAML anchors.)

@gdubicki
Copy link
Member

One of the unclear to me things is that in the GitLab API projects can have topics assigned (there is "topics" attribute listed here), but groups cannot (no "topics" here).

@gdubicki
Copy link
Member

@jimisola / @lfvjimisola: can you take a look at this proposal too, please? As we have discussed the syntax of the next major version of GitLabForm.

@armingerten
Copy link
Author

But I am not sure if I completely understand how it would be used so can you please provide a semi-complete usage example, @armingerten? I mean a config where you define f.e. 2 topics in the config and then use it for a few projects/groups.

Yes, of course.

Example 1

The problem

Let's imagine, we have a team called "team red" that is responsible for multiple projects. The team decided that they have strict rules about merge requests:

  • Only "fast forward" merges should be allowed
  • At least one team member other than the author must review and approve every merge request

Unfortunately, the projects of team red are scattered across multiple groups within the GitLab instance.

Usage Example

Step 1: Define the "annotated" configuration

The first step is to define a configuration you would like to apply. Unless we intend to limit this to a specific group (and its descendent groups), we can use the * wildcard. Next, we define a topic team-red that will later be added to all of team red's projects.

The following entries in the config.yml describe team red's merge request policies:

projects_and_groups:
  "*?team-red":
    project_settings:
      merge_method: ff

    merge_requests:
      approvals:
        approvals_before_merge: 1
        merge_requests_author_approval: false

Step 2: Add the topic to all projects that belong to team red

There is now essentially two options to do that.

The first option (that inspired me for this feature proposal) is obviously through the user interface. While GitLab allows managing topics on an instance level (e.g. through the Topics API), it is not a hard requirement that topics need to be declared on an instance level before a project can reference it. Once you add the string team-red to the project's list of topics, it is instantly available. Do that with all of team red's projects and you are good to go!

The second option uses GitLabForm. Let's say team red is working on the products foo and bar. The have a project called awesome as part of product foo and a project cool as part of product bar. A GitLabForm configuration for projects would then look like this

projects_and_groups:
  product-foo/awesome:
    project_settings:
      topics: "team-red"

  product-bar/cool:
    project_settings:
      topics: "team-red"

Especially if you are intending to manage every single project individually with GitLabForm, this implementation is very similar to simply using YAML anchors.

Example 2

The problem

Another example would be to use this proposed feature to introduce "rules" that can generically be applied to your projects without needing to know the configuration of GitLabForm. This examples evolves around two roles:

  • The GitLab instance operator that also operates GitLabForm
  • The developer that has no deeper knowledge about GitLabForm but wants to apply the rule to their project

Let's assume, we want to add two rules on an instance level:

Usage example

As with the first example, we start by adding the annotated configuration to GitLabForm's config.yml. This is done by the GitLab instance / GitLabForm operator:

projects_and_groups:
  "*?strict-merge-requests":
    project_settings:
      merge_method: ff

    merge_requests:
      approvals:
        approvals_before_merge: 1
        merge_requests_author_approval: false

  "*?conventional-commits":
    project_push_rules:
      commit_message_regex: '^((build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test)(\(\w+\))?(!)?(: (.*\s*)*))'

Now, whenever a developer (unaware of the implementation details of GitLabForm) wants to add such policies to their projects, they simply add the topics strict-merge-requests and/or conventional-commits to their project. Not only does this de-couple the implementation detail from the usage - it also well visible within the GitLab UI, to which rules a developer/contributor needs to adhere when contributing to the project.

Conclusion

Of course these are just very simple examples. I am convinced that this feature would give the users a lot of flexibility to design their project configuration (and probably required repo files, etc.), without minding the group / project hierarchy.

@armingerten
Copy link
Author

One of the unclear to me things is that in the GitLab API projects can have topics assigned (there is "topics" attribute listed here), but groups cannot (no "topics" here).

Yes, to my knowledge that is true.

The way I am intending to use this feature, the topic "marks" projects that should receive a specific configuration. If I want to mark a whole group to receive a specific configuration, I am already quite happy with the wildcard (i.e. <my-group>/*) GitLabForm is offering today. I can see how this also has its limits as described in #325 .

@amimas
Copy link
Collaborator

amimas commented Aug 18, 2022

From a config syntax perspective, the following can be confusing.

projects_and_groups:
  "*?team-red":
    project_settings:
      merge_method: ff

To me, it is unclear what project I'm applying the config. Maybe add topic as a query parameter. Something like ?topic=team-red.

There could also be use case to restrict the above to certain groups only instead of all projects in the entire instance. Then maybe we can do following.

projects_and_groups:
  "group1/?team-red":
    project_settings:
      merge_method: ff

Also does syntax potentially make it harder to capture list of projects using regex pattern instead of wildcard, in the future. Using regex pattern could be another useful pattern. For example all projects that has docs- prefix should have same config. Although this could be addressed by utilizing topics too.

@amimas
Copy link
Collaborator

amimas commented Aug 18, 2022

I think the inheritance feature might need to be enhanced or clarified where it can be used.

@jimisola
Copy link
Collaborator

jimisola commented Aug 18, 2022

One of the unclear to me things is that in the GitLab API projects can have topics assigned (there is "topics" attribute listed here), but groups cannot (no "topics" here).

Another one of those classic GitLab situations where there are differences between projects and groups. We already know that they are working on merging groups and projects into namespaces (here) but it I'm hoping that we'll be releasing the next major version of GitLabForm before :)

That said, we can implement this but specifying a topic for a group will only affect projects under that group.

@jimisola
Copy link
Collaborator

@armingerten Thank you for your feature request will detailed examples and use-cases. I like it a lot. Can see how we can use this at my workplace as well. I don't think that I have paid attention to topics before, so thank you.

@jimisola
Copy link
Collaborator

One of the unclear to me things is that in the GitLab API projects can have topics assigned (there is "topics" attribute listed here), but groups cannot (no "topics" here).

Yes, to my knowledge that is true.

The way I am intending to use this feature, the topic "marks" projects that should receive a specific configuration. If I want to mark a whole group to receive a specific configuration, I am already quite happy with the wildcard (i.e. <my-group>/*) GitLabForm is offering today. I can see how this also has its limits as described in #325 .

I honestly see the same type of use-case that you presented for projects but for groups.

Let's assume that we have topics for groups as well. If I specify a topic for a group what should it be applied to?

  1. only group and sub-groups with that topic
  2. only sub-projects with that topic
  3. 1 and 2

I think we should find a solution that is 1) logical and 2) will work the same way when GitLab merges groups and projects into a common entitiy: namespaces.

@jimisola
Copy link
Collaborator

jimisola commented Aug 18, 2022

@armingerten @amimas Thank you for your comments/input. Our current draft of configuration syntax v4 allows for use of regexp on at least group/project names.

We have couple of suggestion so far (pls correct me if I missed any):

  1. "*?"
  2. "?

We might also need to differentiate if it should apply to only groups, only projects or both.

Ideally, I would like for our new syntax to backwards/forward compatible ones we release v4.
With that I mean that we should be able to add something else like topics in the future and still be backwards compatible.

Also, should I be able to only specify one topic or list several?

With your syntax proposal it have reworked it into alternative 1. We have to use a character that is not are allowed in group or project names (here and that aren't likely in the future. I choose colon for that reason and because I want to avoid * and ? which are used in wildcards, in regexps in URLs etc:

Alternative 1:

projects_and_groups:
  "*group1/:topics=team-red,team-blue;xxx=strict_project,more_strict_project":
    project_settings:
      merge_method: ff

However, for some of the configuration syntax in v4 we're planning to use Node tags (https://yaml.org/spec/1.2.2/#tags).
v4 could have a comma-separated list for topics and later on we kan add other.

Alternative 2:

Also, regarding v4 configuration syntax the current draft is using keys include and exclude on group/project level (where value can be id, name or regexp (by using a custom resolver with node tags).

namespaces:
  groups:
    “description/name for some group rules”:
      include: <name, id and PCRE2 regexp>
      exclude: <name, id and PCRE2 regexp>

We have some variations in the draft for the include/exclude value: should it singular/plural, a list etc. But, to simplify the value can be e.g.:

  • "123": group/project id (implicit)
  • "id: 123" group/project id (explicit)
  • "name: someProjectOrGroup": group or project name
  • "regexp: ": regular expression (match id, name or both)

it would be convenient to add topics as well.

namespaces:
  groups:
    “description/name for some group rules”:
      include:
         - some/path
         - topic: team-read
      exclude: <name, id and PCRE2 regexp>

Would match all groups and projects under path "some/path" with topic = team-read. In the same way topics could be used to exclude groups/projects from being applied settings.

@amimas
Copy link
Collaborator

amimas commented Aug 19, 2022

I wonder whether we should consider converting this feature request to be more generic.

Ability to retrieve projects using various filter option instead of being restricted to just topic.

According to GitLab's project api doc on list projects, we can retrieve projects using various filters. For example:

  • archived
  • id_after
  • id_before
  • imported
  • last_activity_after
  • last_activity_before
  • membership
  • min_access_level
  • owned
  • starred
  • topic
  • topic_id
  • visibility
  • with_issues_enabled
  • with_merge_requests_enabled
  • with_programming_language

Aside from topic, I could see use case for applying different configuration using visibility field. For example: different set of config based on whether the project is private, public or internal. Otherwise, you'll have to duplicate the same info as a "topic" value to achieve the same thing.

Although I can't think of any right now but may be there could be interesting uses cases through other fields listed above. Perhaps some special config for starred projects.

The syntax implementation probably shouldn't hard code those fields. Instead we could try think of "raw parameter passing" type setup that we already have for configuring project level settings.

@lfvjimisola
Copy link

@Animas I like the idea and agree. There is a big difference between include and exclude in that the API does not allow us to exclude directly using the API. We have to do that in a 2nd step. But, that's a implementation detail.

@armingerten
Copy link
Author

@jimisola @amimas Thanks for all the input and suggestion. I really like how this idea is evolving!

There could also be use case to restrict the above to certain groups only instead of all projects in the entire instance. Then maybe we can do following.

projects_and_groups:
  "group1/?team-red":
    project_settings:
      merge_method: ff

Indeed, I think the "filter for topics" should filter the list of projects (or groups) that precedes the "filter separator". So assuming the filter separator is ?, the following principles would apply:

  • *?team-red: From all groups and projects, only those with topic team-red
  • group1/*?team-red: Group1, descendent groups and all projects within group1 (and its decedents) filtered with topic team-red

Also, should I be able to only specify one topic or list several?

Several 😉

From a config syntax perspective, the following can be confusing.

projects_and_groups:
  "*?team-red":
    project_settings:
      merge_method: ff

To me, it is unclear what project I'm applying the config. Maybe add topic as a query parameter. Something like ?topic=team-red.

Hmm, I see your point ... I was already thinking that the ? character might be confused with a "query parameter". Maybe # would be a better fit? 🤔

projects_and_groups:
  "groups/*#team-red,team-blue":
    project_settings:
      merge_method: ff

Also, regarding v4 configuration syntax the current draft is using keys include and exclude on group/project level (where value can be id, name or regexp (yaml/pyyaml#457 (comment)).
it would be convenient to add topics as well.

namespaces:
  groups:
    “description/name for some group rules”:
      include:
         - some/path
         - topic: team-read
      exclude: <name, id and PCRE2 regexp>

Hmm, I suppose using include/exclude rather than modifying the path string definitely increases readability.

I wonder whether we should consider converting this feature request to be more generic.
Ability to retrieve projects using various filter option instead of being restricted to just topic.

You are right - that is actually a brilliant idea! Using topics seemed obvious to me because a topic acts as a "label" and "labeling" / "tagging" / etc. and then filtering by label / tag is a popular approach in many systems ... But why should this be limited to filtering by topic if projects expose other properties as well?

One could even consider filtering projects / groups by processing the returned API objects using yamlpath 🤔 .

@jimisola
Copy link
Collaborator

I've updated #331 and created a wiki page for the syntax here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants