Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process: replacing CODEOWNERS with MAINTAINERS #38725

Open
mbolivar-nordic opened this issue Sep 21, 2021 · 40 comments
Open

Process: replacing CODEOWNERS with MAINTAINERS #38725

mbolivar-nordic opened this issue Sep 21, 2021 · 40 comments
Assignees
Labels
Needs review This PR needs attention from Zephyr's maintainers Process Tracked by the process WG
Projects

Comments

@mbolivar-nordic
Copy link
Contributor

mbolivar-nordic commented Sep 21, 2021

This issue tracks replacing the CODEOWNERS file with the MAINTAINERS file.

The CODEOWNERS file and MAINTAINERS files contain duplicate information. The labeler.yml file would also be obsoleted by using a MAINTAINERS file in combination with a script from @nashif which adds labels as well.

Issues to sort out and current status:

How do we handle "non-maintainer" code owners?

Issue summary: CODEOWNERS and MAINTAINERS serve different but related purposes today, and many people who belong in CODEOWNERS do not belong in MAINTAINERS as it is currently defined.

Status

Current thinking is to define a new "sub-areas" concept, that lets us split up areas large enough to deserve a maintainer -- like "the I2C subsystem" -- into smaller components -- like "the nRF TWIM I2C driver".

We want to extend the MAINTAINERS YAML schema to allow us to define sub-areas and their "owners". Details are still TBD. @stephanosio has agreed to work on this syntax.

Details

  • CODEOWNERS is GitHub-specific syntax for selecting reviewers on a directory, file, or file glob specific resolution. The main historical purpose of this file in zephyr has been to automatically request reviewers in PRs that change those files.
  • MAINTAINERS is Zephyr-specific syntax that tracks areas of the project that are considered large and significant enough to require dedicated attention and maintenance from top-level maintainers and associated lower-level collaborators. See https://docs.zephyrproject.org/latest/project/project_roles.html.

One file, or many?

Issue summary: do we want to continue to put everything in a single MAINTAINERS file, or do we want to have various files throughout the tree that let us distribute sub-area owners into related directories?

For example, should we have the nRF TWIM driver owner in MAINTAINERS, or in a new drivers/i2c/SUB-MAINTAINERS file (exact name of the "sub-area" files is up for bikeshedding).

Current status The current consensus of the group is towards a single MAINTAINERS file with a new "sub-areas" syntax (TBD) for tracking individual components of a subsystem or area and their owners. Multiple "sub-MAINTAINERS" files throughout the tree were proposed but a supermajority of participants did not want to go this way, mainly because it seemed easier to search through a single file than many.

How to do the move?

Issue summary: Delete CODEOWNERS and move forward, or audit CODEOWNERS to make sure we have coverage in MAINTAINERS?

Current status: delete and move forward, but then how do we ensure that MAINTAINERS.yml contains all collaborators from the CODEOWNERS file?

Details

*  @stephanosio: As per https://github.com/zephyrproject-rtos/zephyr/issues/38725#issuecomment-1055570100, there is no point in having duplicate information at two different places.

What other metadata do we want?

Issue summary having a YAML file for describing components like individual drivers opens up opportunities to maintain additional useful metadata about those components. What do we want to add here?

Current status TBD, but some ideas are:

@mbolivar-nordic mbolivar-nordic added the Process Tracked by the process WG label Sep 21, 2021
@mbolivar-nordic mbolivar-nordic added this to In progress in Process Sep 21, 2021
@nashif
Copy link
Member

nashif commented Sep 22, 2021

new syntax proposal for the MAINTAINER file

- area: ARC arch
  collaborators:
  - abrodkin
  - evgeniy-paltsev
  - IRISZZW
  files:
  - arch/arc/
  - include/arch/arc/
  labels:
  - 'area: ARC'
  maintainers:
  - ruuddw
  status: maintained
  type: subsystem
- area: ARM arch
  collaborators:
  - carlocaione
  - galak
  - MaureenHelm
  - stephanosio
  - bbolen
  files:
  - arch/arm/
  - arch/arm/core/offsets/
  - include/arch/arm/aarch32/
  - include/arch/arm/
  - tests/arch/arm/
  labels:
  - 'area: ARM'
  maintainers:
  - ioannisg
  status: maintained
  type: subsystem

@mbolivar-nordic
Copy link
Contributor Author

Process WG minutes:

  • @carlescufi : audit CODEOWNERS and make sure everyone there is at least a collaborator in MAINTAINERS
  • @nashif : won't work since CODEOWNERS covers every file and not everyone in it should be a collaborator. Let's start with finding the diff
  • @galak : are we OK with MAINTAINERS growing very large? E.g. drivers/sensor can get huge and contain many one-off contributions; do we want individual maintainers for those? And what about boards? "Contact point" for contributions like these, which could enable removal of unmaintained boards/drivers. E.g. a MAINTAINERS file in the boards/ directory covering those locations
  • @nashif : "sub-MAINTAINER" files could work (e.g. sensors is owned by person A, individual instances can be owned by persons B, C, ...). Could work for boards also. This "instance" ownership does not confer maintainership status.
  • @galak : there's also https://github.com/zephyrproject-rtos/zephyr/blob/main/scripts/ci/tags.yaml which could be used as a data format
  • @nashif : another issue with MAINTAINERS is that validation is done by hand in get_maintainer.py instead of e.g. a pykwalify schema. I changed the format in the above comment. My script is adding labels and reviewers from this file. You see this as done by zephyrbot. This just needs to be cleaned up and sent as a PR.
  • @galak : would want to make sure the above new format can do what tags.yaml can do, and make sure the parsing is in a python module that can be reused

@evgeniy-paltsev
Copy link
Collaborator

@abrodkin @evgeniy-paltsev @ruuddw - we need to copy ARC-related entries from CODEOWNERS to MAINTAINERS - as I remember we didn't maintain MAINTAINERS file for us and didn't add proper info to it.

@mbolivar-nordic
Copy link
Contributor Author

Process WG minutes:

@mbolivar-nordic
Copy link
Contributor Author

Process WG minutes:

  • Check with @galak on Discord since he did not attend

@galak
Copy link
Collaborator

galak commented Oct 20, 2021

So if I understand what manifest.py#L425 is doing, this is not what I was asking for. I'm looking for something that would provide a class and functions that scripts that want to parse the MAINTAINERs files could be build around.

See:

@mbolivar-nordic
Copy link
Contributor Author

I'm looking for something that would provide a class and functions that scripts that want to parse the MAINTAINERs files could be build around.

OK, yeah, that is also supported in the same file, just lower down (in the Manifest class).

So you're basically asking for a library with generic parsing and validation support that is reusable elsewhere, right?

@nashif
Copy link
Member

nashif commented Oct 20, 2021

@galak when we discussed the various files, the unification and reuse of code idea was primarily around the way we parse the wildcards for paths in CODEOWNERS, MAINTAINERS.yml and tags.yaml. The yaml syntax can vary and I do not see how we can verify the syntax in a generic way.

@mbolivar-nordic
Copy link
Contributor Author

Process WG minutes:

  • Marti: should we start telling submitters to update MAINTAINERS for PRs?
  • Anas: need to figure out "sub-maintainers" files for instances of things (like individual boards or drivers). E.g. I2C could have a sub-maintainer file with Erwan for STM, Carles for Nordic, etc. Until we have this solution it's going to be a problem to remove it. This is the only remaining blocker. Should try to sort it out by the end of the year.
  • Marti: will leave in the "In progress" column but avoid including it in the agenda until @nashif has time to update the group with a proposal

@mbolivar-nordic
Copy link
Contributor Author

Process WG:

  • no news, @nashif still expects this to be done by EOY

@mbolivar-nordic mbolivar-nordic moved this from In progress to Awaiting TSC decision in Process Jan 19, 2022
@nashif
Copy link
Member

nashif commented Mar 1, 2022

@mbolivar-nordic lets put this on the agenda tomorrow, there are things blocking this that need to be discussed.

@nashif
Copy link
Member

nashif commented Mar 1, 2022

  • Anas: need to figure out "sub-maintainers" files for instances of things (like individual boards or drivers). E.g. I2C could have a sub-maintainer file with Erwan for STM, Carles for Nordic, etc. Until we have this solution it's going to be a problem to remove it. This is the only remaining blocker. Should try to sort it out by the end of the year.

So, the boards is going to be an easy one, we just add maintainers list per board in the board yaml file we already have, a quick PoC shows we can even get those populated from the existing CODEOWNERs file and then integrate it in the script that does the labeling and adding reviewers.

The challenge is how to deal with everything else, like drivers. We can solve multiple problems with one solution I guess, the most extreme approach would be to have metadata files per subsystem where we can list all the information needed about instances, their maintainers, how they are tested and how they relate to other areas in Zephyr, so for example for I2C, this could look like this:

- name: Designware I2C Driver
  maintainers:
    - nashif
    - teburd
  type: driver
  subsystem: i2c
  files:
    - Kconfig.dw
    - i2c_dw.*
    - i2c_dw_registers.h
  tests:
    - tests/drivers/i2c
  platforms:
    - rpi_pico
    - intel_s1000_crb
- name: ESP32 I2C Driver
  maintainers:
    - ...
  type: driver
  subsystem: i2c
  files:
    - Kconfig.esp32
    - i2c_esp32.c
  tests:
    - tests/drivers/i2c
  platforms:
    - esp32
...
...

Based on the information above and without the need for too much parse magic we can:

  • set reviewers per instance and have an overview of who maintains what and how things are related. CODEOWNERS file does not give us this information.
  • know how a driver can be tested and what tests are relevant per driver instance
  • know what platforms are using this driver, there are probably other way to know this, ie.. DTS but this is as static as it can get and does not require us to deal with DTS parsing and matching drivers to platforms
  • create automated documentation for drivers and the subsystem they are part of.
  • reduce CI noise and just test whatever is being changed in a PR. If i2c_dw.* files are being changed, I would know which tests to run and on which platforms. I do not have to build and run tests on all platforms and architectures that are not related....

The main challenge of the above is how we get there and how we continue to maintain those files and keep them in sync, I guess CI can enforce this in different way, similar to how enforce this with CODEOWNER now, if you add a file or a new driver that is not part of this "database", you will be asked to add it....

@mbolivar-nordic mbolivar-nordic moved this from On hold to In progress in Process Mar 2, 2022
@mbolivar-nordic
Copy link
Contributor Author

  • know what platforms are using this driver, there are probably other way to know this, ie.. DTS but this is as static as it can get and does not require us to deal with DTS parsing and matching drivers to platforms

There is another way that does rely on DTS that I think would be better than "platforms": devicetree compatibles.

Instead of:

  platforms:
    - foo
    - bar

I suggest using:

  dtcompatibles:
    - vnd,foo
    - vnd,bar

I.e. a list of compatibles that this driver supports.

I also think there should be a documented top-level kconfig option for enabling the driver.

@mbolivar-nordic
Copy link
Contributor Author

Process WG:

  • approach is accepted, details to be dealt with in future PRs

@hakehuang
Copy link
Collaborator

I.e. a list of compatibles that this driver supports.

@nashif , how could we find the mapping from given file to dts defines? using scripts to check the fild compatbile defines? but it is only existing in .c file, what if we change the .h file? or common file?

@nashif
Copy link
Member

nashif commented Mar 3, 2022

@nashif , how could we find the mapping from given file to dts defines? using scripts to check the fild compatbile defines? but it is only existing in .c file, what if we change the .h file? or common file?

a question for @mbolivar-nordic I guess

@mbolivar-nordic mbolivar-nordic moved this from In progress to On hold in Process Mar 9, 2022
@erwango
Copy link
Member

erwango commented Mar 17, 2022

There is another way that does rely on DTS that I think would be better than "platforms": devicetree compatibles.

@mbolivar-nordic I guess you don't think about dealing with dts at each CI run (need to compile all dts for each target and take into account tests/samples in which targets are modified by overlays ...).
Is the plan to generate, via actions, a file similar to the one mentioned by Anas periodically ?

@mbolivar-nordic
Copy link
Contributor Author

@nashif , how could we find the mapping from given file to dts defines? using scripts to check the fild compatbile defines? but it is only existing in .c file, what if we change the .h file? or common file?

a question for @mbolivar-nordic I guess

I'm planning on picking this back up at next week's process meeting so I am looking back at this. I'm sorry @hakehuang , but I don't understand your question. Could you please give me some example?

@mbolivar-nordic I guess you don't think about dealing with dts at each CI run (need to compile all dts for each target and take into account tests/samples in which targets are modified by overlays ...).

I don't know what you mean here either, @erwango. The example from @nashif is a list of drivers. Surely the compatibles that a driver handles do not change with overlays being applied by tests/samples, so I don't get what you're asking about.

@hakehuang
Copy link
Collaborator

I'm planning on picking this back up at next week's process meeting so I am looking back at this. I'm sorry @hakehuang , but I don't understand your question. Could you please give me some example?

my question is how could we find the target application when only .h source is changed? or subsystem code is changed, which does not mapping to all supporting platform for given test application.

@mbolivar-nordic
Copy link
Contributor Author

.h source where? Sorry, could you please give me an example of some real files in zephyr?

@hakehuang
Copy link
Collaborator

.h source where? Sorry, could you please give me an example of some real files in zephyr?

for example if I modify the 'zephyr\subsys\net\l2\ethernet\gptp\gptp_md.h', all impacted platfroms shall be built against, and how can I get the mapping relationship?

mbolivar-nordic added a commit to mbolivar-nordic/zephyr that referenced this issue Apr 8, 2022
This is part of the road towards replacing CODEOWNERS with
MAINTAINERS (tracked in the process working group as zephyrproject-rtos#38725).

As part of that work, we wanted to have a way to track the maintainers
of each west project (entry in west.yml) by name of project.

This is an initial attempt at that, based mostly on my personal
knowledge, the git logs, and some rough guesswork. I fully expect
omissions and errors, but it should be enough to get us started, with
fixes to follow incrementally from the people who know their
individual areas better than I do.

Signed-off-by: Martí Bolívar <marti.bolivar@nordicsemi.no>
mbolivar-nordic added a commit that referenced this issue Apr 13, 2022
This is part of the road towards replacing CODEOWNERS with
MAINTAINERS (tracked in the process working group as #38725).

As part of that work, we wanted to have a way to track the maintainers
of each west project (entry in west.yml) by name of project.

This is an initial attempt at that, based mostly on my personal
knowledge, the git logs, and some rough guesswork. I fully expect
omissions and errors, but it should be enough to get us started, with
fixes to follow incrementally from the people who know their
individual areas better than I do.

Signed-off-by: Martí Bolívar <marti.bolivar@nordicsemi.no>
@mbolivar-nordic
Copy link
Contributor Author

Process WG:

@danieldegrasse
Copy link
Collaborator

@nashif Just wanted to call out a specific use case where CODEOWNERS falls short for boards: the faze board, which was contributed by @simonguinot and @mbittan, is a board designed and maintained by Seagate. They should be pulled in as code owners on any file changes that relate to that board, but GitHub will not request their review (see #44843, where review had to be manually requested). According to the GitHub docs:

The people you choose as code owners must have read permissions for the repository. When the code owner is a team, that team must be visible and it must have write permissions, even if all the individual members of the team already have write permissions directly, through organization membership, or through another team membership.

In the case of Zephyr, It seems that contributors who are not members of Zephyr on GitHub won't be added to the review list for files they are the code owner of.

@mbolivar-nordic mbolivar-nordic moved this from In progress to On hold in Process Apr 20, 2022
@mbolivar-nordic
Copy link
Contributor Author

Moving status to on hold since it's on my plate but I haven't touched it yet to add I2C driver maintainers. Will move back to In Progress when I have a PR.

@stephanosio
Copy link
Member

#46373 and #46380 implemented a script and workflow to automatically assign the reviewers, assignee and labels based on the content of the MAINTAINERS.yml.

With that, we are now a step closer to replacing CODEOWNERS with MAINTAINERS.yml; we just need to make sure that the MAINTAINERS.yml is not missing any entries from the CODEOWNERS.

@mbolivar-nordic mbolivar-nordic moved this from On hold to In progress in Process Jul 6, 2022
@mbolivar-nordic
Copy link
Contributor Author

Process WG:

  • @stephanosio : I want to revisit whether distributing maintainers throughout the tree is the best idea. I think just having one file is better.
  • @mbolivar-nordic : upstream Linux MAINTAINERS is over 20,000 lines long; I don't want that personally
  • @stephanosio : you can still find in it and the get_maintainer.pl script won't take that long anyway, so I think we could use the same approach
  • @mbolivar-nordic : our Zephyr script could still look this up with a nice user experience
  • @stephanosio : it's harder to follow changes to different files; if I want to look up changes to assign them different GitHub roles, it's harder to get a bird's eye view.
  • @fabiobaltieri : I prefer a single file also
  • @mbolivar-nordic : we lose hierarchy as well -- keeping I2C drivers together near the I2C area will be harder without enforcement, and will lead to the file getting scattered
  • @fabiobaltieri : we'd still have a wildcard for the platform, right? like stm32* or something
  • @stephanosio : we'd have different areas, like a top-level area for I2C and sub-areas for specific drivers
  • @fabiobaltieri : maybe it would be good to have some way to have a platform maintainer, and then instead of referring to maintainers directly, referring indirectly to the platform maintainer, and getting github usernames from there?
  • @MaureenHelm : that's what we have today, and we'd lose that if we went distributed, right?
  • @fabiobaltieri : depends on how we implement it
  • @galak : I lean towards one file as well
  • @dleach02 : I do also

We'll put this on hold until next week's meeting at the latest, or whenever @nashif has a chance to comment either on Discord or in this issue.

@mbolivar-nordic
Copy link
Contributor Author

Process WG:

  • @stephanosio I will provide a template with example syntax
  • @mbolivar-nordic -- are you opposed to doing it in one file @nashif ?
  • @nashif I just want to see an example before agreeing to this. Don't want people to get the impression they are project level maintainers just because they have some responsibility for a single file they submit or something like that.

@mbolivar-nordic
Copy link
Contributor Author

Process WG: revisit next week

@marc-hb
Copy link
Collaborator

marc-hb commented Jul 31, 2022

it's harder to follow changes to different files. if I want to look up changes to assign them different GitHub roles, it's harder to get a bird's eye view.

git log -p '*/MAINTAINERS' shows changes to all MAINTAINERS files across the tree. Not much harder.

What I find missing in the giant linux/MAINTAINERS approach is... a bird's eye view. No way to zoom in and out. Searching is impractical when trying to match a path: which subdir to look for? linux/MAINTAINERS itself asks to try subdirs one by one. I suspect linux/MAINTAINERS a write-only database that is never read; readers always use the script.

Good luck trying to do this sort of filtering with a single giant file:
git log --oneline --stat *bluetooth*CMakeLists.txt (it shows changes in drivers/, subsys/, tests/,...)

@mbolivar-nordic
Copy link
Contributor Author

Process WG: moved back to on hold; @stephanosio is busy

@marc-hb
Copy link
Collaborator

marc-hb commented Sep 2, 2022

Will it be possible to unsubscribe from some pull requests? Not sure what the current status is now but "zephyrbot" keeps re-adding you after every force-push. I guess because a force-push can include new changed files but that's not the most common case.

@dleach02
Copy link
Member

I've run into a situation where an SOC/Board was contributed to Zephyr from a different business unit in NXP. Now that person wants to be the maintainer of the relevant files and be automatically pulled into PR reviews if those files are touched. He initially had updated the codeowners file to associate with the files but that apparently doesn't completely work unless he gets put into the MAINTAINERS file. I'm going to do that but now I'm getting too many collaborators at the top level. Which seems to imply needing some hierarchy in the MAINTAINERS file.

@zephyrbot zephyrbot added the Needs review This PR needs attention from Zephyr's maintainers label Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs review This PR needs attention from Zephyr's maintainers Process Tracked by the process WG
Projects
Status: On hold
Process
On hold
Development

No branches or pull requests