Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to pandas governance #47694

Open
datapythonista opened this issue Jul 13, 2022 · 11 comments
Open

Changes to pandas governance #47694

datapythonista opened this issue Jul 13, 2022 · 11 comments
Labels
Admin Administrative tasks related to the pandas project

Comments

@datapythonista
Copy link
Member

datapythonista commented Jul 13, 2022

I was reading the pandas governance document and I think it could benefit from some improvements. The team is much bigger now, and we've got more experience in things like which kind of sponsoring we're receiving, which policies were haven't been enforced and others. I think having more up to date and precise governance policies should make decision making more efficient.

In this issue I compile the points I think we could update. Feel free to propose additions and changes to the list. Once we've got the big picture on what topics we want to discuss, I'll be creating separate issues for each of the topics.

Decision making

So far the policy is to find consensus and have the BDFL to unblock decision when needed. Keeping this is surely an option but I see couple of things that changed since this was decided:

  • The team is now 30+ people, consensus may be more challenging. I don't think this has been a problem in practice so far.
  • Wes focus is in Arrow and not in pandas, and I don't think the BDFL figure has been used, at least for a long time.

For me personally, the main thing to improve in this area is to know when a decision is ready to be made. And even more, with the introduction of PDEPs (see #47444), where few people expressed interest on defining the voting process. I think there are couple of things that maybe could be changed:

  • The introduction of a steering committee as other projects do
  • Implement some sort of policy, like "for a decision to be made (e.g. a PDEP approved), at least 4 approvals are needed, and no objection (i.e. Requested changes in PR) should exist". Or if the committee exists, something like "at least 75% of the steering committee upvotes, and a total of 5 upvotes...."

Personally I don't have a strong opinion on what the policy should be, but I think adding clarity and avoiding ambiguity on when a decision is made, or what is missing to move forward with one decision, would be very beneficial.

Committees

So far we have a CoC committee and a NumFOCUS committee. I'd personally make couple of changes:

  • Rename the NumFOCUS committee to Finances committee or similar, as I think the name is misleading.
  • Implement a communications committee, who takes care of communications between the pandas and other stakeholders. Like sharing NumFOCUS relevant announcements with the core team, share pandas updates with NF and in social media, talk to sponsors...
  • I'd have a chair/leader in every committee, so it's clear who should schedule meetings, renew the committee when members become inactive...

Sponsors

I personally find the institutional partners policy overcomplicated and still ambiguous. I'd simplify thins and simply have a policy like "A pandas sponsor is worth being listed in the website (plus any other benefit we want) if it employs a person to work at least 20% of their time to work in unrestricted work in the project, or provides $10,000 or more in funding to the project (in cash or in kind). A sponsor remains a sponsor for a year (or X period we want) until the last contribution to the project has been made).

Inactive core devs

The governance has a clear policy that core devs remove commit rights after one year of inactivity. I don't think this has been enforced for many years. Should we enforce it, or review the policy?

Personally, I think a good idea would be to have a figure of "inactive maintainer". I don't care much if someone has commit rights that aren't being used. But I think it'd be good to have the list of active core developers somehow updated. Mainly for two reasons:

  • If we want to use policies like "80% of core developer approval" (for example we need that now to update the governance docs). I think inactive core developers aren't relevant, and make things difficult
  • For visibility of the rest of the core development team and the rest of the community. We've got now 30+ core devs listed at the moment, but the actual number of people active is probably closer to 15. I find useful that users, sponsors, devs, organizations we apply grants to... have a more precise figure

I personally prefer something like "Inactive core dev" that doesn't sound like you were a core dev once, but you lost the badge. And makes changing status less of a big deal and changing the question from "do you want to stop being a core dev" to "are you currently active". Of course that would be with a long term view, someone who is inactive for a month should still be an active core dev.

CC: @pandas-dev/pandas-core

@datapythonista datapythonista added the Admin Administrative tasks related to the pandas project label Jul 13, 2022
@mroeschke
Copy link
Member

+1 to all the topics you listed here. I'll save my specific feedback for the upcoming, separate issues related to each topic.

Inactive core devs / Committees

Maybe more generally, the topic can address membership (in & out) into different membership teams (like https://github.com/orgs/pandas-dev/teams) which overlaps with committee. Maybe too heavy to discuss all in 1 github issue, but I kinda see these as specifics to people organization.

@toobaz
Copy link
Member

toobaz commented Jul 13, 2022

Personally, I think a good idea would be to have a figure of "inactive maintainer". I don't care much if someone has commit rights that aren't being used. But I think it'd be good to have the list of active core developers somehow updated. Mainly for two reasons:

As a mostly inactive dev, I definitely agree. It doesn't make sense that the casual visitor sees me in https://pandas.pydata.org/about/team.html just like active devs.
(I do read discussions, so in principle I could contribute to the "80% of core developers", but I'm not sure the distinction is worth making)

We could even maybe be more precise on the definition of "inactivity" as "not contributing code". I have been sporadically commenting in issues, probably at least once a year, but I do think I should be considered inactive.

@attack68
Copy link
Contributor

With more people you might encounter different unique cases. For example, I am currently inactive in contributing code and have been for a while, but I am monitoring issues and if issues arise related to my area of developement will submit code fixes. I also intend to revisit code contributions when I have more free time. Perhaps some kind of automatic self certification of status might be useful?

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Jul 13, 2022

IMHO, there are different levels of activities that members of the "core team" do, which maybe indicates that different labels should be applied (my labels are suggested and open for discussion, and the qualifications could certainly be adjusted):

  • Principal : Responsible for design decisions, attends all monthly meetings, etc.
  • Senior : Creates or reviews/approves PRs at least once per month on average
  • Junior : Comments on PRs and/or creates/comments on issues once per month on average
  • Casual: Has activity on PRs and/or issues at least once per quarter
  • Retired: Previously a Principal, but no longer even qualifies as Casual.

If you aren't at least "Casual", then it's time to drop you off the team, unless you are Retired.

To become a member of the core team, use the current process, which means you make code contributions that are considered significant. But once on the core team, your level of involvement could change the label over time.

@jbrockmendel
Copy link
Member

jbrockmendel commented Jul 13, 2022 via email

@mroeschke
Copy link
Member

I'm unclear on what failure(s) the increased formality is aimed at addressing.

I am also hoping not too much unneeded formality and complexity is added in the end. @datapythonista may have other thoughts, but I see this as rather responding to failures, an opportunity to

  1. Simplify or remove old policy
  2. Really clarify how the pandas project operates from code development to handling outside influences so it's clear to everyone. IMO I don't think the current governance doc captures all aspects of how the project operates today.

@datapythonista
Copy link
Member Author

Thanks all for the feedback. I think what @mroeschke about having a general topic of membership makes a lot of sense, and I started by that. It made sense to me to also add the sponsors there, as I think the same format as maintainers and committees is useful to use. I was writing an issue, but at the end it made more sense to open a PR with an initial proposal, I think it'll make the discussion easier: #47706

While what @Dr-Irv says makes sense to me, I'd personally keep things simpler, at least for the initial version of the new governance. But happy to continue the discussion about the roles in the PR, maybe there is value in that division that I didn't immediately see, compared to just using active/inactive maintainers.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Jul 13, 2022

  • Retired: Previously a Principal, but no longer even qualifies as Casual.

If you aren't at least "Casual", then it's time to drop you off the team, unless you are Retired.

@jbrockmendel suggested in the team call today using Emeritus instead of Retired, which is a good idea.

But happy to continue the discussion about the roles in the PR, maybe there is value in that division that I didn't immediately see, compared to just using active/inactive maintainers.

The goal of my labels was to help define active/inactive into categories, based on different levels of activity.

If you want to have just "active" and "inactive", then we need to agree on what "active" means.

@datapythonista
Copy link
Member Author

The goal of my labels was to help define active/inactive into categories, based on different levels of activity.

If you want to have just "active" and "inactive", then we need to agree on what "active" means.

Thanks for the clarification. In my initial proposal I added that a maintainer is inactive based on their own decision. To me this should work well enough and make things much simpler. I trust in this case people will make more sensible decisions than metrics, and also, to me it doesn't seem feasible that any of us spends time counting the number of PRs (or whatever) of pandas maintainers.

But I'm more than happy to hear other opinions, and continue the discussion.

@jreback
Copy link
Contributor

jreback commented Jul 13, 2022

we need to have discussions around any changes in governance

this is actually a big deal

and we need consensus of pretty much everyone

@datapythonista
Copy link
Member Author

we need to have discussions around any changes in governance

this is actually a big deal

and we need consensus of pretty much everyone

Agree. The current governance requires this A minimum of 80% of the Core Team must vote and at least 2/3 of the votes must be positive to carry out the proposed action. The core team has 28 people right now, this means 23 votes with 15 positives. I think as you say that we should have consensus, so let's aim at 23 positive votes for #47706. I can introduce a change in that PR that this is only for active maintainers, so for further changes we don't need to bother people who is not active anymore. Does this make sense to you?

And just to be clear, so far #47706 is not much about changing the project governance, but in better defining what we've got now, so things are clearer for everyone, and decision making is more efficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Admin Administrative tasks related to the pandas project
Projects
None yet
Development

No branches or pull requests

7 participants