From 4c887ba7ebf376a3c9594b4873a3983342db8a18 Mon Sep 17 00:00:00 2001 From: Rich Bowen Date: Wed, 3 Jun 2026 09:19:57 -0400 Subject: [PATCH] Proposed next-committer prompt/skill for analyzing your project's public data and identifying contributors who you have not yet added to your project's formal governance. --- mcp/next-committer/PROMPT.md | 201 +++++++++++++++++++++++++++ mcp/next-committer/README.md | 201 +++++++++++++++++++++++++++ mcp/next-committer/example-output.md | 63 +++++++++ 3 files changed, 465 insertions(+) create mode 100644 mcp/next-committer/PROMPT.md create mode 100644 mcp/next-committer/README.md create mode 100644 mcp/next-committer/example-output.md diff --git a/mcp/next-committer/PROMPT.md b/mcp/next-committer/PROMPT.md new file mode 100644 index 0000000..401ee03 --- /dev/null +++ b/mcp/next-committer/PROMPT.md @@ -0,0 +1,201 @@ +# Next Committer Candidates — AI Prompt + +Use this prompt (or adapt it) with your MCP-capable AI assistant to generate +a "Next Committer Candidates" report for your Apache project. + +## Prerequisites + +Your AI tool must have access to two MCP servers: + +1. **ponymail-mcp** — provides `search_list`, `get_email`, `get_thread`, `get_mbox` +2. **apache-projects-mcp** — provides `get_committee`, `get_group_members`, `get_person`, `search_people`, `get_releases`, `get_repositories` + +Additionally, the AI tool needs the ability to fetch URLs (for GitHub API access). + +## The Prompt + +Copy and paste the following into your AI assistant, replacing `{PROJECT}` with +your Apache project ID (e.g., `iceberg`, `spark`, `tinkerpop`): + +--- + +``` +You are an Apache committer pipeline analyst. Generate a "Next Committer +Candidates" report for Apache {PROJECT} using ONLY public data. + +## Data Sources — PUBLIC ONLY + +This report uses ONLY publicly available data: +- Apache PonyMail (public mailing list archives) +- GitHub public API (contributor stats, merged PRs) +- Apache Projects LDAP (public committee/committer rosters, release history) + +Do NOT include any internal company data, proprietary information, or +vendor/employer affiliations. Focus exclusively on open source contributions. + +## Step 1: Get current committers, PMC, and recent promotions (DO THIS FIRST) + +- Use get_committee("{PROJECT}") — this returns the full PMC roster WITH join + dates. Record join dates to identify recent promotions (last 12 months). +- Use get_group_members("{PROJECT}") for committers. +- Use get_group_members("{PROJECT}-pmc") for PMC members. +- Store these lists. Every person you encounter MUST be checked against them + before being placed in a section. + +## Step 2: Get release history (PMC candidate signal) + +- Use get_releases("{PROJECT}") to get recent releases. +- Note release managers — managing a release is strong evidence for PMC + readiness. + +## Step 3: Find all repositories for the project + +- Use get_repositories("{PROJECT}") to discover all repos (some projects have + multiple: main, website, docs, language-specific implementations). +- Get GitHub contributor stats from ALL relevant repos. + +## Step 4: Scan mailing list activity (last 2 years) + +- Use search_list to scan dev@{PROJECT}.apache.org. +- Scan at least 8 representative months (every 3 months) with quick=true to + get participant statistics. +- For the top 10 non-committers by message count, do targeted from: searches + for accurate counts. +- Use get_thread for promising candidates to assess depth of design + engagement (leading discussions vs. single replies). +- PUBLIC lists only. + +## Step 5: Get GitHub contributors + +- Fetch https://api.github.com/repos/apache/{REPO}/contributors?per_page=100 + for EACH repository found in Step 3. +- Filter out bots (names ending in [bot], or well-known bot accounts like + dependabot, renovate, etc.). +- Merge contributor counts across repos for the same person. + +## Step 6: Cross-reference with ASF LDAP + +- Use search_people for prominent contributors to check if they have ASF + accounts and committership on OTHER projects. +- Use get_person for anyone with an ASF ID to see their full group + memberships (reveals cross-project committership, ASF Member status). + +## Step 7: Classify and write the report + +CRITICAL classification rules (based on Step 1 data): + +- If a person is ALREADY a committer on this project AND already on the + PMC → exclude entirely (they are done) +- If a person is ALREADY a committer on this project but NOT on the PMC + → PMC Candidate section +- If a person is NOT a committer on this project → Committer Candidate + section +- If a person was recently promoted (from get_committee join dates, within + last 12 months) → Recent Promotions section + +PMC Candidate evidence to look for: +- Release management (from Step 2) +- Leading design discussions on dev@ (not just participating) +- Mentoring new contributors +- Community building (organizing events, writing docs) +- Cross-project ASF involvement + +## Report Format + +Use this structure: + +# Apache {PROJECT} — Next Committer Candidates Report + +**Report Date:** {today's date} +**Project:** Apache {PROJECT} +**Current Committers:** {count} | **PMC Members:** {count} +**Chair:** {name} ({apache_id}) + +--- + +## Recent Promotions (Last 12 Months) + +{People who recently became committers or PMC members, with dates} + +--- + +## Committer Candidates + +People who are NOT currently committers on this project but show strong +contribution patterns. + +### Tier 1 — Strong Candidates (Ready for Discussion) + +| # | Name | GitHub | Contributions | Evidence | +|---|------|--------|---------------|----------| + +### Tier 2 — Growing Contributors + +| # | Name | GitHub | Contributions | Evidence | +|---|------|--------|---------------|----------| + +--- + +## PMC Candidates + +People who ARE already committers on this project but not yet on the PMC, +and show PMC-level activity (release management, design leadership, +mentoring, community building). + +| # | Name | Apache ID | Evidence | +|---|------|-----------|----------| + +--- + +## Pipeline Health Assessment + +{Status emoji and summary of overall pipeline health} + +## Rules + +- NEVER list someone as a "Committer Candidate" if they are already a + committer. Double-check against Step 1 data. +- NEVER list someone as a "PMC Candidate" if they are already on the PMC. +- Do NOT mention company/vendor affiliations. These reports are intended + for public publication. Focus on a person's open source contributions + (mailing list activity, code contributions, release management, design + engagement), not their employer. +- ALL data from public sources. No internal or proprietary information. +- Be honest — if fewer candidates exist, say so. Empty sections are fine. +``` + +--- + +## Customization + +You can adapt this prompt to your needs: + +- **Change the time window** — adjust "last 2 years" and "last 12 months" + to match your project's cadence. +- **Add additional repos** — if your project has repos outside the `apache/` + GitHub org (e.g., a mirror, or a separate ecosystem project), add them to + Step 5. +- **Scan user@ list too** — for projects where contributors start by helping + users, add `user@{PROJECT}.apache.org` to the mailing list scan. +- **Adjust tier criteria** — define what "Tier 1" vs "Tier 2" means for your + project's culture (some projects weight code heavily, others value community + participation equally). + +## Running on a Schedule + +If your AI tool supports scheduled/recurring tasks, you can run this monthly +or quarterly. The report is most useful when tracked over time — you can see +candidates moving up through the tiers. + +## Sharing Results + +While these reports use only publicly available data sources, the output +concerns **project governance decisions** — specifically, who may or may not +be elected to committer or PMC membership. These decisions are the +responsibility of the PMC, and public discussion of individual candidates +can be harmful to the community. + +**These reports should be treated as confidential guidance for the PMC.** +Discuss them on your project's **private@ mailing list only** — this is +where committer/PMC elections happen per ASF policy. Do not post to dev@, +public wikis, or other public channels. diff --git a/mcp/next-committer/README.md b/mcp/next-committer/README.md new file mode 100644 index 0000000..d531e6d --- /dev/null +++ b/mcp/next-committer/README.md @@ -0,0 +1,201 @@ +# Next Committer Candidates + +An AI-powered tool for Apache project PMCs to identify potential committer +and PMC candidates by analyzing public contribution data. + +## What This Does + +Given an Apache project name, this tool generates a structured report +identifying: + +1. **Committer Candidates** — people who are not yet committers but show + strong contribution patterns (code, mailing list engagement, design + discussions) +2. **PMC Candidates** — existing committers who are not yet on the PMC but + demonstrate PMC-level activity (release management, design leadership, + mentoring) +3. **Recent Promotions** — people promoted in the last 12 months, as + pipeline health evidence + +The report uses **only publicly available data** — Apache PonyMail archives, +GitHub public API, and Apache Projects LDAP. No internal or proprietary data +is used. No vendor/employer affiliations are mentioned. + +## How It Works + +This is not a standalone program. It is a **prompt** designed to be used with +any AI assistant that supports the [Model Context Protocol +(MCP)](https://modelcontextprotocol.io/). The AI tool does the actual +analysis — the prompt provides the methodology. + +The analysis follows these steps: + +1. Fetch the current committer and PMC roster from Apache LDAP (with join + dates for recent promotions) +2. Fetch release history to identify release managers +3. Discover all source repositories for the project +4. Scan the dev@ mailing list archives (last 2 years) for active + non-committers +5. Fetch GitHub contributor statistics across all project repos +6. Cross-reference contributors against ASF LDAP for existing affiliations +7. Classify everyone into the correct section and write the report + +## Prerequisites + +### Required MCP Servers + +This tool depends on two MCP servers that provide access to Apache +infrastructure data: + +#### 1. ponymail-mcp + +Provides access to Apache PonyMail mailing list archives. + +- **Repository**: https://github.com/rbowen/ponymail-mcp +- **Tools used**: `search_list`, `get_email`, `get_thread` +- **Install**: + ```bash + git clone https://github.com/rbowen/ponymail-mcp.git + cd ponymail-mcp + npm install + ``` +- **Run**: `node index.js` (stdio MCP server) +- **Requirements**: Node.js 20+ + +#### 2. apache-projects-mcp + +Provides access to Apache project metadata (committees, people, groups, +releases, repositories). + +- **Repository**: https://github.com/rbowen/apache-projects-mcp +- **Tools used**: `get_committee`, `get_group_members`, `get_person`, + `search_people`, `get_releases`, `get_repositories` +- **Install**: + ```bash + git clone https://github.com/rbowen/apache-projects-mcp.git + cd apache-projects-mcp + npm install + ``` +- **Run**: `node index.js` (stdio MCP server) +- **Requirements**: Node.js 20+ + +### Required Capabilities + +Your AI tool must also be able to: + +- **Fetch URLs** — to access the GitHub public API + (`https://api.github.com/repos/apache/...`). Most AI tools provide this + natively (e.g., a `url_fetch` or `web_request` tool). No authentication + is required for the GitHub endpoints used, though a personal access token + increases rate limits. + +### Optional: GitHub Personal Access Token + +The GitHub API allows 60 requests/hour without authentication, or 5,000/hour +with a token. For projects with many repositories or large contributor lists, +a token is recommended: + +1. Go to https://github.com/settings/tokens +2. Generate a token with no special scopes (public repo access is the default) +3. Pass it in the `Authorization: Bearer {token}` header on GitHub API calls +4. Tell your AI tool to use this header (how to do this varies by tool) + +## Usage + +### One-Time Report + +1. Start your AI tool with both MCP servers connected +2. Paste the contents of `PROMPT.md` into the conversation +3. Replace `{PROJECT}` with your project ID (e.g., `iceberg`, `spark`, + `tinkerpop`) +4. The AI will execute the steps and produce a markdown report + +### Batch Run (Multiple Projects) + +You can modify the prompt to loop over multiple projects. Replace the +`{PROJECT}` placeholder with a list and add instructions to iterate: + +``` +Analyze the following projects, writing a separate report for each: +- projectA +- projectB +- projectC +``` + +### Scheduled Reports + +If your AI tool supports scheduled/recurring tasks, you can run this monthly +or quarterly. Tracking reports over time shows candidates moving through the +pipeline. + +## Output + +The report is a self-contained Markdown document with this structure: + +``` +# Apache {Project} — Next Committer Candidates Report + +Report metadata (date, committer count, PMC count, chair) + +## Recent Promotions (Last 12 Months) +## Committer Candidates + ### Tier 1 — Strong Candidates + ### Tier 2 — Growing Contributors +## PMC Candidates +## Pipeline Health Assessment +``` + +### Pipeline Health Indicators + +- 🟢 **Healthy** — Multiple strong candidates, active pipeline +- 🟡 **Moderate** — Some candidates but thin in places +- 🔴 **Concern** — Very few candidates, development concentrated among + existing committers + +## Sharing Reports + +While these reports use only publicly available data sources, the output +concerns **project governance decisions** — specifically, who may or may not +be elected to committer or PMC membership. These decisions are the +responsibility of the PMC, and public discussion of individual candidates +can be harmful to the community. + +**These reports should be treated as confidential guidance for the PMC.** + +Suggested venue for discussion: + +- **Your project's private@ mailing list ONLY** — this is where PMC + membership and committer elections are discussed per ASF policy. + +Do not post these reports to dev@, public wikis, or other public channels. +The PMC uses them as a discussion aid, not as a public ranking or scorecard. + +## Limitations + +- **GitHub-only code analysis** — projects using only Gitbox or SVN will + have limited code contribution data from the GitHub API. The mailing + list analysis still works. +- **Bot filtering is heuristic** — some bots may slip through, or + legitimate contributors with bot-like names may be excluded. Review the + output. +- **No private list access** — the ponymail-mcp server blocks private + lists by default (and this is the correct behavior for this use case). + Candidates are identified from public activity only. +- **Point-in-time snapshot** — the report reflects current data. Run + periodically for trend analysis. +- **AI judgment calls** — tier placement and pipeline health assessment + involve AI judgment. The PMC should use these as discussion starters, + not definitive rankings. + +## Contributing + +This tool is part of the Apache Community Development (ComDev) project. + +- **Mailing list**: dev@community.apache.org +- **Issues**: https://github.com/apache/comdev/issues +- **Improvements welcome** — especially around the classification + methodology and report format. + +## License + +Licensed under the Apache License, Version 2.0. diff --git a/mcp/next-committer/example-output.md b/mcp/next-committer/example-output.md new file mode 100644 index 0000000..9938f64 --- /dev/null +++ b/mcp/next-committer/example-output.md @@ -0,0 +1,63 @@ +# Apache Toasteroven — Next Committer Candidates Report + +**Report Date:** June 3, 2026 +**Project:** Apache Toasteroven +**Current Committers:** 24 | **PMC Members:** 11 +**Chair:** Keanu Reeves (kreeves) + +--- + +## Recent Promotions (Last 12 Months) + +| Name | Apache ID | Promotion | Date | +|------|-----------|-----------|------| +| Meryl Streep | mstreep | PMC Member | March 2026 | +| Oscar Isaac | oisaac | Committer | January 2026 | + +--- + +## Committer Candidates + +People who are NOT currently committers on this project but show strong +contribution patterns. + +### Tier 1 — Strong Candidates (Ready for Discussion) + +| # | Name | GitHub | Contributions | Evidence | +|---|------|--------|---------------|----------| +| 1 | Pedro Pascal | pedropascal42 | 187 GH commits | Rewrote the entire crumb-tray eviction algorithm. 14 dev@ messages in 3 months including the "Should we support bagels?" DISCUSS thread. Led the BreadFormat v2 spec proposal. | +| 2 | Florence Pugh | florencepugh | 134 GH commits | Implemented async preheating across 3 repos. Active on dev@ — initiated the timer precision RFC and participated in 6 release vote threads. | +| 3 | Ke Huy Quan | kehuyquan | 91 GH commits | Fixed 47 flaky temperature-sensor tests. Wrote the "Getting Started with Your First Toast" tutorial. Existing ASF committer on Apache Waffle (ID: khquan). | + +### Tier 2 — Growing Contributors + +| # | Name | GitHub | Contributions | Evidence | +|---|------|--------|---------------|----------| +| 4 | Aubrey Plaza | aubreyplz | 56 GH commits | Added dark-mode support for the browning UI. 4 dev@ messages on defrost semantics. | +| 5 | Barry Keoghan | bkeoghan | 43 GH commits | Contributed the sourdough compatibility module. Participated in one design thread (slot-width configuration). | +| 6 | Stephanie Hsu | stephhsu | 38 GH commits | Documentation improvements — rewrote the troubleshooting guide for stuck levers. | + +--- + +## PMC Candidates + +People who ARE already committers on this project but not yet on the PMC, +and show PMC-level activity (release management, design leadership, +mentoring, community building). + +| # | Name | Apache ID | Evidence | +|---|------|-----------|----------| +| 1 | Cate Blanchett | cblanchett | Release manager for 4.2.0, 4.2.1, and 4.3.0. Led the "Toasteroven 5.0 Roadmap" DISCUSS thread. Mentored 3 new contributors through their first PRs. | +| 2 | Dev Patel | devpatel | Initiated the cross-project integration proposal with Apache Butterknife. Active design reviewer — 22 substantive code review comments in last quarter. | + +--- + +## Pipeline Health Assessment + +**Status: 🟢 Healthy — Strong Pipeline with Active Community** + +Toasteroven has a vibrant contributor ecosystem with multiple strong +committer candidates and clear PMC-track contributors. The recent addition +of async preheating has attracted new contributors. The project would benefit +from a committer election cycle to recognize Pedro Pascal and Florence Pugh's +sustained contributions.