Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .env-example
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ END_DATE = ""
ORGANIZATION = "organization"
REPOSITORY = "organization/repository"
START_DATE = ""
SPONSOR_INFO = "False"
LINK_TO_PROFILE = "True"
ACKNOWLEDGE_COAUTHORS = "True"

# GITHUB APP
GH_APP_ID = ""
Expand Down
21 changes: 12 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,20 +84,23 @@ This action can be configured to authenticate with GitHub App Installation or Pe

#### Other Configuration Options

| field | required | default | description |
| ------------------- | ----------------------------------------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `GH_ENTERPRISE_URL` | False | "" | The `GH_ENTERPRISE_URL` is used to connect to an enterprise server instance of GitHub. github.com users should not enter anything here. |
| `ORGANIZATION` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the GitHub organization which you want the contributor information of all repos from. ie. github.com/github would be `github` |
| `REPOSITORY` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the repository and organization which you want the contributor information from. ie. `github/contributors` or a comma separated list of multiple repositories `github/contributor,super-linter/super-linter` |
| `START_DATE` | False | Beginning of time | The date from which you want to start gathering contributor information. ie. Aug 1st, 2023 would be `2023-08-01`. |
| `END_DATE` | False | Current Date | The date at which you want to stop gathering contributor information. Must be later than the `START_DATE`. ie. Aug 2nd, 2023 would be `2023-08-02` |
| `SPONSOR_INFO` | False | False | If you want to include sponsor information in the output. This will include the sponsor count and the sponsor URL. This will impact action performance. ie. SPONSOR_INFO = "False" or SPONSOR_INFO = "True" |
| `LINK_TO_PROFILE` | False | True | If you want to link usernames to their GitHub profiles in the output. ie. LINK_TO_PROFILE = "True" or LINK_TO_PROFILE = "False" |
| field | required | default | description |
| ----------------------- | ----------------------------------------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `GH_ENTERPRISE_URL` | False | "" | The `GH_ENTERPRISE_URL` is used to connect to an enterprise server instance of GitHub. github.com users should not enter anything here. |
| `ORGANIZATION` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the GitHub organization which you want the contributor information of all repos from. ie. github.com/github would be `github` |
| `REPOSITORY` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the repository and organization which you want the contributor information from. ie. `github/contributors` or a comma separated list of multiple repositories `github/contributor,super-linter/super-linter` |
| `START_DATE` | False | Beginning of time | The date from which you want to start gathering contributor information. ie. Aug 1st, 2023 would be `2023-08-01`. |
| `END_DATE` | False | Current Date | The date at which you want to stop gathering contributor information. Must be later than the `START_DATE`. ie. Aug 2nd, 2023 would be `2023-08-02` |
| `SPONSOR_INFO` | False | False | If you want to include sponsor information in the output. This will include the sponsor count and the sponsor URL. This will impact action performance. ie. SPONSOR_INFO = "False" or SPONSOR_INFO = "True" |
| `LINK_TO_PROFILE` | False | True | If you want to link usernames to their GitHub profiles in the output. ie. LINK_TO_PROFILE = "True" or LINK_TO_PROFILE = "False" |
| `ACKNOWLEDGE_COAUTHORS` | False | True | If you want to include co-authors from commit messages as contributors. Co-authors are identified via the `Co-authored-by:` trailer in commit messages using the GitHub noreply email format (e.g., `username@users.noreply.github.com`). This will impact action performance as it requires scanning all commits. ie. ACKNOWLEDGE_COAUTHORS = "True" or ACKNOWLEDGE_COAUTHORS = "False" |

**Note**: If `start_date` and `end_date` are specified then the action will determine if the contributor is new. A new contributor is one that has contributed in the date range specified but not before the start date.

**Performance Note:** Using start and end dates will reduce speed of the action by approximately 63X. ie without dates if the action takes 1.7 seconds, it will take 1 minute and 47 seconds.

**Co-authors Note:** When `ACKNOWLEDGE_COAUTHORS` is enabled, the action will scan commit messages for `Co-authored-by:` trailers and include those users as contributors. Only co-authors with GitHub noreply email addresses (e.g., `username@users.noreply.github.com`) will be recognized, as this is the standard format used by GitHub for [creating commits with multiple authors](https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors).

### Example workflows

**Be sure to change at least these values: `<YOUR_ORGANIZATION_GOES_HERE>`, `<YOUR_GITHUB_HANDLE_HERE>`**
Expand Down
138 changes: 135 additions & 3 deletions contributors.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# pylint: disable=broad-exception-caught
"""This file contains the main() and other functions needed to get contributor information from the organization or repository"""

import re
from typing import List

import auth
Expand All @@ -27,6 +28,7 @@ def main():
end_date,
sponsor_info,
link_to_profile,
acknowledge_coauthors,
) = env.get_env_vars()

# Auth to GitHub.com
Expand All @@ -46,7 +48,13 @@ def main():

# Get the contributors
contributors = get_all_contributors(
organization, repository_list, start_date, end_date, github_connection, ghe
organization,
repository_list,
start_date,
end_date,
github_connection,
ghe,
acknowledge_coauthors,
)

# Check for new contributor if user provided start_date and end_date
Expand All @@ -60,6 +68,7 @@ def main():
end_date=start_date,
github_connection=github_connection,
ghe=ghe,
acknowledge_coauthors=acknowledge_coauthors,
)
for contributor in contributors:
contributor.new_contributor = contributor_stats.is_new_contributor(
Expand Down Expand Up @@ -103,6 +112,7 @@ def get_all_contributors(
end_date: str,
github_connection: object,
ghe: str,
acknowledge_coauthors: bool = False,
):
"""
Get all contributors from the organization or repository
Expand All @@ -113,6 +123,8 @@ def get_all_contributors(
start_date (str): The start date of the date range for the contributor list.
end_date (str): The end date of the date range for the contributor list.
github_connection (object): The authenticated GitHub connection object from PyGithub
ghe (str): The GitHub Enterprise URL to use for authentication
acknowledge_coauthors (bool): Whether to acknowledge co-authors from commit messages

Returns:
all_contributors (list): A list of ContributorStats objects
Expand All @@ -130,7 +142,9 @@ def get_all_contributors(
all_contributors = []
if repos:
for repo in repos:
repo_contributors = get_contributors(repo, start_date, end_date, ghe)
repo_contributors = get_contributors(
repo, start_date, end_date, ghe, acknowledge_coauthors
)
if repo_contributors:
all_contributors.append(repo_contributors)

Expand All @@ -140,20 +154,61 @@ def get_all_contributors(
return all_contributors


def get_contributors(repo: object, start_date: str, end_date: str, ghe: str):
def get_coauthors_from_message(commit_message: str) -> List[str]:
"""
Extract co-author usernames from a commit message.

Co-authored-by trailers follow the format:
Co-authored-by: Name <email>
Or with a GitHub username:
Co-authored-by: Name <username@users.noreply.github.com>

Args:
commit_message (str): The commit message to parse

Returns:
List[str]: List of GitHub usernames extracted from co-author trailers
"""
# Match Co-authored-by trailers - case insensitive
# Format: Co-authored-by: Name <email>
pattern = r"Co-authored-by:\s*[^<]*<([^>]+)>"
matches = re.findall(pattern, commit_message, re.IGNORECASE)

usernames = []
for email in matches:
# Check if it's a GitHub noreply email format: username@users.noreply.github.com
noreply_pattern = r"^(\d+\+)?([^@]+)@users\.noreply\.github\.com$"
noreply_match = re.match(noreply_pattern, email)
if noreply_match:
usernames.append(noreply_match.group(2))
return usernames


def get_contributors(
repo: object,
start_date: str,
end_date: str,
ghe: str,
acknowledge_coauthors: bool = False,
):
"""
Get contributors from a single repository and filter by start end dates if present.

Args:
repo (object): The repository object from PyGithub
start_date (str): The start date of the date range for the contributor list.
end_date (str): The end date of the date range for the contributor list.
ghe (str): The GitHub Enterprise URL to use for authentication
acknowledge_coauthors (bool): Whether to acknowledge co-authors from commit messages

Returns:
contributors (list): A list of ContributorStats objects
"""
all_repo_contributors = repo.contributors()
contributors = []
# Track usernames already added as contributors
contributor_usernames = set()

try:
for user in all_repo_contributors:
# Ignore contributors with [bot] in their name
Expand Down Expand Up @@ -187,6 +242,15 @@ def get_contributors(repo: object, start_date: str, end_date: str, ghe: str):
"",
)
contributors.append(contributor)
contributor_usernames.add(user.login)

# Get co-authors from commit messages if enabled
if acknowledge_coauthors:
coauthor_contributors = get_coauthor_contributors(
repo, start_date, end_date, ghe, contributor_usernames
)
contributors.extend(coauthor_contributors)

except Exception as e:
print(f"Error getting contributors for repository: {repo.full_name}")
print(e)
Expand All @@ -195,5 +259,73 @@ def get_contributors(repo: object, start_date: str, end_date: str, ghe: str):
return contributors


def get_coauthor_contributors(
repo: object,
start_date: str,
end_date: str,
ghe: str,
existing_usernames: set,
) -> List[contributor_stats.ContributorStats]:
"""
Get contributors who were co-authors on commits in the repository.

Args:
repo (object): The repository object
start_date (str): The start date of the date range for the contributor list.
end_date (str): The end date of the date range for the contributor list.
ghe (str): The GitHub Enterprise URL
existing_usernames (set): Set of usernames already added as contributors

Returns:
List[ContributorStats]: A list of ContributorStats objects for co-authors
"""
coauthor_counts: dict = {} # username -> count
endpoint = ghe if ghe else "https://github.com"

try:
# Get all commits in the date range
if start_date and end_date:
commits = repo.commits(since=start_date, until=end_date)
else:
commits = repo.commits()

for commit in commits:
# Get commit message from the commit object
commit_message = commit.commit.message if commit.commit else ""
if not commit_message:
continue

# Extract co-authors from commit message
coauthors = get_coauthors_from_message(commit_message)
for username in coauthors:
if username not in existing_usernames:
coauthor_counts[username] = coauthor_counts.get(username, 0) + 1

except Exception as e:
print(f"Error getting co-authors for repository: {repo.full_name}")
print(e)
return []

# Create ContributorStats objects for co-authors
coauthor_contributors = []
for username, count in coauthor_counts.items():
if start_date and end_date:
commit_url = f"{endpoint}/{repo.full_name}/commits?author={username}&since={start_date}&until={end_date}"
else:
commit_url = f"{endpoint}/{repo.full_name}/commits?author={username}"

contributor = contributor_stats.ContributorStats(
username,
False,
"", # No avatar URL available for co-authors
count,
commit_url,
"",
)
coauthor_contributors.append(contributor)

return coauthor_contributors


if __name__ == "__main__":
main()
4 changes: 4 additions & 0 deletions env.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ def get_env_vars(
str,
bool,
bool,
bool,
]:
"""
Get the environment variables for use in the action.
Expand All @@ -105,6 +106,7 @@ def get_env_vars(
end_date (str): The end date to get contributor information to.
sponsor_info (str): Whether to get sponsor information on the contributor
link_to_profile (str): Whether to link username to Github profile in markdown output
acknowledge_coauthors (bool): Whether to acknowledge co-authors from commit messages
"""

if not test:
Expand Down Expand Up @@ -145,6 +147,7 @@ def get_env_vars(

sponsor_info = get_bool_env_var("SPONSOR_INFO", False)
link_to_profile = get_bool_env_var("LINK_TO_PROFILE", False)
acknowledge_coauthors = get_bool_env_var("ACKNOWLEDGE_COAUTHORS", True)

# Separate repositories_str into a list based on the comma separator
repositories_list = []
Expand All @@ -166,4 +169,5 @@ def get_env_vars(
end_date,
sponsor_info,
link_to_profile,
acknowledge_coauthors,
)
Loading
Loading