Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev-cmd/contributions: Use GitHub APIs for commit author info #14737

Merged
merged 3 commits into from Feb 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
24 changes: 11 additions & 13 deletions Library/Homebrew/dev-cmd/contributions.rb
Expand Up @@ -19,11 +19,11 @@ module Homebrew
sig { returns(CLI::Parser) }
def contributions_args
Homebrew::CLI::Parser.new do
usage_banner "`contributions` <email|name> [<--repositories>`=`] [<--csv>]"
usage_banner "`contributions` <email|username> [<--repositories>`=`] [<--csv>]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should only ever take the username for consistency?

Copy link
Member Author

@issyl0 issyl0 Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an open question about this and backwards compatibility at #14737 (comment). I think until we have trailers implemented at API level (or some querying of our own to filter to get them), we have to support both? Especially if we're doing #14722 that queries by name because not everyone uses their GitHub username in their Git committer attributes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I guess the problem I'm hoping to figure out a solution to (even if a manual one) is: how do I combine the results from the various different outputs into a cohesive one without double counting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or some querying of our own to filter to get them

Can we ask the API for email addresses associated with a user name and then use those to filter git log?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or some querying of our own to filter to get them

Can we ask the API for email addresses associated with a user name and then use those to filter git log?

Nope, only the authenticated user can get their own "private" email addresses: https://docs.github.com/en/rest/users/emails?apiVersion=2022-11-28#list-email-addresses-for-the-authenticated-user. Which makes sense.

I suppose what we could do is get all the commits for an author, slurp the email addresses and then, discounting the commits we already have, scan again for coauthors and signoffs ourselves? But that feels like a lot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but there's only ever one "primary" email address. 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I guess the problem I'm hoping to figure out a solution to (even if a manual one) is: how do I combine the results from the various different outputs into a cohesive one without double counting?

To perhaps be even more explicit (and this addresses backwards compatibility too): my understanding is the goal of this command is to be able to have me, and ideally any maintainer, see which maintainers are likely to be asked to step down due to inactivity.

Personally, I think that means:

  • there's no need for any backwards compatibility
  • a single path and fewer options are better, even if the output is inferior and if we need e.g. to run against both email and username and check both: this tool should be doing the aggregation and not the user

Given this: I think it'd make sense for something like a bare brew contributions (i.e. the default options) to spit out the contributions for:

  • all current maintainers
  • over the last year
  • using their username and/or email
  • on primary repositories, grouped by repo
  • sorted from most to least active

@issyl0 I can privately share the spreadsheet I filled in manually to calculate this last time if it helps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that'd be useful. This definitely needs work. Sorry for the back and forth/confusion/diversions. The last "makes sense for" gives me more to work with, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@issyl0 Absolutely no problem on back-and-forth here! I've not been terribly clear/explicit here and wasn't sure if we were agreed on the goals here!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna post that comment in the main tracking isue and merge this as-is, since it's useful (and more accurate) than the previous implementation (for a GitHub username). Then I'll iterate on the maintainers stuff more soon.

description <<~EOS
Contributions to Homebrew repos for a user.

The first argument is a name (e.g. "BrewTestBot") or an email address (e.g. "brewtestbot@brew.sh").
The first argument is a GitHub username (e.g. "BrewTestBot") or an email address (e.g. "brewtestbot@brew.sh").
EOS

comma_array "--repositories",
Expand Down Expand Up @@ -65,13 +65,20 @@ def contributions
end

repo_path = find_repo_path_for_repo(repo)
tap = Tap.fetch("homebrew", repo)
unless repo_path.exist?
opoo "Repository #{repo} not yet tapped! Tapping it now..."
Tap.fetch("homebrew", repo).install
tap.install
end

repo_full_name = if repo == "brew"
"homebrew/brew"
else
tap.full_name
end

results[repo] = {
commits: git_log_author_cmd(T.must(repo_path), args),
commits: GitHub.repo_commit_count_for_user(repo_full_name, args.named.first),
coauthorships: git_log_trailers_cmd(T.must(repo_path), "Co-authored-by", args),
signoffs: git_log_trailers_cmd(T.must(repo_path), "Signed-off-by", args),
}
Expand Down Expand Up @@ -127,15 +134,6 @@ def total(results)
.sum(&:sum) # 956
end

sig { params(repo_path: Pathname, args: Homebrew::CLI::Args).returns(Integer) }
def git_log_author_cmd(repo_path, args)
cmd = ["git", "-C", repo_path, "log", "--oneline", "--author=#{args.named.first}"]
cmd << "--before=#{args.to}" if args.to
cmd << "--after=#{args.from}" if args.from

Utils.safe_popen_read(*cmd).lines.count
end

sig { params(repo_path: Pathname, trailer: String, args: Homebrew::CLI::Args).returns(Integer) }
def git_log_trailers_cmd(repo_path, trailer, args)
cmd = ["git", "-C", repo_path, "log", "--oneline"]
Expand Down
10 changes: 10 additions & 0 deletions Library/Homebrew/utils/github.rb
Expand Up @@ -699,4 +699,14 @@ def multiple_short_commits_exist?(user, repo, commit)

output[/^Status: (200)/, 1] != "200"
end

def repo_commit_count_for_user(nwo, user)
return if Homebrew::EnvConfig.no_github_api?

commits = 0
API.paginate_rest("#{API_URL}/repos/#{nwo}/commits", additional_query_params: "author=#{user}") do |result|
commits += result.length
end
commits
end
end
4 changes: 2 additions & 2 deletions Library/Homebrew/utils/github/api.rb
Expand Up @@ -253,9 +253,9 @@ def open_rest(url, data: nil, data_binary_path: nil, request_method: nil, scopes
end
end

def paginate_rest(url, per_page: 100)
def paginate_rest(url, additional_query_params: nil, per_page: 100)
(1..API_MAX_PAGES).each do |page|
result = API.open_rest("#{url}?per_page=#{per_page}&page=#{page}")
result = API.open_rest("#{url}?per_page=#{per_page}&page=#{page}&#{additional_query_params}")
yield(result, page)
end
end
Expand Down