Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create tool for getting DCO sign off emails #34

Merged
merged 1 commit into from Dec 20, 2022

Conversation

andrross
Copy link
Member

@andrross andrross commented Nov 11, 2022

This tool digs through commit messages and parses out names and email addresses from the Signed-off-by: tags, collects all unique email addresses and outputs a CSV of name/email pairs.

Example usage:

./bin/project contributors --from=2022-11-08 --to=2022-11-09 --repo=OpenSearch dco-csv
Andrew Ross,andrross@amazon.com
Andriy Redko,andriy.redko@aiven.io
dblock,dblock@dblock.org
Marc Handalian,handalm@amazon.com
Owais Kazi,owaiskazi19@gmail.com
Rabi Panda,adnapibar@gmail.com
Xiao Cui,constantine124@gmail.com

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@andrross
Copy link
Member Author

This really needs tests. I'm a ruby newbie so I haven't waded into figuring out how all the spec stuff works yet...

@andrross andrross force-pushed the dco-csv branch 3 times, most recently from 6ce2d02 to 43d44a6 Compare November 11, 2022 21:41
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below + add to README.

I haven't been good at writing tests here, but yes you should. With some suggestions below it should be pretty easy. At least add pending tests for where you'd intend to write them.

org = GitHub::Organization.new(options)
puts org.commits(options).unique_dco_signers_csv
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an issue for exporting CSV in a generic way, #18.

How about we change this to be called "emails" and output something non-structured, then implement the CSV thing generically?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still looking at what it would take to wire in more generic csv support.

end

def unique_dco_signers_csv
# Create an association list of all name->email pairs, e.g:
Copy link
Member

@dblock dblock Nov 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adopt the @var ||= begin syntax to avoid re-parsing/sorting.

  • Create a class Signer that represents a single DCO signer with name and multiple emails.
  • Create a class Signers that represents a collection of Signer and has a class method that implements the logic here to create it.
  • Rename this method to signers that returns Signers.
  • Do sorting in Signers.sort!.
  • Transform to text elsewhere, e.g. Signers#to_s.

This is what this could look like:

signers_list.each do |name, email|
  signers[email] ||= Set.new
  signers[email] += email
end

or even

signers.add(name, email)

@dblock
Copy link
Member

dblock commented Nov 15, 2022

Btw, I had in the past written and used https://github.com/dblock/fue which is a way to dig through commits for emails in another way than DCO.

@andrross
Copy link
Member Author

Btw, I had in the past written and used https://github.com/dblock/fue which is a way to dig through commits for emails in another way than DCO.

After parsing through a lot of this data, the DCO proved to be highest quality source for email addresses. The email address in the commit metadata itself was frequently something like 18663532+xxx@users.noreply.github.com where the DCO signoff in the commit message would be a real email address.

@dblock
Copy link
Member

dblock commented Dec 1, 2022

@andrross Want to finish this?

@andrross
Copy link
Member Author

andrross commented Dec 1, 2022

@dblock Yes I will finish this.

@andrross andrross force-pushed the dco-csv branch 3 times, most recently from 72531b1 to da041e1 Compare December 2, 2022 22:03
Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Add to README and let's merge?

This tool digs through commit messages and parses out names and email
addresses from the `Signed-off-by:` tags, collects all unique email
addresses and outputs a CSV of name/email pairs.

Signed-off-by: Andrew Ross <andrross@amazon.com>
@andrross
Copy link
Member Author

@dblock This is ready to review again. Added the README section plus another unit test.

@dblock dblock merged commit 3b3cb8b into opensearch-project:main Dec 20, 2022
@dblock
Copy link
Member

dblock commented Dec 20, 2022

Nice work, welcome to Ruby!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants