Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create mapping on Jira user id -> GitHub account #3

Closed
mocobeta opened this issue Jun 29, 2022 · 11 comments
Closed

Create mapping on Jira user id -> GitHub account #3

mocobeta opened this issue Jun 29, 2022 · 11 comments
Assignees
Labels

Comments

@mocobeta
Copy link
Contributor

To correctly map Jira user ids in issues (reporter/assignee/author) to GitHub account, we need an account mapping file.
This could be inferred from https://github.com/orgs/apache/people?

@mocobeta
Copy link
Contributor Author

mocobeta commented Jul 10, 2022

Maybe we could take a brute force approach:

  1. List all GitHub users or Search Users by their full names
  2. Associate Jira usernames with GitHub accounts by comparing "display" names (in other words, "full name") [1]

[1] E-mail address could be another clue, but it's not publicly exposed in Jira as far as I know, and also it's an optional field in GitHub. I would rather rely on users' Full Names than E-mails, although names can be ambiguous (if there are multiple accounts with same full names, it'd be possible to manually identify the correct account).

Jira Profile

jira_profile

GitHub Profile

github_profile

@mocobeta mocobeta removed the help wanted Extra attention is needed label Jul 10, 2022
@mocobeta mocobeta self-assigned this Jul 10, 2022
@mocobeta
Copy link
Contributor Author

mocobeta commented Jul 15, 2022

Tasks to be done:

  • regenerate a candidate mapping (on July 24th)
  • manually make a "verified" mapping and commit it to main (on July 24th or 25th)
  • send a mail to the dev list to let others browse/check both "candidate" and "verified" mappings (on July 25th)
  • accept pull requests to add/edit the mapping
  • fix the final mapping (on August 7th)

@mikemccand
Copy link
Member

Could we maybe look for Jira issues that have GitHub PRs attached and "correlate" the ids of who opened the PR against who commented on the issue?

It would clearly not be perfect, but it could provide input for a human to sift through and carry over some verified accounts.

@mocobeta
Copy link
Contributor Author

We already include merged pull requests' authors (if their GitHub full names are set to the same string as Jira full names).
Maybe we could also consider all opened pull requests' authors.

@mikemccand
Copy link
Member

We already include merged pull requests' authors (if their GitHub full names are set to the same string as Jira full names).
Maybe we could also consider all opened pull requests' authors.

OK thanks, but does this only work for committers?

I was thinking if a contributor who is not a committer comments on a Jira issue and also opens a PR, linked to the issue, we could maybe correlate those two events to speculate about ID mapping. And then verify by hand after.

@mocobeta
Copy link
Contributor Author

"Authors" are not necessarily committers; they literally pull request authors (contributors).
For example apache/lucene@2cf12b8

@mocobeta
Copy link
Contributor Author

mocobeta commented Jul 26, 2022

Properly speaking, the current "verified" account mapping includes both committers and commit authors. "commit authors" can be committers or contributors.

4. Verify the candidate GitHub accounts by checking  if (1) the GitHub account has push access to apache/lucene repository, or (2) the GitHub account has been logged as commit author in the repo's commit history at least once.

Here, (1) means committers and (2) means committers or contributors.

@mikemccand
Copy link
Member

OK got it.

Could we expand the matching so that if the userid in jira == the userid in GitHub we strongly suggest a match? E.g. mdmarshmallow would have been matched this way.

Hmm, actually, his presented name (Marc D'mello) looks the same in GitHub and Jira. Oh, wait, no! One is Marc D'mello and the other is Marc D'Mello (m vs M). Maybe we can do a case insensitive comparison?

But I'll push his account to the verified file separately.

@mocobeta
Copy link
Contributor Author

mocobeta commented Jul 26, 2022

Could we expand the matching so that if the userid in jira == the userid in GitHub we strongly suggest a match? E.g. mdmarshmallow would have been matched this way.

It'd be easy to pick up such candidates - I think we'd need manually verify all of them if there are no clues other than id strings.

@mocobeta
Copy link
Contributor Author

I'll try to improve candidate generation and verification steps maybe next week.

@mocobeta
Copy link
Contributor Author

mocobeta commented Aug 6, 2022

I'm closing this, but we'll accept improvements on mapping until the actual migration.

@mocobeta mocobeta closed this as completed Aug 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants