Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoring Algorithm #9

Open
Arrowbox opened this issue Apr 22, 2020 · 0 comments
Open

Scoring Algorithm #9

Arrowbox opened this issue Apr 22, 2020 · 0 comments

Comments

@Arrowbox
Copy link
Owner

The scoring algorithm is probably the next big feature for this right now it goes like:

Score = Lines

That's nice but there are a few other factors I'd like to consider and use for ordering contributors.

Factors

  • Lines: This is definitely the first order approximation. If someone wrote 95 out of 100 lines, they are probably familiar with how it works.
  • Commits: Ideally, contributing many commits correlates with some understanding over time.
  • Most recent commit date: Writing 100 lines a year ago is not as useful as writing 10 lines yesterday.
  • Ownership of specific lines: If a specific set of lines has been requested for analysis (for example, a Pull-request bot that suggests reviewers including the set of lines that have changed) then contributors for those lines are of particular interest.

Roughly I'd like to implement a score closer to:

S = W(Lines)*(Lines/Total Lines) + W(Commits)*(Commits/Total Commits) + W(Date) * ((1/2)^((Today-Date)/Halflife))

The last term looks complicated but is just a half-life based exponential decay. All of the terms take the form for weight * normalized metric.

Test Cases

I'm going to break out a few test cases to see how the weighting might look.
Google Spreadsheet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant