Skip to content

Conversation

@astansler
Copy link
Member

Add new facts calculation:

  • Average number of lines per commit (used for fun fact)
  • Number of commits in repo (used for calculation of prev fun fact between all user's repos)
  • Average length of line (used for fun fact)
  • Number of locs for through all commits (used for calculation of prev fun fact between all user's repos)
  • Number of commits per number of lines (used for histogram in did you know section)

@astansler astansler self-assigned this Nov 6, 2017
@astansler astansler requested a review from asurkov November 6, 2017 16:50
}
new.numLinesDeleted = new.diffs.fold(0) { total, file ->
total + file.getAllDeleted().size
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you move this code? I find it reasonable to keep this code in CommitHasher, which is all about a commit analysis/processing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's general commit stats that could be used in multiple places, e.g. in facthasher.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the point

}

// RepoTeamSize.
fsRepoTeamSize.add(email)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is a funny to have teamSize as team email storage, fsRepoTeam would more reasonable name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'll make it simpler to calculate.


// Commits.
val numCommits = fsCommitNum[email]!! + 1
val numLinesCurrent = commit.numLinesAdded - commit.numLinesDeleted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this metric: if a user removes lines only, then it's average is negative? If the user always adds same amount of lines that he removes, then the average is 0?

Anyway at least it'd be good to comment a metric when it is defined.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be sum of added and deleted.

fsLineNum[email] = fsLineNum[email]!! + addedLines.size

fsLinesPerCommits[email]!![numCommits - 1] =
fsLinesPerCommits[email]!![numCommits - 1] + addedLines.size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need make !! twice, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smart cast didn't work for this case. += worked and it's not a problem anymore.

/ultimate/ideaSDK
/ultimate/out
/ultimate/tmp
src/main/resources/data/models/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change seems unrelated, why it is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Models got deleted from resources, but we forgot to delete them from gitignore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

}
fsLineNum[email] = fsLineNum[email]!! + addedLines.size

fsLinesPerCommits[email]!![numCommits - 1] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+=

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

fsCommitNum.put(author, 0)
fsLineLenAvg.put(author, 0.0)
fsLineNum.put(author, 0)
fsLinesPerCommits.put(author, Array(rehashes.size) {0})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that may be as large as half million items array, for example in case of gecko-dev repo. Are you sure about keeping all of this in the memory? Can you do the bin computations as addCommitsPerLinesFacts has on the go?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I've add todo to future implementation.

}
}

private fun calcIncAvg(prev: Double, element: Double, count: Long):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please give it a comment to make clear what the arguments are about

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

}

private fun calcIncAvg(prev: Double, element: Double, count: Long):
Double {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, I find the style when arguments are on new lines more readable than putting a return type on a new line

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea, agree.

val COMMITS_NUM = 9
// Used for number of commits per number of lines in a commit histogram.
// Key should be number of lines. Value number of commits.
val COMMITS_NUM_PER_LINE_NUM = 12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the name a bit confusing. Perhaps replacing 'per' to 'to' would make nicer? For example, COMMIT_COUNT_TO_LINE_COUNT. Also COMMITS -> COMMIT.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COMMIT_NUM_TO_LINE_NUM is shorter

/ultimate/ideaSDK
/ultimate/out
/ultimate/tmp
src/main/resources/data/models/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

}
new.numLinesDeleted = new.diffs.fold(0) { total, file ->
total + file.getAllDeleted().size
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the point

}

/**
* Used for incremental calculation of average of sequence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe: Computes the average of a numerical sequence

val COMMITS_LINE_NUM_AVG = 8
val COMMITS_NUM = 9
// A map of line numbers to commits number. Used in a commit histogram.
val COMMITS_NUM_TO_LINE_NUM = 12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm sort of concerned that commits_num has the plural form, but line_num has the single form.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix

val LINE_LONGEVITY = 3
val LINE_LONGEVITY_REPO = 4
val LINE_LEN_AVG = 10
val LINE_NUM = 11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be really nice to document all of them.

Copy link
Member Author

@astansler astansler Nov 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, will do it

@astansler astansler merged commit f7b044a into develop Nov 7, 2017
@astansler astansler deleted the dev-commit-facts branch November 7, 2017 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants