Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better enrichment of Git and GitHub raw indices to calculate metrics for Manuscripts #364

Closed
aswanipranjal opened this issue Jun 4, 2018 · 2 comments

Comments

@aswanipranjal
Copy link
Contributor

Hey all
I've been trying to calculate the GMD metrics and to add them to Manuscripts.
There are some fields and values that I think should be added to the enriched index when it is being created so that it becomes easier to calculate the metrics related to those fields. The metrics can be derived from some other field in the enriched index, but never the less, I think there is a need to produce better enriched indices for Git and GitHub data sources, for now.
There are several Metrics which need additional data in the enriched indices, and they are as follows:

NOTE: I also wanted to put this out there, I think we need to interlink issues and PRs by mentioning the issue number in the proposed PR and vice versa. This way it'll be easier to track down both of them and see the progress.

  • Code Reviews: as far as I can tell, there is no information about the reviews that are given to a PR. This information needs to be fetched using the API and added into the enriched index.

  • Code Merge Duration | What is the duration of time between code merge request and code commit?

  • Code Review Efficiency | What is the number of merged code changes/number of abandoned code change requests?

  • Maintainer Response to Merge Request Duration | What is the duration of time for a maintainer to make a first response to a code merge request?

  • Code Review Iteration | What is the number of iterations that occur before a merge request is accepted or declined?

  • Forks | Forks are a concept in distributed version control systems like GitHub. It is a proxy for the approximate number of developers who have taken a shot at building and deploying the codebase for development.

  • Pull Request Comment Duration | The difference between the timestamp of the pull request creation date and the most recent comment on the pull request.

  • Pull Request Comment Diversity | Number of each people discussing each pull request.

  • Pull Request Comments | Number of comments on each pull request.

I think these or fields from which these can be derived need to be added to the enriched indices.

I am also adding issues for the GMD Metrics which need more polishing and definition under WG-GMD repository.
I plan on generating PRs to add the above metrics and this issue can be used as the Parent issue for the same.

@jgbarah
Copy link
Contributor

jgbarah commented Jun 5, 2018

For resolution efficiency, we still need to settle on a definition, @aswanipranjal. Please have a look at my last comment in chaoss/wg-evolution#5. Similar for code review efficiency.

For all metrics on pull requests, I think we can start a new issue. Besides, I know @acs and @valeriocos are planning to work on that, and me too (although I don't know when), we could use that issue to coordinate.

@aswanipranjal
Copy link
Contributor Author

closed by #419.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants