#**Opensource Project Metrics**

## **Core Idea**

Significant metrics to represent project acitivity by tracking recent issues:

* The percentage of recent open issues : ✔

* The percentage of recent open pull requests : ✔

* Median time of resolution recent issues: (in process)

* Median time of closing recent pull requests: (in process)

All metrics are calculated in period of `past month` using GitHub API.






In [5]:
pip install PyGitHub



In [6]:
pip install GitPython



In [7]:
from tqdm import tqdm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import time
import seaborn as sns
from datetime import datetime, timedelta
from github import Github
from git import Repo
from github import RateLimitExceededException


### 1. Preprocessing data and writing metrics functions:

In [53]:
# init GIT API
# g = Github('api_key') 

In [44]:
test_data = pd.read_csv("./projects_list.csv")
test_data

Unnamed: 0,project_url,intuitive_score_elizabet38,intuitive_score_inspired99,intuitive_score_DanielGabitov,intuitive_score_VSPlekhanov,average
0,https://github.com/angular/angular,10,9,10,10,9.75
1,https://github.com/pharo-project/pharo,9,6,8,10,8.25
2,https://github.com/redline-smalltalk/redline-s...,2,4,3,3,3.0
3,https://github.com/fossasia/visdom,9,6,6,8,7.25
4,https://github.com/diaspora/diaspora,9,7,7,7,7.5
5,https://github.com/durch/rust-s3,7,6,5,8,6.5
6,https://github.com/zappa/zappa,9,7,7,7,7.5
7,https://github.com/mnapoli/IsItMaintained,3,1,1,3,2.0
8,https://github.com/logsol/Github-Auto-Deploy,2,1,1,1,1.25
9,https://github.com/r10r/rcswitch-pi,1,1,1,1,1.0


In [24]:
def normalize_data(data):
  for i in range(data.shape[0]):
    data['average'][i] /= 10
  return data


In [25]:
def count_open_issues_and_pr(issues):
    cnt_open_issues = 0
    cnt_closed_issues = 0
    cnt_open_pr = 0
    cnt_closed_pr = 0


    for starting_index, issue in enumerate(issues): 
        if not issue.pull_request:
            if issue.state == 'open':
                cnt_open_issues += 1
            else:
                cnt_closed_issues += 1
        else:
            if issue.state == 'open':
                cnt_open_pr += 1
            else:
                cnt_closed_pr += 1
        if starting_index >= 2000:
            return cnt_open_issues, cnt_closed_issues, cnt_open_pr, cnt_closed_pr
    return cnt_open_issues, cnt_closed_issues, cnt_open_pr, cnt_closed_pr


In [26]:
def get_percentage_open_issues_and_pr(url):
    username = url.split("/")[-2]
    repo = url.split("/")[-1]
    repo = g.get_repo(f'{username}/{repo}')
    
    date = datetime.today()
    period = date - timedelta(days=30)
    issues = repo.get_issues(state = 'all', since=period, sort='Newest')
    
    cnt_open_issues, cnt_closed_issues, cnt_open_pr, cnt_closed_pr = count_open_issues_and_pr(issues)

    total_issues = cnt_open_issues + cnt_closed_issues
    total_pr = cnt_open_pr + cnt_closed_pr

    if total_issues:
        percentage_issues = cnt_open_issues / total_issues
    else:
        percentage_issues = 0
    
    if total_pr:
        percentage_pr = cnt_open_pr / total_pr
    else:
        percentage_pr = 0
    print(f'{url} is done, total issues: {total_issues}, total PRs: {total_pr}')

    return total_issues, percentage_issues, total_pr, percentage_pr



### 2. Calculating results for the metrics



In [41]:
def results_percentage_open_issues_and_pr(data, start=0, end=test_data.shape[0]):
    results = pd.DataFrame(columns=['Project url', 'Total issues' ,
            'Percentage of open issues','Total PRs', 'Percentage of open Prs'])
    results['Project url'] = data['project_url']
    results.fillna(0)
    
    for i in tqdm(range(start, end)):
        total_iss, perc_iss, total_pr, perc_pr = get_percentage_open_issues_and_pr(data['project_url'][i])
        results['Total issues'][i] = total_iss
        results['Percentage of open issues'][i] = perc_iss * 100
        results['Total PRs'][i] = total_pr
        results['Percentage of open Prs'][i] = perc_pr * 100
        # time.sleep(30)       

    return results


In [45]:
# Processing results for percentage of issues and PRs 
results = results_percentage_open_issues_and_pr(test_data, end = 17)
       

  6%|▌         | 1/17 [01:27<23:24, 87.75s/it]

https://github.com/angular/angular is done, total issues: 401, total PRs: 475


 12%|█▏        | 2/17 [02:07<14:53, 59.57s/it]

https://github.com/pharo-project/pharo is done, total issues: 213, total PRs: 185


 18%|█▊        | 3/17 [02:07<07:35, 32.50s/it]

https://github.com/redline-smalltalk/redline-smalltalk is done, total issues: 0, total PRs: 0


 24%|██▎       | 4/17 [02:09<04:21, 20.11s/it]

https://github.com/fossasia/visdom is done, total issues: 5, total PRs: 3


 29%|██▉       | 5/17 [02:11<02:44, 13.71s/it]

https://github.com/diaspora/diaspora is done, total issues: 12, total PRs: 15


 35%|███▌      | 6/17 [02:12<01:43,  9.41s/it]

https://github.com/durch/rust-s3 is done, total issues: 5, total PRs: 4


 41%|████      | 7/17 [02:16<01:15,  7.54s/it]

https://github.com/zappa/zappa is done, total issues: 20, total PRs: 19


 47%|████▋     | 8/17 [02:16<00:47,  5.22s/it]

https://github.com/mnapoli/IsItMaintained is done, total issues: 0, total PRs: 0


 53%|█████▎    | 9/17 [02:16<00:29,  3.68s/it]

https://github.com/logsol/Github-Auto-Deploy is done, total issues: 0, total PRs: 0


 59%|█████▉    | 10/17 [02:16<00:18,  2.62s/it]

https://github.com/r10r/rcswitch-pi is done, total issues: 0, total PRs: 0


 65%|██████▍   | 11/17 [02:18<00:13,  2.26s/it]

https://github.com/sui77/rc-switch is done, total issues: 8, total PRs: 1


 71%|███████   | 12/17 [02:19<00:10,  2.02s/it]

https://github.com/apache/cassandra is done, total issues: 0, total PRs: 91


 76%|███████▋  | 13/17 [08:36<07:42, 115.61s/it]

https://github.com/microsoft/vscode is done, total issues: 1827, total PRs: 174


 82%|████████▏ | 14/17 [10:42<05:55, 118.54s/it]

https://github.com/microsoft/TypeScript is done, total issues: 604, total PRs: 160


 88%|████████▊ | 15/17 [10:45<02:47, 83.92s/it] 

https://github.com/google/guava is done, total issues: 19, total PRs: 25


 94%|█████████▍| 16/17 [10:48<00:59, 59.34s/it]

https://github.com/google/leveldb is done, total issues: 12, total PRs: 11


100%|██████████| 17/17 [10:56<00:00, 38.64s/it]

https://github.com/vuejs/vue is done, total issues: 40, total PRs: 13





In [54]:
last_iter = 17
size = test_data.shape[0]
results.iloc[last_iter:] = results_percentage_open_issues_and_pr(test_data, start=last_iter)



 14%|█▍        | 1/7 [03:42<22:13, 222.30s/it]

https://github.com/tensorflow/tensorflow is done, total issues: 960, total PRs: 217


 29%|██▊       | 2/7 [09:15<23:58, 287.72s/it]

https://github.com/golang/go is done, total issues: 1506, total PRs: 57


 43%|████▎     | 3/7 [09:16<10:25, 156.48s/it]

https://github.com/dtao/lazy.js is done, total issues: 0, total PRs: 0


 57%|█████▋    | 4/7 [09:16<04:44, 94.85s/it] 

https://github.com/evernote/android-job is done, total issues: 0, total PRs: 0


 71%|███████▏  | 5/7 [09:20<02:04, 62.17s/it]

https://github.com/ValveSoftware/openvr is done, total issues: 15, total PRs: 1


 86%|████████▌ | 6/7 [09:21<00:41, 41.16s/it]

https://github.com/googleanalytics/autotrack is done, total issues: 0, total PRs: 0


100%|██████████| 7/7 [09:23<00:00, 80.51s/it]

https://github.com/hierynomus/smbj is done, total issues: 11, total PRs: 0





In [55]:
results


Unnamed: 0,Project url,Total issues,Percentage of open issues,Total PRs,Percentage of open Prs
0,https://github.com/angular/angular,401,51.8703,475,15.7895
1,https://github.com/pharo-project/pharo,213,23.9437,185,8.10811
2,https://github.com/redline-smalltalk/redline-s...,0,0.0,0,0.0
3,https://github.com/fossasia/visdom,5,40.0,3,0.0
4,https://github.com/diaspora/diaspora,12,50.0,15,13.3333
5,https://github.com/durch/rust-s3,5,100.0,4,75.0
6,https://github.com/zappa/zappa,20,80.0,19,52.6316
7,https://github.com/mnapoli/IsItMaintained,0,0.0,0,0.0
8,https://github.com/logsol/Github-Auto-Deploy,0,0.0,0,0.0
9,https://github.com/r10r/rcswitch-pi,0,0.0,0,0.0


### 3. Plotting and analysis
