# Find repos with high priority
* Want to find repos with a high priority label
* Sort by size and select repos based on a threshold
* This will produce a list of the repos found, with an optional threshold applied for how many issues with HP labels it contains. 

In [38]:
# Variables
# Select the minimum number of issues per repository
min_issues_per_repo = 100

In [39]:
import pandas as pd

In [40]:
# Read CSV into a dataframe
high_priority = pd.read_csv("csv/high_priority_no_td.csv", index_col=0)

In [41]:
# Number or different labels
high_priority.labels.value_counts().to_frame()[:50]

Unnamed: 0_level_0,count
labels,Unnamed: 1_level_1
bug high priority,10435
high priority,10360
High Priority,7484
enhancement high priority,5486
Priority: High,4751
priority.High,3694
priority.High type.Task,3685
priority.high type.task,3209
priority.High type.Story,2979
priority.high,2749


In [42]:
high_priority[high_priority["repo"] == "python/mypy"]

Unnamed: 0,id,type,created_at,repo,repo_url,action,title,labels,body


In [43]:
# Remove mypy from the dataset
high_priority = high_priority[high_priority["repo"] != "python/mypy"]
high_priority

Unnamed: 0,id,type,created_at,repo,repo_url,action,title,labels,body
0,1.141754e+10,IssuesEvent,2020-02-03 00:00:44,unitystation/unitystation,https://api.github.com/repos/unitystation/unit...,closed,Client breaking NRE when using edit field on C...,Bug High Priority In Progress UI,### Bug:\r\n\r\nIf you use the edit field of t...
1,1.141754e+10,IssuesEvent,2020-02-03 00:01:26,zowe/sample-spring-boot-api-service,https://api.github.com/repos/zowe/sample-sprin...,closed,The SDK provides a separate Java (no-Spring) l...,Feature: API Security Priority: High no-issue-...,- The commons-spring library is split into:\r\...
2,1.141755e+10,IssuesEvent,2020-02-03 00:02:58,openmsupply/mobile,https://api.github.com/repos/openmsupply/mobile,closed,Auto-log out after some time frame,Docs: not needed Effort: small Feature Module:...,## Is your feature request related to a proble...
3,1.141755e+10,IssuesEvent,2020-02-03 00:04:18,UltimateCodeMonkeys/CodeMonkeysMVVM,https://api.github.com/repos/UltimateCodeMonke...,opened,Migrate: CodeMonkeys ViewModelNavigationServic...,Priority: High Status: In Progress Type: Maint...,Migrate the Xamarin.Forms navigation service i...
4,1.141756e+10,IssuesEvent,2020-02-03 00:08:03,wordpress-mobile/WordPress-Android,https://api.github.com/repos/wordpress-mobile/...,closed,IA Reader filter bottom sheet: manage untitled...,IA Reader [Pri] High [Type] Task,In the filter bottom sheet we introduced in th...
...,...,...,...,...,...,...,...,...,...
373104,7.334924e+09,IssuesEvent,2018-03-06 01:09:12,StrangeLoopGames/EcoIssues,https://api.github.com/repos/StrangeLoopGames/...,closed,Meteor hit - Can no longer join server 6.4.1,High Priority,Just had the meteor hit on our server and now ...
373105,7.334984e+09,IssuesEvent,2018-03-06 01:28:23,asteca/ASteCA,https://api.github.com/repos/asteca/ASteCA,closed,Improve performance of synth cluster generatio...,code_enhance p:high performance synth-cl,Follow up on #227.\r\n\r\n#### 0. Initial `syn...
373106,7.335034e+09,IssuesEvent,2018-03-06 01:44:11,wh1ter0se/PowerUp-2018,https://api.github.com/repos/wh1ter0se/PowerUp...,closed,Get and record refined PID loops for rotation ...,IRL development high priority java,"Pretty simple, and pretty sure we already have..."
373107,7.335035e+09,IssuesEvent,2018-03-06 01:44:26,wh1ter0se/PowerUp-2018,https://api.github.com/repos/wh1ter0se/PowerUp...,closed,Figure out why CyborgCommandSpit() runs early ...,R&D bug high priority java,The problem is that when we run autonomous (so...


In [44]:
# Drop duplicates by the content of the title
high_priority = high_priority.drop_duplicates(subset=['title'], keep='last')
high_priority.dropna(inplace=True)
high_priority.reset_index(inplace=True)
high_priority.drop(columns=["index"] , inplace= True)
high_priority["labels"].value_counts()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  high_priority.dropna(inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  high_priority.drop(columns=["index"] , inplace= True)


labels
bug high priority                                                    7350
high priority                                                        6639
High Priority                                                        4795
enhancement high priority                                            3697
Priority: High                                                       3122
                                                                     ... 
backend bug high priority in progress                                   1
Beta High Priority                                                      1
Beta High Priority bug robots.txt                                       1
Platform: Backend Priority: High Type: Enhancement Type: Question       1
R&D bug high priority java                                              1
Name: count, Length: 85374, dtype: int64

In [45]:
high_priority

Unnamed: 0,id,type,created_at,repo,repo_url,action,title,labels,body
0,1.141755e+10,IssuesEvent,2020-02-03 00:02:58,openmsupply/mobile,https://api.github.com/repos/openmsupply/mobile,closed,Auto-log out after some time frame,Docs: not needed Effort: small Feature Module:...,## Is your feature request related to a proble...
1,1.141776e+10,IssuesEvent,2020-02-03 01:22:41,SolerSoft/TaskeRGB,https://api.github.com/repos/SolerSoft/TaskeRGB,closed,Image Picker for Source,config enhancement palette priority-high,Implement an Android Image Picker to browse Im...
2,1.141780e+10,IssuesEvent,2020-02-03 01:35:11,ericauv/ericauv-portfolio,https://api.github.com/repos/ericauv/ericauv-p...,closed,Fix Video Page ListItem Hovering Behaviour,High Priority,- When a video is not playing:\r\n - Hover ...
3,1.141797e+10,IssuesEvent,2020-02-03 02:25:22,unitystation/unitystation,https://api.github.com/repos/unitystation/unit...,closed,Escape shuttle reaches ludicrous speed,Bug High Priority lol,## Description\r\n\r\nEscape shuttle accelerat...
4,1.141798e+10,IssuesEvent,2020-02-03 02:28:57,qlcchain/go-qlc,https://api.github.com/repos/qlcchain/go-qlc,closed,constructing virtual router structure,Priority: High Type: Enhancement Type: Mainten...,\r\n
...,...,...,...,...,...,...,...,...,...
242903,7.334924e+09,IssuesEvent,2018-03-06 01:09:12,StrangeLoopGames/EcoIssues,https://api.github.com/repos/StrangeLoopGames/...,closed,Meteor hit - Can no longer join server 6.4.1,High Priority,Just had the meteor hit on our server and now ...
242904,7.334984e+09,IssuesEvent,2018-03-06 01:28:23,asteca/ASteCA,https://api.github.com/repos/asteca/ASteCA,closed,Improve performance of synth cluster generatio...,code_enhance p:high performance synth-cl,Follow up on #227.\r\n\r\n#### 0. Initial `syn...
242905,7.335034e+09,IssuesEvent,2018-03-06 01:44:11,wh1ter0se/PowerUp-2018,https://api.github.com/repos/wh1ter0se/PowerUp...,closed,Get and record refined PID loops for rotation ...,IRL development high priority java,"Pretty simple, and pretty sure we already have..."
242906,7.335035e+09,IssuesEvent,2018-03-06 01:44:26,wh1ter0se/PowerUp-2018,https://api.github.com/repos/wh1ter0se/PowerUp...,closed,Figure out why CyborgCommandSpit() runs early ...,R&D bug high priority java,The problem is that when we run autonomous (so...


In [46]:
# Drop all with empty title or body
high_priority = high_priority[high_priority["title"].notna()]
print(high_priority)
high_priority = high_priority[high_priority["body"].notna()]
print(high_priority)

                  id         type           created_at  \
0       1.141755e+10  IssuesEvent  2020-02-03 00:02:58   
1       1.141776e+10  IssuesEvent  2020-02-03 01:22:41   
2       1.141780e+10  IssuesEvent  2020-02-03 01:35:11   
3       1.141797e+10  IssuesEvent  2020-02-03 02:25:22   
4       1.141798e+10  IssuesEvent  2020-02-03 02:28:57   
...              ...          ...                  ...   
242903  7.334924e+09  IssuesEvent  2018-03-06 01:09:12   
242904  7.334984e+09  IssuesEvent  2018-03-06 01:28:23   
242905  7.335034e+09  IssuesEvent  2018-03-06 01:44:11   
242906  7.335035e+09  IssuesEvent  2018-03-06 01:44:26   
242907  7.335039e+09  IssuesEvent  2018-03-06 01:45:34   

                              repo  \
0               openmsupply/mobile   
1               SolerSoft/TaskeRGB   
2        ericauv/ericauv-portfolio   
3        unitystation/unitystation   
4                  qlcchain/go-qlc   
...                            ...   
242903  StrangeLoopGames/EcoIssues   

In [47]:
# Drop all repositories with less than 100 issues
high_priority = high_priority.groupby('repo').filter(lambda x: len(x) >= min_issues_per_repo)
high_priority

Unnamed: 0,id,type,created_at,repo,repo_url,action,title,labels,body
0,1.141755e+10,IssuesEvent,2020-02-03 00:02:58,openmsupply/mobile,https://api.github.com/repos/openmsupply/mobile,closed,Auto-log out after some time frame,Docs: not needed Effort: small Feature Module:...,## Is your feature request related to a proble...
3,1.141797e+10,IssuesEvent,2020-02-03 02:25:22,unitystation/unitystation,https://api.github.com/repos/unitystation/unit...,closed,Escape shuttle reaches ludicrous speed,Bug High Priority lol,## Description\r\n\r\nEscape shuttle accelerat...
4,1.141798e+10,IssuesEvent,2020-02-03 02:28:57,qlcchain/go-qlc,https://api.github.com/repos/qlcchain/go-qlc,closed,constructing virtual router structure,Priority: High Type: Enhancement Type: Mainten...,\r\n
8,1.141825e+10,IssuesEvent,2020-02-03 03:46:09,ballerina-platform/ballerina-lang,https://api.github.com/repos/ballerina-platfor...,closed,Create local variable code action breaks code,Area/Tooling Component/LanguageServer Points/1...,"<img width=""558"" alt=""Screen Shot 2019-10-29 a..."
10,1.141832e+10,IssuesEvent,2020-02-03 04:04:30,zulip/zulip,https://api.github.com/repos/zulip/zulip,closed,Add rate limiting to the login and password ch...,area: authentication enhancement good first is...,Zulip historically has relied on strong passwo...
...,...,...,...,...,...,...,...,...,...
242894,7.334777e+09,IssuesEvent,2018-03-06 00:22:10,StrangeLoopGames/EcoIssues,https://api.github.com/repos/StrangeLoopGames/...,closed,Scoring items and Industry Items (that will br...,High Priority,we had this information in previous release 6....
242898,7.334886e+09,IssuesEvent,2018-03-06 00:56:47,StrangeLoopGames/EcoIssues,https://api.github.com/repos/StrangeLoopGames/...,closed,USER ISSUE: Currency exchanges dont work,High Priority,**Version:** 0.7.0.8 beta\r\n\r\n**Steps to Re...
242902,7.334924e+09,IssuesEvent,2018-03-06 01:08:55,StrangeLoopGames/EcoIssues,https://api.github.com/repos/StrangeLoopGames/...,closed,USER ISSUE: Unable to set up contracts,High Priority,**Version:** 0.6.4.2 alpha\r\n\r\n**Steps to R...
242903,7.334924e+09,IssuesEvent,2018-03-06 01:09:12,StrangeLoopGames/EcoIssues,https://api.github.com/repos/StrangeLoopGames/...,closed,Meteor hit - Can no longer join server 6.4.1,High Priority,Just had the meteor hit on our server and now ...


In [48]:
# Print list of repositories
high_priority.repo.value_counts().to_frame()


Unnamed: 0_level_0,count
repo,Unnamed: 1_level_1
StrangeLoopGames/EcoIssues,2294
ballerina-platform/ballerina-lang,1769
pytorch/pytorch,1337
ahmedkaludi/accelerated-mobile-pages,1098
Automattic/wp-calypso,1006
...,...
twosigma/beaker-notebook,101
minetest/minetest,100
lampepfl/dotty,100
Knowledge-Management-Capstone/knowledge-management-core,100


In [49]:
# Save the dataframe to a CSV file
high_priority.to_csv("csv/high_priority_min_repos.csv")

In [50]:
# Read CSV into a dataframe
high_priority = pd.read_csv("csv/high_priority_min_repos.csv", index_col=0)