# Compute Project Stats

* This notebook uses the GitHub [GraphQL API](https://developer.github.com/v4/) to compute the number of open and 
  closed bugs pertaining to Kubeflow GitHub Projects
  * Stats are broken down by labels
* Results are plotted using [plotly](https://plot.ly)
  * Plots are currently published on plot.ly for sharing; they are publicly vieable by anyone
  
## Setup GitHub

* You will need a GitHub personal access token in order to use the GitHub API
* See these [instructions](https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/) for creating a personal access token
  * You will need the scopes:
    * repo
    * read:org    
* Set the environment variable `GITHUB_TOKEN` to pass your token to the code

## Setup Plot.ly Online

* In order to use plot.ly to publish the plot you need to create a plot.ly account and get an API key
* Follow plot.ly's [getting started guide](https://plot.ly/python/getting-started/)
* Store your API key in `~/.plotly/.credentials `

In [7]:
# Use plotly cufflinks to plot data frames
# https://plot.ly/ipython-notebooks/cufflinks/
# instructions for offline plotting
# https://plot.ly/python/getting-started/#initialization-for-offline-plotting
#
# Follow the instructions for online plotting:
# https://plot.ly/python/getting-started/
# You will need to setup an account
import plotly
import plotly.plotly as py
import plotly.graph_objs as go
import cufflinks as cf
#from importlib import reload
import itertools

In [8]:
import project_stats
#reload(project_stats)


In [9]:
c = project_stats.ProjectStats(project="0.6.0")
#c = project_stats.ProjectStats(project="0.7.0")
c.main()

Make plots showing different groups of labels

* Columns are multi level indexes
* See [here](https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html) for instructions on multilevel indexes
   * We specify a list of tuples where each tuple specifies the item to select at the corresponding level in the index

In [10]:
counts = ["open", "total"]
#labels = ["cuj/build-train-deploy", "cuj/multi-user", "area/katib"]
labels = ["priority/p0", "priority/p1", "priority/p2"]
columns = [(a,b) for (a,b) in itertools.product(counts, labels)]

import datetime
start=datetime.datetime(2019, 1, 1)

i = c.stats.index > start
#c.stats.iloc[i]
c.stats.loc[i, columns].iplot(kind='scatter', width=5, filename='project-stats', title='{0} Issue Count'.format(c.project))


In [11]:
c.stats.iloc[-1][columns]

       label      
open   priority/p0      7
       priority/p1     13
       priority/p2      7
total  priority/p0     46
       priority/p1    139
       priority/p2     38
Name: 2019-08-04 23:42:23, dtype: int64

In [12]:
import datetime
start=datetime.datetime(2019, 1, 1)

i = c.stats.index > start
c.stats.iloc[i]


Unnamed: 0_level_0,open,open,open,open,open,open,open,open,open,open,...,total,total,total,total,total,total,total,total,total,total
label,addition/feature,area/0.3.0,area/0.4.0,area/0.5.0,area/1.0.0,area/bootstrap,area/build-release,area/centraldashboard,area/deployment,area/design,...,lifecycle/stale,nolabels,platform/aws,platform/gcp,platform/minikube,platform/other,priority/p0,priority/p1,priority/p2,release/v0.6
time,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2019-01-03 19:45:38,0,1,3,5,1,6,1,0,0,0,...,29,0,0,8,1,0,4,22,14,1
2019-01-04 02:34:32,0,1,3,5,1,6,2,0,0,0,...,29,0,0,8,1,0,4,22,15,1
2019-01-04 02:35:17,0,1,3,5,1,6,3,0,0,0,...,30,0,0,8,1,0,4,22,16,1
2019-01-06 21:51:20,0,1,3,5,1,6,3,0,0,0,...,31,0,0,8,1,0,4,22,17,1
2019-01-07 15:06:42,0,1,3,5,1,6,3,0,0,1,...,31,0,0,8,1,0,4,23,17,1
2019-01-08 21:01:44,0,1,3,5,1,6,3,0,0,1,...,31,0,0,8,1,0,4,23,18,1
2019-01-11 23:45:09,0,1,3,5,1,6,3,0,0,1,...,31,0,0,8,1,0,4,24,18,1
2019-01-14 00:44:31,0,1,3,5,1,6,3,0,0,1,...,32,0,0,8,1,0,4,25,18,1
2019-01-24 19:03:53,0,1,3,5,1,6,3,0,0,1,...,33,0,0,8,1,0,4,25,18,1
2019-01-28 08:40:04,0,1,3,5,1,6,3,0,0,1,...,33,0,0,8,1,0,4,25,18,1
