# Get and Parse GitHub Issues From a Private Repo

Use the GitHub API to get a list of all the issues in a private repo, and write a subset of the info for each to a CSV. This example uses a public repo so that the super top secret issues on the private repo are not published. The main point here is the commands and first two scripts to collect the issues into one file. The rest of the data munging is up to you.

1. Get personal oauth token for GitHub API at https://developer.github.com/guides/getting-started/. You can do all this stuff without a token (and the `-H "Authorization...`) if you're just getting issues from a public repo. Set it in your shell to the environment variable `$GITHUB_OAUTH`.

2. ([Docs](https://developer.github.com/v3/issues/#list-issues-for-a-repository)) Inspect issues for the repo like
```bash
curl -I -H "Authorization: token $GITHUB_OAUTH" "https://api.github.com/repos/swift-nav/piksi_firmware/issues?state=all"
```
To get the link format. Should look something like
```
Link: <https://api.github.com/repositories/2464156/issues?state=all&page=2>; rel="next", <https://api.github.com/repositories/2464156/issues?state=all&page=25>; rel="last"
```

3. Get all the pages with this instead of writing an actual parsing/looping script because you're too lazy for that

```bash
#!/bin/bash
PAGES=$1

for i in $(seq 1 $PAGES); do
  curl -H \
  "Authorization: token $GITHUB_OAUTH" \
  "https://api.github.com/repositories/:repo_number/issues?state=all&page=$i" > issues$i.json
done

```

Run it like "`./get_github_issues.sh 25`". Now you should have a bunch of `issues{n}.json` files.

In [46]:
import json
import pandas as pd

In [76]:
def concat_json(num_pages, filename):
    '''Concatenate the files that were output by the script you ran
    to get all the response pages.
    I'm sure there would have been a nicer way to do this all in pandas,
    but this was faster.'''
    all_issues = []
  
    for i in range(1, num_pages+1):
        page_file = "issues{0}.json".format(i)
        with open(page_file) as f:
            issues = json.load(f)
        # append will create a list of lists of issues in each page
        all_issues += (issues)
        all_issues_json = json.dumps(all_issues)
  
    with open(filename, 'w') as f:
      f.write(all_issues_json)

In [77]:
filename = 'piksi_firmware_issues.json'
concat_json(25, filename)

## Do stuff with the data
Not the cleanest script. As in, probably could have done the main thing in a single pandas command, but this is mostly for reference. You can go ahead and do whatever.

In [78]:
with open(filename) as f:
    all_issues = json.load(f)

In [79]:
len(all_issues)

729

In [80]:
closed_issues = [issue for issue in all_issues if issue['state'] == 'closed']

In [81]:
len(closed_issues)

621

In [82]:
closed_issues[0]

{u'assignee': None,
 u'assignees': [],
 u'body': u'Attempts to fix the large drift between pseudorange and carrier-phase @denniszollo observed.\n\n@denniszollo Can you test this?\n',
 u'closed_at': u'2016-07-07T17:23:19Z',
 u'comments': 9,
 u'comments_url': u'https://api.github.com/repos/swift-nav/piksi_firmware/issues/729/comments',
 u'created_at': u'2016-06-27T13:00:25Z',
 u'events_url': u'https://api.github.com/repos/swift-nav/piksi_firmware/issues/729/events',
 u'html_url': u'https://github.com/swift-nav/piksi_firmware/pull/729',
 u'id': 162446270,
 u'labels': [],
 u'labels_url': u'https://api.github.com/repos/swift-nav/piksi_firmware/issues/729/labels{/name}',
 u'locked': False,
 u'milestone': None,
 u'number': 729,
 u'pull_request': {u'diff_url': u'https://github.com/swift-nav/piksi_firmware/pull/729.diff',
  u'html_url': u'https://github.com/swift-nav/piksi_firmware/pull/729',
  u'patch_url': u'https://github.com/swift-nav/piksi_firmware/pull/729.patch',
  u'url': u'https://api.

In [83]:
issues = pd.DataFrame(closed_issues)

In [84]:
issues_info_subset = issues[['title', 'number', 'html_url', 'closed_at', 'state']]

In [87]:
issues_info_subset.head()

Unnamed: 0,title,number,html_url,closed_at,state
0,WIP: Adjust carrier phase for receiver clock e...,729,https://github.com/swift-nav/piksi_firmware/pu...,2016-07-07T17:23:19Z,closed
1,WIP: Fix L2C alias lock detector,728,https://github.com/swift-nav/piksi_firmware/pu...,2016-06-23T15:11:27Z,closed
2,v3: Support NAP v3.4.0,727,https://github.com/swift-nav/piksi_firmware/pu...,2016-06-22T04:33:00Z,closed
3,add echo of hw target to all make targets,726,https://github.com/swift-nav/piksi_firmware/pu...,2016-08-14T02:21:46Z,closed
4,I136 isc use,725,https://github.com/swift-nav/piksi_firmware/pu...,2016-06-29T08:28:34Z,closed


In [88]:
issues_info_subset.to_csv('piksi_firmware_issues.csv', encoding='utf-8')

🎉 Go have fun manually categorizing your bugs.