# Data Fetching with GitHub REST API
This notebook covers the way we fetched data from the [MLH Fellowship Organization](https://github.com/MLH-Fellowship) on GitHub to use for training in our hackathon project.

Table of Content:
- [List of Members](#first)
- [Total Commits](#second)
- [Followers and Open Repos](#third)
- [Stars and Forks on Repos](#fourth)
- [Number of Organizations](#fifth)
- [Number of Issues and Contributions](#sixth)
- [Collect Entire Data](#seventh)

In [1]:
import requests # for API calls
import json # for JSON file type storage
import time # for API calls
import os # for the API token

token = os.getenv("GITHUB_TOKEN") # MUST HAVE PERSONAL ACCESS TOKEN SETUP
print("TOKEN ABSENT" if token == None else "TOKEN PRESENT")
headers = {'Authorization': 'token {}'.format(token)}
base_url = "https://api.github.com"

TOKEN PRESENT


## Get the members of the organization on GitHub and save them to a JSON file
### Member List <a class="anchor" id="first"></a>

In [2]:
# fetches all github users who are members of the MLH Fellowship organization on GitHub
def get_member_list(organization_name):
    members = []
    page_number = 1
    
    # iterate over pages of users, saving them to one pool
    while True:
        url = "{}/orgs/{}/members?per_page=100&page={}".format(base_url, organization_name, page_number)
        print("Calling URL: {}".format(url))
        response = requests.get(url, headers=headers)
        
        member_list = response.json()
        members += member_list
        
        print("There are {} in this URL call".format(len(member_list)))
        if len(member_list) == 0:
            break
        else:
            page_number += 1
            
    return members

In [3]:
# members = get_member_list("MLH-Fellowship")
with open('members.json', 'r') as json_file:
    members = json.load(json_file)

In [4]:
members[:1] # see the first user

[{'login': '1tracy',
  'id': 55264469,
  'node_id': 'MDQ6VXNlcjU1MjY0NDY5',
  'avatar_url': 'https://avatars.githubusercontent.com/u/55264469?v=4',
  'gravatar_id': '',
  'url': 'https://api.github.com/users/1tracy',
  'html_url': 'https://github.com/1tracy',
  'followers_url': 'https://api.github.com/users/1tracy/followers',
  'following_url': 'https://api.github.com/users/1tracy/following{/other_user}',
  'gists_url': 'https://api.github.com/users/1tracy/gists{/gist_id}',
  'starred_url': 'https://api.github.com/users/1tracy/starred{/owner}{/repo}',
  'subscriptions_url': 'https://api.github.com/users/1tracy/subscriptions',
  'organizations_url': 'https://api.github.com/users/1tracy/orgs',
  'repos_url': 'https://api.github.com/users/1tracy/repos',
  'events_url': 'https://api.github.com/users/1tracy/events{/privacy}',
  'received_events_url': 'https://api.github.com/users/1tracy/received_events',
  'type': 'User',
  'site_admin': False}]

In [5]:
len(members) # there are 677 members in the organization

677

In [6]:
# save the list of members as a json file
with open('members.json', 'w') as json_file:
    json.dump(members, json_file)

## Get more specific user data for each user and save them to a JSON file
### Total Commits <a class="anchor" id="second"></a>

In [73]:
# Gets all of the commits the user commited
def get_total_commits(username):
    url = "{}/search/commits?q=author:{}".format(base_url, username)
    # requires a special header with an accept (search still in preview)
    headers = {"Content-Type": "application/json",
        "Accept": "application/vnd.github.cloak-preview",
        "Authorization": 'token {}'.format(token)}
    request = requests.get(url, headers=headers)
    
    json_info = request.json()
    if 'total_count' in json_info.keys():
        return json_info['total_count']
    else:
        print("Couldn't get total commits for user {}".format(username))
        return 0

In [8]:
total = get_total_commits("dtemir")
print("I have a total of {} commits".format(total))

I have a total of 742 commits


### Followers and Open Repos <a class="anchor" id="third"></a>

In [47]:
# Gets the number of followers and repositories the user has
def get_followers_and_repos(username):
    url = "{}/users/{}".format(base_url, username)
    request = requests.get(url, headers=headers)
    
    json_info = request.json()
    followers, repos = 0, 0
    if 'followers' in json_info.keys():
        followers = json_info['followers']
    if 'public_repos' in json_info.keys():
        repos = json_info['public_repos']
        
    return followers, repos

In [48]:
fl, re = get_followers_and_repos("dtemir")
print("I have {} followers and {} public repos".format(fl, re))

I have 17 followers and 13 public repos


### Stars and Forks on Repos <a class="anchor" id="fourth"></a>

In [50]:
# Gets the number of stars and forks the user has in their repositories
def get_stars_and_forks(username):
    url = "{}/users/{}/repos".format(base_url, username)
    request = requests.get(url, headers=headers)
    repositories = request.json()
    
    stars, forks = 0, 0
    for repository in repositories:
        if 'stargazers_count' in repository.keys():
            stars += repository["stargazers_count"]
        if 'forks_count' in repository.keys():
            forks += repository["forks_count"]
    
    return stars, forks

In [51]:
stars, forks = get_stars_and_forks("dtemir")
print("I have {} stars and {} forks on my repos".format(stars, forks))

I have 4 stars and 3 forks on my repos


### Number of Organizations  <a class="anchor" id="fifth"></a>

In [52]:
# Gets the number of organizations the user is a part of and their names
def get_organizations(username):
    url = "{}/users/{}/orgs".format(base_url, username)
    request = requests.get(url, headers=headers)
    organizations = request.json()
    
    number, names = 0, []
    for organization in organizations:
        if 'login' in organization.keys():
            number += 1
            names.append(organization["login"])
    
    return number, names

In [53]:
orgs, names = get_organizations("dtemir")
print("I'm a part of {} organizations named {}".format(orgs, names))

I'm a part of 2 organizations named ['ProteinDesignLab', 'MLH-Fellowship']


### Number of Issues and Contributions <a class="anchor" id="sixth"></a>

In [61]:
# Gets the number of Issues and PRs the user created in repositories
def get_issues_and_contributions(username):
    url_issues = "{}/search/issues?q=user:{}".format(base_url, username)
    url_contributions = "{}/search/issues?q=type:pr+user:{}".format(base_url, username)
    request_issues = requests.get(url_issues, headers=headers)
    request_contributions = requests.get(url_contributions, headers=headers)
    
    json_info_issues = request_issues.json()
    json_info_contributions = request_contributions.json()
    
    issues, contributions = 0, 0
    if 'total_count' in json_info_issues.keys():
        issues = json_info_issues['total_count']
    if 'total_count' in json_info_contributions.keys():
        contributions = json_info_contributions['total_count']
    
    return issues, contributions

In [62]:
issues, contributions = get_issues_and_contributions("dtemir")
print("I have contributed {} issues and helped {} outside projects".format(issues, contributions))

I have contributed 11 issues and helped 8 outside projects


## Now we should collect that data for all members and save it in a JSON file <a class="anchor" id="third"></a>

In [76]:
# Gets all needed information about users
def get_full_info():
    full_information = []
    i = 1

    for member in members:

        username = member['login']
        
        print("Getting info for member {} at id {}".format(username, i))
        i += 1
        
        total_commits = get_total_commits(username)
        no_followers, no_repos = get_followers_and_repos(username)
        no_stars, no_forks = get_stars_and_forks(username)
        no_organizations, organization_names = get_organizations(username)
        no_issues, no_contributions = get_issues_and_contributions(username)
        
        time.sleep(5)
        
        info = {
            "Username": username,
            "Commits": total_commits,
            "Followers": no_followers,
            "Repos": no_repos,
            "Stars": no_stars,
            "Forks": no_forks,
            "Organizations": no_organizations,
            "OrganizationNames": organization_names,
            "Issues": no_issues,
            "Contributions": no_contributions,
        }

        full_information.append(info)
    
    return full_information

In [77]:
full_info = get_full_info()
with open('full_info.json', 'w') as json_file:
    json.dump(full_info, json_file)

Getting info for member 1tracy at id 1
Getting info for member aarnphm at id 2
Getting info for member aaronosher at id 3
Getting info for member aayush-05 at id 4
Getting info for member abhishalya at id 5
Getting info for member Abhishek-kumar09 at id 6
Getting info for member abishekvashok at id 7
Getting info for member Acrylami at id 8
Getting info for member acushlakoncept at id 9
Getting info for member adarsh-swe at id 10
Getting info for member adata111 at id 11
Getting info for member adavria at id 12
Getting info for member adfaris at id 13
Getting info for member Adib234 at id 14
Getting info for member adisen at id 15
Getting info for member adnan-creator at id 16
Getting info for member aehwany at id 17
Getting info for member aemmadi at id 18
Getting info for member agg-shambhavi at id 19
Getting info for member aguonyinye at id 20
Getting info for member ahadislam1 at id 21
Getting info for member AhadKhan98 at id 22
Getting info for member aHappyCamer at id 23
Getting 

Getting info for member EmilyXinyi at id 188
Getting info for member EricKarschner37 at id 189
Getting info for member ericwidjaja at id 190
Getting info for member erin2722 at id 191
Getting info for member EshikaShah at id 192
Getting info for member esiebomaj at id 193
Getting info for member farhan2742 at id 194
Getting info for member FawziyahAlebiosu at id 195
Getting info for member felixfaisal at id 196
Getting info for member fissoreg at id 197
Getting info for member flozender at id 198
Getting info for member FocalChord at id 199
Getting info for member francisco-1 at id 200
Getting info for member FrancoisCoding at id 201
Getting info for member frankidatank at id 202
Getting info for member fzchriha at id 203
Getting info for member gabriel-esco at id 204
Getting info for member garrettluu at id 205
Getting info for member GedionT at id 206
Getting info for member georgeamccarthy at id 207
Getting info for member gideontong at id 208
Getting info for member GinaJame at id 

Getting info for member luiszugasti at id 370
Getting info for member luke-truitt at id 371
Getting info for member m0mosenpai at id 372
Getting info for member m2chan at id 373
Getting info for member makrandr1999 at id 374
Getting info for member mamnuya at id 375
Getting info for member Manasa2850 at id 376
Getting info for member manlalaro1 at id 377
Getting info for member manyaagarwal at id 378
Getting info for member marcnjaramillo at id 379
Getting info for member Mark-Nawar at id 380
Getting info for member masterchief01 at id 381
Getting info for member MathewDavidov at id 382
Getting info for member mattdillabough at id 383
Getting info for member MayankJ99 at id 384
Getting info for member mchaudhry05 at id 385
Getting info for member MDanialSaleem at id 386
Getting info for member mdmjg at id 387
Getting info for member MeghalBisht at id 388
Getting info for member membriux at id 389
Getting info for member mgautam98 at id 390
Getting info for member mgsium at id 391
Getti

Getting info for member ShinteiMai at id 552
Getting info for member shivamsouravjha at id 553
Getting info for member shivaylamba at id 554
Getting info for member shivichaubey at id 555
Getting info for member Shraddha2104 at id 556
Getting info for member shreyagupta30 at id 557
Getting info for member shreyaparadkar at id 558
Getting info for member ShrillShrestha at id 559
Getting info for member shu8 at id 560
Getting info for member shubhank-saxena at id 561
Getting info for member shweta3047 at id 562
Getting info for member SiddeshSambasivam at id 563
Getting info for member SiddharthSham at id 564
Getting info for member silva-nick at id 565
Getting info for member simran1199 at id 566
Getting info for member SinaKhalili at id 567
Getting info for member SincerelyBrittany at id 568
Getting info for member skeshavaa at id 569
Getting info for member sksuryan at id 570
Getting info for member skymake at id 571
Getting info for member sladyn98 at id 572
Getting info for member S

In [79]:
full_info[-1]

{'Username': 'ZzRanger',
 'Commits': 815,
 'Followers': 10,
 'Repos': 39,
 'Stars': 0,
 'Forks': 0,
 'Organizations': 0,
 'OrganizationNames': [],
 'Issues': 12,
 'Contributions': 10}