# Using this notebook

This notebook helps to provide a quick glance at how many and when commits are happening in class repos. *__This does not give any information about the content of commits, you still need to look at github to see that__*

### Instructions

The following cell are the functions that pull the information. The cell below that, you can put in the information relevant to your cohort. Currently, information for the web scraping project for DDA3 is entered. 

`github_org_name` is the name of the class organization on github

`instructors` is a list of github handles for any non-students in the class organization

`group_proj` is a boolean indicating if the project you are checking is a group project

`names` can either be a list of student github handles for individual projects (`group_proj = False`) or a list of group names for group projects (`group_proj = True`)

### Caveats, issues, and limitations

There is an issue looking at number of pushes for group projects. It works sometimes but not in every case. I'm not sure if it has to do with branches or some other factor. The most recent push I think is functional but would be good to double check.

The api is fully open, no credentials needed, but the rate limit is pretty low, so best to check one project at a time. For example, checking a project in the morning to know who to follow up with about pushing would be a good use. Checking multiple projects in a row could easily run into rate limits.

**This should not be a substitute for actually looking at repos. Someone could make many small changes to readmes and it would count as commits, but they could still be struggling with the technologies**

In [None]:
import requests
import pandas as pd
from datetime import datetime


def get_repo_attributes(user_org, repo, student_team_name, repo_attribute = "events", return_response = False, detailed = False, instructors = ['raom1', 'JohnBorthick', 'MaryLvV', 'cattfield']):
    res = requests.get("https://api.github.com/repos/{}/{}/{}".format(user_org, repo, repo_attribute)).json()
    student_event_counts = {}
    latest_push = {}
    for e in res:
        try:
            e['actor']['login']
        except TypeError:
            print(res)
        if e['actor']['login'] in student_event_counts:
            try:
                student_event_counts[e['actor']['login']][e['type']] += 1
            except KeyError:
                student_event_counts[e['actor']['login']][e['type']] = 1
        elif e['actor']['login'] not in instructors:
            student_event_counts[e['actor']['login']] = {e['type']:1}
        try:
            created_at = pd.to_datetime(e['created_at'])
            if latest_push[e['actor']['login']] < created_at:
                latest_push[e['actor']['login']] = created_at
        except KeyError:
            if e['actor']['login'] not in instructors:
                latest_push[e['actor']['login']] = pd.to_datetime(e['created_at'])
    try:
        return pd.Series(latest_push).dt.tz_convert('US/Central'), pd.DataFrame(student_event_counts).T
    except AttributeError:
        if len(student_event_counts) == 0:
            student_event_counts = {student_team_name: {'PushEvent':0}}
        return latest_push, pd.DataFrame(student_event_counts).T
    

def check_project_repos(user_org, proj_name, student_team_names, group_proj = False, instructors = ['raom1', 'JohnBorthick', 'MaryLvV', 'cattfield']):
    repo_names = [proj_name + '-' + n for n in student_team_names]
    if not group_proj:
        latest_push_time_full = []
        event_counts_full = []
    for repo in repo_names:
        print(repo)
        latest_push_time, event_counts_df = get_repo_attributes(user_org, repo, repo.split('-')[-1], instructors = instructors)
        if not group_proj:
            latest_push_time_full.append(latest_push_time)
            event_counts_full.append(event_counts_df)
        else:
            if len(event_counts_df) > 0:
                print(event_counts_df)
                event_counts_df.plot(kind = 'bar', title = repo);
                print(latest_push_time)
            else:
                print('NO COMMITS!!')
        print('-'*20)
        print('\n')
    if not group_proj:
        pd.concat(event_counts_full).plot(kind = 'bar', figsize = (20, 10));
        print(pd.Series(latest_push_time_full))

In [None]:
github_org_name = 'NSS-Full-Time-Data-Analytics-3'

instructors = ['raom1', 'ChristopherNWright', 'MaryLvV', 'cattfield']

group_proj = False

proj_name = 'web-scraping-marathons'

names = ['AnnikaBock65',
         'NashvilleBrandon',
         'BrandonM471998',
         'CalebAutry',
         'CelineIT',
         'chipnesss',
         'DenisProkhoda',
         'EASENFT',
         'ehoughton',
         'ripplesphere',
         'jmcmillan4506',
         'jeffreyreeve',
         'jenwhitson',
         'jomccall',
         'JCrippen01',
         'KatieSylvester',
         'kellyrwilliams',
         'vitalmaggi',
         'SNClevTN',
         'RobKirk3',
         'wsbetts4',
         'thay615',
         'TobiasHollander',
         'VGooch']

# names = [
#     'snickers',
#     'butterfinger',
#     'twix',
#     'reese-s',
#     'candy-corn',
#     'almond-joy'
# ]

In [None]:
cra.check_project_repos(github_org_name,
                        proj_name,
                        names,
                        group_proj = group_proj,
                        instructors = instructors)