# Microtask 4: 
> Produce a listing of repositories, as a table and as CSV file, with the number of commits authored, issues opened, and pull/merge requests opened, during the last three months, ordered by the total number (commits plus issues plus pull requests). Use plain Python3 (eg, no Pandas) for this.

I am using the data source files of five repositories of FOSSASIA. They are 
- [badgeyay](https://github.com/fossasia/badgeyay) 
- [open-event-server](https://github.com/fossasia/open-event-server) 
- [phimpme-android](https://github.com/fossasia/phimpme-android) 
- [susi_android](https://github.com/fossasia/susi_android) 
- [susi_server](https://github.com/fossasia/susi_server) 

All the data source are located in the `data/` folder of the repository.

In [1]:
# while running this in mybiner notebooks if you are facing 
# dependency errors, you need to uncomment the below lines.

#!pip install prettytable
#!pip install pandas
#!pip install perceval
#!pip install regex
#!pip install matplotlib

# Retrieving the data

You can also retrieve the data source files from the jupyter notebook itself. Just provide your `github_token` (github personal access token) and uncomment the code and run the code in the below cell.

In [2]:
# Please enter your github token here
github_token = ""
owner = "fossasia"
repos = ["badgeyay", "open-event-server","phimpme-android","susi_android","susi_server"]
repos_url = ["https://github.com/" + owner + "/" + repo for repo in repos]
# file to which perceval stores data source
files = [repo+".json" for repo in repos] 
ctypes = ('commit','issue','pull_request')

#for repo, repo_url, file in zip(repos, repos_url, files):
#    print(repo, repo_url, file)
#    !perceval git --json-line $repo_url >> ../$file
#    !perceval github --json-line --sleep-for-rate -t $github_token --category pull_request $owner $repo >> ../data/$file
#    !perceval github --json-line --sleep-for-rate -t $github_token --category issue $owner $repo >> ../data/$file

In [3]:
import json 
import csv 
import regex as re

import pandas as pd

from datetime import datetime, date, timedelta
from collections import defaultdict  
from prettytable import from_csv

In [4]:
class Contents_Repository:

    def __init__(self, path):
        """
        Get the contents of the project.

        This method gives, by taking the data retrived by perceval, 
        the content of the repository.

        :param path: get the path of the repository
        :return: contents of the repository
        """
        self.repodata = defaultdict(list)
        self.contents = defaultdict(list)

        # to filter out commit, issue, pr details from the data source and store them seperately in dict.
        with open('%s'%path) as datasrc:
            for line in datasrc:
                line = json.loads(line)
                if line['category'] == 'commit':    
                    content = self.summary_commit(line) 
                elif line['category'] == 'issue':    
                    content = self.summary_issue(line)
                elif line['category'] == 'pull_request':    
                    content = self.summary_pr(line) 
                self.contents[line['category']].append(content)
    
    def summary_commit(self, commit):
        """
        Get the contents of the commit.

        This method gives, by taking the line data, 
        the summary of the commit.

        :param item: line json data of the commit
        :return: summary of the commit
        """
        repo = commit['origin']
        data = commit['data']
        summary ={
                'repo': repo,
                'hash': data['commit'],
                'author': data['Author'],

                'created_date': datetime.strptime(data['CommitDate'],
                                                          "%a %b %d %H:%M:%S %Y %z")
        }
        return summary

    def summary_issue(self, issue):
        """
        Get the contents of the issue.

        This method gives, by taking the line data, 
        the summary of the issue.

        :param item: line json data of the issue
        :return: summary of the issue
        """
        repo = issue['origin']
        data = issue['data']
        summary ={
                'repo': repo,
                'hash': data['id'],
                'author': data['user']['login'],
                'created_date': datetime.strptime(data['created_at'],
                                                  "%Y-%m-%dT%H:%M:%SZ")
        }
        return summary

    def summary_pr(self, pr):
        """
        Get the contents of the pr.

        This method gives, by taking the line data, 
        the summary of the pr.

        :param item: line json data of the pr
        :return: summary of the pr
        """
        repo = pr['origin']
        data = pr['data']
        summary ={
                'repo': repo,
                'hash': data['id'],
                'author': data['user']['login'],
                'created_date': datetime.strptime(data['created_at'],
                                                  "%Y-%m-%dT%H:%M:%SZ")
        }  
        return summary
    
    def repo_name(self):
        content = self.contents
        repourl = "%s"%content['commit'][0]['repo']
        reponame = re.split('/', repourl)
        return reponame[-1]

    def get_data_3mon(self):
        repodata = defaultdict(list)
        initial_date = datetime.combine(date.today() - timedelta(3*365/12), 
                                        datetime.min.time())
        # REFERENCE: Stack Overflow https://stackoverflow.com/a/546356/8268998
        
        repodata['repo'].append(self.repo_name())
        repocontents = self.contents
        total = 0
        for ctype in ctypes:
            count = 0
            for item in repocontents[ctype]:
                if item['created_date'].replace(tzinfo=None) >= initial_date:
                    count += 1
            repodata[ctype].append(count)
            total += count
        repodata['total'].append(total)
        self.repodata = repodata
        return repodata

    def show_data_3mon(self):
        print("Repositories Details in the past three months\n")
        rdata = self.repodata
        for item in dict(rdata):
            print (item, dict(rdata)[item])  
        print("\n")

In [5]:
class Display_Repositories:
    def __init__(self):
        """
        """
        self.contents = ()
        header=['Repository','# Commits','# PullRequests','# Issues','# Total']
        with open('results-%s.csv'%owner, 'w') as file:
            writer = csv.writer(file)
            writer.writerow(header)

    def update_repo(self, repodata):
        self.contents = (repodata['repo'][0],repodata['commit'][0],
                    repodata['issue'][0],repodata['pull_request'][0],repodata['total'][0])
        
    def as_csv(self):
        with open('results-%s.csv'%owner, 'a') as file:
            writer = csv.writer(file)
            # to map the similar index of multiple containers so that they can be added in single entity i.e, rows
            writer.writerow(self.contents)        
        
    def as_table(self):
        with open("results-%s.csv"%owner, "r") as csvfile: 
            csvtable = from_csv(csvfile)
        print(csvtable)

In [6]:
details = Display_Repositories()

for repo in repos:
    repo_obj = Contents_Repository("../data/%s.json"%repo)
    repo_obj.get_data_3mon()
    repo_obj.show_data_3mon()

    details.update_repo(repo_obj.get_data_3mon())
    details.as_csv()
    
details.as_table()

Repositories Details in the past three months

repo ['badgeyay']
commit [103]
issue [139]
pull_request [65]
total [307]


Repositories Details in the past three months

repo ['open-event-server']
commit [299]
issue [228]
pull_request [121]
total [648]


Repositories Details in the past three months

repo ['phimpme-android']
commit [157]
issue [353]
pull_request [160]
total [670]


Repositories Details in the past three months

repo ['susi_android']
commit [78]
issue [262]
pull_request [140]
total [480]


Repositories Details in the past three months

repo ['susi_server']
commit [36]
issue [30]
pull_request [16]
total [82]


+-------------------+-----------+----------------+----------+---------+
|     Repository    | # Commits | # PullRequests | # Issues | # Total |
+-------------------+-----------+----------------+----------+---------+
|      badgeyay     |    103    |      139       |    65    |   307   |
| open-event-server |    299    |      228       |   121    |   648   |
|  phimp