# Microtask4
> Produce a listing of repositories, as a table and as CSV file, with the number of commits authored, issues opened, and pull/merge requests opened, during the last three months, ordered by the total number (commits plus issues plus pull requests). Use plain Python3 (eg, no Pandas) for this.

In [None]:
from access_token import ACCESS_TOKEN

In [1]:
from datetime import date, datetime, timedelta
from collections import defaultdict
import json
import csv
from tabulate import tabulate

### Getting the date 3 months ago from today### Getting the date 3 months ago from today¶
Using the date and timedelta from python's [datetime](https://docs.python.org/2/library/datetime.html)

In [2]:
initial_date =  date.today() - timedelta(3*365/12) # 3 months ago from today's date
initial_date = datetime.combine(initial_date, datetime.min.time())   # convert datetime.date to datetime.datetime

### The repos analysed 
The repos used are [Cloud-CV/Fabrik](https://github.com/Cloud-CV/Fabrik) and [Cloud-CV/Origami](https://github.com/Cloud-CV/Origami)<br>
<br>
Also, defining
- data : dict having *repo* as keys.
- The **order** for the data[*repo*] list is  **[commit, issues(only), pull_request]**

In [3]:
repos = ['Origami', 'Fabrik']
data = defaultdict(list)
for repo in repos:
    data[repo] = [0,0,0]                 #the order for the list is [commit, issues(only), pull_request]

### Using from commandline - Retrieving data for the repositories and storing as json.
(Removed the output from here after getting the .json and converted the cell to markdown so the notebook doesn't look cluttered)

In [1]:
#owner = "Cloud-CV"
#for repo in repos:
#    outfile = repo.lower() + '_info.json'
#    url = "https://github.com/" + owner + "/" + repo
#    !perceval git --json-line $url > $outfile
#    !perceval github -t $ACCESS_TOKEN --json-line --sleep-for-rate --category issue $owner $repo >> $outfile
#    !perceval github -t $ACCESS_TOKEN --json-line --sleep-for-rate --category pull_request $owner $repo >> $outfile

### Function to get the required data from json and filling the data dict

For each category, if "AuthorDate" (or "created_date") > initial_date : <br>
- increment the list **data[repo][0] for commits**, **data[repo][1] for issues** and **data[repo][2] for pull_request**

In [4]:
def get_count(path, repo):
    with open(path) as file:
        for line in file:
            line = json.loads(line)
            if (line['category']=='commit'):
                date = datetime.strptime(line['data']['AuthorDate'], "%a %b %d %H:%M:%S %Y %z").replace(tzinfo = None)
                if(date > initial_date):
                    data[repo][0]+=1
            
            elif(line['category']=='issue' and "pull_request" not in line['data']):
                date = datetime.strptime(line['data']['created_at'], "%Y-%m-%dT%H:%M:%SZ")
                if(date > initial_date):
                    data[repo][1]+=1
            
            elif(line['category']=='pull_request'):
                date = datetime.strptime(line['data']['created_at'], "%Y-%m-%dT%H:%M:%SZ")
                if(date > initial_date):
                    data[repo][2]+=1
                    
    data[repo].append(sum(data[repo]))

### Calling get_count to populate the data dictionary

In [5]:
for repo in repos:
    path = repo.lower() + '_info.json'
    get_count(path, repo)        

## Writing and reading from csv
- Makes use of the [csv](https://docs.python.org/3/library/csv.html) python module - csv.writer and csv.reader.

In [6]:
header = ['Repository', '#CommitsAuthored', '#IssuesOpened', '#PullRequestsOpened', 'TotalActivity']
with open("./cloudcv_past_3_months.csv", 'w') as writefile:
    writer = csv.writer(writefile)
    writer.writerow(header)
    
    for repo in repos:
        writer.writerow([repo, data[repo][0], data[repo][1], data[repo][2], data[repo][3]])

### Printing it as a table
We can use the [tabulate](https://pypi.org/project/tabulate/) module.


In [7]:
with open('./cloudcv_past_3_months.csv') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    print(tabulate(reader))

----------  ----------------  -------------  -------------------  -------------
Repository  #CommitsAuthored  #IssuesOpened  #PullRequestsOpened  TotalActivity
Origami     0                 4              9                    13
Fabrik      0                 3              3                    6
----------  ----------------  -------------  -------------------  -------------
