# Microtask5
>Like Microtask 4, but now using [Pandas](http://pandas.pydata.org/).
>>Microtask 4: Produce a listing of repositories, as a table and as CSV file, with the number of commits authored, issues opened, and pull/merge requests opened, during the last three months, ordered by the total number (commits plus issues plus pull requests). Use plain Python3 (eg, no Pandas) for this.

In [1]:
import pandas as pd
import json
from tabulate import tabulate
from datetime import date, timedelta, datetime

### Getting the date 3 months ago from today¶
Using the date and timedelta from python's [datetime](https://docs.python.org/2/library/datetime.html)

In [2]:
initial_date =  date.today() - timedelta(3*365/12) # 3 months ago from today's date
initial_date = datetime.combine(initial_date, datetime.min.time())   # convert datetime.date to datetime.datetime

### Function to analyse and extract the count for 3 categories for the past three months

In [5]:
def analysis_last3months(repo, df):
    file = repo.lower() + '_info.json'
    commits = 0
    issues = 0
    prs = 0 
    with open(file) as f:
        for line in f:
            line = json.loads(line)
            if(line['category'] == 'commit'):
                date = datetime.strptime(line['data']['AuthorDate'], "%a %b %d %H:%M:%S %Y %z").replace(tzinfo = None)
                if(date >= initial_date):
                    commits+=1
            
            elif(line['category'] == 'issue' and "pull_request" not in line['data']):
                date = datetime.strptime(line['data']['created_at'], "%Y-%m-%dT%H:%M:%SZ")
                if(date >= initial_date):
                    issues+=1
                    
            elif(line['category'] == 'pull_request'):
                date = datetime.strptime(line['data']['created_at'], "%Y-%m-%dT%H:%M:%SZ")
                if(date >= initial_date):
                    prs+=1
                     
    total = commits + issues + prs
    summary = {"Repository_name":repo, "#CommitsAuthored": commits, '#IssuesOpened':issues, '#PullRequestsOpened':prs, 'TotalActivity':total}
    content.append(summary)

### Defining the DataFrame and calling `def analysis_last3months` to populate the DataFrame 
Pandas DataFrame object can be referred from [here](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html).<br>
The repos used are [Cloud-CV/Fabrik](https://github.com/Cloud-CV/Fabrik) and [Cloud-CV/Origami](https://github.com/Cloud-CV/Origami)

In [9]:
repos = ['Fabrik', 'Origami']
df = pd.DataFrame(columns = ["Repository_name", "#CommitsAuthored", '#IssuesOpened', '#PullRequestsOpened', 'TotalActivity'])
content = []
for repo in repos:
    analysis_last3months(repo, content)
df = df.append(content)

### Writing datframe to csv, and printing it as a table
We can use the [tabulate](https://pypi.org/project/tabulate/) module.

In [10]:
# Writing dataframe to csv
df.to_csv("./cloudcv_last3months_pandas", index=False)

#printing dataframe as table
print(tabulate(df, headers='keys', tablefmt='psql'))

+----+-------------------+--------------------+-----------------+-----------------------+-----------------+
|    | Repository_name   |   #CommitsAuthored |   #IssuesOpened |   #PullRequestsOpened |   TotalActivity |
|----+-------------------+--------------------+-----------------+-----------------------+-----------------|
|  0 | Fabrik            |                  0 |               3 |                     3 |               6 |
|  1 | Origami           |                  0 |               4 |                     9 |              13 |
+----+-------------------+--------------------+-----------------+-----------------------+-----------------+
