# Microtask 2
Create a Python script to execute Perceval via its Python interface using the Git and GitHub backends. Feel free to select any target repository.

Target repo selected: `https://github.com/hypertrons/hypertrons`.

## Git Backend

In [1]:
from perceval.backends.core.git import Git
from pprint import pprint

In [2]:
# url for the git repo to analyze
REPO_URL = "https://github.com/hypertrons/hypertrons.git"
# directory for letting Perceval clone the git repo
REPO_DIR = "./tmp/hypertrons"

In [6]:
# create a Git object, pointing to repo_url, using repo_dir for cloning
repo = Git(uri=REPO_URL, gitpath=REPO_DIR)
# fetch all commits as an iterator, and iterate it printing each hash
commits = repo.fetch()
# count the commits
commits_list = list(commits)
commits_amount = len(commits_list)
print("NUMBER OF COMMITS:: ", commits_amount)

Number of commits:  170


In [4]:
# print the last commit
last_commit = commits_list[commits_amount - 1]
pprint(last_commit)

{'backend_name': 'Git',
 'backend_version': '0.12.0',
 'category': 'commit',
 'classified_fields_filtered': None,
 'data': {'Author': 'GoodMeowing '
                    '<36814673+GoodMeowing@users.noreply.github.com>',
          'AuthorDate': 'Thu Mar 26 12:10:01 2020 +0800',
          'Commit': 'GitHub <noreply@github.com>',
          'CommitDate': 'Thu Mar 26 04:10:01 2020 +0000',
          'Signed-off-by': ['pantang <729618421@qq.com>'],
          'commit': '06af85499c84494412bc4033a1272b6c994fe3a4',
          'files': [{'action': 'M',
                     'added': '1',
                     'file': 'app/plugin/gitee/gitee-app.ts',
                     'indexes': ['f727536...', '6bf5b30...'],
                     'modes': ['100644', '100644'],
                     'removed': '1'},
                    {'action': 'C056',
                     'added': '10',
                     'file': 'app/plugin/gitee/gitee-raw-client/gitee-raw-client.ts',
                     'indexes': ['a1ccfe6...

In [7]:
# print CommitDate, Author, Commit Message from the last 10 commits
for commit in commits_list[-10:]:
    print("COMMIT DATE: {commit_date}\nAUTHOR: {author_name}\nCOMMIT MESSAGE: {commit_message}".format(commit_date=commit["data"]["CommitDate"], author_name=commit["data"]["Author"], commit_message=commit["data"]["message"]))
    print()

COMMIT DATE: Tue Feb 25 03:35:02 2020 +0000
AUTHOR: Frank Zhao <syzhao1988@126.com>
COMMIT MESSAGE: docs: add ci badge into README (#273)

Signed-off-by: Frankzhaopku <syzhao1988@126.com>

COMMIT DATE: Tue Feb 25 14:15:01 2020 +0000
AUTHOR: WSL <wsl6@outlook.com>
COMMIT MESSAGE: refactor: fix configuration update bug, fix add repo bug (#276)

Signed-off-by: WuShaoling <wsl6@outlook.com>

COMMIT DATE: Wed Feb 26 03:00:02 2020 +0000
AUTHOR: heming <lhming23@outlook.com>
COMMIT MESSAGE: docs: improve contributing guide (#268) (#278)

Signed-off-by: LinHaiming <lhming23@outlook.com>

COMMIT DATE: Wed Feb 26 11:35:01 2020 +0000
AUTHOR: WSL <wsl6@outlook.com>
COMMIT MESSAGE: refactor: format repoData after receive event (#279)

Signed-off-by: WuShaoling <wsl6@outlook.com>

COMMIT DATE: Wed Feb 26 15:20:01 2020 +0000
AUTHOR: WSL <wsl6@outlook.com>
COMMIT MESSAGE: refactor: disable remark-lint-no-dead-urls (#280)

Signed-off-by: WuShaoling <wsl6@outlook.com>

COMMIT DATE: Tue Mar 3 09:15:01 20

## Github Backend

In [8]:
from perceval.backends.core.github import ( GitHub, 
                                            CATEGORY_ISSUE, CATEGORY_PULL_REQUEST)
from datetime import datetime
import json 

In [9]:
#GitHub API Token
API_TOKENS = ["xxxxx"]

OWNER = "hypertrons"
REPO_NAME = "hypertrons"

# Initializing the GitHub backend
github_backend = GitHub(owner = OWNER,
                        repository = REPO_NAME,
                        api_token = API_TOKENS,
                        sleep_for_rate = True)

In [10]:
# print some basic information
print(github_backend.owner)
print(github_backend.repository)
print(github_backend.origin)
print(github_backend.categories)

hypertrons
hypertrons
https://github.com/hypertrons/hypertrons
['issue', 'pull_request', 'repository']


#### Fetch Issues

In [13]:
# Datetime range
from_date = datetime(2019,11,11)
to_date = datetime(2020,2,2)

# Call fetch() method to fetch ISSUEs information from the github repository
issues = github_backend.fetch(
    category = 'issue', 
    from_date = from_date,
    to_date = to_date
    )

issues_list = list(issues)
issues_count = len(issues_list)
print("NUMBER OF ISSUES:", issues_count)

NUMBER OF ISSUES: 216


In [14]:
last_issue = issues_list[issues_count-1]
print("Attributes of issue JSON document: ")
print(last_issue.keys())

# dump the data to JSON file
with open("issue.json", "w") as write_file:
    json.dump(last_issue, write_file)

Attributes of issue JSON document: 
dict_keys(['backend_name', 'backend_version', 'perceval_version', 'timestamp', 'origin', 'uuid', 'updated_on', 'classified_fields_filtered', 'category', 'search_fields', 'tag', 'data'])


#### Fetch Pull Requests

In [15]:
# Fetch Pull Requests
prs = github_backend.fetch(
    category = 'pull_request', 
    from_date = from_date,
    to_date = to_date
    )

prs_list = list(prs)
prs_count = len(prs_list)
print("NUMBER OF PULL REQUESTS:", prs_count)

NUMBER OF PULL REQUESTS: 148


In [16]:
last_pr = prs_list[prs_count-1]
print("Attributes of pull request JSON document: ")
print(last_pr.keys())

# dump the data to JSON file
with open("pull_request.json", "w") as write_file:
    json.dump(last_pr, write_file)

Attributes of pull request JSON document: 
dict_keys(['backend_name', 'backend_version', 'perceval_version', 'timestamp', 'origin', 'uuid', 'updated_on', 'classified_fields_filtered', 'category', 'search_fields', 'tag', 'data'])
