## Microtask #2 : Git Backend
>Create a Python script to execute Perceval via its Python interface using the Git and GitHub backends. Feel free to select any target repository.

We will be using chaoss [grimoirelab-elk](https://github.com/chaoss/grimoirelab-elk.git) as our target repo.

In [1]:
#importing required modules
from perceval.backends.core.git import Git
from datetime import datetime
from pprint import pprint

* The `Git` Backend class required two mandatory arguments
    - `uri`: Url of the Git Repository in order to clone it.
    - `gitpath`: Path where the repository will be cloned.

In [2]:
REPOSITORY_URL = "https://github.com/chaoss/grimoirelab-elk.git"
REPO_DIR = "./tmp/grimoirelab-elk"

In [3]:
# Initializing the Git backend
git_backend = Git(uri=REPOSITORY_URL,gitpath=REPO_DIR)

In [4]:
# Range of dates in which commits are to be fetched
from_date = datetime(2019, 1, 1)
to_date = datetime(2020, 3, 22)

In [5]:
repo_branches = ["master", "GeorgLink-patch-1","jgbarah-patch-1"]

In [6]:
# Calling fetch method
# The method retrieves from a Git repository or a log file a list of commits. 
# Commits are returned in the same order they were obtained.
range_commits = git_backend.fetch(branches=repo_branches, from_date=from_date, to_date=to_date)
range_commits_list = list(range_commits)
n_commits = len(range_commits_list)
print("NUMBER OF COMMITS: ", n_commits)

NUMBER OF COMMITS:  490


In [7]:
# Let's check the structure of one of the commits. ( picking last one in this case )
last_commit = range_commits_list[n_commits - 1]
pprint(last_commit)

{'backend_name': 'Git',
 'backend_version': '0.12.0',
 'category': 'commit',
 'classified_fields_filtered': None,
 'data': {'Author': 'Valerio Cosentino <valcos@bitergia.com>',
          'AuthorDate': 'Fri Mar 20 12:35:40 2020 +0100',
          'Commit': 'Valerio Cosentino <valcos@bitergia.com>',
          'CommitDate': 'Fri Mar 20 12:35:40 2020 +0100',
          'Signed-off-by': ['Valerio Cosentino <valcos@bitergia.com>'],
          'commit': '1201f3f7242386d9f21a76fe8e2b5783fb1c8e17',
          'files': [{'action': 'M',
                     'added': '1',
                     'file': 'grimoire_elk/_version.py',
                     'indexes': ['1df687b', '82356e6'],
                     'modes': ['100644', '100644'],
                     'removed': '1'}],
          'message': 'Update version number to 0.71.0\n'
                     '\n'
                     'Signed-off-by: Valerio Cosentino <valcos@bitergia.com>',
          'parents': ['ec6694fa850f81a0f1f36440c246429bfdcc35d1'],
    

In [8]:
for commit in range_commits_list:
    print("COMMIT DATE: {commit_date}\nAUTHOR: {author_name}\nCOMMIT MESSAGE: {commit_message}".format(commit_date=commit["data"]["CommitDate"], author_name=commit["data"]["Author"], commit_message=commit["data"]["message"]))
    print()

COMMIT DATE: Wed Jan 2 14:52:20 2019 +0100
AUTHOR: Valerio Cosentino <valcos@bitergia.com>
COMMIT MESSAGE: [elk] Include stacktrace in log message

This code includes the stacktrace information in the log message when fetching and
enriching data.

COMMIT DATE: Wed Jan 2 14:52:20 2019 +0100
AUTHOR: Valerio Cosentino <valcos@bitergia.com>
COMMIT MESSAGE: [sortinghat_elk] Include stacktrace in log message for unknown exception

This code includes stack trace information within the log message generated
when an unknown exception is thrown when adding identity.

COMMIT DATE: Wed Jan 2 15:04:17 2019 +0100
AUTHOR: Valerio Cosentino <valcos@bitergia.com>
COMMIT MESSAGE: [elastic] Change log message level when inserting data to ES

This code changes the log message level from INFO to DEBUG when
inserting data to ElasticSearch.

COMMIT DATE: Thu Jan 10 12:27:53 2019 +0100
AUTHOR: Santiago Dueñas <sduenas@bitergia.com>
COMMIT MESSAGE: Update version number to 0.35.0

COMMIT DATE: Wed Jan 9 21:19:

In [9]:
#Let us now print out Additions and Deletions over files from last 5 commits
for commit in range_commits_list[-5:]:
    print("* COMMIT MESSAGE: {commit_message}\n".format(commit_message=commit["data"]["message"]))
    for change_file in commit['data']['files']:
        print("\tFILE NAME: {file_name}\n\tADDITIONS: +{additions}\n\tDELETIONS: -{deletions}\n".format(file_name=change_file['file'], additions=change_file['added'], deletions=change_file['removed']))


* COMMIT MESSAGE: [enriched-mappings] Update aoc mappings

This code includes the attribute `multi_org_names` in
the mappings of the AOC studies. This change
is needed to propage the information about the multiple
affiliations of a single user to the study indexes.

Signed-off-by: Valerio Cosentino <valcos@bitergia.com>

	FILE NAME: grimoire_elk/enriched/mappings/git_aoc.json
	ADDITIONS: +3
	DELETIONS: -0

	FILE NAME: grimoire_elk/enriched/mappings/git_aoc_es7.json
	ADDITIONS: +3
	DELETIONS: -0

* COMMIT MESSAGE: [schema] Add attribute for multi org names in git.csv

This code includes a new attribute in the git.csv that
document the field author_multi_org_names (which includes
the several affiliations a user may have).

Signed-off-by: Valerio Cosentino <valcos@bitergia.com>

	FILE NAME: schema/git.csv
	ADDITIONS: +2
	DELETIONS: -1

* COMMIT MESSAGE: [schema] Add attribute for multi org names in github_issues.csv

This code includes new attributes in the github_issues.csv
that document