# Growth Maturity Decline for Manuscripts

This Jupyter-lab notebook is about creating functions defining and showing the CHAOSS-GMD metrics. These functions will be incorporated into the Manuscripts project. We will also be testing visualizatons in this notebook.

We'll be using the local elasticsearch instance and an already inserted index. This index has been created using the `p2o.py` script from the grimoirelab-toolset.

We will start by importing the necessary libraries and initializing the necessary variables:

In [1]:
import os
import elasticsearch

from elasticsearch_dsl import Search
from pprint import pprint

import altair as alt
import pandas as pd

from datetime import date, timezone
from dateutil import parser, relativedelta

from manuscripts.manuscripts import esquery
from manuscripts.manuscripts import metrics

In [3]:
# address of the local elasticsearch instance
es_url = "http://localhost:9200"

# names of the git and github indices to be used
git_index = "perceval_git"
github_index = "perceval_github"

# time interval in which the analysis has to be done
end_date = parser.parse(date.today().strftime('%Y-%m-%d')).replace(tzinfo=timezone.utc)
# start_date = end_date - relativedelta.relativedelta(months=18) 
start_date = date(2014, 1, 1)

The idea here is to divide the current reporting system into two parts: The metrics that are currently generated will have no change. 
Other than that, specific CHAOSS metrics can be generated using the `--chaoss` flag when calling the manuscripts command.

We can start by adding a class method named `def get_chaoss_metrics(cls):` to each of the classes in the data sources:
```
class GitHubIssues(its.ITS):
    name = "github_issues"
    
    @classmethod
    def get_chaoss_metrics(cls):
        return {
            "issue_resolution" : {
                "open": [Open],
                "closed": [Closed],
                "issue_resolution_efficiency": [],
                "open_issue_age": [],
                "first_response_to_issue_duration": [],
                "closed_issue_resolution_Duration": [],
            }
        }
```

We will go with the structure already defined in Manuscripts:

In [34]:
from manuscripts.manuscripts.metrics import github_issues, its, metrics

### Issue Resolution

- Open issues

(Only create the classes that are not present and reuse code where possible)

In [5]:
# this class goes into the its.py file in manuscripts/metrics folder
# names of the classes will be changed according to the pattern used in that file

class ITSOpen(its.ITSMetrics):
    """ Tickets Open metric class for issue tracking systems """
    id = "open"
    name = "Open tickets"
    desc = "Number of tickets currently open"
    FIELD_COUNT = "id"
    FIELD_NAME = "url"
    FIELD_DATE = "created_at"

In [6]:
# this class goes into github_issues.py file in manuscripts/metrics folder

class Open(ITSOpen):
    ds = github_issues.GitHubIssues
    filters = {"pull_request": "false", "state": "open"}

The Open class can be called inside report.py file

In [17]:
open_issues = Open(es_url, github_index, start=start_date, end=end_date)

In [14]:
open_issues.get_agg()

22

- Closed issues

In [15]:
closed_issues = github_issues.Closed(es_url, github_index, start=start_date, end=end_date)

In [16]:
closed_issues.get_agg()

113

- Issue Resolution Efficiency (What is the number of closed issues/number of abandoned issues?)

How do we say that an issue has been abandoned?

- Open issue age

In [35]:
class AgeOpenIssue(metrics.Metrics):
    ds = github_issues.GitHubIssues
    
    id = "age_open_issues"
    name = "Age of open issues"
    desc = "Number of days since the open issues were created"
    FIELD_COUNT = 'time_open_days'
    AGG_TYPE = 'terms'
    filters = {"pull_request": "false", "state": "open"}

    def get_agg(self):
        agg = super(type(self), self).get_agg()
        if agg is None:
            agg = 0  # None is because NaN in ES. Let's convert to 0
        return agg

In [37]:
age_open_issues = AgeOpenIssue(es_url, github_index, start=start_date, end=end_date)

In [43]:
age_open_issues.get_query()

'{"query": {"bool": {"must": [{"match": {"pull_request": "false"}}, {"match": {"state": "open"}}, {"range": {"grimoire_creation_date": {"gte": "2014-01-01", "lte": "2018-05-09T00:00:00+00:00"}}}]}}, "aggs": {"1": {"terms": {"field": "time_open_days"}}}, "from": 0, "size": 0}'

In [46]:
age_open_issues.get_metrics_data(age_open_issues.get_query())

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 5, 'total': 5},
 'aggregations': {'1': {'buckets': [{'doc_count': 1,
     'key': 38.900001525878906},
    {'doc_count': 1, 'key': 76.95999908447266},
    {'doc_count': 1, 'key': 83.83999633789062},
    {'doc_count': 1, 'key': 88.52999877929688},
    {'doc_count': 1, 'key': 158.82000732421875},
    {'doc_count': 1, 'key': 159.47999572753906},
    {'doc_count': 1, 'key': 161.77000427246094},
    {'doc_count': 1, 'key': 361.70001220703125},
    {'doc_count': 1, 'key': 363.4200134277344},
    {'doc_count': 1, 'key': 440.30999755859375}],
   'doc_count_error_upper_bound': 0,
   'sum_other_doc_count': 12}},
 'hits': {'hits': [], 'max_score': 0.0, 'total': 22},
 'timed_out': False,
 'took': 5}