## Issue Age
This is the reference implementation for [Issue Age](https://github.com/chaoss/wg-evolution/blob/master/metrics/Issue_Age.md),
a metric specified by the
[Evolution Working Group](https://github.com/chaoss/wg-evolution) of the
[CHAOSS project](https://chaoss.community).

Have a look at [README.md](../README.md) to find out how to run this notebook (and others in this directory) as well as to get a better understanding of the purpose of the implementations.

The implementation is described in two parts (see below):

* Class for computing Issue Age
* An explanatory analysis of the class' functionality

Some more auxiliary information in this notebook:

* Examples of the use of the implementation

As discussed in the [README](../README.md) file, the scripts required to analyze the data fetched by Perceval are located in the `scripts` package. Due to python's import system, to import modules from a package which is not in the current directory, we have to either add the package to `PYTHONPATH` or simply append a `../..` to `sys.path`, so that `scripts` can be successfully imported. 

In [23]:
from datetime import datetime

import sys
sys.path.append('../..')

from implementations.scripts.issue_github import IssueGithub
from implementations.scripts.utils import read_json_file

In [24]:
class IssueAgeGithub(IssueGithub):
    """
    Issue Age Metric
    """

    def _flatten(self, item):
        """
        Flatten a raw issue fetched by Perceval into a flat dictionary.

        A list with a single flat directory will be returned.
        That dictionary will have the elements we need for computing metrics.
        The list may be empty, if for some reason the issue should not
        be considered.

        :param item: raw item fetched by Perceval (dictionary)
        :returns:   list of a single flat dictionary
        """
        flat = super()._flatten(item)

        if flat:
            flat = flat[0]
        else:
            return flat

        if flat['current_status'] != 'open':
            return []

        flat['open_issue_age'] = (datetime.now() - flat['created_date']).days

        return [flat]

    def compute(self):
        """
        Compute the average open issue age for all issues in the Perceval data.

        :returns avg_open_issue_age: the average age of open
            issues
        """
        open_issue_ages = [item['open_issue_age'] for item in self.items]
        return sum(open_issue_ages) / len(open_issue_ages) if open_issue_ages else None

    def compute_max(self):
        """
        Compute the maximum open issue age for all issues in the Perceval data.

        :returns avg_open_issue_age: the average age of open
            issues
        """
        open_issue_ages = [item['open_issue_age'] for item in self.items]
        return max(open_issue_ages) if open_issue_ages else None

    def compute_min(self):
        """
        Compute the minimum open issue age for all issues in the Perceval data.

        :returns avg_open_issue_age: the average age of open
            issues
        """
        open_issue_ages = [item['open_issue_age'] for item in self.items]
        return min(open_issue_ages) if open_issue_ages else None

    def __str__(self):
        return "Issue Age Metric for Github"

## Performing the Analysis
We'll perform two kinds of analysis here:
- Counting the average age of open issues
- Change of average open issue age over time

### Counting the average age of open issues
First, we read the JSON file `issues.json`, present in the `implementations` directory, one level up. We make use of the `read_json_file` utility function.  Notice the filter being used here on `items`. The GitHub API considers all pull requests to be issues. Any pull request represented as an issue has a 'pull_request' attribute, which is used to filter them out from the issue data.

In [25]:
items = read_json_file('../issues.json')

items = [item for item in items if 'pull_request' not in item['data']]

Let's use the `compute` method to count the total number of valid issues made. First, we will do it without passing any since and until dates. 
Next, we can pass in the start and end dates as a tuple. The format would be `%Y-%m-%d`.   

Lets calculate the average age for all open issues first. Then, we can do it by passing a start date. Here, only those issues will be considered that were created after the start date we passed via the variable `date_since`. 

While printing the output, we will keep the precision to only two decimals. 

In [26]:
date_since = datetime.strptime("2018-09-07", "%Y-%m-%d")
open_issue_age = IssueAgeGithub(items)
print("The average age of all open issues is {:.2f} days."
      .format(open_issue_age.compute()))

open_issue_age_interval = IssueAgeGithub(items, (date_since, None))
print("The average age of open issues created after 2018-09-07 is {:.2f} days."
      .format(open_issue_age_interval.compute()))

The average age of all open issues is 715.91 days.
The average age of open issues created after 2018-09-07 is 403.35 days.
