# **JIRA_ISSUES_time**

This notebook the creation of the table `JIRA_ISSUES_time`, that adds to the `JIRA_ISSUES` table a new attribute to know the time until the resolution of the issue.

First, we import the libraries we need and, then, we read the corresponding csv.


In [1]:
import pandas as pd
import numpy as np

In [2]:
jiraIssues = pd.read_csv("../../../data/interim/DataPreparation/CleanData/JIRA_ISSUES_clean.csv").iloc[:,1:]
print(jiraIssues.shape)
jiraIssues.head()

(66345, 8)


Unnamed: 0,projectID,key,creationDate,resolutionDate,type,priority,assignee,reporter
0,commons-exec,EXEC-108,2018-09-18T11:15:58.000+0000,2019-07-07T10:32:12Z,Bug,Major,not-assigned,natanieljr
1,commons-exec,EXEC-107,2018-07-04T12:09:47.000+0000,2019-07-07T10:32:12Z,New Feature,Major,not-assigned,stefanreich
2,commons-exec,EXEC-106,2018-03-06T11:32:51.000+0000,2019-07-07T10:32:12Z,Improvement,Major,not-assigned,sebb
3,commons-exec,EXEC-105,2018-02-16T13:47:10.000+0000,2019-07-07T10:32:12Z,Wish,Trivial,not-assigned,IP
4,commons-exec,EXEC-104,2017-08-04T11:57:39.000+0000,2019-07-07T10:32:12Z,Bug,Major,not-assigned,krichter


First of all we have to convert the type of the attributs `resolutionDate` and `creationDate` to a timestamp format:

In [3]:
jiraIssues['creationDate'] =  pd.to_datetime(jiraIssues['creationDate'], format="%Y-%m-%dT%H:%M:%S.%f")
jiraIssues['resolutionDate'] =  pd.to_datetime(jiraIssues['resolutionDate'], format="%Y-%m-%dT%H:%M:%S.%f")
jiraIssues.head()

Unnamed: 0,projectID,key,creationDate,resolutionDate,type,priority,assignee,reporter
0,commons-exec,EXEC-108,2018-09-18 11:15:58+00:00,2019-07-07 10:32:12+00:00,Bug,Major,not-assigned,natanieljr
1,commons-exec,EXEC-107,2018-07-04 12:09:47+00:00,2019-07-07 10:32:12+00:00,New Feature,Major,not-assigned,stefanreich
2,commons-exec,EXEC-106,2018-03-06 11:32:51+00:00,2019-07-07 10:32:12+00:00,Improvement,Major,not-assigned,sebb
3,commons-exec,EXEC-105,2018-02-16 13:47:10+00:00,2019-07-07 10:32:12+00:00,Wish,Trivial,not-assigned,IP
4,commons-exec,EXEC-104,2017-08-04 11:57:39+00:00,2019-07-07 10:32:12+00:00,Bug,Major,not-assigned,krichter


Using the attributs `resolutionDate` and `creationDate`, we are going to create a new attribute, called `resolutionTime`, as the difference between these two attributs, that will be the time needed to resolve the issue in hours.

In [4]:
jiraIssues["resolutionTime"] = jiraIssues["resolutionDate"]
jiraIssues.head()

Unnamed: 0,projectID,key,creationDate,resolutionDate,type,priority,assignee,reporter,resolutionTime
0,commons-exec,EXEC-108,2018-09-18 11:15:58+00:00,2019-07-07 10:32:12+00:00,Bug,Major,not-assigned,natanieljr,2019-07-07 10:32:12+00:00
1,commons-exec,EXEC-107,2018-07-04 12:09:47+00:00,2019-07-07 10:32:12+00:00,New Feature,Major,not-assigned,stefanreich,2019-07-07 10:32:12+00:00
2,commons-exec,EXEC-106,2018-03-06 11:32:51+00:00,2019-07-07 10:32:12+00:00,Improvement,Major,not-assigned,sebb,2019-07-07 10:32:12+00:00
3,commons-exec,EXEC-105,2018-02-16 13:47:10+00:00,2019-07-07 10:32:12+00:00,Wish,Trivial,not-assigned,IP,2019-07-07 10:32:12+00:00
4,commons-exec,EXEC-104,2017-08-04 11:57:39+00:00,2019-07-07 10:32:12+00:00,Bug,Major,not-assigned,krichter,2019-07-07 10:32:12+00:00


In [5]:
seconds = (jiraIssues.loc[:,"resolutionDate"] - jiraIssues.loc[:,"creationDate"]).dt.total_seconds()
jiraIssues.loc[:,"resolutionTime"] = seconds/3600
jiraIssues

Unnamed: 0,projectID,key,creationDate,resolutionDate,type,priority,assignee,reporter,resolutionTime
0,commons-exec,EXEC-108,2018-09-18 11:15:58+00:00,2019-07-07 10:32:12+00:00,Bug,Major,not-assigned,natanieljr,7007.270556
1,commons-exec,EXEC-107,2018-07-04 12:09:47+00:00,2019-07-07 10:32:12+00:00,New Feature,Major,not-assigned,stefanreich,8830.373611
2,commons-exec,EXEC-106,2018-03-06 11:32:51+00:00,2019-07-07 10:32:12+00:00,Improvement,Major,not-assigned,sebb,11710.989167
3,commons-exec,EXEC-105,2018-02-16 13:47:10+00:00,2019-07-07 10:32:12+00:00,Wish,Trivial,not-assigned,IP,12140.750556
4,commons-exec,EXEC-104,2017-08-04 11:57:39+00:00,2019-07-07 10:32:12+00:00,Bug,Major,not-assigned,krichter,16846.575833
...,...,...,...,...,...,...,...,...,...
66340,zookeeper,ZOOKEEPER-5,2008-06-09 23:43:48+00:00,2008-10-17 00:24:34+00:00,New Feature,Major,mahadev,mahadev,3096.679444
66341,zookeeper,ZOOKEEPER-4,2008-06-09 16:42:38+00:00,2008-09-09 21:09:01+00:00,Bug,Major,fpj,breed,2212.439722
66342,zookeeper,ZOOKEEPER-3,2008-06-09 16:39:34+00:00,2009-11-18 17:48:01+00:00,Bug,Trivial,mahadev,breed,12649.140833
66343,zookeeper,ZOOKEEPER-2,2008-06-09 16:34:31+00:00,2008-08-25 21:13:14+00:00,Bug,Major,fpj,breed,1852.645278


We save the results in a new csv file:

In [6]:
jiraIssues.to_csv('../../../data/interim/DataPreparation/ConstructData/JIRA_ISSUES_time.csv', header=True)