Stajesz przed nowym zadaniem, nowy product manager poprosił Cię o pomoc w obliczeniu i wizualizacji nowego KPI, który ma
pomóc ulepszyć proces dostarczania rezultatów do klienta. Jego definicja to: Ilość zaakceptowanych błędów na godzinę w 
pierwszych 6-ciu godzinach od rozpoczęcia cyklu testowego. PM przekazał Ci, że dla uproszczenia jako początek cyklu testowego możesz uznać czas pierwszego zgłoszonego błędu. 

Jeżeli uda Ci się pokazać tą metrykę, PM napewno doceni analizę na tych danych - ponieważ on nie ma na to czasu.

Stwórz widok pokazujący powyższą metryke i zaproponuj dodatkowe wykresy z które mogą być przydatne dla Product Manager'a.

In [1]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt 

In [33]:
data = pd.read_csv('cycle_data.csv')

In [34]:
data.sample(10)

Unnamed: 0,issue_code_,created_at,severity,resolution,affected_components,creator_id
38,CYCLE-10-39,13/04/2021 16:17:27,Medium,Confirmed (S),{Workouts},4075.0
84,CYCLE-10-85,13/04/2021 19:10:27,Medium,Confirmed (S),{Challenge},5271.0
16,CYCLE-10-17,13/04/2021 15:54:23,Low,Confirmed (S),{Profile},3230.0
41,CYCLE-10-42,13/04/2021 16:21:22,High,Confirmed (S),{Signup},6380.0
13,CYCLE-10-14,13/04/2021 15:51:06,Low,Confirmed (S),{Profile},3230.0
2,CYCLE-10-3,13/04/2021 15:41:28,Low,Confirmed (S),{Signup},1066.0
36,CYCLE-10-37,13/04/2021 16:14:44,High,Confirmed (S),{Kit},3230.0
35,CYCLE-10-36,13/04/2021 16:13:14,Low,Confirmed (S),{Kit},8124.0
80,CYCLE-10-81,13/04/2021 18:46:09,Medium,No tester response,{Profile},6435.0
88,CYCLE-10-89,13/04/2021 20:08:18,Medium,Confirmed (S),{Profile},7594.0


In [35]:
data.severity.unique()

array(['Low', 'High', 'Medium', 'Critical'], dtype=object)

In [36]:
data.resolution.unique()

array(['Confirmed (S)', 'Duplicate', 'Expected Behaviour',
       'Confirmed (!)', 'Invalid - not tester error', 'Not in Scope',
       'No tester response', 'Confirm (S)'], dtype=object)

In [37]:
data.affected_components.unique()

array(['{Signup}', '{Profile}', 'Profile', '{Sign-up}', '{Content}',
       '{Challenge}', '{Workouts}', '{Settings}', '{Kit}', '{Guides}',
       '{Meals}', '{Login}'], dtype=object)

In [38]:
# First step will be to clear the data.
# I replace rows where resolution status is 'Confirmed (!)' and 'Confirm (S)' with status 'Confirmed (S)', because it's 
# the same category 

data["resolution"].replace({"Confirmed (!)": "Confirmed (S)", "Confirm (S)": "Confirmed (S)"}, inplace=True)
data.resolution.unique()

array(['Confirmed (S)', 'Duplicate', 'Expected Behaviour',
       'Invalid - not tester error', 'Not in Scope', 'No tester response'],
      dtype=object)

In [39]:
data.isnull().sum()

issue_code_            0
created_at             1
severity               0
resolution             0
affected_components    0
creator_id             1
dtype: int64

In [40]:
# Drop rows where NaN is in 'created_at' column, because we need this data to check if error was created within first 
# 6 hours of testing cycle.

data.dropna(subset=['created_at'], inplace=True)
data['created_at'].isnull().sum()

0

In [41]:
data.head(10)

Unnamed: 0,issue_code_,created_at,severity,resolution,affected_components,creator_id
0,CYCLE-10-1,13/04/2021 15:35:36,Low,Confirmed (S),{Signup},8124.0
1,CYCLE-10-2,13/04/2021 15:40:15,Low,Confirmed (S),{Signup},8124.0
2,CYCLE-10-3,13/04/2021 15:41:28,Low,Confirmed (S),{Signup},1066.0
3,CYCLE-10-4,13/04/2021 15:41:38,High,Confirmed (S),{Signup},4075.0
4,CYCLE-10-5,13/04/2021 15:43:07,Medium,Confirmed (S),{Profile},3230.0
5,CYCLE-10-6,13/04/2021 15:43:32,Low,Confirmed (S),{Signup},5271.0
6,CYCLE-10-7,13/04/2021 15:46:35,Low,Confirmed (S),{Signup},4693.0
7,CYCLE-10-8,13/04/2021 15:46:44,Medium,Confirmed (S),Profile,8058.0
8,CYCLE-10-9,13/04/2021 15:46:53,High,Confirmed (S),{Profile},3230.0
9,CYCLE-10-10,13/04/2021 15:47:03,Low,Duplicate,{Signup},1066.0


In [46]:
cycle_start = data.head(1)['created_at'][0]

In [49]:
type(data.head(1)['created_at'])

pandas.core.series.Series