# 3. Impact of the New Upgrade to the Existing System 
The preliminary analysis in the phase 1 provided us with insights of the general overview of our dataset. To meet Company A’s objective, we shall elaborate on our data analysis in this section.

## 3.1. Impact of the New Upgrade to the Number of Issue
### 3.1.1. Trend of Issues – Mainland China and Hong Kong

<img src="https://drive.google.com/uc?export=view&id=13SfORYcsiWapn7qLxZOuDhOrSXgoK-tj" width="550" />

*Graph 5: Plot of the Number of Issues in Mainland China and Hong Kong Before and After New Upgrade Launch.*
<br><br>
Mainland China and Hong Kong were compared based on the median number of issues per month. The comparison period was from July 2018 to May 2021, which was extracted based on 17 months before and after the new implementation in December 2019. From Graph 5, we were able to determine that the average number of issues logged in Mainland China had a significant drop of 49.5%, from 2,134 to 1,078 issues per month. We were then able to deduce, in the aspect of the number of issues, that the new upgrade had effectively reduced the number of issues logged via issue tracking software.

On the contrary, the median number of issues in Hong Kong had surged by 113.6%, from 433 to 925 issues per month. This was out of expectations, bearing in mind the purpose of the new upgrade. The rise in the number of issues was reasonably caused by the spike in the number of issues in November 2020, whereby it had the highest number of issues logged at 1,811. 

To explain the unexpected increase after the new upgrade launch, we had engaged with the business stakeholders to obtain further insights. The cause of the spike in November 2020 was triggered by a system migration project for one of the software application taking place in Hong Kong.

Additionally, some of the resources in Mainland China, such as the software testers, were moved to support the system migration in Hong Kong. Therefore, it can be observed in Graph 5 that there was a slight drop in cases in Mainland China during the same period as well.

Despite the increase in the monthly average number of issues in Hong Kong, the monthly average number of issues in Mainland China and Hong Kong combined had decreased by 22.0%.



### 3.1.2. Relationship between Number of Issues vs Number of Projects

In Graph 3 in Phase 1, it was mentioned that the trend between the number of issues and projects seems to be correlated. To gain further insights into this relationship, we had performed a simple linear regression to test if the number of projects significantly predicted the number of issues.

```python
# To find the relationship between number of issue vs all features

# Use grouping and inner join to find the number of project and number of issue per month [2016 to 2021]
project1 <- data2016 %>%
  group_by(month = lubridate::floor_date(c.created.date, "month")) %>%
  summarize(issue = length(ID))

project2 <- data2016 %>%
  group_by(month = lubridate::floor_date(c.created.date, "month")) %>%
  summarize(project = length(unique(Project)))

project <- project1 %>% 
  inner_join(project2, by = c("month" = "month"))

# Run the regression of number of project to number of issue per month
project <- as.data.frame (unclass(project), stringsAsFactors = TRUE)
str(project)

model = lm(issue ~ project, data = project)
summary(model)
```

*Results:*

<img src="https://drive.google.com/uc?export=view&id=13XAJFULLtGH_8Ytvi2uGTMnP0I9r37MT" width="400" />

The regression analysis model (r2 = 0.603, F(1, 62) = 94.18, p-value = 4.689e-14) was statistically significant. It was found that every increase in 1 project predicts an increase in 73 issues, with an estimated standard error of 7 (β = 72.694, p = 4.69e-14).
<br><br>
```python
# Plot the model and perform residuals checking 
plot(model)
res = residuals(model)
shapiro.test(res)
```

*Results:*

<img src="https://drive.google.com/uc?export=view&id=13XknDb_IuGAXNc0aew8QEgG1f9GVhz04" width="500" />

<img src="https://drive.google.com/uc?export=view&id=13fIGDb-JAfvzxyF_R9KfYm1l5rM0NXj2" width="250" />

With a p-value of 0.3595, the Shapiro-Wilk normality test failed to reject the null hypothesis. The data is normally distributed. The residuals in the Normal Q-Q plot were approximately linear, supporting the condition that the error terms were normally distributed.

The Residuals vs Fitted Plot depicts residuals that bounced randomly around the y-axis line of zero. This suggests that the assumption of a linear relationship was valid, the residuals exhibit normal random noise. 

The regression analysis model had proven that there is a significant impact between the number of issues and project, which means the change in number of project will bring a signficant change to number of issue. With this, the decreasing trend of number of issue in Graph 5 may not only be impacted by the new upgrade, it may due to the reduction of number of project as well. Therefore, we have move further to normalize the number of issue with the number of project, to see if the number of issue per project change along with the new upgrade. 

### 3.1.3.	Trend of Issues After Normalizing – Mainland China and Hong Kong

In Section 3.1, a comparison of the average number of issues had shown a 49.5% decrease in Mainland China but an 113.6% increase in Hong Kong. To gain further insights, normalizing the number of issues with the number of projects enabled us to compare based on the number of issues per project.
<br><br>
<img src="https://drive.google.com/uc?export=view&id=13oLhO9N_WEm9oh1bOawenAKTbFuwUWA_" width="500" />

*Graph 8:  Plot of the Number of Issues Per Project in Mainland China and Hong Kong Before and After New Upgrade Launched.*

Before the launch of the new upgrade, there was a growing trend in the number of issues per project, as seen in Graph 8. Following the launch of new upgrade, there was a downward trend in Mainland China before it peaked in April 2021. Whereas for Hong Kong there had been an upward trend, with the highest spike in November 2020 when there were 127 issues per project. As noted in Section 3.1, a system migration project in Hong Kong which was a major project that had resulted in the rise of issues logged in November 2020.


## 3.2.	Impact of the New Upgrade to the Resolution Time
### 3.2.1 Trend of Resolution Time - Overall

<img src="https://drive.google.com/uc?export=view&id=13u7dwR3L7SCY7YJ4ZKk1Msj8OvTrqCNH" width="500" />

*Graph 9: Average Resolution Time, by Days, Before and After the New Upgrade.*

The monthly average resolution time had been trending downwards for the entire comparison period of 17 months before and after the new upgrade, as shown in Graph 9. Nonetheless, after new upgrade, the downward slope was steeper and more apparent. In terms of a comparison based on the median number of issues, after the new upgrade was launched, there was a reduction of 41.8% (17.8 days) for the same period.

Evaluating it against Phase 1 where the number of issues and projects can be seen decreasing, it is logical that this would have an impact on the resolution time. We were anticipating a decrease in the number of issue as well as project would decrease the resolution time since resources can be able to focus on the reduced issues logged. 


### 3.2.2 Trend of Resolution Time – by Issue Type

<img src="https://drive.google.com/uc?export=view&id=13vkFUcLdOl80qegHdpwpl2xGHN05m15A" width="500" />

*Graph 10: Average Resolution Time (Days) Before and After the New Upgrade Based on Issue Type.*

Graph 10 plots the average Resolution Time with 5 issue types from IssueType 1 to IssueType 5.

Among the 5 Issues Types, the main concern lies in the IssueType 1. In the process of software testing, IssueType 1 is a flaw or an incident when the actual behavior has not resulted in an expected behavior.  When all occurring IssueType 1 had been resolved, the particular system or application is deemed successful and is ready to go live. Hence, resolving IssueType 1 is a top priority and is beneficial to be resolved as quickly as feasible.

After the New Upgrade was launched, there was a downward trend for all Issue Types. IssueType 1 specifically had a reduction of 35.3% (12.7 days) in Resolution Time. The Issue Type with the highest reduction was IssueType 3 at 76.7%.




### 3.2.3 Trend of Resolution Time – by Priority Level

<img src="https://drive.google.com/uc?export=view&id=13zq3RyZ1kpXCHoifXPBUmuuWxYOEnkmZ" width="500" />

*Graph 11: Average Resolution Time (days) Before and After the New Upgrade Based on Priority Level.*

Similarly, as reflected in Graph 11, there was a downward trend for all Priority Levels. Priority levels had been classified into 4 levels listed below.
1.	Critical
2.	High
3.	Medium
4.	Low

When issue with Priority Level of Critical and High are not resolved, the system is not considered ready and will not be able to go live, according to the Company's operational criteria. In the graph plotted, the Critical and High priority levels had a reduction of resolution time of 31.9% (7.3 days) and 33.3% (10.1 days) respectively. Combining the requirement to resolve IssueType 1, in sectoin 3.2.2., the resolution of Critical and High level are also prerequisites for the system or application to go live.
<br><br>
On the overall, there was a significant reduction in resolution time following the launched of New Upgrade.

