#                                        Continuous Integration 

## 1. Automatic Fault Detection

The goal is to find, in a revision, which Change-List (CL) caused a regression, in a short amount of time.

<img src="CL.png" alt="Drawing" style="width: 500px;"/>

Test are executed periodically, for example, every $N$  $CL$. In this case, at $CL_{G}$ all tests are passing (*Green*) and at $CL_{R}$ some test are failing (*Red*). Hence, there has to be some **Culprit CL** in the range $[CL_{G}, CL_{R}]$, that caused a regression.

A possible solution to finding the culprit is conducting a search over the range $[CL_{G}, CL_{R}]$, an example of it is doing *binary-search*. However, to obtain results faster, we can implement machine learning models that tell us where to look first and which **CL's** are more likely to be the ones that caused a regression.

This way, we are able to notify the developers responsable for those changes and catch/correct them early on.

## 2. Optimizing Regression Testing

Above, we show how it is possible to prioritize different CL's to avoid testing "unnecessary" or "non-revealing" CL's. However, if a CL is very suspicious, but tests take 8+ hours to discover if the change is defective or not, it is not of much use. 

So in parallel to Defect Prediction, the way Regression Testing is done can be optimized by applying these techniques:

- **Test Case Minimisation**
- **Test Case Selection**
- **Test Case Prioritization**

This way, we propose to reduce the lag time between a commit and project status feedback by a significant amount, saving time and resources.

## 3. Notification

Now, we can rank **CL's** by their suspiciousness and notify developers to rectify and take a closer look/apply more tests at a specific change list, not much time after they've submitted a change.

***

# Implementation 

These algorithms are applied in the following situations: 

1. **Synthetic Data**: generated randomly to replicate some patterns that are similar to real-world situation
2. **OpenSource Projects**: datasets that are available online 
3. **Real-World Data**: source control log obtained over 10 years from a company.

In [5]:
import pandas as pd
import numpy as np

# 1. Features used in Defect Prediction:

From the paper "*Deep Learning for Just-In-Time Defect Prediction*" (2015), by Xinli Yang et al.

- **id**: Unique identifier of CL
- **author**: Developer name responsable for the change
- **timestamp**: Commit time
- **ns**: The number of modified subsystems
- **nd**: The number of modified directories
- **nf**: The number of modified files
- **entropy**: Distribution of modified code across each file
- **la**: Lines of code added
- **ld**: Lines of code deleted
- **lt**: Lines of code in a file before the change
- **fix**: Whether or not the change is a defect fix
- **ndev**: The number of developers that changed the modified files
- **age**: The average time interval between the last and the current change
- **nuc**: The number of unique changes to the modified files
- **exp**: Developer experience
- **rexp**: Recent developer experience
- **sexp**: Developer experience on a subsystem

### Label
- **Per change**: Suspiciousness , from 1 or 0

In [6]:
features = ['commit_id','author','timestamp', 'ns','nd','nf', 'entropy', 'la', 'ld', 'lt', 'fix', 'ndev','age', 'nuc', 'exp', 'rexp', 'sexp', 'suspicious'] 

In [7]:
df_cl = pd.DataFrame(columns=features)

In [8]:
df_cl

Unnamed: 0,commit_id,author,timestamp,ns,nd,nf,entropy,la,ld,lt,fix,ndev,age,nuc,exp,rexp,sexp,suspicious


In [9]:
print(f" This dataset contains {len(df_cl.columns)} features")

 This dataset contains 18 features


## Open Source Dataset

In [10]:
bugzilla = pd.read_csv('../jit/input/bugzilla.csv')

In [11]:
bugzilla.head()

Unnamed: 0,transactionid,commitdate,ns,nm,nf,entropy,la,ld,lt,fix,ndev,pd,npt,exp,rexp,sexp,bug
0,3,2001/12/12 17:41,1,1,3,0.57938,0.09362,0.0,480.666667,1,14,596,0.666667,143,133.5,129,1
1,7,1999/10/12 12:57,1,1,1,0.0,0.0,0.0,398.0,1,1,0,1.0,140,140.0,137,1
2,8,2002/5/15 16:55,3,3,52,0.739279,0.183477,0.208913,283.519231,0,23,15836,0.75,984,818.65,978,0
3,9,2002/1/21 15:37,1,1,8,0.685328,0.016039,0.01288,514.375,1,21,1281,1.0,579,479.25,550,0
4,10,2001/12/19 16:44,2,2,38,0.769776,0.091829,0.072746,366.815789,1,21,6565,0.763158,413,313.25,405,0


In [12]:
print(f"This data set contains {bugzilla.shape[0]} instances and {bugzilla.shape[1]} features")

This data set contains 4620 instances and 17 features


---

# 2. Features for Test Case Failure Prediction 

Using Machine Learning models to prioritize test according to a certain criteria, one criterion may be the ability of test cases to find faults that can be predicted *a priori*.

In [13]:
test_features = ['version', 'test_id','name', 'size', 'hist', 'cov', 'txt', 'res'] 

In [14]:
df_test = pd.DataFrame(columns=test_features)

In [15]:
df_test

Unnamed: 0,version,test_id,name,size,hist,cov,txt,res


In [16]:
print(f" This dataset contains {len(df_test.columns)} features")

 This dataset contains 8 features


---


## Open Source Dataset

From Palma et al (2018) , 5 projects retrieved.

### Features
- **Version**: version under test.
- **TestID**: unique test identifier.
- **TestName**: test name.
- **Status**: test case status - Modified/New 
- **ST**: *Size of tests*. Number of lines of code.
- **MC**: *Method Coverage*, the ratio of the nr. of methods called by a test case from the previous version and the total number of methods in the source code.
- **BC**: *Basic Counting*, the nr. of unique method calls in the test trace from the current release that also appear in the previous failing sequences for that test case. 
- **HD**: *Hamming distance*, min nr. of edit operations (insertios, deletions and substitutions) required to convert a sequence into another.
- **ED**: *Edit distance*, Levenshtein distance. 
- **CMC**: *Changed Method Coverage*, ratio between the nr. of changed methods from the previous version and the total nr. of methods in the source code.
- **TM**: *Traditional Historical Fault Detection Metric*, obtained by counting the nr. of versions for which a test case has failed previously. 
- **IBC**: *Improved Basic Counting*, combination of **BC** and **HD**.



### Additional features to consider
- **size**: test size in minutes
- **txt**: Text-based features textual representation of the test case (obtained by using topic modelling). Can be used to cluster test cases.

### Label:
- **Result**: pass/fail; 1/0

In [17]:
ClosureCompiler = pd.read_csv('../pred-rep-master/tanzeem_noor-promise17_data/Closure-Compiler Metrics Raw_Data.csv', index_col='Version')

In [18]:
print(f"This data set contains {ClosureCompiler.shape[0]} instances and {ClosureCompiler.shape[1]} features")

This data set contains 3374 instances and 12 features


In [19]:
ClosureCompiler.head(10)

Unnamed: 0_level_0,TestID,TestName,Result,Status,ST,MC,BC,HD,ED,CMC,TM,IBC
Version,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1,C_1_M_1,com.google.javascript.jscomp.IntegrationTest::...,1,Modified,9,1866,0.950697,0.14595,638738,0.0,0,0.950697
1,C_2_M_1,com.google.javascript.jscomp.RemoveUnusedVarsT...,1,Modified,4,672,0.995536,0.249005,262767,0.0,0,0.995536
1,C_2_M_2,com.google.javascript.jscomp.RemoveUnusedVarsT...,1,Modified,4,667,0.995502,0.25011,261775,0.0,0,0.995502
1,C_2_M_3,com.google.javascript.jscomp.RemoveUnusedVarsT...,1,Modified,4,693,0.995671,0.249962,266981,0.0,0,0.995671
1,C_2_M_4,com.google.javascript.jscomp.RemoveUnusedVarsT...,1,Modified,3,672,0.995536,0.249431,262767,0.0,0,0.995536
1,C_3_M_1,com.google.javascript.jscomp.CommandLineRunner...,1,New,3,1630,0.941104,0.150986,558971,0.0,0,0.941104
1,C_3_M_2,com.google.javascript.jscomp.CommandLineRunner...,0,New,3,2031,0.928607,0.130447,694723,0.0,0,0.928607
1,C_3_M_3,com.google.javascript.jscomp.CommandLineRunner...,0,Modified,2,1674,0.94325,0.145455,573755,0.0,0,0.94325
1,C_3_M_4,com.google.javascript.jscomp.CommandLineRunner...,0,Modified,4,2187,0.942387,0.130274,747954,0.002286,0,0.942387
1,C_3_M_5,com.google.javascript.jscomp.CommandLineRunner...,1,Modified,4,1598,0.943054,0.151367,548219,0.0,0,0.943054
