In [1]:
import pandas as pd
import numpy as npa
import math
import matplotlib.pyplot as plt
import warnings
warnings.simplefilter(action='ignore', category=UserWarning)
from IPython.display import display, HTML
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows',None)

# Bug Classification Analysis

In this notebook we will show the degree of agreement reached between 4 different classifications.

The classifications considered are the following:

- **Codethink - Round 1:** 47 bugs classified by developers (several authors)
- **Codethink - Round 2:** 10 bugs classified by developers (several authors)
- **Ben Dooks:** 47 bugs classified by a Linux Kernel Developer
- **Michel Maes:** 50 bugs classified by a researcher (only if its a bug-fixing commit or not)

**Preliminary conclusions**:

- In the "*Did this fix a bug?*" classification, we found similar results with slight differences in complex cases.
- In the classification "*Safety relevant bug?*" we found a great disparity of results between Codethink and Ben Dooks (16 agree and 16 differ).
- In the specific classification of safety relevant bug, we also found notable differences

**Recommendations**:
- It is necessary to clearly define what we understand as a "safety related bug" and that all evaluators have the same notion of the concept.

#### Data clean

In order to carry out the analysis, we have made the following data processing decisions:

- Yes/no answers have been changed to lower case.
- If the answer contained a "?" (e.g. "yes?") the value without "?" was taken.
- Considered "?", "maybe", "possible" as a third response **"undetermined"**=>**?**.
- For the categories 'Timing + Execution', 'Memory' and 'Exchange of Info', if there was no answer, it was considered a "no".

In [2]:
def showResults(results_df):
    if len(results_df.index) > 0:
        display(HTML( results_df.to_html().replace("\\n","<br>") ))

## 1. Merging Codethink round's results

In [3]:
codethink_results_round_1 = pd.read_csv('Codethink Hivemind Linux Bug Classification - Round 1.csv') 
codethink_results_round_2 = pd.read_csv('Codethink Hivemind Linux Bug Classification - Round 2.csv') 

In [4]:
codethink_results_round_1[['Name']] = codethink_results_round_1[['Name']].fillna("None")
codethink_results_round_2[['Name']] = codethink_results_round_2[['Name']].fillna("None")

for column in ['Timing + Execution', 'Memory', 'Exchange of Info']:
    codethink_results_round_1[[column]] = codethink_results_round_1[[column]].fillna("no")
    codethink_results_round_2[[column]] = codethink_results_round_2[[column]].fillna("no")

for column in ['Did this fix a bug?','Safety relevant bug?','Timing + Execution', 'Memory', 'Exchange of Info']:
    codethink_results_round_1[[column]] = codethink_results_round_1[[column]].replace("No", "no")
    codethink_results_round_1[[column]] = codethink_results_round_1[[column]].replace("Yes", "yes")
    codethink_results_round_1[[column]] = codethink_results_round_1[[column]].replace("maybe", "?")
    codethink_results_round_1[[column]] = codethink_results_round_1[[column]].replace("possible", "?")
    codethink_results_round_1[[column]] = codethink_results_round_1[[column]].replace("??", "?")
    
    codethink_results_round_2[[column]] = codethink_results_round_2[[column]].replace("No", "no")
    codethink_results_round_2[[column]] = codethink_results_round_2[[column]].replace("Yes", "yes")
    codethink_results_round_2[[column]] = codethink_results_round_2[[column]].replace("maybe", "?")
    codethink_results_round_2[[column]] = codethink_results_round_2[[column]].replace("possible", "?")
    codethink_results_round_2[[column]] = codethink_results_round_2[[column]].replace("?", "?")
    codethink_results_round_2[[column]] = codethink_results_round_2[[column]].replace("??", "?")

codethink_results_round_1[["Reasoning"]] = codethink_results_round_1[["Reasoning"]].fillna("No reasoning")
codethink_results_round_2[["Reasoning"]] = codethink_results_round_2[["Reasoning"]].fillna("No reasoning")

#### Classification - Codethink Round 1 (only commmits reviewed)

In [5]:
codethink_results_round_1_filled = codethink_results_round_1[codethink_results_round_1['Name'] != "None"]
print("Round 1 - results count: ",len(codethink_results_round_1_filled))
(
    codethink_results_round_1_filled
    [["Name","Did this fix a bug?","Safety relevant bug?","Timing + Execution","Memory","Exchange of Info","Reasoning"]]
)

Round 1 - results count:  47


Unnamed: 0,Name,Did this fix a bug?,Safety relevant bug?,Timing + Execution,Memory,Exchange of Info,Reasoning
0,Adnan Chowdhury,yes,yes,no,yes,no,Fix of a race condition which could possibly corrupt memory
1,John G,yes,yes,no,yes,no,"""could [stuff...]... causing a use-after-free memory access"""
2,Hao Hu,yes,yes,no,no,no,Change SAS interface pointer
3,Hao Hu,no,no,yes,no,no,Remove extra dirty node flag
4,Basit A.,yes,no,no,no,no,commit + superfluos function section data generation
5,Aidan MacDonald,yes,yes,yes,no,?,"cluster filesystem locking error causing hang - according to commit message, difficult to verify otherwise"
6,Chris Phang,yes,no,no,no,no,kbuild dependency fix
7,Tom Bloor,yes,?,no,yes,no,if youre rebooting during normal operation then... good luck?
8,Tom Bloor,yes,no,no,no,no,wouldnt compile without this fix
9,Aidan MacDonald,yes,no,no,no,no,Kbuild fix


#### Classification - Codethink Round 2 (only commmits reviewed)

In [6]:
codethink_results_round_2_filled = codethink_results_round_2[codethink_results_round_2['Name'] != "None"]
print("Round 2 - results count: ",len(codethink_results_round_2_filled))
(
    codethink_results_round_2_filled
    [["Name","Did this fix a bug?","Safety relevant bug?","Timing + Execution","Memory","Exchange of Info","Reasoning"]]
)

Round 2 - results count:  10


Unnamed: 0,Name,Did this fix a bug?,Safety relevant bug?,Timing + Execution,Memory,Exchange of Info,Reasoning
10,Chris Phang,yes,yes,no,yes,no,"NULL pointer dereference bug in eventfs, which is used for tracing"
16,Weyman,no,no,yes,no,no,"About adjusting flow of something, an optimization"
20,Paul Sherwood,yes,no,no,no,no,the commit message doesn't give enough info to understand this
29,Gavin McCall,yes,yes,yes,no,no,use after free
31,Aidan MacDonald,yes,yes,no,yes,yes,looks like possible data corruption in networking related code
37,Gavin McCall,yes,?,yes,no,no,NULL pointer is dereferenced
38,Anton Pyntikov,yes,yes,yes,yes,no,Memory corruption if phys addr beyond 64Gb
43,Aidan MacDonald,yes,?,?,no,?,SCSI adapter drops all drives after suspend/resume cycle
45,Aidan MacDonald,yes,?,?,?,?,rmmod of module causes hang or crash
48,Adnan Chowdhury,yes,yes,no,no,no,This fixes a null pointer reference error on some systems with APU


In [7]:
def getComparativeMatrix(column, set1, set2, name1, name2):

    results_1 = set1.to_dict('records')
    results_2 = set2.to_dict('records')
    count_result = {
        'YES': { 'YES': 0, 'NO':0, '?': 0 },
        'NO':  { 'YES': 0, 'NO':0, '?': 0 },
        '?':   { 'YES': 0, 'NO':0, '?': 0 },
    }
    
    for idx, result_1 in enumerate(results_1):
        result_2 = results_2[idx]

        if not isinstance(result_2['Did this fix a bug?'], str): continue # Only values in Results 1
    
        if not isinstance(result_1['Did this fix a bug?'], str): continue # Only values in Results 2

        if result_1[column] not in ['yes', 'no', '?']: continue # Discard conflict results
        if result_2[column] not in ['yes', 'no', '?']: continue # Discard conflict results

        r1_key = result_1[column].upper().strip()
        r2_key = result_2[column].upper().strip()
        
        # Fill Matrix
        if column == "Did this fix a bug?":
            count_result[r1_key][r2_key] += 1
        elif column == "Safety relevant bug?":
            if result_1["Did this fix a bug?"] == result_2["Did this fix a bug?"]:
                count_result[r1_key][r2_key] += 1
        elif column in ['Timing + Execution', 'Memory', 'Exchange of Info']:
            if result_1["Safety relevant bug?"] == result_2["Safety relevant bug?"]:
                count_result[r1_key][r2_key] += 1
    display(pd.DataFrame(count_result, 
        index=pd.Index(['YES', 'NO', '?'], name=name2+':'),
    ).style.set_caption("<b style='float: right;'>"+name1+":</b>"))

#### Comparative results (Codethink - Round 1 vs Codethink - Round 2)

The following results show only the agreements and discrepancies between the evaluated commits in both sets of results (10)

In [8]:
getComparativeMatrix("Did this fix a bug?",codethink_results_round_1, codethink_results_round_2, "Codethink R1", "Codethink R2")

Unnamed: 0_level_0,YES,NO,?
Codethink R2:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,9,0,0
NO,1,0,0
?,0,0,0


In [9]:
getComparativeMatrix("Safety relevant bug?",codethink_results_round_1, codethink_results_round_2, "Codethink R1", "Codethink R2")

Unnamed: 0_level_0,YES,NO,?
Codethink R2:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,4,0,1
NO,1,0,0
?,2,1,0


In [10]:
for column in ['Timing + Execution', 'Memory', 'Exchange of Info']:
    display(column)
    getComparativeMatrix(column,codethink_results_round_1, codethink_results_round_2, "Codethink R1", "Codethink R2")

'Timing + Execution'

Unnamed: 0_level_0,YES,NO,?
Codethink R2:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,1,1,0
NO,0,2,0
?,0,0,0


'Memory'

Unnamed: 0_level_0,YES,NO,?
Codethink R2:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,3,0,0
NO,1,0,0
?,0,0,0


'Exchange of Info'

Unnamed: 0_level_0,YES,NO,?
Codethink R2:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,1,0,0
NO,0,3,0
?,0,0,0


In [11]:
def setConflict(result_1, result_2, column, changeOtherColumns=True):
    result_1[column] = 'Conflict: '+str(result_1[column])+'/'+str(result_2[column])
    if changeOtherColumns:
        result_1['Reasoning'] = '#' + str(result_1['Name']) + '# ' + str(result_1['Reasoning']) + '\\n' + '#' + str(result_2['Name'])+ '# ' + str(result_2['Reasoning'])
        result_1['Name'] = str(result_1['Name']) + "/" + str(result_2['Name'])

def solveConflict(result_1, result_2):
    # Same result on BFC identification
    if result_2['Did this fix a bug?'] == result_1['Did this fix a bug?']:

        # Same result on safety/non-safety relevant bug
        if result_2['Safety relevant bug?'] == result_1['Safety relevant bug?']:
            new_result = result_1.copy()
            any_conflict = False
            
            for column in ['Timing + Execution', 'Memory', 'Exchange of Info']:
                if result_2[column] != result_1[column]:
                    setConflict(new_result,result_2,column, changeOtherColumns= not any_conflict)
                    any_conflict = True

            if any_conflict:
                # Conflicts on classification
                return new_result
            else:
                # No conflicts in classification
                return result_1
        
        # Different result on safety/non-safety relevant bug
        else: 
            new_result = result_1.copy()
            setConflict(new_result,result_2,'Safety relevant bug?')
            return new_result
    # Different result on BFC identification
    else: 
        new_result = result_1.copy()
        setConflict(new_result,result_2,'Did this fix a bug?')
        return new_result
        
def combineResults(set1, set2):

    results_1 = set1.to_dict('records')
    results_2 = set2.to_dict('records')

    combined_results = []

    for idx, result_1 in enumerate(results_1):
        result_2 = results_2[idx]

        if not isinstance(result_2['Did this fix a bug?'], str): # Only values in Results 1
            combined_results.append(result_1)
            continue # Only values in Results 1
    
        if not isinstance(result_1['Did this fix a bug?'], str): # Only values in Results 2
            combined_results.append(result_2)
            continue # Only values in Results 2

        # Values in Resuls 1 and results 2
        new_result = solveConflict(result_1, result_2)
        combined_results.append(new_result)
        
    return pd.DataFrame.from_dict(combined_results)

In [12]:
combined_codethink_results_df = combineResults(codethink_results_round_1, codethink_results_round_2)

### 1.1 Conflicts in bug-fixing commit identification (Codethink Round 1 vs Round 2)

In [13]:
combined_codethink_results_df_filtered = combined_codethink_results_df[combined_codethink_results_df['Did this fix a bug?'].notnull()]
showResults(
    combined_codethink_results_df_filtered[combined_codethink_results_df_filtered['Did this fix a bug?'].str.contains('Conflict')]
    [['Name','Did this fix a bug?','Reasoning']]
)

Unnamed: 0,Name,Did this fix a bug?,Reasoning
16,Paul A/Weyman,Conflict: yes/no,"#Paul A# [Why] section in commit message identifiies incorrect outcome ""userspace will react to the hotplug event based on a wrong state"" #Weyman# About adjusting flow of something, an optimization"


### 1.2 Conflicts in safety bugs identification (Codethink Round 1 vs Round 2)

**Note**: At this point, we only consider those commits for which the same result has been obtained by classifying "Did this fix a bug?"

In [14]:
combined_codethink_results_df_filtered = combined_codethink_results_df[combined_codethink_results_df['Safety relevant bug?'].notnull()]
showResults(
    combined_codethink_results_df_filtered[combined_codethink_results_df_filtered['Safety relevant bug?'].str.contains('Conflict')]
    [['Name','Did this fix a bug?','Safety relevant bug?','Reasoning']]
)

Unnamed: 0,Name,Did this fix a bug?,Safety relevant bug?,Reasoning
20,Anton Pyntikov/Paul Sherwood,yes,Conflict: yes/no,#Anton Pyntikov# Fix memory coherency #Paul Sherwood# the commit message doesn't give enough info to understand this
37,Frank M/Gavin McCall,yes,Conflict: yes/?,#Frank M# Removes dereferencing of null pointer #Gavin McCall# NULL pointer is dereferenced
43,Tomas Veiga/Aidan MacDonald,yes,Conflict: no/?,#Tomas Veiga# No reasoning #Aidan MacDonald# SCSI adapter drops all drives after suspend/resume cycle
45,Jeeeun/Aidan MacDonald,yes,Conflict: yes/?,#Jeeeun# fixes the issue that igb module might hang or clash by clearing the variable #Aidan MacDonald# rmmod of module causes hang or crash
48,Gavin McCall/Adnan Chowdhury,yes,Conflict: ?/yes,#Gavin McCall# Null data returned #Adnan Chowdhury# This fixes a null pointer reference error on some systems with APU


### 1.3 Conflicts in safety classification (Codethink Round 1 vs Round 2)

**Note**: At this point, we only consider those commits for which the same result has been obtained by classifying "Safety relevant bug?"

In [15]:
for column in ['Timing + Execution', 'Memory', 'Exchange of Info']:
    display(column)
    combined_codethink_results_df_filtered = combined_codethink_results_df[combined_codethink_results_df[column].notnull()]
    showResults(
        combined_codethink_results_df_filtered[combined_codethink_results_df_filtered[column].str.contains('Conflict')]
        [['Name','Did this fix a bug?','Safety relevant bug?',column,'Reasoning']]
    )

'Timing + Execution'

Unnamed: 0,Name,Did this fix a bug?,Safety relevant bug?,Timing + Execution,Reasoning
38,John G/Anton Pyntikov,yes,yes,Conflict: no/yes,"#John G# ""it makes page table base points to wrong address if..."" #Anton Pyntikov# Memory corruption if phys addr beyond 64Gb"


'Memory'

Unnamed: 0,Name,Did this fix a bug?,Safety relevant bug?,Memory,Reasoning
29,Tom Bloor/Gavin McCall,yes,yes,Conflict: yes/no,#Tom Bloor# Race condition leading to incorrect reference counting and use after free. also strawberries #Gavin McCall# use after free


'Exchange of Info'

### 1.4 Results of CodeThink (Round 1 + Round 2)

In [16]:
def reportResuts(results, show_conflicts=True):
    results_filtered = results[results['Name'].notnull()]
    print("BFC identified: ",len(results_filtered[results_filtered['Did this fix a bug?']=="yes"]))
    print("Safety relevant bugs identified: ",len(results_filtered[results_filtered['Safety relevant bug?']=="yes"]))
    for column in ['Timing + Execution', 'Memory', 'Exchange of Info']:
        print("   > "+column+": ",
            len(results_filtered.loc[
                (results_filtered['Safety relevant bug?']=="yes") & (results[column]=="yes")]     
            )
        )
    if show_conflicts:
        print("Conflicts: ")
        for column in ['Did this fix a bug?', 'Safety relevant bug?', 'Timing + Execution', 'Memory', 'Exchange of Info']:
            results_filtered_aux = results_filtered[results_filtered[column].notnull()]
            print("    > "+column+": ", len(results_filtered_aux[results_filtered_aux[column].str.contains('Conflict')]))
reportResuts(combined_codethink_results_df)

BFC identified:  43
Safety relevant bugs identified:  17
   > Timing + Execution:  7
   > Memory:  11
   > Exchange of Info:  5
Conflicts: 
    > Did this fix a bug?:  1
    > Safety relevant bug?:  5
    > Timing + Execution:  1
    > Memory:  1
    > Exchange of Info:  0


## 2. Ben Dooks results

In [17]:
codethink_results_ben_dooks = pd.read_csv('Codethink Hivemind Linux Bug Classification - Ben Dooks.csv')
codethink_results_ben_dooks['Name'] = ["Ben D."] * 50
for column in ['Timing + Execution', 'Memory', 'Exchange of Info']:
    codethink_results_ben_dooks[[column]] = codethink_results_ben_dooks[[column]].fillna("no")
    codethink_results_ben_dooks[[column]] = codethink_results_ben_dooks[[column]].fillna("no")

for column in ['Did this fix a bug?','Safety relevant bug?','Timing + Execution', 'Memory', 'Exchange of Info']:
    codethink_results_ben_dooks[[column]] = codethink_results_ben_dooks[[column]].replace("maybe?", "?")
    codethink_results_ben_dooks[[column]] = codethink_results_ben_dooks[[column]].replace("no?", "no")
    codethink_results_ben_dooks[[column]] = codethink_results_ben_dooks[[column]].replace("yes?", "yes")
codethink_results_ben_dooks[["Reasoning"]] = codethink_results_ben_dooks[["Reasoning"]].fillna("No reasoning")
reportResuts(codethink_results_ben_dooks, show_conflicts=False)

BFC identified:  40
Safety relevant bugs identified:  14
   > Timing + Execution:  1
   > Memory:  8
   > Exchange of Info:  0


In [18]:
codethink_results_ben_dooks_filled = codethink_results_ben_dooks[codethink_results_round_1['Name'] != "None"]
print("Ben Dooks- results count: ",len(codethink_results_ben_dooks_filled))
(
    codethink_results_ben_dooks_filled
    [["Name","Did this fix a bug?","Safety relevant bug?","Timing + Execution","Memory","Exchange of Info","Reasoning"]]
)

Ben Dooks- results count:  47


Unnamed: 0,Name,Did this fix a bug?,Safety relevant bug?,Timing + Execution,Memory,Exchange of Info,Reasoning
0,Ben D.,yes,no,no,no,no,locking and reference fix in dm layer
1,Ben D.,yes,?,no,yes,no,scsi driver fix for lock after free
2,Ben D.,yes,no,no,no,?,fixes specific scsi scanning bug
3,Ben D.,no,no,no,no,no,removed unnecessary work already done elsewhere
4,Ben D.,yes,no,no,no,no,removes warning due to linker output
5,Ben D.,yes,yes,no,no,no,fixes gfs2 locking bug over suspend/resume
6,Ben D.,yes,no,no,no,no,build configuration dependency issue
7,Ben D.,yes,no,no,no,no,reliable boot failure if pagetable size is set to 5 levels
8,Ben D.,no,no,no,no,no,kernel page allocation api change
9,Ben D.,yes,no,no,no,no,build configuration dependency issue


## 3. Comparing Codethink round's results with Ben Dooks results

**Note**: When comparing Codethink and Ben's results, we have left out those commits for which there has been conflict between Codethink developers.

In [19]:
getComparativeMatrix("Did this fix a bug?",combined_codethink_results_df, codethink_results_ben_dooks, "Codethink R1+R2", "Ben Dooks")

Unnamed: 0_level_0,YES,NO,?
Ben Dooks:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,35,2,0
NO,4,1,0
?,2,0,0


In [20]:
getComparativeMatrix("Safety relevant bug?",combined_codethink_results_df, codethink_results_ben_dooks, "Codethink R1+R2", "Ben Dooks")

Unnamed: 0_level_0,YES,NO,?
Ben Dooks:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,9,1,0
NO,4,7,2
?,4,5,0


In [21]:
for column in ['Timing + Execution', 'Memory', 'Exchange of Info']:
    display(column)
    getComparativeMatrix(column,combined_codethink_results_df, codethink_results_ben_dooks, "Codethink R1+R2", "Ben Dooks")

'Timing + Execution'

Unnamed: 0_level_0,YES,NO,?
Ben Dooks:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,1,0,0
NO,7,13,0
?,0,0,0


'Memory'

Unnamed: 0_level_0,YES,NO,?
Ben Dooks:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,4,2,0
NO,2,12,0
?,0,1,0


'Exchange of Info'

Unnamed: 0_level_0,YES,NO,?
Ben Dooks:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,0,0,0
NO,5,15,1
?,0,0,0


In [22]:
combined_codethink_and_ben_results_df = combineResults(combined_codethink_results_df, codethink_results_ben_dooks)

### 3.1 Conflicts in bug-fixing commit identification (Codethink R1+R2 vs Ben Dooks)

In [23]:
combined_codethink_and_ben_results_df_filtered = combined_codethink_and_ben_results_df[combined_codethink_and_ben_results_df['Did this fix a bug?'].notnull()]
showResults(
    combined_codethink_and_ben_results_df_filtered
    [combined_codethink_and_ben_results_df_filtered['Did this fix a bug?'].str.contains('Conflict: Conflict:')==False]
    [combined_codethink_and_ben_results_df_filtered['Did this fix a bug?'].str.contains('Conflict')]
    [['Name','Did this fix a bug?','Reasoning']]
)

Unnamed: 0,Name,Did this fix a bug?,Reasoning
8,Tom Bloor/Ben D.,Conflict: yes/no,#Tom Bloor# wouldnt compile without this fix #Ben D.# kernel page allocation api change
11,William Salmon/Ben D.,Conflict: yes/no,"#William Salmon# Fixes bug improving filesystem performace, could effect timing but would not be safety critical in a first order sence #Ben D.# kernel page allocation api change"
13,John G/Ben D.,Conflict: no/yes,"#John G# Makes different things (""DPIA""?) preferred, sounds like enhancement rather than fx #Ben D.# bug is hardware assignment at init time"
20,Anton Pyntikov/Paul Sherwood/Ben D.,Conflict: yes/?,#Anton Pyntikov/Paul Sherwood# #Anton Pyntikov# Fix memory coherency #Paul Sherwood# the commit message doesn't give enough info to understand this #Ben D.# not enough info in commit
24,Sam L/Ben D.,Conflict: yes/?,#Sam L# I think HDP registers are only related to caching so not safety related #Ben D.# No reasoning
27,Paul Sherwood/Ben D.,Conflict: no/yes,#Paul Sherwood# i think this is just a slight performance improvement #Ben D.# kunit test bug
33,Nathan/Ben D.,Conflict: yes/no,#Nathan# Reverts a commit that had a false assumption that did an error out instead of a garbage out. This doesn't effect the outcome both still count as a fail. #Ben D.# nfs issue where data will be retransmitted - todo check if memory leak?
36,Ivan Orlov/Ben D.,Conflict: yes/no,#Ivan Orlov# one instruction compiles as another #Ben D.# removes extra unused branch protection instructions


### 3.2 Conflicts in safety bugs identification (Codethink R1+R2 vs Ben Dooks)

**Note**: At this point, we only consider those commits for which the same result has been obtained by classifying "Did this fix a bug?"

In [24]:
combined_codethink_and_ben_results_df_filtered = combined_codethink_and_ben_results_df[combined_codethink_and_ben_results_df['Safety relevant bug?'].notnull()]
showResults(
    combined_codethink_and_ben_results_df_filtered
    [combined_codethink_and_ben_results_df_filtered['Did this fix a bug?'].str.contains('Conflict:')==False]
    [combined_codethink_and_ben_results_df_filtered['Safety relevant bug?'].str.contains('Conflict: Conflict:')==False]
    [combined_codethink_and_ben_results_df_filtered['Safety relevant bug?'].str.contains('Conflict')]
    [['Name','Did this fix a bug?','Safety relevant bug?','Reasoning']]
)
# Ben Dooks Commit 40

Unnamed: 0,Name,Did this fix a bug?,Safety relevant bug?,Reasoning
0,Adnan Chowdhury/Ben D.,yes,Conflict: yes/no,#Adnan Chowdhury# Fix of a race condition which could possibly corrupt memory #Ben D.# locking and reference fix in dm layer
1,John G/Ben D.,yes,Conflict: yes/?,"#John G# ""could [stuff...]... causing a use-after-free memory access"" #Ben D.# scsi driver fix for lock after free"
2,Hao Hu/Ben D.,yes,Conflict: yes/no,#Hao Hu# Change SAS interface pointer #Ben D.# fixes specific scsi scanning bug
7,Tom Bloor/Ben D.,yes,Conflict: ?/no,#Tom Bloor# if youre rebooting during normal operation then... good luck? #Ben D.# reliable boot failure if pagetable size is set to 5 levels
15,John G/Ben D.,yes,Conflict: yes/no,"#John G# Commit message said so, diff includes a free*()ing function #Ben D.# jbd2 may fail to release memory in error paths"
17,Ivan Orlov/Ben D.,yes,Conflict: no/?,#Ivan Orlov# Commit message + code which stops FS trimming when suspend #Ben D.# ext4 fstrim may block suspend
18,Basit A./Ben D.,yes,Conflict: no/yes,#Basit A.# commit message + interrupt service routines using mutexes #Ben D.# fixes deadlock in drm/vkms driver
19,Sam L/Ben D.,yes,Conflict: no/?,"#Sam L# Previous code doesn't reschedule packet, packets are now rescheduled properly I think?? #Ben D.# nfs requests may be lost - maybe get retried?"
23,Poppy/Ben D.,yes,Conflict: ?/no,#Poppy# Metrics table filler fails correctly. #Ben D.# caller should detect invalid response and retry metrics
25,Paul Sherwood/Ben D.,yes,Conflict: no/?,#Paul Sherwood# No reasoning #Ben D.# nfs bugfix


### 3.3 Conflicts in safety classification (Codethink R1+R2 vs Ben Dooks)

**Note**: At this point, we only consider those commits for which the same result has been obtained by classifying "Safety relevant bug?"

In [25]:
for column in ['Timing + Execution', 'Memory', 'Exchange of Info']:
    display(column)
    combined_codethink_and_ben_results_df_filtered = combined_codethink_and_ben_results_df[combined_codethink_and_ben_results_df[column].notnull()]
    showResults(
        combined_codethink_and_ben_results_df_filtered
        [combined_codethink_and_ben_results_df_filtered['Did this fix a bug?'].str.contains('Conflict:')==False]
        [combined_codethink_and_ben_results_df_filtered['Safety relevant bug?'].str.contains('Conflict:')==False]
        [combined_codethink_and_ben_results_df_filtered[column].str.contains('Conflict')]
        [['Name','Did this fix a bug?','Safety relevant bug?',column,'Reasoning']]
    )

'Timing + Execution'

Unnamed: 0,Name,Did this fix a bug?,Safety relevant bug?,Timing + Execution,Reasoning
3,Hao Hu/Ben D.,no,no,Conflict: yes/no,#Hao Hu# Remove extra dirty node flag #Ben D.# removed unnecessary work already done elsewhere
5,Aidan MacDonald/Ben D.,yes,yes,Conflict: yes/no,"#Aidan MacDonald# cluster filesystem locking error causing hang - according to commit message, difficult to verify otherwise #Ben D.# fixes gfs2 locking bug over suspend/resume"
12,Weyman/Ben D.,yes,yes,Conflict: yes/no,"#Weyman# Involves an event, a lock, and access of released resource (a crash) #Ben D.# possible lock usage on freed memory"
14,Paul Sherwood/Ben D.,yes,yes,Conflict: yes/no,"#Paul Sherwood# bug related to timing in io function, so scheduling + info exchange? #Ben D.# possible sleep in invalid place may cause system hang"
21,Sam L/Ben D.,yes,yes,Conflict: yes/no,#Sam L# Previous code locked the wrong lock... #Ben D.# incorrect locking in nfs
28,Gavin McCall/Ben D.,yes,no,Conflict: yes/no,"#Gavin McCall# ""not valid"" #Ben D.# backlight bug on certain amd hw during shutdown"
41,JZ/Ben D.,yes,yes,Conflict: yes/no,"#JZ# race condition between adding and reading data, resulting in trying to add two entries with the same index #Ben D.# possible runtime failiure in btrfs under user conditions"


'Memory'

Unnamed: 0,Name,Did this fix a bug?,Safety relevant bug?,Memory,Reasoning
12,Weyman/Ben D.,yes,yes,Conflict: no/yes,"#Weyman# Involves an event, a lock, and access of released resource (a crash) #Ben D.# possible lock usage on freed memory"
14,Paul Sherwood/Ben D.,yes,yes,Conflict: no/yes,"#Paul Sherwood# bug related to timing in io function, so scheduling + info exchange? #Ben D.# possible sleep in invalid place may cause system hang"
21,Sam L/Ben D.,yes,yes,Conflict: yes/no,#Sam L# Previous code locked the wrong lock... #Ben D.# incorrect locking in nfs
22,Basit A./Ben D.,yes,yes,Conflict: yes/no,#Basit A.# commit message + use after free #Ben D.# possible read of invalid memory / BUG trigger


'Exchange of Info'

Unnamed: 0,Name,Did this fix a bug?,Safety relevant bug?,Exchange of Info,Reasoning
5,Aidan MacDonald/Ben D.,yes,yes,Conflict: ?/no,"#Aidan MacDonald# cluster filesystem locking error causing hang - according to commit message, difficult to verify otherwise #Ben D.# fixes gfs2 locking bug over suspend/resume"
14,Paul Sherwood/Ben D.,yes,yes,Conflict: yes/no,"#Paul Sherwood# bug related to timing in io function, so scheduling + info exchange? #Ben D.# possible sleep in invalid place may cause system hang"
22,Basit A./Ben D.,yes,yes,Conflict: yes/no,#Basit A.# commit message + use after free #Ben D.# possible read of invalid memory / BUG trigger
31,Dickon Hood/Ben D.,yes,yes,Conflict: yes/no,#Dickon Hood# Socket handling error under error conditions #Ben D.# could cause errors in the network data quees
41,JZ/Ben D.,yes,yes,Conflict: yes/no,"#JZ# race condition between adding and reading data, resulting in trying to add two entries with the same index #Ben D.# possible runtime failiure in btrfs under user conditions"


## 4. Comparing with our results

In [26]:
michel_results = pd.read_csv('Linux Bug Classification - Michel Maes.csv')  

### 4.1 Comparing Michel Maes results with Codethink R1+R2 results

In [27]:
getComparativeMatrix("Did this fix a bug?",michel_results, combined_codethink_results_df, "Michel Maes", "Codethink R1+R2")

Unnamed: 0_level_0,YES,NO,?
Codethink R1+R2:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,37,3,3
NO,2,0,1
?,0,0,0


In [28]:
combined_codethink_and_michel_results_df = combineResults(combined_codethink_results_df, michel_results)

In [29]:
combined_codethink_and_michel_results_df_filtered = combined_codethink_and_michel_results_df[combined_codethink_and_michel_results_df['Did this fix a bug?'].notnull()]
showResults(
    combined_codethink_and_michel_results_df_filtered
    [combined_codethink_and_michel_results_df_filtered['Did this fix a bug?'].str.contains('Conflict: Conflict:')==False]
    [combined_codethink_and_michel_results_df_filtered['Did this fix a bug?'].str.contains('Conflict')]
    [['Name','Did this fix a bug?','Reasoning']]
)

Unnamed: 0,Name,Did this fix a bug?,Reasoning
3,Hao Hu/Michel M.,Conflict: no/?,#Hao Hu# Remove extra dirty node flag #Michel M.# not sure
4,Basit A./Michel M.,Conflict: yes/no,#Basit A.# commit + superfluos function section data generation #Michel M.# change on Makefile - no code change
6,Chris Phang/Michel M.,Conflict: yes/no,#Chris Phang# kbuild dependency fix #Michel M.# change on Kconfig - no code change
9,Aidan MacDonald/Michel M.,Conflict: yes/no,#Aidan MacDonald# Kbuild fix #Michel M.# change on Kconfig - no code change
11,William Salmon/Michel M.,Conflict: yes/?,"#William Salmon# Fixes bug improving filesystem performace, could effect timing but would not be safety critical in a first order sence #Michel M.# looks like a refactor - rename variable page"
13,John G/Michel M.,Conflict: no/yes,"#John G# Makes different things (""DPIA""?) preferred, sounds like enhancement rather than fx #Michel M.# commit message indcates that is a fix"
24,Sam L/Michel M.,Conflict: yes/?,#Sam L# I think HDP registers are only related to caching so not safety related #Michel M.# not sure
27,Paul Sherwood/Michel M.,Conflict: no/yes,#Paul Sherwood# i think this is just a slight performance improvement #Michel M.# commit msg indicates that this change fixes a bug introduced in another commit
28,Gavin McCall/Michel M.,Conflict: yes/?,"#Gavin McCall# ""not valid"" #Michel M.# not sure"


### 4.2 Comparing Michel Maes results with Ben Dooks results

In [30]:
getComparativeMatrix("Did this fix a bug?",michel_results, codethink_results_ben_dooks, "Michel Maes", "Ben Dooks")

Unnamed: 0_level_0,YES,NO,?
Ben Dooks:,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
YES,36,3,1
NO,3,0,2
?,1,0,1


In [31]:
combined_ben_and_michel_results_df = combineResults(codethink_results_ben_dooks, michel_results)

In [32]:
combined_ben_and_michel_results_df_filtered = combined_ben_and_michel_results_df[combined_ben_and_michel_results_df['Did this fix a bug?'].notnull()]
showResults(
    combined_ben_and_michel_results_df_filtered
    [combined_ben_and_michel_results_df_filtered['Did this fix a bug?'].str.contains('Conflict: Conflict:')==False]
    [combined_ben_and_michel_results_df_filtered['Did this fix a bug?'].str.contains('Conflict')]
    [['Name','Did this fix a bug?','Reasoning']]
)

Unnamed: 0,Name,Did this fix a bug?,Reasoning
3,Ben D./Michel M.,Conflict: no/?,#Ben D.# removed unnecessary work already done elsewhere #Michel M.# not sure
4,Ben D./Michel M.,Conflict: yes/no,#Ben D.# removes warning due to linker output #Michel M.# change on Makefile - no code change
6,Ben D./Michel M.,Conflict: yes/no,#Ben D.# build configuration dependency issue #Michel M.# change on Kconfig - no code change
8,Ben D./Michel M.,Conflict: no/yes,#Ben D.# kernel page allocation api change #Michel M.# commit msg indicates that this change fixes a bug introduced in another commit
9,Ben D./Michel M.,Conflict: yes/no,#Ben D.# build configuration dependency issue #Michel M.# change on Kconfig - no code change
11,Ben D./Michel M.,Conflict: no/?,#Ben D.# kernel page allocation api change #Michel M.# looks like a refactor - rename variable page
20,Ben D./Michel M.,Conflict: ?/yes,#Ben D.# not enough info in commit #Michel M.# commit msg indicates that the change is needed for HDP flush to work correctly
28,Ben D./Michel M.,Conflict: yes/?,#Ben D.# backlight bug on certain amd hw during shutdown #Michel M.# not sure
33,Ben D./Michel M.,Conflict: no/yes,#Ben D.# nfs issue where data will be retransmitted - todo check if memory leak? #Michel M.# revert a problematic change
36,Ben D./Michel M.,Conflict: no/yes,#Ben D.# removes extra unused branch protection instructions #Michel M.# commit msg indicates that this change fixes a bug introduced in another commit


#### Michel Comments:

- I have not considered as corrected errors those that did not touch source code lines, as in the case of the Makefile and Kconfig configuration files.
- I have relied on the format of the commit message, which when it is a bug fix, in most cases, contains the commit which fixes: "Fixes: \<commit-hash\>"
