In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import os
import math

In [None]:
buggy_elements = pd.read_csv('../data/RQ1_Data_Processed.csv')

## Total amount of buggy code elements spread through the quantum projects

In [None]:
code_elements = ['<class \'_ast.Return\'>','<class \'_ast.Call\'>','<class \'_ast.Delete\'>','<class \'_ast.Assign\'>','<class \'_ast.AugAssign\'>','<class \'_ast.AnnAssign\'>','<class \'_ast.For\'>','<class \'_ast.AsyncFor\'>','<class \'_ast.While\'>','<class \'_ast.If\'>','<class \'_ast.With\'>','<class \'_ast.AsyncWith\'>','<class \'_ast.Match\'>','<class \'_ast.Raise\'>','<class \'_ast.Try\'>','<class \'_ast.Assert\'>','<class \'_ast.Import\'>','<class \'_ast.ImportFrom\'>','<class \'_ast.Global\'>','<class \'_ast.NonLocal\'>','<class \'_ast.Break\'>','<class \'_ast.Continue\'>','<class \'_ast.Pass\'>','<class \'_ast.BoolOp\'>','<class \'_ast.NamedExpr\'>','<class \'_ast.BinOp\'>','<class \'_ast.UnaryOp\'>','<class \'_ast.Lambda\'>','<class \'_ast.IfExp\'>','<class \'_ast.ListComp\'>','<class \'_ast.SetComp\'>','<class \'_ast.DictComp\'>','<class \'_ast.GeneratorExp\'>','<class \'_ast.Await\'>','<class \'_ast.Yield\'>','<class \'_ast.YieldFrom\'>','<class \'_ast.Compare\'>','<class \'_ast.Constant\'>','<class \'_ast.Tuple\'>','<class \'_ast.List\'>','<class \'_ast.Attribute\'>' ]

In [None]:
code_element_count = []

for code_element in code_elements:
    counter = 0
    for values in buggy_elements['Values']:
       
        if code_element in values:
            counter = counter +1
            
    code_element_count.append(counter)

In [None]:
print(code_element_count)

In [None]:
code_element_description = ['Return','Function call','Del keyword','Assignment','Augmented Assignment','Annotated assignment','For Loop','Async For Loop','While Loop','If statement','With keyword','Async With keyword','Match keyword','Raise Exception','Try exception','Assertion','Import Keyword','ImportFrom Keyword','Global Variable','Non Local Variable','Break','Continue','Pass','Boolean operator','Named expression','Binary operator','Unitary Operator','Lambda','Conditional Expression','List comprehension','Set Comprehension','Dictionary Comprehension','Object Generator','Await expression','Yield expression','YieldFrom expression','Comparison Expression','Constant','Tuple','List','Attribute']

In [None]:
print(len(code_element_description))
print(len(code_element_count))

In [None]:
# Figure Size
fig, ax = plt.subplots(figsize =(16, 9))
 
# Horizontal Bar Plot
ax.barh(code_element_description, code_element_count, color = 'yellow')
 
# Remove axes splines
for s in ['top', 'bottom', 'left', 'right']:
    ax.spines[s].set_visible(False)
 
# Remove x, y Ticks
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')
 
# Add padding between axes and labels
ax.xaxis.set_tick_params(pad = 5)
ax.yaxis.set_tick_params(pad = 10)
 
# Add x, y gridlines
ax.grid(b = True, color ='grey',
        linestyle ='-.', linewidth = 0.5,
        alpha = 0.2)
 
# Show top values
ax.invert_yaxis()
 
# Add annotation to bars
for i in ax.patches:
    plt.text(i.get_width()+0.2, i.get_y()+0.5,
             str(round((i.get_width()), 2)),
             fontsize = 10, fontweight ='bold',
             color ='grey')
 
# Add Plot Title
ax.set_title('Buggy Code Element Distribution ',
             loc ='left', )
# Show Plot
plt.show()

In [None]:
fig.savefig("RQ1_Code_Element_Distribution.pdf", bbox_inches='tight',)

Answer :  As we can see from the plot bar above, the most common buggy elements are function calls, assignments to variables and importFrom statements. It is interesting to note that just import statements have a way less probability of being buggy when compared to importFrom statements, this might prelude us ot the fact that usually developers import modules that may not be totally necessary to the code's good function or whether they need to import the whole library instead. Secondly, function calls are the statement that usually causes a more buggy behaviour. If statements also tend to be slighty buggy, as we can see from the plot bar. Nevertheless, there are not many changes in iterators such as loops or while statements, which means these statements are most of the times correct. More specific statements, like declaring global variables, non local variables, asyncchronous events are not used at all in quantum projects. We can also see that comparison operators like != and == tend to be way more buggy than binary operators, such as aritmetric ones ( +,-,*,/ and so forth).

## Average buggy code element distribution

In [None]:
average_code_element_count = []

for counter in code_element_count:
    average_code_element_count.append(round(counter/14))
    
print(len(average_code_element_count))
    

In [None]:
# Figure Size
fig, ax = plt.subplots(figsize =(16, 9))
 
# Horizontal Bar Plot
ax.barh(code_element_description,average_code_element_count, color = 'orange')
 
# Remove axes splines
for s in ['top', 'bottom', 'left', 'right']:
    ax.spines[s].set_visible(False)
 
# Remove x, y Ticks
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')
 
# Add padding between axes and labels
ax.xaxis.set_tick_params(pad = 5)
ax.yaxis.set_tick_params(pad = 10)
 
# Add x, y gridlines
ax.grid(b = True, color ='grey',
        linestyle ='-.', linewidth = 0.5,
        alpha = 0.2)
 
# Show top values
ax.invert_yaxis()
 
# Add annotation to bars
for i in ax.patches:
    plt.text(i.get_width()+0.2, i.get_y()+0.5,
             str(round((i.get_width()), 2)),
             fontsize = 10, fontweight ='bold',
             color ='grey')
 
# Add Plot Title
ax.set_title('Buggy Code Element Distribution Over Classical bugs ',
             loc ='left', )
# Show Plot
plt.show()

In [None]:
fig.savefig("RQ1_Average_Code_Element_Distribution.pdf", bbox_inches='tight',)

## Total amount of buggy code elements in quantum bugs

In [None]:
quantum_bug_dataframe = buggy_elements.loc[((buggy_elements['Bug Type'] == 'Quantum'))]

In [None]:
code_element_count_quantum = []

for code_element in code_elements:
    counter = 0
    for values in quantum_bug_dataframe['Values']:
       
        if code_element in values:
            counter = counter +1
            
    code_element_count_quantum.append(counter)

In [None]:
# Figure Size
fig, ax = plt.subplots(figsize =(16, 9))
 
# Horizontal Bar Plot
ax.barh(code_element_description, code_element_count_quantum, color = 'green')
 
# Remove axes splines
for s in ['top', 'bottom', 'left', 'right']:
    ax.spines[s].set_visible(False)
 
# Remove x, y Ticks
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')
 
# Add padding between axes and labels
ax.xaxis.set_tick_params(pad = 5)
ax.yaxis.set_tick_params(pad = 10)
 
# Add x, y gridlines
ax.grid(b = True, color ='grey',
        linestyle ='-.', linewidth = 0.5,
        alpha = 0.2)
 
# Show top values
ax.invert_yaxis()
 
# Add annotation to bars
for i in ax.patches:
    plt.text(i.get_width()+0.2, i.get_y()+0.5,
             str(round((i.get_width()), 2)),
             fontsize = 10, fontweight ='bold',
             color ='grey')
 
# Add Plot Title
ax.set_title('Buggy Code Element Distribution Over Quantum bugs ',
             loc ='left', )
# Show Plot
plt.show()

In [None]:
fig.savefig("RQ1_Quantum_Code_Element_Distribution.pdf", bbox_inches='tight',)

Answer: As for buggy elements in quantum, we can see the the results yield , for the most part, from the overall buggy element results. Function calls, if statements,assigns and importFrom continue to be the most common buggy elements followed by changes in data structure expressions such as tuples, lists or even attributes. It is interesting to note that although there are less quantum bugs than classical bugs, most of the changes actually apply to quantum bugs, which suggests that the quantum behaviour and thus how it affects the code( requiring quantum knowledge to develop a particular functionality), is still not very clear to developers. Furthermore, its also interesting to note that in quantum bugs, the are not many buggy occurence for the 'for' statement.

## Total amount of buggy code elements in classical bugs

In [None]:
classical_bug_dataframe = buggy_elements.loc[((buggy_elements['Bug Type'] == 'Classical'))]

In [None]:
code_element_count_classical = []

for code_element in code_elements:
    counter = 0
    for values in classical_bug_dataframe['Values']:
       
        if code_element in values:
            counter = counter +1
            
    code_element_count_classical.append(counter)

In [None]:
# Figure Size
fig, ax = plt.subplots(figsize =(16, 9))
 
# Horizontal Bar Plot
ax.barh(code_element_description, code_element_count_classical, color = 'red')
 
# Remove axes splines
for s in ['top', 'bottom', 'left', 'right']:
    ax.spines[s].set_visible(False)
 
# Remove x, y Ticks
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')
 
# Add padding between axes and labels
ax.xaxis.set_tick_params(pad = 5)
ax.yaxis.set_tick_params(pad = 10)
 
# Add x, y gridlines
ax.grid(b = True, color ='grey',
        linestyle ='-.', linewidth = 0.5,
        alpha = 0.2)
 
# Show top values
ax.invert_yaxis()
 
# Add annotation to bars
for i in ax.patches:
    plt.text(i.get_width()+0.2, i.get_y()+0.5,
             str(round((i.get_width()), 2)),
             fontsize = 10, fontweight ='bold',
             color ='grey')
 
# Add Plot Title
ax.set_title('Buggy Code Element Distribution Over Classical bugs ',
             loc ='left', )
# Show Plot
plt.show()

In [None]:
fig.savefig("RQ1_Classic_Code_Element_Distribution.pdf", bbox_inches='tight',)

Answer: Unlike the quantum bugs, classical bugs hold way less buggy code elements, even though these quantum projects have more classical bugs than quantum bugs. That being said, the most buggy code elements still remain the same, i.e ( function calls, assigns, importFrom, and if statements). However, it's in classical bugs that we have the highest distribution of ' for' statement assignments.

## Buggy code elements per project 

In [None]:
def buggy_code_element_project(project):
    project_bug_dataframe = buggy_elements.loc[((buggy_elements['Repo'] == project))]
                                                
    return project_bug_dataframe
    

In [None]:
def BugDistributionPerRepo(project_name,values):
        # Figure Size
    fig, ax = plt.subplots(figsize =(16, 9))

    # Horizontal Bar Plot
    ax.barh(code_element_description, values, color = 'red')

    # Remove axes splines
    for s in ['top', 'bottom', 'left', 'right']:
        ax.spines[s].set_visible(False)

    # Remove x, y Ticks
    ax.xaxis.set_ticks_position('none')
    ax.yaxis.set_ticks_position('none')

    # Add padding between axes and labels
    ax.xaxis.set_tick_params(pad = 5)
    ax.yaxis.set_tick_params(pad = 10)

    # Add x, y gridlines
    ax.grid(b = True, color ='grey',
            linestyle ='-.', linewidth = 0.5,
            alpha = 0.2)

    # Show top values
    ax.invert_yaxis()

    # Add annotation to bars
    for i in ax.patches:
        plt.text(i.get_width()+0.2, i.get_y()+0.5,
                 str(round((i.get_width()), 2)),
                 fontsize = 10, fontweight ='bold',
                 color ='grey')

    # Add Plot Title
    ax.set_title('Threshold distribution per project - ' + project_name,
                 loc ='left', )
    # Show Plot
    plt.show()
    
    fig.savefig("RQ1_" + project_name + "_Code_Element_Distribution.pdf", bbox_inches='tight',)
    

In [None]:
uniqueRepo = buggy_elements['Repo'].unique()
print(uniqueRepo)

In [None]:
for repo in uniqueRepo:
    
    code_element_count_repo = []

    for code_element in code_elements:
        counter = 0
        repo_df = buggy_code_element_project(repo)
        for values in repo_df['Values']:

            if code_element in values:
                counter = counter +1

        code_element_count_repo.append(counter)
        
    BugDistributionPerRepo(repo,code_element_count_repo)
        
    
    
    
    

Answer:

In this plot bar we have the distribution of the buggy code elements per project. We can see that the majority of the buggy code elements are spread through ProjectQ, this means that in particular, this quantum project was the one that had more changes throughout it's life cycle ( may possibly be an outlier). The remaining projects are a similar buggy code element distribution whose numbers of each buggy code element usually don't go higher than a 1000. Projects like amazong-skt-python,pennylane,qiskit-aer,Strawberry fields, hold more buggy code elements than the remaining projects( values ranging from 0 to 1000 for each buggy code element). Next up, we have cirq,qiskit-ignis and qiskit-terra whose values range from 0 to 200( very interesting as qiskit-terra is a huge repository and yet doesn't have many changes). And finally we have projects such as qsharp,tequila,mitiq and OpenQL that barely have any buggy code elements. The distribution of the most common buggy code elements is for most of the projects the same as for the overall. However, with some singularities, as we can see pennylane uses a lot the with keyword on buggy behaviour, unlike other projects, where the with keyword is barely used. 

## Short Conclusion

All in all, although we have less quantum bugs than classical bugs, the majority of the buggy code elements  can be found at quantum bugs. Comparing both these type of bugs, we can see the they share the most buggy code elements, however classical bugs tend to have more buggy behaviour in for loops than quantum bugs, which lead us to conclude that either developers are using for loops properly when working with quantum phenomenom (such as qubits) or that for loops are not used at all when we're coding something that is quantum specific. The most common buggy elements are function calls, if statements,assignments, importFrom which means these should be the targets for future quantum software testing development.

In [None]:
# elementos aparecem à direita e não à esquerda e vice versa