## Run GPT-4 as a static analyzer

After carefully tuning the prompts, the prompt is formulated into:

    You will be given a piece of Python code, many of which use the TensorFlow library. Your job is to find shape-related errors if there are any. Only focus on errors that occur when the used shape does not match the expected shape in an operator or function. Ignore the errors related to redefined or fixed shape definitions. Generate outputs according to the templates: "No shape mismatch found." if no shape errors found, or "[error]: [LINENUMBER: LINEOFCODE] REASON" if any shape errors are found, and make sure to provide reasons.

Note: you should have setup own API key to run GPT-4 locally.

In [None]:
import sa_by_chatGPT
import os

remove comments systematically

In [None]:
# remove comments in the code (prevent chatGPT takes short cuts)

# py_paths_all = sa_by_chatGPT.get_py_paths('')
# for key, py_paths in py_paths_all.items():
#     tar_folder = "data4chatGPT/tf_fix_py"
#     if 'buggy' in key:
#         tar_folder = "data4chatGPT/tf_bugs_py"
#     for py_path in py_paths:
#         sa_by_chatGPT.remove_comments_and_docstrings_python(py_path, tar_folder)

run with single file

In [None]:
sa_by_chatGPT.chatGPT_py_file('data4chatGPT\\tf_bugs_py\\ut4_experiment_runinfo.py')

run all the data files

In [None]:
# on Pythia dataset
# buggy and fix version is run separately to avoid long waiting time
ress_chatGPT = {}
py_paths_all = sa_by_chatGPT.get_py_paths('data4chatGPT')
for key, py_paths in py_paths_all.items():
    #if 'fix' not in key: # 'buggy'  #only the buggy (+runinfo) version
    #    continue
    for py_path in py_paths:
        res = sa_by_chatGPT.chatGPT_py_file(py_path)
        if key in ress_chatGPT:
            ress_chatGPT[key].append({os.path.basename(py_path):res})
        else:
            ress_chatGPT[key]=[{os.path.basename(py_path):res}]
#ress_chatGPT

save the results from GPT-4

In [None]:
import json
  
# save the output data
def extract_errors(data):
    result = {}
    for category, entries in data.items():
        result[category] = []
        for entry in entries:
            for file_name, message in entry.items():
                errors = message.strip('"').split('\n')
                errors = [error.strip().strip('"') for error in errors if '[error]' in error.lower() or 'error:' in error.lower()]

                result[category].append({
                    file_name: {
                        'error_count': len(errors),
                        'error': errors
                    }
                })
    return result

tmp = extract_errors(ress_chatGPT)

with open('output_chatGPT/output_1.json', 'w') as json_file:
    json.dump(tmp, json_file, indent=4)

Upon processing the \gpt{} outputs, certain errors were identified with seemingly unreasonable justifications. For instance, in the \ut-3 fixed version with run-time information, \gpt{} yielded the following output:

    [error]: [9: y = tf.reshape(y, [478, 717, 3])] The reshape operation is trying to reshape the tensor of shape (1028178,) into a tensor of shape (478, 717, 3) which is not possible because 478*717*3 = 1028196 != 1028178. The total size of the new shape must be the same as the total size of the original shape.

It claims 478*717*3 = 1028196, whereas the correct calculation is 478*717*3 = 1028178, rendering this reported error logically incorrect. Instances like this were not considered valid errors, occurring twice in total, both for the \ut-3 fixed version, with and without run-time information.

The saved raw outputs can be found in folder: ut_dataset_runinfo\output_chatGPT

The processed results for GPT-4 can be found: ut_dataset_runinfo\output_chatGPT\results_chatGPT.txt

Appendix: test a simple case

In [None]:
# file_path = 'data4chatGPT\\simple_test_for_chatGPT.py'
# res = sa_by_chatGPT.chatGPT_py_file(file_path)
# print(os.path.basename(file_path) + '\n' + res)

In [None]:
# remove comments from the simple case
# sa_by_chatGPT.remove_comments_and_docstrings_python('data4chatGPT\\simple_test_for_chatGPT.py', 'data4chatGPT/tf_bugs_py')

In [None]:
# test the simple case without comments
# print(sa_by_chatGPT.chatGPT_py_file('data4chatGPT\\tf_bugs_py\\simple_test_for_chatGPT.py'))