## MongoDB BigCodeBench Code Generation Database Builder

This notebook is used to initialize a new code generation database onto MongoDB for BigCodeBench benchmark.

It should be noted that there could be artifacts leftover from building the database. They will appear under the parent directory to this notebook. You may remove them once the database has finished generating.

In [None]:
import pandas as pd
import sys
import os
import shutil

In [None]:
import matplotlib
matplotlib.use("Agg")  # Non-interactive backend (no GUI)

import matplotlib.pyplot as plt
plt.ioff()  # Turn off interactive mode
plt.show = lambda *args, **kwargs: None 

In [None]:
curr_dir = os.getcwd()
parent_dir = os.path.dirname(curr_dir)
proj_dir = os.path.dirname(parent_dir)
bigcodebench = pd.read_csv(os.path.join(proj_dir, "datasets/open_ended_format/bigcodebench_test.csv"), header = 0, encoding='utf-8')
sys.path.append(proj_dir)

In [None]:
from code_generation.utility.bigcodebench_helper import CodeGenerationBigCodeBenchHelper
from database import MongoDBHelper

# Connecting to MongoDB

In [None]:
mongodbHelper = MongoDBHelper()
mongodbHelper.check_database_connectivity()

In [None]:
db = mongodbHelper.client[os.getenv('MONGODB_BENCHMARK_DATABASE')]
open_ended_db = db[os.getenv('MONGODB_BIGCODEBENCH_COLLECTION')]

In [None]:
# %%script false --no-raise-error

to_check_example = []
to_check_test = []

try: 
    for idx in range(
        bigcodebench.__len__(),
        ):

        qn_id = f"BigCodeBencho{idx-len(to_check_example)-len(to_check_test)}"

        qn_details = open_ended_db.find_one({"_id" : qn_id})        # checking if this qn_id already exists in the db

        qn = bigcodebench.iloc[idx]

        original_prompt = qn['complete_prompt'].replace(r"\\x", r"\\\\x")
        canonical_solution = qn['canonical_solution']
        instruct_prompt = qn['instruct_prompt']

        test = qn['test']
        original_test_id = qn['task_id']

        # storing all the file names in the current directory as a snapshot
        # this is a necessary step to remove any new files created from running the tasks
        snapshot_dir = os.listdir(curr_dir)

        prompt, qn_desc = CodeGenerationBigCodeBenchHelper.seperate_original_desciptions(original_prompt)

        task_doc_string, example = CodeGenerationBigCodeBenchHelper.extract_examples(qn_desc)

        natural_language_instruct, code_instruct= CodeGenerationBigCodeBenchHelper.split_code_from_instruct_prompt(instruct_prompt=instruct_prompt)
        try: 
            full_sol = CodeGenerationBigCodeBenchHelper.obtain_full_sol(
                canonical_sol=canonical_solution, 
                code_instruct=code_instruct,
                test = test)
        except Exception as e:
            print(e)
            print(original_test_id)
            to_check_test.append(original_test_id)
            continue
        
        entry_dict = {
            "_id" : qn_id,
            "qn" : code_instruct,
            "qn_desc": task_doc_string,
            "canon_solution" : canonical_solution,
            "original_qn_desc" : qn_desc,
            "examples": example,
            "check" : test,
            "original_id": original_test_id
        }

        curr_dir_snapshot = os.listdir(curr_dir)
        for file_name in curr_dir_snapshot:
            if file_name not in snapshot_dir:
                file_path = os.path.join(curr_dir, file_name)
                if os.path.isdir(file_path):
                    shutil.rmtree(file_path)
                else:
                    os.remove(file_path)


        if qn_details is None:
            open_ended_db.insert_one(entry_dict)
            print('Added entry to database: {id}'.format(id = qn_id))
        else:
            open_ended_db.update_one({"_id" : qn_id}, update = {"$set": entry_dict})
            print('Updated existing entry in database: {id}'.format(id = qn_id))
except KeyboardInterrupt:
    print(original_test_id)
except Exception as e:
    print(original_test_id)
    print(e)
    print(to_check_test)


print(to_check_example if len(to_check_example) > 0 else "All cases contains examples. Nothing to check!")
print(to_check_test if len(to_check_test) > 0 else "All test cases passed. Nothing to check!")

## Notes:

During the time of preparing this notebook for submission, we found that the following BigCodeBench tasks could not be successfully processed:

- `BigCodeBench/39`
- `BigCodeBench/80`
- `BigCodeBench/81`
- `BigCodeBench/82`
- `BigCodeBench/83`
- `BigCodeBench/101`
- `BigCodeBench/115`
- `BigCodeBench/177`
- `BigCodeBench/205`
- `BigCodeBench/245`
- `BigCodeBench/334`
- `BigCodeBench/360`
- `BigCodeBench/361`
- `BigCodeBench/362`
- `BigCodeBench/363`
- `BigCodeBench/372`
- `BigCodeBench/383`
- `BigCodeBench/495`
- `BigCodeBench/501`
- `BigCodeBench/590`
- `BigCodeBench/593`
- `BigCodeBench/596`
- `BigCodeBench/612`
- `BigCodeBench/634`
- `BigCodeBench/686`
- `BigCodeBench/734`
- `BigCodeBench/736`
- `BigCodeBench/779`
- `BigCodeBench/940`
- `BigCodeBench/964`
- `BigCodeBench/1005`
- `BigCodeBench/1028`
- `BigCodeBench/1084`
- `BigCodeBench/1109`

We suspect this is due to package conflicts and will require further investigation.