# Creating test cases
This notebook takes a folder of notebooks and turns them into a jsonl file in the format human_eval expects.

The notebooks have to have the following format:
* Within one cell there must be a function that solves a specific [bio-image analysis] task.
* This function must have a meaningful docstring between """ and """. It must be so meaningful that a language model could possibly write the entire function.
* There must be another code cell that starts with `def check(candiate):` and contains test code to test the generated code.
* The text code must use `assert` statements and call the `candidate` function. E.g. if a given function to test is `sum`, then a valid test for `sum` would be:
```
def check(candidate):
    assert candidate(3, 4) == 7
```
* A third python code cell in the notebook must call the `check` function with your custom function, e.g. like this, to prove that the code you provided works with the tests you wrote:
```
check(sum)
```
* Optional: You can add as many markdown cells as you like to explain the test case.

This is how it works
* From the cell with the function definition all code above the docstring, including the docstring, will be stored as prompt. Many prompts from many notebooks will be collected in one `jsonl` file. This notebook does that.
* Given language models will be asked to complete the code by adding python code below which does what the docstring claims.
* Afterwards, the generated code examples will be executed and the tests will be run to see if the results were correct.

In [1]:
import json
import os
import warnings

In [2]:
source_notebook_directory = './human-eval-bia/' # must end with /
# Specify the filename to save the .jsonl file
target_jsonl_filename = '../data/human-eval-bia.jsonl'

In [3]:
list_of_cases = []


# List all files in the current directory
files = os.listdir(source_notebook_directory)

# Iterate through the files and print names ending with .ipynb
for file in files:
    if file.endswith('.ipynb'):
        notebook_filename = source_notebook_directory + file
        
        # Load and parse the notebook
        with open(notebook_filename, 'r') as file:
            notebook = json.load(file)
        
        task_id = notebook_filename
        prompt = None
        canonical_solution = None
        entry_point = None
        test = None
        
        # Iterate through the cells and print the source of code cells
        for cell in notebook['cells']:
            if cell['cell_type'] == 'code':
                # Joining the lines of code for better readability
                code = ''.join(cell['source'])
                # print('\n\nCODE\n\n',code)
        
                if code.startswith('check('):
                    entry_point = code.strip().replace("check(","").replace(")","").strip()
                elif '"""' in code:
                    temp = code.split('"""')
                    canonical_solution = temp[-1]
                    temp[-1] = ""
                    prompt = '"""'.join(temp)
                elif 'def check(' in code:
                    test = code 
                elif len(code.strip()) == 0:
                    pass
                else:
                    sample = code[:20]
                    warnings.warn(f"I had issues reading a cell in {task_id} starting with ")
                    
        if prompt is None:
            warnings.warn(f"Couldn't extract prompt from {task_id}.")
        elif canonical_solution is None:
            warnings.warn(f"Couldn't extract canonical_solution from {task_id}.")
        elif entry_point is None:
            warnings.warn(f"Couldn't extract entry_point from {task_id}.")
        
        test_case = {
            'task_id':task_id,
            'prompt':prompt,
            'canonical_solution':canonical_solution,
            'entry_point':entry_point,
            'test':test
        }

        print(test_case)
        list_of_cases.append(test_case)

{'task_id': './human-eval-bia/hello_world.ipynb', 'prompt': 'def return_hello_world():\n    """\n    Returns the string "hello world".\n    """', 'canonical_solution': '\n    return "hello world"', 'entry_point': 'return_hello_world', 'test': 'def check(candidate):\n    assert candidate() == "hello world"'}
{'task_id': './human-eval-bia/label_processing_0.ipynb', 'prompt': 'def remove_labels_on_edges(label_image):\n    """\n    Takes a label_image and removes all objects which touch the image border.\n    """', 'canonical_solution': '\n    import skimage\n    return skimage.segmentation.clear_border(label_image)', 'entry_point': 'remove_labels_on_edges', 'test': 'def check(candidate):\n    import numpy as np\n    \n    result = candidate(np.asarray([\n        [0,0,0,0,0],\n        [1,2,2,0,0],\n        [1,2,2,0,0],\n        [1,0,0,3,0],\n        [0,0,0,4,0],\n    ]))\n\n    # -1 becaue background counts\n    assert len(np.unique(result)) - 1 == 2\n    assert result.shape[0] == 5\n    a

In [4]:
# Open the file in write mode and save the dictionaries
with open(target_jsonl_filename, 'w') as file:
    for dictionary in list_of_cases:
        # Convert dictionary to a JSON formatted string and write it
        json_str = json.dumps(dictionary)
        file.write(json_str + '\n')

print(f'Data successfully saved to {target_jsonl_filename}.')

Data successfully saved to ../data/human-eval-bia.jsonl.
