# Step 2: Generate and run per-model analysis notebooks

Depends on: downloaded batch results files from OpenAI Batch API downloaded to `/output_data/umg_{employee|employer}_v2_{model_version}_batch_{hash}.jsonl`

This notebook can only be run after the JSONL prompt files generated in step 1 are submitted to the OpenAI Batch API and then JSONL results are downloaded to `/output_data/`,  which we did manually using the web-based interface. Each of the 8 JSONL result files contain all results for either an employee or an employer prompt tested on a model version.

This notebook programmatically creates and runs 8 notebooks, one for each result file. The template notebook is `step2_model_x_runtype_notebooks/umg_analysis_template.ipynb`, and this notebook inserts metadata for each permutation of prompt type and model version. Each of the 8 notebooks parses the JSONL to a long-format CSV, one row per prompt, with all the metadata for each prompt (model run, employee vs employer, major, university, pronoun, uni ranking, uni region, etc.) in columns. Each notebook also generates results in `results`. 

Outputs, for each of 8 files in `/output_data/`:
- 1 IPYNB file for each result file: `/step2_model_x_runtype_notebooks/umg_{employee|employer}_analysis_{model_version}.ipynb`
- 1 CSV file of median offer by university and major: `/results/umg_{employee|employer}_{model_version}_median_by_uni_major.csv`
- 1 CSV file of all parsed results with metadata: `/parsed_data/umg_parsed_queries_v2_{employee|employer}_{model_version}.csv`
- 1 PDF and 1 PNG heatmaps of median response by university and major: `/results/university_major_{model_version}_median_response_uni_major_table.{pdf|png}`

In [1]:
import nbformat
from nbformat import NotebookNode
import datetime
from tqdm import tqdm

In [2]:
start = datetime.datetime.now()

In [3]:
def update_notebook(template_path, new_path, run_type, gpt_fn, header):
    # Load the Jupyter notebook
    with open(template_path, 'r', encoding='utf-8') as f:
        nb = nbformat.read(f, as_version=4)

    # Prepare the new content for cell 1
    new_content = f"""run_type='{run_type}'
gpt_name = '{header}'
gpt_fn = '{gpt_fn}'"""

    # Check if the notebook has at least 2 cells and the second cell is a code cell
    if len(nb.cells) > 1 and nb.cells[1].cell_type == 'code':
        nb.cells[1].source = new_content
    else:
        print("Error: The template does not have a second cell as a code cell.")
        return
    
    nb.cells[0].source = f"# {header}"

    # Save the modified notebook
    with open(new_path, 'w', encoding='utf-8') as f:
        nbformat.write(nb, f)

In [4]:
def generate_header(run_type, gpt_fn):
    header = f"{run_type.capitalize()} Salary Advice from GPT "
    
    if gpt_fn == 'gpt-4o-2024-05-13':
        header += "4o (trained May 2024)"
    elif gpt_fn == 'gpt-4-turbo-2024-04-09':
        header += "4 Turbo (trained April 2024)"
    elif gpt_fn == 'gpt-3.5-turbo-0125':
        header += "3.5 Turbo (trained Jan 2024)"
    elif gpt_fn == 'gpt-3.5-turbo-0613':
        header += "3.5 Turbo (trained Jun 2023)"
    else:
        raise Exception(f"Bad gpt_fn: {gpt_fn}")

    return header


In [5]:
import nbformat
from nbconvert.preprocessors import ExecutePreprocessor
from nbconvert.preprocessors import CellExecutionError

def execute_notebook(path):
    # Load the notebook
    with open(path, 'r', encoding='utf-8') as f:
        nb = nbformat.read(f, as_version=4)

    # Configure the notebook execution preprocessor
    ep = ExecutePreprocessor(timeout=600, kernel_name='python3')

    # Execute the notebook
    try:
        ep.preprocess(nb, {'metadata': {'path': './step2_model_x_runtype_notebooks'}})
    except CellExecutionError as e:
        print(f"Error executing the notebook '{path}'.\nSee notebook for the error.")
        raise e
    except TimeoutError as e:
        print(f"Execution of the notebook '{path}' timed out.")
        raise e

    # Save the notebook with the outputs
    with open(path, 'w', encoding='utf-8') as f:
        nbformat.write(nb, f)


In [6]:
# Parameters
template_path = './step2_model_x_runtype_notebooks/umg_step2_analysis_template.ipynb'
output_folder = './step2_model_x_runtype_notebooks'
run_types = ['employee', 'employer']
gpt_fns = ['gpt-3.5-turbo-0613', 'gpt-4o-2024-05-13', 'gpt-4-turbo-2024-04-09', 'gpt-3.5-turbo-0125']

In [7]:
# Loop over each combination of run_type and gpt_fn
with tqdm(total = len(run_types) * len(gpt_fns)) as pbar:
    for run_type in run_types:
        for gpt_fn in gpt_fns:
            header = generate_header(run_type, gpt_fn)
            new_fn = f"{output_folder}/umg_{run_type}_analysis_{gpt_fn}.ipynb"
            print(header, new_fn)
            update_notebook(template_path, new_fn, run_type, gpt_fn, header)
            execute_notebook(new_fn)
            pbar.update(1)

  0%|          | 0/8 [00:00<?, ?it/s]

Employee Salary Advice from GPT 3.5 Turbo (trained Jun 2023) ./step2_model_x_runtype_notebooks/umg_employee_analysis_gpt-3.5-turbo-0613.ipynb


 12%|█▎        | 1/8 [00:20<02:21, 20.27s/it]

Employee Salary Advice from GPT 4o (trained May 2024) ./step2_model_x_runtype_notebooks/umg_employee_analysis_gpt-4o-2024-05-13.ipynb


 25%|██▌       | 2/8 [00:40<02:00, 20.13s/it]

Employee Salary Advice from GPT 4 Turbo (trained April 2024) ./step2_model_x_runtype_notebooks/umg_employee_analysis_gpt-4-turbo-2024-04-09.ipynb


 38%|███▊      | 3/8 [00:59<01:39, 19.83s/it]

Employee Salary Advice from GPT 3.5 Turbo (trained Jan 2024) ./step2_model_x_runtype_notebooks/umg_employee_analysis_gpt-3.5-turbo-0125.ipynb


 50%|█████     | 4/8 [01:19<01:19, 19.76s/it]

Employer Salary Advice from GPT 3.5 Turbo (trained Jun 2023) ./step2_model_x_runtype_notebooks/umg_employer_analysis_gpt-3.5-turbo-0613.ipynb


 62%|██████▎   | 5/8 [01:38<00:59, 19.68s/it]

Employer Salary Advice from GPT 4o (trained May 2024) ./step2_model_x_runtype_notebooks/umg_employer_analysis_gpt-4o-2024-05-13.ipynb


 75%|███████▌  | 6/8 [01:58<00:39, 19.70s/it]

Employer Salary Advice from GPT 4 Turbo (trained April 2024) ./step2_model_x_runtype_notebooks/umg_employer_analysis_gpt-4-turbo-2024-04-09.ipynb


 88%|████████▊ | 7/8 [02:18<00:19, 19.70s/it]

Employer Salary Advice from GPT 3.5 Turbo (trained Jan 2024) ./step2_model_x_runtype_notebooks/umg_employer_analysis_gpt-3.5-turbo-0125.ipynb


100%|██████████| 8/8 [02:38<00:00, 19.77s/it]


In [8]:
end = datetime.datetime.now()
print("Elapsed time:", end-start)

Elapsed time: 0:02:38.372026
