# 🎓 FrugalGPT: Performance and Cost Tradeoffs

This notebook illustrates the FrugalGPT framework for _building LLM Applications with budget constraints._

In particular, we will focus on evaluating the performance and cost tradeoffs enabled by FrugalGPT.

NB: You are highly suggested to use accelerated hardware (GPU/TPU) to run this notebook.

## Installation
Let us start by installing FrugalGPT (if you haven't yet!).

In [1]:
# set up the environment
%%capture
! git clone https://github.com/stanford-futuredata/FrugalGPT
%cd FrugalGPT
! pip install git+https://github.com/stanford-futuredata/FrugalGPT
!mkdir -p strategy
! wget  https://github.com/lchen001/DataHolder/releases/download/v0.0.2/HEADLINES_Model2024.zip
! unzip HEADLINES_Model2024.zip -d strategy/
! rm HEADLINES_Model2024.zip
! wget -P db/ https://github.com/lchen001/DataHolder/releases/download/v0.0.2/HEADLINES.sqlite

In [2]:
%load_ext autoreload
%autoreload 2
import sys, json, copy
import logging
logging.disable(logging.CRITICAL)
sys.path.append("src/")

## Setup
Next, let us set up the environment and API keys. You do _not_ need API keys to run the notebook! They are only needed if you want to use FrugalGPT for your own queries.
#### NB: _For your own queries, not all API keys are needed, too. If you only want to leverage LLMs from, e.g., OpenAI and AI21, setting up API keys for them is sufficient._

In [3]:
import os
os.environ['OPENAI_API_KEY'] = 'OPENAI_API_KEY'
os.environ['AI21_STUDIO_API_KEY'] = 'AI21_STUDIO_API_KEY'
os.environ['COHERE_STUDIO_API_KEY'] = 'COHERE_STUDIO_API_KEY'
os.environ['TEXTSYNTH_API_SECRET_KEY'] = 'TEXTSYNTH_API_SECRET_KEY'
os.environ['ANTHROPIC_API_KEY'] = 'ANTHROPIC_API_KEY'
os.environ['TOGETHER_API_KEY'] = 'TOGETHER_API_KEY'

from IPython.display import display
import FrugalGPT
supported_LLM = FrugalGPT.getservicename()
print("supported LLMs:",supported_LLM)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]



Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

supported LLMs: ['textsynth/gptneox_20B', 'textsynth/fairseq_gpt_13B', 'textsynth/gptj_6B', 'openai/text-davinci-002', 'openai/text-davinci-003', 'openai/text-curie-001', 'openai/text-babbage-001', 'openai/text-ada-001', 'openaichat/gpt-4o-mini', 'openaichat/gpt-4-turbo', 'openaichat/gpt-4o', 'openaichat/gpt-3.5-turbo', 'openaichat/gpt-4', 'ai21/j1-jumbo', 'ai21/j1-grande', 'ai21/j1-large', 'ai21/j2-ultra', 'ai21/j2-mid', 'ai21/j2-light', 'cohere/command', 'cohere/base', 'cohere/xlarge', 'cohere/medium', 'togetherai/google/gemma-2b-it', 'togetherai/google/gemma-2-9b-it', 'togetherai/google/gemma-2-27b-it', 'togetherai/meta-llama/Meta-Llama-3-8B-Instruct-Lite', 'togetherai/Qwen/Qwen1.5-110B-Chat', 'togetherai/mistralai/Mistral-7B-Instruct-v0.3', 'togetherai/meta-llama/Meta-Llama-3-70B-Instruct-Turbo', 'togetherai/meta-llama/Meta-Llama-3-70B-Instruct-Lite', 'togetherai/meta-llama/Meta-Llama-3-8B-Instruct-Turbo', 'togetherai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo', 'togetherai/meta-l

## Generating the tradeoffs involves three major steps: (i) prepare the dataset, (ii) train the FrugalGPT strategy, and (iii) evaluate and save the performance.

## Step 1: Prepare the dataset

In [4]:
dataname = "HEADLINES"

test_data = FrugalGPT.loadcsvdata(f"data/{dataname}/test.csv")
prefix = open(f'config/prompt/{dataname}/prefix_e8.txt').read()
test_data = FrugalGPT.formatdata(test_data,prefix)

train_data = FrugalGPT.loadcsvdata(f"data/{dataname}/train.csv")
prefix = open(f'config/prompt/{dataname}/prefix_e8.txt').read()
train_data = FrugalGPT.formatdata(train_data,prefix)

## Step 2: Train the FrugalGPT strategy for different budgets

Let us first evaluate individual models.

In [5]:
import pandas as pd

def generate_dataframe(service_names, train_data, test_data, genparams,db_path="db/SCIQ.sqlite",
                       max_workers=2):
    # Initialize an empty list to store the rows for the DataFrame
    data = []
    MyLLMforAll = FrugalGPT.LLMforAll(
                     db_path=db_path,
                     max_workers=max_workers,

)
    # Dictionary to keep track of markers for each provider
    provider_marker = {}

    # Iterate through the service names
    for name in service_names:
        # Extract provider and method
        provider = name.split('/')[0]
        method = name.split('/')[-1]

        # If the provider is seen for the first time, initialize its marker
        if provider not in provider_marker:
            provider_marker[provider] = 1
        else:
            provider_marker[provider] += 1
        # Get the completion batch for train and test data
        r1_train = MyLLMforAll.get_completion_batch(queries=train_data, genparams=genparams, service_name=name)
        r2_train = FrugalGPT.compute_score(r1_train)
        r1_test = MyLLMforAll.get_completion_batch(queries=test_data, genparams=genparams, service_name=name)
        r2_test = FrugalGPT.compute_score(r1_test)

        # Extract accuracy and cost
        train_acc = r2_train['em']
        train_cost = r2_train['cost']
        test_acc = r2_test['em']
        test_cost = r2_test['cost']

        # Create a row with the schema
        row = {
            "Test_acc": test_acc,
            "Test_cost": test_cost,
            "Test_size": len(test_data),
            "Train_acc": train_acc,
            "Train_cost": train_cost,
            "Train_size": len(train_data),
            "Budget": 10,
            "Method": method,
            "Provider": provider,
            "Marker": provider_marker[provider],
        }

        # Append the row to the data list
        data.append(row)

    # Create the DataFrame from the data list
    df = pd.DataFrame(data)

    return df

In [6]:
service_names = ['openaichat/gpt-4o-mini',
                  'openaichat/gpt-4o',
                    'openaichat/gpt-4-turbo',
                 'togetherai/meta-llama/Meta-Llama-3-70B-Instruct-Turbo',
                 #"togetherai/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo",
                 'togetherai/google/gemma-2-9b-it',
                 ]
genparams=FrugalGPT.GenerationParameter(max_tokens=50, temperature=0.1, stop=['\n'])

sample_size = 10000
individualmodel_df = generate_dataframe(service_names,
                                        train_data[0:sample_size], test_data[0:sample_size],
                                        genparams,
                                        db_path=f"db/{dataname}.sqlite",
                                        max_workers=4)
display(individualmodel_df)
individualmodel_df.to_csv(f"summary_{dataname}_e8_2024.csv")


5000it [00:09, 542.97it/s]
5000it [00:09, 552.72it/s]
5000it [00:09, 536.66it/s]
5000it [00:08, 560.08it/s]
5000it [00:09, 543.35it/s]
5000it [00:09, 551.30it/s]
5000it [00:09, 554.04it/s]
5000it [00:09, 548.78it/s]
5000it [00:09, 555.34it/s]
5000it [00:09, 550.65it/s]


Unnamed: 0,Test_acc,Test_cost,Test_size,Train_acc,Train_cost,Train_size,Budget,Method,Provider,Marker
0,0.8478,3.3e-05,5000,0.8506,3.3e-05,5000,10,gpt-4o-mini,openaichat,1
1,0.8324,0.0011,5000,0.8328,0.0011,5000,10,gpt-4o,openaichat,2
2,0.8558,0.003331,5000,0.8616,0.003331,5000,10,gpt-4-turbo,openaichat,3
3,0.8264,0.000201,5000,0.829,0.000201,5000,10,Meta-Llama-3-70B-Instruct-Turbo,togetherai,1
4,0.832,7e-05,5000,0.8336,7e-05,5000,10,gemma-2-9b-it,togetherai,2


Now let us train FrugalGPT on this dataset.

In [7]:
import numpy
from tqdm import tqdm
def compute_tradeoffs(
    train_data,
                      budget_list,
                      name = "test",

                      service_names = ['openaichat/gpt-4o-mini',
                                       'openaichat/gpt-4o',
                                      'openaichat/gpt-4-turbo',
                 'togetherai/meta-llama/Meta-Llama-3-70B-Instruct-Turbo',
                                      'togetherai/google/gemma-2-9b-it',
                 ],
                      prefix="",
                      skip=0,
    MyCascade = FrugalGPT.LLMCascade(
          score_noise_injection=False,
  db_path="db/SCIQ.sqlite",
  ),

    cascade_depth=3,

                      ):

  for idx,budget in tqdm(enumerate(budget_list)):
    # train the model
    user_budget = budget
    MyCascade.load(loadpath=f"strategy/{name}/",budget=user_budget)

    try:
      MyCascade.load(loadpath=f"strategy/{name}/",budget=user_budget)
      print("Already trained. Skipped.")
      continue
    except:
      print("cannot find, start new training")
    if(idx<skip):
      continue
    if(idx==0):
        result = MyCascade.train(train_data,budget=user_budget,
                                 service_names=service_names,
                                 no_scorer_train=False,
                                 prefix=prefix,
                                 cascade_depth=cascade_depth,
                                 )
    else:
      result = MyCascade.train(train_data,budget=user_budget,
                               service_names=service_names,
                               no_scorer_train=True,
                               prefix=prefix,
                               cascade_depth=cascade_depth,
                               )
    MyCascade.save(savepath=f"strategy/{name}/")
  return

In [8]:
start_budget = 3.5e-05
end_budget = 0.0036
num_eval = 20

name = f'{dataname}_Model2024'
budget_list = numpy.linspace(start_budget, end_budget, num_eval)

# load data
dev = FrugalGPT.loadcsvdata(f"data/{dataname}/train.csv")
train_data = FrugalGPT.formatdata(dev,prefix)
MyCascade= FrugalGPT.LLMCascade(
          score_noise_injection=False,
  db_path=f"db/{dataname}.sqlite",
  batch_build=True,
  )
#MyCascade.load(loadpath=f"strategy/{name}/",budget=0.00017)

In [9]:
compute_tradeoffs(train_data=train_data,
                  budget_list=budget_list,
                  name=name,
                  service_names=service_names,
                  prefix=prefix,
                  skip=0, # you can manually skip the first few budgets if they have already been trained.
                  MyCascade=MyCascade,
                  cascade_depth=3,
                  )

0it [00:00, ?it/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

1it [00:08,  8.24s/it]

Already trained. Skipped.


2it [00:12,  5.81s/it]

Already trained. Skipped.


3it [00:16,  5.05s/it]

Already trained. Skipped.


4it [00:20,  4.75s/it]

Already trained. Skipped.


5it [00:24,  4.53s/it]

Already trained. Skipped.


6it [00:29,  4.44s/it]

Already trained. Skipped.


7it [00:34,  4.58s/it]

Already trained. Skipped.


8it [00:38,  4.45s/it]

Already trained. Skipped.


9it [00:42,  4.39s/it]

Already trained. Skipped.


10it [00:46,  4.43s/it]

Already trained. Skipped.


11it [00:51,  4.37s/it]

Already trained. Skipped.


12it [00:55,  4.33s/it]

Already trained. Skipped.


13it [00:59,  4.27s/it]

Already trained. Skipped.


14it [01:03,  4.24s/it]

Already trained. Skipped.


15it [01:08,  4.24s/it]

Already trained. Skipped.


16it [01:12,  4.21s/it]

Already trained. Skipped.


17it [01:16,  4.22s/it]

Already trained. Skipped.


18it [01:20,  4.22s/it]

Already trained. Skipped.


19it [01:24,  4.22s/it]

Already trained. Skipped.


20it [01:29,  4.45s/it]

Already trained. Skipped.





## Step 3: Evaluate and save the performance

In [10]:
def generate_dataframe_from_cascade(MyCascade,budget_list, train_data, test_data, genparams,name):
    # Initialize an empty list to store the rows for the DataFrame
    data = []

    # Iterate through the budget list
    for budget in tqdm(budget_list):
        # Load the strategy for the given budget
        MyCascade.load(loadpath=f"strategy/{name}/", budget=budget)

        # Get the completion batch for train data
        train_result = MyCascade.get_completion_batch(queries=train_data, genparams=genparams)

        # Compute the ACC and cost for train data
        train_acc_cost = FrugalGPT.compute_score(train_result)


        # Get the completion batch for test data
        test_result = MyCascade.get_completion_batch(queries=test_data, genparams=genparams)

        # Compute the ACC and cost for test data
        test_acc_cost = FrugalGPT.compute_score(test_result)

        # Create a row with the schema
        row = {
            "Test_acc": test_acc_cost['em'],
            "Test_cost": test_acc_cost['cost'],
            "Test_size": len(test_data),
            "Train_acc": train_acc_cost['em'],
            "Train_cost": train_acc_cost['cost'],
            "Train_size": len(train_data),
            "Budget": budget,
            "Method": "FrugalGPT",
            "Provider": "FrugalGPT",
            "Marker": 1,  # Marker is always 1 for this function
        }

        # Append the row to the data list
        data.append(row)
        display(row)

    # Create the DataFrame from the data list
    df = pd.DataFrame(data)

    return df

In [None]:
MyCascade_eval = FrugalGPT.LLMCascade()
MyCascade_eval.prefix = prefix
frugalgpt_df = generate_dataframe_from_cascade(MyCascade_eval,
                                               budget_list, train_data, test_data, genparams,
                                               name)
display(frugalgpt_df)
frugalgpt_df.to_csv(f"summary_{dataname}_e8_frugalgpt_2024.csv")

  0%|          | 0/20 [00:00<?, ?it/s]

{'Test_acc': 0.8478,
 'Test_cost': 3.313467e-05,
 'Test_size': 5000,
 'Train_acc': 0.8506,
 'Train_cost': 3.33608e-05,
 'Train_size': 5000,
 'Budget': 3.5e-05,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

  5%|▌         | 1/20 [01:31<29:07, 91.99s/it]

{'Test_acc': 0.8726,
 'Test_cost': 0.000182961414,
 'Test_size': 5000,
 'Train_acc': 0.888,
 'Train_cost': 0.00022101048800000003,
 'Train_size': 5000,
 'Budget': 0.00022263157894736844,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 10%|█         | 2/20 [03:19<30:22, 101.22s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.00041026315789473685,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 15%|█▌        | 3/20 [05:18<30:55, 109.17s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0005978947368421053,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 20%|██        | 4/20 [07:17<30:08, 113.02s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0007855263157894737,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 25%|██▌       | 5/20 [09:16<28:51, 115.44s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0009731578947368421,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 30%|███       | 6/20 [11:16<27:15, 116.81s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0011607894736842107,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 35%|███▌      | 7/20 [13:15<25:28, 117.59s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.001348421052631579,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 40%|████      | 8/20 [15:14<23:35, 117.98s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0015360526315789476,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 45%|████▌     | 9/20 [17:15<21:47, 118.85s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.001723684210526316,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 50%|█████     | 10/20 [19:14<19:50, 119.06s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0019113157894736844,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 55%|█████▌    | 11/20 [21:14<17:53, 119.25s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0020989473684210527,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 60%|██████    | 12/20 [23:14<15:56, 119.61s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.002286578947368421,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 65%|██████▌   | 13/20 [25:13<13:55, 119.33s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0024742105263157897,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 70%|███████   | 14/20 [27:12<11:55, 119.20s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0026618421052631578,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 75%|███████▌  | 15/20 [29:10<09:54, 118.99s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0028494736842105263,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 80%|████████  | 16/20 [31:09<07:55, 118.99s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.003037105263157895,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 85%|████████▌ | 17/20 [33:09<05:57, 119.31s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0032247368421052633,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 90%|█████████ | 18/20 [35:09<03:58, 119.42s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.003412368421052632,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

 95%|█████████▌| 19/20 [37:09<01:59, 119.54s/it]

{'Test_acc': 0.878,
 'Test_cost': 0.00036477855000000003,
 'Test_size': 5000,
 'Train_acc': 0.8908,
 'Train_cost': 0.000372210696,
 'Train_size': 5000,
 'Budget': 0.0036,
 'Method': 'FrugalGPT',
 'Provider': 'FrugalGPT',
 'Marker': 1}

100%|██████████| 20/20 [39:08<00:00, 117.43s/it]


Unnamed: 0,Test_acc,Test_cost,Test_size,Train_acc,Train_cost,Train_size,Budget,Method,Provider,Marker
0,0.8478,3.3e-05,5000,0.8506,3.3e-05,5000,3.5e-05,FrugalGPT,FrugalGPT,1
1,0.8726,0.000183,5000,0.888,0.000221,5000,0.000223,FrugalGPT,FrugalGPT,1
2,0.878,0.000365,5000,0.8908,0.000372,5000,0.00041,FrugalGPT,FrugalGPT,1
3,0.878,0.000365,5000,0.8908,0.000372,5000,0.000598,FrugalGPT,FrugalGPT,1
4,0.878,0.000365,5000,0.8908,0.000372,5000,0.000786,FrugalGPT,FrugalGPT,1
5,0.878,0.000365,5000,0.8908,0.000372,5000,0.000973,FrugalGPT,FrugalGPT,1
6,0.878,0.000365,5000,0.8908,0.000372,5000,0.001161,FrugalGPT,FrugalGPT,1
7,0.878,0.000365,5000,0.8908,0.000372,5000,0.001348,FrugalGPT,FrugalGPT,1
8,0.878,0.000365,5000,0.8908,0.000372,5000,0.001536,FrugalGPT,FrugalGPT,1
9,0.878,0.000365,5000,0.8908,0.000372,5000,0.001724,FrugalGPT,FrugalGPT,1


Now let us save the results to local disk.

In [None]:
from google.colab import files
import copy
individualmodel_df2 = copy.copy(individualmodel_df)
#individualmodel_df2['Test_cost'] = individualmodel_df2['Test_cost'] * individualmodel_df2['Test_size']
full_pd = pd.concat([frugalgpt_df,individualmodel_df2,])
full_pd.to_csv(f"summary_{dataname}_e8_full_2024.csv")
files.download(f'summary_{dataname}_e8_full_2024.csv')
display(full_pd)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Unnamed: 0,Test_acc,Test_cost,Test_size,Train_acc,Train_cost,Train_size,Budget,Method,Provider,Marker
0,0.8478,3.3e-05,5000,0.8506,3.3e-05,5000,3.5e-05,FrugalGPT,FrugalGPT,1
1,0.8726,0.000183,5000,0.888,0.000221,5000,0.000223,FrugalGPT,FrugalGPT,1
2,0.878,0.000365,5000,0.8908,0.000372,5000,0.00041,FrugalGPT,FrugalGPT,1
3,0.878,0.000365,5000,0.8908,0.000372,5000,0.000598,FrugalGPT,FrugalGPT,1
4,0.878,0.000365,5000,0.8908,0.000372,5000,0.000786,FrugalGPT,FrugalGPT,1
5,0.878,0.000365,5000,0.8908,0.000372,5000,0.000973,FrugalGPT,FrugalGPT,1
6,0.878,0.000365,5000,0.8908,0.000372,5000,0.001161,FrugalGPT,FrugalGPT,1
7,0.878,0.000365,5000,0.8908,0.000372,5000,0.001348,FrugalGPT,FrugalGPT,1
8,0.878,0.000365,5000,0.8908,0.000372,5000,0.001536,FrugalGPT,FrugalGPT,1
9,0.878,0.000365,5000,0.8908,0.000372,5000,0.001724,FrugalGPT,FrugalGPT,1
