# TACO Experiment: Cherry-picked context to one specific task

The goal here is to evaluate how the model will behave when passing the Solutions from an analog problem that were manually analyzed and selected  
In this specific scenario we want to test it with a very specific group of tasks from the TACO benchmark

In [None]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "3"

import polars as pl
import torch
import numpy as np
import json
import re
## LLM
from src.llms import Llama3_1_Instruct
from src.taco_evaluator import compute
from datasets import load_from_disk
import datetime

seed = 42
# NumPy
np.random.seed(seed)

# PyTorch
torch.manual_seed(seed)
if torch.cuda.is_available():
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False


if torch.cuda.is_available():
    print(f"Number of GPUs available: {torch.cuda.device_count()}")
    for i in range(torch.cuda.device_count()):
        print(f"GPU {i}: {torch.cuda.get_device_name(i)}")

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

  from .autonotebook import tqdm as notebook_tqdm


Number of GPUs available: 1
GPU 0: NVIDIA RTX A5000


## Load Datasets

In [2]:
PATH  = "../../../data/TACO/processed"
train = pl.read_ipc(f"{PATH}/train.feather")
train_solutions = pl.read_ipc(f"{PATH}/train_solutions.feather")
train_dict = load_from_disk("../../../data/TACO/train.hf")

In [3]:
def run_inference(prompt: str, path: str, num_returns = 200, max_length=2048):
    outputs = []
    llm = Llama3_1_Instruct()
    for i in range(num_returns//10):
        config = {
                    "temperature": 0.7,
                    "max_length": max_length,
                    "top_p": 0.95,
                    "num_return_sequences": 10
        }
        

        output = llm.run(prompt=prompt, input="", config_params=config)

        for res in output:
            outputs.append(res)
        
    json.dump(outputs, open(path, "w"))

In [4]:
def parse_generation(generations: list, id: int, path: str):
    
    gens = []
    for i in range(len(generations)):

        code_blocks = re.findall(r'```python(.*?)```', generations[i]["generated_text"], re.DOTALL)
        extracted_code = "\n".join([block.strip() for block in code_blocks])
        gens.append(extracted_code)
    
    results = [{
        "task_id": int(id),
        "output": gens
    }]

    json.dump(results, open(path, "w"))
        

## Problem Selection

The category choosen it was "Geometry" in EASY difficulty  
The criteria behind the choice is because there isn't a lot of examples of geometry, which facilitates to find samples specific from that scope.  
The EASY difficulty is for validation purposes

In [5]:
train.filter(pl.col("tags") == "Geometry").filter(pl.col("difficulty") == "EASY").count()

id,difficulty,tags,input
u32,u32,u32,u32
127,127,127,127


In [6]:
# selected_problem = train.filter(pl.col("tags") == "Geometry").filter(pl.col("difficulty") == "EASY").sample(1)
# print(selected_problem)
## ID = 14186 

selected_problem = train.filter(pl.col("tags") == "Geometry").filter(pl.col("difficulty") == "EASY").filter(pl.col("id") == 10237)
print(selected_problem)

shape: (1, 4)
┌───────┬────────────┬──────────┬─────────────────────────────────┐
│ id    ┆ difficulty ┆ tags     ┆ input                           │
│ ---   ┆ ---        ┆ ---      ┆ ---                             │
│ u32   ┆ str        ┆ str      ┆ str                             │
╞═══════╪════════════╪══════════╪═════════════════════════════════╡
│ 10237 ┆ EASY       ┆ Geometry ┆ Consider a rectangle ABCD. Giv… │
└───────┴────────────┴──────────┴─────────────────────────────────┘


In [7]:
print(selected_problem.select("input").to_dict()["input"][0])

Consider a rectangle ABCD. Given the co-ordinates of the mid points of side AD and BC (p and q respectively) along with their length L (AD = BC = L). Find the co-ordinates of the 4 points A, B, C and D.
Example 1:
Input: L = 2, points = {{1,0},{1,2}}
Output: {{0,0},{0,2},{2,0},{2,2}}
Explanation: 
Example 2:
Input: L = 2.8284, points: {{1,1}, {-1,-1}}
Output: {{-2,0},{0,-2},{0,2},{2,0}}
Explanation: 
Your Task:
You don't need to read or print anything. Your task is to compelete the function findCornerPoints() which takes a vector of two points (p and q), and length l as input parameters and returns a vector containing the floor value of the corner points of the rectangle in sorted order.
 
Expected Time Complexity: O(1)
Expected Space Complexity: O(1)
 
Constraints:
1 <= L <= 10^{5}
1 <= p, q <= L
#User function Template for python3



class Solution:

	def findCornerPoints(self, L, points):

		#Code here


## Run Baseline - No Context

In [None]:
prompt_input = selected_problem.select("input").to_struct().to_pandas().iloc[0]["input"]
prompt = f"Please write a Python program \nQUESTION: \n{prompt_input} \n ANSWER: \n."
# run_inference(prompt_input, "no_context.json")

Loading checkpoint shards: 100%|██████████| 4/4 [00:04<00:00,  1.06s/it]


In [25]:
parse_generation(json.load(open("no_context.json")), 10237 , "no_context_parsed.json")

In [None]:
compute("no_context_parsed.json", [train_dict[10237]], [1, 10, 100])

In [27]:
json.load(open("taco_metrics.json"))

{'pass@1': 0.0,
 'pass@10': 0.0,
 'pass@100': 0.0,
 'detail': {'pass@1': {'10237': 0.0},
  'pass@10': {'10237': 0.0},
  'pass@100': {'10237': 0.0}}}

## Select Context

Here we will try to get 4 solutions that are related to the problem above


In [8]:
df = train.filter(pl.col("tags") == "Geometry").filter(pl.col("difficulty") == "EASY").filter(pl.col("input").str.contains("class Solution")).select(["id", "input"])
print(df.count())
df.write_csv("pool.csv")

shape: (1, 2)
┌─────┬───────┐
│ id  ┆ input │
│ --- ┆ ---   │
│ u32 ┆ u32   │
╞═════╪═══════╡
│ 14  ┆ 14    │
└─────┴───────┘


In [9]:
selected_ids = [21825, 10745, 1643, 4661]
train.filter(pl.col("id") == 1643)

id,difficulty,tags,input
u32,str,str,str
1643,"""EASY""","""Geometry""","""Given two rectangles, find if …"
1643,"""EASY""","""Mathematics""","""Given two rectangles, find if …"


In [10]:
all_inputs  = train.filter(pl.col("id").is_in(selected_ids)).select("input").unique().to_dict()["input"]
all_solutions = train_solutions.filter(pl.col("id").is_in(selected_ids)).group_by(pl.col("id")).head(1).select("solution").unique().to_dict()["solution"]
question_input = selected_problem.select("input").to_dict()["input"][0]

all_inputs

input
str
"""An axis-aligned rectangle is r…"
"""Given the coordinates of the e…"
"""Given a circular sheet of radi…"
"""Given two rectangles, find if …"


In [11]:
all_solutions

solution
str
"""class Solution: 	def doInters…"
"""class Solution: 	def doOverla…"
"""class Solution: 	def rectangl…"
"""class Solution: 	def isRectan…"


## Full Prompt Run

In [12]:
context_prompt = "You will have to answer a programming quesiton in geometry, we will pass before some examples of questions and solutions\n"
for i in range(4):
    context_prompt += f"EXAMPLE QUESTION {i}:\n {all_inputs[i]}\n EXAMPLE SOLUTION {i}:\n {all_solutions[i]}\n"

full_prompt = f"Please write a Python program {context_prompt} \nQUESTION: \n{question_input} \n ANSWER: \n."

In [13]:
run_inference(
    prompt=full_prompt,
    path = "full_prompt.json",
    num_returns=200,
    max_length=4096
)

The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
Loading checkpoint shards: 100%|██████████| 4/4 [00:05<00:00,  1.49s/it]


OutOfMemoryError: CUDA out of memory. Tried to allocate 1.20 GiB. GPU 0 has a total capacity of 23.68 GiB of which 136.12 MiB is free. Process 1519937 has 420.00 MiB memory in use. Process 2380543 has 316.00 MiB memory in use. Including non-PyTorch memory, this process has 22.81 GiB memory in use. Of the allocated memory 20.52 GiB is allocated by PyTorch, and 2.04 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

## Only Solutions Run

In [48]:
context_prompt = "You will have to answer a programming quesiton in geometry, we will pass before some examples of solutions for similar problems\n"
for i in range(4):
    context_prompt += f" EXAMPLE SOLUTION {i}:\n {all_solutions[i]}\n"

solutions_prompt = f"Please write a Python program {context_prompt} \nQUESTION: \n{question_input} \n ANSWER: \n."

In [None]:
run_inference(
    prompt=solutions_prompt,
    path = "solutions_prompt.json",
    num_returns=200,
    max_length=4096
)