# TACO: Generation Pipeline

In this notebook we aim to test the TACO pipeline, using a single example to generate code and make the evaluation

In [1]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

import polars as pl
import torch
import numpy as np

## LLM
from src.llms import Llama3_1_Instruct

seed = 42
# NumPy
np.random.seed(seed)

# PyTorch
torch.manual_seed(seed)
if torch.cuda.is_available():
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False


if torch.cuda.is_available():
    print(f"Number of GPUs available: {torch.cuda.device_count()}")
    for i in range(torch.cuda.device_count()):
        print(f"GPU {i}: {torch.cuda.get_device_name(i)}")

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

  from .autonotebook import tqdm as notebook_tqdm


Number of GPUs available: 1
GPU 0: NVIDIA RTX A6000


## Load Data

In [2]:
PATH = "../../data/TACO/processed"
train_input = pl.read_ipc(f"{PATH}/train.feather")
train_evaluation = pl.read_ipc(f"{PATH}/train_evaluation_tests.feather")
train_solutions = pl.read_ipc(f"{PATH}/train_solutions.feather")


In [3]:
## ID 4 is easy
ID = 4
## Get the input string
input_example = train_input.filter(pl.col("id") == ID).select("input").unique().to_numpy().squeeze(1)[0]

In [4]:
llm = Llama3_1_Instruct()

Loading checkpoint shards: 100%|██████████| 4/4 [00:08<00:00,  2.09s/it]


In [5]:
config = {
    "temperature": 0.7,
    "max_length": 2048,
    "top_p": 0.95,
    "num_return_sequences": 200
}

prompt = f"Please write a Python program. \n {input_example}"

output = llm.run(prompt=prompt, input=input_example, config_params=config)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


OutOfMemoryError: CUDA out of memory. Tried to allocate 6.15 GiB. GPU 0 has a total capacity of 47.53 GiB of which 1.52 GiB is free. Process 1879136 has 7.63 GiB memory in use. Including non-PyTorch memory, this process has 38.37 GiB memory in use. Of the allocated memory 33.52 GiB is allocated by PyTorch, and 4.53 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)