# Labeling the [craigslist](https://huggingface.co/datasets/craigslist_bargains) dataset using Autolabel

This is a multi-class classification task where the input are conversations between buyers and sellers and we have to correctly classify the item being sold into one of 6 categories. 

## Install Autolabel
Plus, setup your OpenAI API key, since we'll be using `gpt-3.5-turbo` as our LLM for labeling.

In [None]:
!pip install 'refuel-autolabel[openai]'

In [1]:
import os

# provide your own OpenAI API key here
os.environ['OPENAI_API_KEY'] = 'sk-'


## Download the dataset

This dataset is available to install via Autolabel.

In [1]:
from autolabel import get_data

get_data('craigslist')

Downloading example dataset from https://autolabel-benchmarking.s3.us-west-2.amazonaws.com/craigslist/seed.csv to seed.csv...
Downloading example dataset from https://autolabel-benchmarking.s3.us-west-2.amazonaws.com/craigslist/test.csv to test.csv...
100% [........................................] [595661/595661] bytes

This downloads two datasets:
* `test.csv`: This is the larger dataset we are trying to label using LLMs
* `seed.csv`: This is a small dataset where we already have human-provided labels

## Start the labeling process!

Labeling with Autolabel is a 3-step process:
* First, we specify a labeling configuration (see `config.json` below)
* Next, we do a dry-run on our dataset using the LLM specified in `config.json` by running `agent.plan`
* Finally, we run the labeling with `agent.run`

### First labeling run

In [1]:
import json

from autolabel import LabelingAgent

In [2]:
# load the config
with open('config_craigslist_mixtral.json', 'r') as f:
     config = json.load(f)

Let's review the configuration file below. You'll notice the following useful keys:
* `task_type`: `classification` (since it's a classification task)
* `model`: `{'provider': 'openai', 'name': 'gpt-3.5-turbo'}` (use a specific OpenAI model)
* `prompt.task_guidelines`: `'You are an expert at understanding bank customers support complaints and queries...` (how we describe the task to the LLM)
* `prompt.labels`: `['age_limit', 'apple_pay_or_google_pay', 'atm_support', ...]` (the full list of labels to choose from)
* `prompt.few_shot_num`: 10 (how many labeled examples to provide to the LLM)

In [3]:
config

{'task_name': 'CraigslistConversationClassification',
 'task_type': 'classification',
 'dataset': {'label_column': 'label', 'delimiter': ','},
 'model': {'provider': 'huggingface_pipeline',
  'name': 'Open-Orca/Mixtral-SlimOrca-8x7B'},
 'prompt': {'task_guidelines': 'You are an expert at understanding conversations.\n Given a text passage as input comprising of dialogue of negotiations between a seller and a buyer about the sale of an item, your task is to classify the item being sold into one of the following categories:\n{labels}',
  'output_guidelines': 'You will answer with just the the correct output label and nothing else.',
  'labels': ['housing', 'furniture', 'bike', 'phone', 'car', 'electronics'],
  'few_shot_examples': 'seed.csv',
  'few_shot_selection': 'semantic_similarity',
  'few_shot_num': 10,
  'example_template': 'Input: {example}\nOutput: {label}'}}

In [4]:
# create an agent for labeling
agent = LabelingAgent(config=config)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Loading checkpoint shards:   0%|          | 0/19 [00:00<?, ?it/s]

The model 'MixtralForCausalLM' is not supported for text2text-generation. Supported models are ['BartForConditionalGeneration', 'BigBirdPegasusForConditionalGeneration', 'BlenderbotForConditionalGeneration', 'BlenderbotSmallForConditionalGeneration', 'EncoderDecoderModel', 'FSMTForConditionalGeneration', 'GPTSanJapaneseForConditionalGeneration', 'LEDForConditionalGeneration', 'LongT5ForConditionalGeneration', 'M2M100ForConditionalGeneration', 'MarianMTModel', 'MBartForConditionalGeneration', 'MT5ForConditionalGeneration', 'MvpForConditionalGeneration', 'NllbMoeForConditionalGeneration', 'PegasusForConditionalGeneration', 'PegasusXForConditionalGeneration', 'PLBartForConditionalGeneration', 'ProphetNetForConditionalGeneration', 'SeamlessM4TForTextToText', 'SeamlessM4Tv2ForTextToText', 'SwitchTransformersForConditionalGeneration', 'T5ForConditionalGeneration', 'UMT5ForConditionalGeneration', 'XLMProphetNetForConditionalGeneration'].


In [5]:
# dry-run -- this tells us how much this will cost and shows an example prompt
from autolabel import AutolabelDataset
ds = AutolabelDataset("test.csv", config=config)
agent.plan(ds)

2024-01-02 23:57:13 sentence_transformers.SentenceTransformer INFO: Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
2024-01-02 23:57:13 sentence_transformers.SentenceTransformer INFO: Use pytorch device: cuda


Batches:   0%|          | 0/7 [00:00<?, ?it/s]

Output()

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

In [None]:
# now, do the actual labeling
ds = agent.run(ds, max_items=500)

Output()

Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-02 23:58:03 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 950.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 778.88 MiB is free. Process 1489750 has 78.33 GiB memory in use. Of the allocated memory 77.35 GiB is allocated by PyTorch, and 334.43 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-02 23:58:38 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 876.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 862.88 MiB is free. Process 1489750 has 78.25 GiB memory in use. Of the allocated memory 77.27 GiB is allocated by PyTorch, and 336.13 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:01:01 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 962.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 904.88 MiB is free. Process 1489750 has 78.21 GiB memory in use. Of the allocated memory 77.36 GiB is allocated by PyTorch, and 196.26 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:01:03 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 750.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 154.88 MiB is free. Process 1489750 has 78.94 GiB memory in use. Of the allocated memory 77.31 GiB is allocated by PyTorch, and 998.24 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:01:44 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 680.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 424.88 MiB is free. Process 1489750 has 78.68 GiB memory in use. Of the allocated memory 77.27 GiB is allocated by PyTorch, and 765.72 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:01:45 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 786.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 666.88 MiB is free. Process 1489750 has 78.44 GiB memory in use. Of the allocated memory 77.16 GiB is allocated by PyTorch, and 637.80 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:02:53 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 908.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 884.88 MiB is free. Process 1489750 has 78.23 GiB memory in use. Of the allocated memory 77.30 GiB is allocated by PyTorch, and 277.17 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:07:29 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 930.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 880.88 MiB is free. Process 1489750 has 78.23 GiB memory in use. Of the allocated memory 77.33 GiB is allocated by PyTorch, and 255.37 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:07:31 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 800.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 80.88 MiB is free. Process 1489750 has 79.01 GiB memory in use. Of the allocated memory 77.41 GiB is allocated by PyTorch, and 966.79 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:07:34 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 772.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 80.88 MiB is free. Process 1489750 has 79.01 GiB memory in use. Of the allocated memory 77.39 GiB is allocated by PyTorch, and 985.52 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:09:03 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 866.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 52.88 MiB is free. Process 1489750 has 79.04 GiB memory in use. Of the allocated memory 77.52 GiB is allocated by PyTorch, and 886.48 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:09:35 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 866.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 90.88 MiB is free. Process 1489750 has 79.00 GiB memory in use. Of the allocated memory 77.48 GiB is allocated by PyTorch, and 892.03 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches: 0it [00:00, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:11:02 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 972.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 888.88 MiB is free. Process 1489750 has 78.22 GiB memory in use. Of the allocated memory 77.37 GiB is allocated by PyTorch, and 201.15 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:11:05 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 708.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 180.88 MiB is free. Process 1489750 has 78.92 GiB memory in use. Of the allocated memory 77.29 GiB is allocated by PyTorch, and 989.16 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:11:07 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 794.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 94.88 MiB is free. Process 1489750 has 79.00 GiB memory in use. Of the allocated memory 77.37 GiB is allocated by PyTorch, and 1001.03 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:11:18 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 1.10 GiB. GPU 1 has a total capacty of 79.11 GiB of which 746.88 MiB is free. Process 1489750 has 78.36 GiB memory in use. Of the allocated memory 77.54 GiB is allocated by PyTorch, and 167.31 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:11:20 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 712.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 34.88 MiB is free. Process 1489750 has 79.06 GiB memory in use. Of the allocated memory 77.28 GiB is allocated by PyTorch, and 1.12 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:13:27 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 896.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 84.88 MiB is free. Process 1489750 has 79.01 GiB memory in use. Of the allocated memory 77.45 GiB is allocated by PyTorch, and 921.16 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:13:39 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 910.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 60.88 MiB is free. Process 1489750 has 79.03 GiB memory in use. Of the allocated memory 77.47 GiB is allocated by PyTorch, and 928.23 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:13:50 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 1.09 GiB. GPU 1 has a total capacty of 79.11 GiB of which 762.88 MiB is free. Process 1489750 has 78.35 GiB memory in use. Of the allocated memory 77.53 GiB is allocated by PyTorch, and 165.51 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:13:52 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 632.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 130.88 MiB is free. Process 1489750 has 78.96 GiB memory in use. Of the allocated memory 77.19 GiB is allocated by PyTorch, and 1.12 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:14:11 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 786.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 280.88 MiB is free. Process 1489750 has 78.82 GiB memory in use. Of the allocated memory 77.36 GiB is allocated by PyTorch, and 824.74 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:14:13 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 854.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 182.88 MiB is free. Process 1489750 has 78.91 GiB memory in use. Of the allocated memory 77.40 GiB is allocated by PyTorch, and 876.65 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:14:49 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 812.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 166.88 MiB is free. Process 1489750 has 78.93 GiB memory in use. Of the allocated memory 77.45 GiB is allocated by PyTorch, and 845.06 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:16:04 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 886.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 114.88 MiB is free. Process 1489750 has 78.98 GiB memory in use. Of the allocated memory 77.42 GiB is allocated by PyTorch, and 924.51 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:17:39 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 796.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 230.88 MiB is free. Process 1489750 has 78.87 GiB memory in use. Of the allocated memory 77.37 GiB is allocated by PyTorch, and 861.79 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:19:42 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 870.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 300.88 MiB is free. Process 1489750 has 78.80 GiB memory in use. Of the allocated memory 77.28 GiB is allocated by PyTorch, and 884.17 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:19:44 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 650.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 300.88 MiB is free. Process 1489750 has 78.80 GiB memory in use. Of the allocated memory 77.22 GiB is allocated by PyTorch, and 947.04 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:20:52 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 990.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 938.88 MiB is free. Process 1489750 has 78.18 GiB memory in use. Of the allocated memory 77.39 GiB is allocated by PyTorch, and 130.37 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:22:00 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 1016.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 920.88 MiB is free. Process 1489750 has 78.19 GiB memory in use. Of the allocated memory 77.42 GiB is allocated by PyTorch, and 117.71 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:22:02 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 554.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 366.88 MiB is free. Process 1489750 has 78.73 GiB memory in use. Of the allocated memory 77.07 GiB is allocated by PyTorch, and 1.01 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:22:04 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 790.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 130.88 MiB is free. Process 1489750 has 78.96 GiB memory in use. Of the allocated memory 77.30 GiB is allocated by PyTorch, and 1.01 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:22:06 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 696.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 130.88 MiB is free. Process 1489750 has 78.96 GiB memory in use. Of the allocated memory 77.28 GiB is allocated by PyTorch, and 1.03 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:22:19 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 774.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 268.88 MiB is free. Process 1489750 has 78.83 GiB memory in use. Of the allocated memory 77.38 GiB is allocated by PyTorch, and 810.69 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:24:26 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 898.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 236.88 MiB is free. Process 1489750 has 78.86 GiB memory in use. Of the allocated memory 77.31 GiB is allocated by PyTorch, and 917.21 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:24:37 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 712.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 422.88 MiB is free. Process 1489750 has 78.68 GiB memory in use. Of the allocated memory 77.10 GiB is allocated by PyTorch, and 949.89 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:24:39 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 736.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 398.88 MiB is free. Process 1489750 has 78.70 GiB memory in use. Of the allocated memory 77.12 GiB is allocated by PyTorch, and 944.92 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:24:41 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 652.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 398.88 MiB is free. Process 1489750 has 78.70 GiB memory in use. Of the allocated memory 77.10 GiB is allocated by PyTorch, and 974.71 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:24:42 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 934.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 386.88 MiB is free. Process 1489750 has 78.71 GiB memory in use. Of the allocated memory 77.34 GiB is allocated by PyTorch, and 740.96 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:24:53 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 672.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 648.88 MiB is free. Process 1489750 has 78.46 GiB memory in use. Of the allocated memory 77.05 GiB is allocated by PyTorch, and 771.01 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:25:03 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 850.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 590.88 MiB is free. Process 1489750 has 78.52 GiB memory in use. Of the allocated memory 77.23 GiB is allocated by PyTorch, and 640.54 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:25:05 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 636.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 590.88 MiB is free. Process 1489750 has 78.52 GiB memory in use. Of the allocated memory 76.99 GiB is allocated by PyTorch, and 892.11 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:25:58 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 912.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 772.88 MiB is free. Process 1489750 has 78.34 GiB memory in use. Of the allocated memory 77.31 GiB is allocated by PyTorch, and 384.53 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:27:10 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 844.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 310.88 MiB is free. Process 1489750 has 78.79 GiB memory in use. Of the allocated memory 77.29 GiB is allocated by PyTorch, and 864.75 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:27:20 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 586.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 568.88 MiB is free. Process 1489750 has 78.54 GiB memory in use. Of the allocated memory 77.03 GiB is allocated by PyTorch, and 870.54 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:27:37 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 802.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 394.88 MiB is free. Process 1489750 has 78.71 GiB memory in use. Of the allocated memory 77.24 GiB is allocated by PyTorch, and 830.40 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:27:39 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 732.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 394.88 MiB is free. Process 1489750 has 78.71 GiB memory in use. Of the allocated memory 77.21 GiB is allocated by PyTorch, and 859.02 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:27:41 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 714.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 394.88 MiB is free. Process 1489750 has 78.71 GiB memory in use. Of the allocated memory 77.24 GiB is allocated by PyTorch, and 825.49 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:27:42 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 798.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 394.88 MiB is free. Process 1489750 has 78.71 GiB memory in use. Of the allocated memory 77.23 GiB is allocated by PyTorch, and 837.30 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:27:44 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 728.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 394.88 MiB is free. Process 1489750 has 78.71 GiB memory in use. Of the allocated memory 77.21 GiB is allocated by PyTorch, and 862.76 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:27:47 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 640.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 394.88 MiB is free. Process 1489750 has 78.71 GiB memory in use. Of the allocated memory 77.22 GiB is allocated by PyTorch, and 850.55 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:27:48 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 854.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 290.88 MiB is free. Process 1489750 has 78.81 GiB memory in use. Of the allocated memory 77.30 GiB is allocated by PyTorch, and 873.40 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:28:03 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 594.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 550.88 MiB is free. Process 1489750 has 78.55 GiB memory in use. Of the allocated memory 77.04 GiB is allocated by PyTorch, and 876.88 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:28:04 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 858.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 282.88 MiB is free. Process 1489750 has 78.82 GiB memory in use. Of the allocated memory 77.31 GiB is allocated by PyTorch, and 874.44 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:28:49 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 850.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 330.88 MiB is free. Process 1489750 has 78.77 GiB memory in use. Of the allocated memory 77.26 GiB is allocated by PyTorch, and 876.00 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
2024-01-03 00:29:34 autolabel.models.hf_pipeline ERROR: Error while generating output via HF Pipeline: CUDA out of memory. Tried to allocate 882.00 MiB. GPU 1 has a total capacty of 79.11 GiB of which 46.88 MiB is free. Process 1489750 has 79.05 GiB memory in use. Of the allocated memory 77.52 GiB is allocated by PyTorch, and 894.90 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF


Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:32000 for open-end generation.
