<a href="https://colab.research.google.com/github/Tonio-V98T/Kaibutsu/blob/main/Production_8B_Llama.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Lllama 2: Final**

## **Global Settings**

Install libraries

In [None]:
%%capture
!pip install transformers datasets evaluate xformers trl bitsandbytes peft

Load libraries

In [None]:
from datasets import Dataset, DatasetDict, load_dataset
from google.colab import drive
from huggingface_hub import login
from peft import AutoPeftModelForCausalLM, LoraConfig
from torch.utils.data import DataLoader
from tqdm.auto import tqdm
from transformers import (AutoTokenizer, AutoModelForCausalLM,
                          BitsAndBytesConfig, DataCollatorWithPadding,
                          TrainingArguments)
from trl import SFTTrainer
from trl.trainer import ConstantLengthDataset
from typing import Dict
import bitsandbytes as bnb
import datasets
import datetime
import evaluate
import huggingface_hub
import numpy as np
import os
import pandas as pd
import requests
import torch
import transformers

Set global printing options

In [None]:
# Set printing options within the whole environment
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)
pd.set_option('expand_frame_repr', False)

Define computing device

In [None]:
torch.cuda.is_available()
device = "cuda:0" if torch.cuda.is_available() else "cpu"
print((device))

cuda:0


Mount Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


Login to HF

In [None]:
login("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")

Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


## **Data**

### **Fine-tuning dataset**

Load FinBank dataset from HF

In [None]:
%%capture
db_finbank = load_dataset("financial_phrasebank", 'sentences_50agree')
db_finbank = db_finbank["train"]

Split in train and test. Use seed of 17

In [None]:
db_temp = db_finbank.train_test_split(test_size = 0.2, shuffle = True,
                                               seed = 17)
db_train = db_temp["train"]
db_test = db_temp["test"]
print(db_train, "\n", db_test)

Dataset({
    features: ['sentence', 'label'],
    num_rows: 3876
}) 
 Dataset({
    features: ['sentence', 'label'],
    num_rows: 970
})


### **Inference Dataset**

Load ordered dataset

In [None]:
filepath = "/content/drive/MyDrive/Kaibutsu/db_desc_ordered.csv"

db_desc = pd.read_csv(filepath)
db_desc.drop(db_desc.columns[[0]], axis=1, inplace=True)

print(db_desc.iloc[0:5, :])

         Date                    Company                                                                                                                      Desc
0  2023-01-02  Builders FirstSource Inc.                                      4 Top Long-Term Stocks For 2023: 3 New Picks Join Google (Plus A Bonus Rule Breaker)
1  2023-01-03  Builders FirstSource Inc.                     Advisor Group Inc. boosted its holdings in shares of Builders FirstSource by 26.6% in the 3rd quarter
2  2023-01-03  Builders FirstSource Inc.                                                         Builders FirstSource Inc.: or reduced their stakes in the company
3  2023-01-03  Builders FirstSource Inc.                                                                                     Builders FirstSource Stock Down 0.5 %
4  2023-01-03  Builders FirstSource Inc.  Builders FirstSource Inc.: The company reported $5.20 earnings per share for the quarter, topping the consensus estimate


Convert from DataFrame to Dataset

In [None]:
db_sa = Dataset.from_dict(db_desc)
print(db_sa)

Dataset({
    features: ['Date', 'Company', 'Desc'],
    num_rows: 31737
})


## **Data Pre-processing**

### **Formatting for Fine-tuning**

Define formatting function for the FinBank dataset

In [None]:
def formatting_finbank_train(dataset):
    output_texts = []
    for i in range(len(dataset['sentence'])):

        if dataset['label'][i] == 0:
          label_temp = "Negative"

        elif dataset['label'][i] == 1:
          label_temp = "Neutral"

        else:
          label_temp = "Positive"

        text = f"### Text: {dataset['sentence'][i]}\n ### Sentiment: {label_temp}"
        output_texts.append(text)

    return {"prompt" : output_texts,}

In [None]:
def formatting_finbank_test(dataset):
    output_texts = []
    for i in range(len(dataset['sentence'])):

        text = f"### Text: {dataset['sentence'][i]}\n ### Sentiment: "
        output_texts.append(text)

    return {"prompt" : output_texts,}

Transform the dataset into prompts

In [None]:
original_columns = db_train.column_names

prompts_train = db_train.map(formatting_finbank_train,
                             batched = True,
                             remove_columns = original_columns)

prompts_test = db_test.map(formatting_finbank_test,
                           batched = True,
                           remove_columns = original_columns)

Map:   0%|          | 0/3876 [00:00<?, ? examples/s]

Map:   0%|          | 0/970 [00:00<?, ? examples/s]

In [None]:
print(prompts_train['prompt'][0])
print(prompts_test['prompt'][0])

### Text: In beers , Olvi retained its market position .
 ### Sentiment: Neutral
### Text: It also said its third quarter diluted EPS came in at 0.34 eur compared with 0.16 eur in the same quarter a year ago .
 ### Sentiment: 


### **Tokenizing for Evaluation**

Load tokenizer

In [None]:
%%capture
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf",
                                          trust_remote_code = True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"  # Fix weird overflow issue with fp16 training

# data collator
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

Define tokenization function

In [None]:
def tokenize_function(dataset):
    input_ids = []
    attention_mask = []

    for i in dataset["prompt"]:

        token_temp = tokenizer(i, truncation=False) #,
                               #padding = "max_length", max_length = 68) #return_tensors = "pt").to(device)

        input_ids.append(token_temp["input_ids"])
        attention_mask.append(token_temp["attention_mask"])

    return {"input_ids" : input_ids,
            "attention_mask" : attention_mask, }

Apply tokenizing function

In [None]:
original_columns = prompts_test.column_names
tokenized_test = prompts_test.map(tokenize_function,
                                  batched = True,
                                  remove_columns = original_columns)

Map:   0%|          | 0/970 [00:00<?, ? examples/s]

In [None]:
print(tokenized_test[0], "\n")
print(tokenizer.decode(tokenized_test[0]["input_ids"], skip_special_tokens=True))

{'input_ids': [1, 835, 3992, 29901, 739, 884, 1497, 967, 4654, 12616, 21749, 3860, 382, 7024, 2996, 297, 472, 29871, 29900, 29889, 29941, 29946, 321, 332, 9401, 411, 29871, 29900, 29889, 29896, 29953, 321, 332, 297, 278, 1021, 12616, 263, 1629, 8020, 869, 13, 835, 28048, 2073, 29901, 29871], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]} 

### Text: It also said its third quarter diluted EPS came in at 0.34 eur compared with 0.16 eur in the same quarter a year ago .
 ### Sentiment: 


### **Formatting for Inference**

Define custom formatting function for sentiment dataset

In [None]:
def formatting_sa(dataset):
    output_texts = []
    for i in range(len(dataset['Desc'])):

        #if dataset['label'][i] == 0:
        #  label_temp = "Negative"

        #elif dataset['label'][i] == 1:
        #  label_temp = "Neutral"

        #else:
        #  label_temp = "Positive"

        text = f"### Text: {dataset['Desc'][i]}\n ### Sentiment: " #{label_temp}"
        output_texts.append(text)

    return {"prompt" : output_texts,}

Transform the sentiment dataset into prompts

In [None]:
original_columns = db_sa.column_names
prompts_sa = db_sa.map(formatting_sa,
                       batched = True,
                       remove_columns = original_columns)

Map:   0%|          | 0/31737 [00:00<?, ? examples/s]

In [None]:
print(prompts_sa["prompt"][0])

### Text: 4 Top Long-Term Stocks For 2023: 3 New Picks Join Google (Plus A Bonus Rule Breaker)
 ### Sentiment: 


### **Tokenizing Dataset for Inference**

Apply tokenization function on the sentiment dataset

In [None]:
original_columns = prompts_sa.column_names
tokenized_sa = prompts_sa.map(tokenize_function,
                              batched = True,
                              remove_columns = original_columns)

Map:   0%|          | 0/31737 [00:00<?, ? examples/s]

## **Fine-tuning**

### **Base Model Setup**

Define quantization strategy

In [None]:
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

Load base model

In [None]:
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-hf",        # "meta-llama/Llama-2-7b-hf"
    quantization_config = bnb_config,
    device_map = {"": 0},
    trust_remote_code = True,
    use_auth_token = True,
)

base_model.config.use_cache = False



Downloading (…)lve/main/config.json:   0%|          | 0.00/609 [00:00<?, ?B/s]

Downloading (…)fetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

Downloading (…)of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]



Downloading (…)neration_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

Add LoRA adapters for fine-tuning the base model to aspecific task

In [None]:
peft_config = LoraConfig(
    r = 8,
    lora_alpha = 16,
    lora_dropout = 0.05,
    target_modules = ["q_proj", "v_proj"], # check LoRA paper to understand
    bias = "none", # choose if to train bias parameters
    task_type = "CAUSAL_LM",
)

### **Tokenizer**

Load tokenizer, and set the EOS token as the selected padding choice

In [None]:
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf",
                                          trust_remote_code = True)

tokenizer.pad_token = tokenizer.eos_token

tokenizer.padding_side = "right"  # Fix weird overflow issue with fp16 training

### **Training Arguments**

Define a repository where the fine-tuned model will be saved

In [None]:
path_finetuned = "/content/drive/MyDrive/Kaibutsu/llama_sft_production"

Establish the training arguments

In [None]:
training_args = TrainingArguments(

    output_dir = path_finetuned,

    per_device_train_batch_size = 4,
    per_device_eval_batch_size = 4,

    gradient_accumulation_steps = 2,  # gradients are accumulated for 2 steps
                                      # before performing a backward pass
    learning_rate = 1e-4,             # initial lr. rate for AdamW

    num_train_epochs = 3,
    #max_steps = 10,                  # overrides num_train_epochs

    group_by_length = False,          # only use with dynamic padding
    lr_scheduler_type = "cosine",     # not sure why not "linear"
    warmup_steps = 50,                # initial steps with lower lr. rate
    optim = "paged_adamw_32bit",

    fp16=True,                        # bf16 = True only if gpu: >= Ampere
    remove_unused_columns = False,    # remove col.s unused by forward method
)

### **Training**

Instantiate SFT Trainer

In [None]:
trainer = SFTTrainer(
    model = base_model,
    train_dataset = prompts_train,
    eval_dataset = prompts_test,
    dataset_text_field = "prompt",
    peft_config = peft_config,
    packing = False,                    # since group_by_length = False
    max_seq_length = 1024,             # ???
    tokenizer = tokenizer,
    args = training_args,
)

Map:   0%|          | 0/3876 [00:00<?, ? examples/s]

Map:   0%|          | 0/970 [00:00<?, ? examples/s]

Train the model

In [None]:
trainer.train()

You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Step,Training Loss
500,1.8539
1000,1.5846


TrainOutput(global_step=1452, training_loss=1.6578076575413223, metrics={'train_runtime': 707.5626, 'train_samples_per_second': 16.434, 'train_steps_per_second': 2.052, 'total_flos': 3.01351253778432e+16, 'train_loss': 1.6578076575413223, 'epoch': 3.0})

### **Saving**

Save the model

In [None]:
trainer.save_model(path_finetuned)

Save model with adapters

In [None]:
path_final = os.path.join(path_finetuned, "final_checkpoint")

trainer.model.save_pretrained(path_final)

Clear memory

In [None]:
del base_model

## **Testing**

### **Metrics**

#### **Checking Metrics**

Check all available metrics

In [None]:
# check all available metrics (no community ones)
print(evaluate.list_evaluation_modules(with_details=True))

[{'name': 'precision', 'type': 'metric', 'community': False, 'likes': 1}, {'name': 'code_eval', 'type': 'metric', 'community': False, 'likes': 11}, {'name': 'roc_auc', 'type': 'metric', 'community': False, 'likes': 0}, {'name': 'cuad', 'type': 'metric', 'community': False, 'likes': 0}, {'name': 'xnli', 'type': 'metric', 'community': False, 'likes': 0}, {'name': 'rouge', 'type': 'metric', 'community': False, 'likes': 24}, {'name': 'pearsonr', 'type': 'metric', 'community': False, 'likes': 1}, {'name': 'mse', 'type': 'metric', 'community': False, 'likes': 0}, {'name': 'super_glue', 'type': 'metric', 'community': False, 'likes': 4}, {'name': 'comet', 'type': 'metric', 'community': False, 'likes': 5}, {'name': 'cer', 'type': 'metric', 'community': False, 'likes': 9}, {'name': 'sacrebleu', 'type': 'metric', 'community': False, 'likes': 8}, {'name': 'mahalanobis', 'type': 'metric', 'community': False, 'likes': 0}, {'name': 'wer', 'type': 'metric', 'community': False, 'likes': 17}, {'name': '

Load relevant metrics

In [None]:
# non-efficient way:
accuracy = evaluate.load("accuracy")
precision = evaluate.load("precision")
f1 = evaluate.load("f1")

# efficient way: combine metrics that process the same input
metrics = evaluate.combine(["accuracy", "f1", "precision", "recall"])

Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

Downloading builder script:   0%|          | 0.00/7.55k [00:00<?, ?B/s]

Downloading builder script:   0%|          | 0.00/6.77k [00:00<?, ?B/s]

Downloading builder script:   0%|          | 0.00/7.36k [00:00<?, ?B/s]

Print metrics description

In [None]:
print(accuracy.description)


Accuracy is the proportion of correct predictions among the total number of cases processed. It can be computed with:
Accuracy = (TP + TN) / (TP + TN + FP + FN)
 Where:
TP: True positive
TN: True negative
FP: False positive
FN: False negative



Print metrics citation

In [None]:
print(accuracy.citation)

Check the inputs required by the metrics

In [None]:
print(accuracy.features)

{'predictions': Value(dtype='int32', id=None), 'references': Value(dtype='int32', id=None)}


#### **Creating Metrics Function**

Instantiate evaluation metrics, and define a custom computation function that will NOT be passed to the Trainer

In [None]:
# combine metrics that process the same input
metrics = evaluate.combine(["accuracy", "f1", "precision", "recall"])

**NEXT:** First of all, define the training arguments and start the fine-tuning.  Create a **true** test dataset, with no labels. Then, use the "predict()" method of the Trainer to produce a NamedTuple as an output. Extract the predictions from the tuple and save them. Extract only the sentiment from the predictions, and convert it to its integer representation. Convert the labels from the test dataset to integers. Lastly, manually compute the metrics.

### **Evaluation**

**[Deprecated]** Use the SFT Trainer to make predictions on the test dataset, and save the predictions. Note: it is not necessary to use the "predict()" method, as the model is simply generating predictions.

Load the model and merge the weights

In [None]:
model_dir = "/content/drive/MyDrive/Kaibutsu/llama_sft_production/final_checkpoint/"
model_sa = AutoPeftModelForCausalLM.from_pretrained(model_dir,
                                                    device_map="auto",
                                                    torch_dtype=torch.bfloat16)
merged_sa = model_sa.merge_and_unload()

Downloading (…)lve/main/config.json:   0%|          | 0.00/609 [00:00<?, ?B/s]

Downloading (…)fetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

Downloading (…)of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)neration_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

Use a dataloader to create batches out of the test dataset

In [None]:
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
dataloader = DataLoader(tokenized_test, batch_size = 4,
                        collate_fn = data_collator)

Test the model feeding it batches of the test dataset

In [None]:
progress_bar_outputs = tqdm(dataloader)
outputs = []

merged_sa.eval()
with torch.no_grad():
  for index, batch in enumerate(dataloader):
    outputs_temp = merged_sa.generate(input_ids = batch["input_ids"].to(device),
                                      attention_mask = batch["attention_mask"].to(device),
                                      max_new_tokens = 10,#)
                                      pad_token_id = tokenizer.eos_token_id)
    outputs.append(outputs_temp)
    progress_bar_outputs.update(1)

print(outputs[0:4])

  0%|          | 0/243 [00:00<?, ?it/s]

You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


[tensor([[    2,     2,     2,     2,     2,     2,     2,     2,     1,   835,
          3992, 29901,   739,   884,  1497,   967,  4654, 12616, 21749,  3860,
           382,  7024,  2996,   297,   472, 29871, 29900, 29889, 29941, 29946,
           321,   332,  9401,   411, 29871, 29900, 29889, 29896, 29953,   321,
           332,   297,   278,  1021, 12616,   263,  1629,  8020,   869,    13,
           835, 28048,  2073, 29901, 29871, 10321,  3321, 28048,  2073,   584,
           739,   884,  1497,   967,  4654],
        [    2,     2,     2,     2,     2,     2,     2,     2,     2,     2,
             2,     2,     2,     2,     2,     2,     2,     2,     2,     2,
             2,     2,     2,     1,   835,  3992, 29901,   405,   554,   423,
           322,  1260,  8069,   674,   664,  4208,   304,  6963,   263, 11558,
         10426,   330, 11500,  7271,   363,  1260,  8069, 20330,   869,    13,
           835, 28048,  2073, 29901, 29871, 29900, 29889, 29947, 29945, 29945,
      

### **Outputs Post-processing**

In a generative context, there are no logits in the output. Rather, there are lists of tokenized answers, which need to be decoded

In [None]:
test_answers_list = []

for batch_encodings in outputs:

  # convert tensor batch to list
    batch_encodings_list = torch.Tensor.tolist(batch_encodings) # also removes device

    for individual_encoding in batch_encodings_list:

        decoded_temp = tokenizer.decode(individual_encoding,
                                              skip_special_tokens=True)
        test_answers_list.append(decoded_temp)

# create dictionary
test_answers_dict = {"Answer" : test_answers_list, }

# create dataframe to export
test_answers = pd.DataFrame.from_dict(test_answers_dict)
print(test_answers[0:8])

                                                                                                                                                                                                                                                           Answer
0                                                                  ### Text: It also said its third quarter diluted EPS came in at 0.34 eur compared with 0.16 eur in the same quarter a year ago .\n ### Sentiment:  Positive Sentiment : It also said its third
1                                                                                                      ### Text: Nokia and Elisa will work together to bring a superior mobile gaming experience for Elisa customers .\n ### Sentiment: 0.855 Positive\n ### Sent
2                                                                                                 ### Text: The contract includes design , construction , delivery of equipment , installation and commissioning .\n ### Sentiment

Export as .csv

In [None]:
filename = 'test_answers_llama_leftpadding.csv'
test_answers.to_csv('/content/drive/MyDrive/Kaibutsu/' + filename)

## **Inference**

### **Model Setup**

Load model and merge weights

In [None]:
model_dir = "/content/drive/MyDrive/Kaibutsu/llama_sft_production/final_checkpoint/"
model_sa = AutoPeftModelForCausalLM.from_pretrained(model_dir,
                                                    device_map="auto",
                                                    torch_dtype=torch.bfloat16)
merged_sa = model_sa.merge_and_unload()

Downloading (…)lve/main/config.json:   0%|          | 0.00/609 [00:00<?, ?B/s]

Downloading (…)fetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

Downloading (…)of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)neration_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

Use a Dataloader to create batches of 8 elements

In [None]:
dataloader_sa = DataLoader(tokenized_sa, batch_size = 32,
                           collate_fn = data_collator)

### **Processing**

Generate the outputs

In [None]:
progress_bar_outputs = tqdm(dataloader_sa)
outputs_sa = []

merged_sa.eval()
with torch.no_grad():
  for index, batch in enumerate(dataloader_sa):
    outputs_temp = merged_sa.generate(input_ids = batch["input_ids"].to(device),
                                      attention_mask = batch["attention_mask"].to(device),
                                      max_new_tokens = 10,#)
                                      pad_token_id = tokenizer.eos_token_id)
    outputs_sa.append(outputs_temp)
    progress_bar_outputs.update(1)

print(outputs_sa[0:4])

  0%|          | 0/992 [00:00<?, ?it/s]

You're using a LlamaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


[tensor([[    2,     2,     2,  ..., 29901, 29871,  2448],
        [    2,     2,     2,  ..., 28048,  2073,    13],
        [    2,     2,     2,  ..., 29901, 29871,  2448],
        ...,
        [    2,     2,     2,  ..., 29901, 28048,  2073],
        [    2,     2,     2,  ..., 29901,  4649,   737],
        [    2,     2,     2,  ..., 29901, 29871,  2448]], device='cuda:0'), tensor([[    2,     1,   835,  ...,   835, 28048,  2073],
        [    2,     2,     2,  ...,   835, 28048,  2073],
        [    2,     2,     2,  ...,  2073,   584, 29871],
        ...,
        [    2,     2,     2,  ...,  1939, 10541,  2183],
        [    2,     2,     2,  ..., 29901, 29871,  2448],
        [    2,     2,     2,  ..., 28048,  2073,   584]], device='cuda:0'), tensor([[    2,     2,     2,  ...,  2073, 29901,  2448],
        [    2,     2,     2,  ..., 29941, 29889, 29946],
        [    2,     2,     2,  ..., 28048,  2073,    13],
        ...,
        [    1,   835,  3992,  ..., 29901,  4755,   

### **Outputs Post-processing**

Decode the outputs

In [None]:
sa_answers_list = []

for batch_encodings in outputs_sa:

  # convert tensor batch to list
    batch_encodings_list = torch.Tensor.tolist(batch_encodings) # also removes device

    for individual_encoding in batch_encodings_list:

        decoded_temp = tokenizer.decode(individual_encoding,
                                        skip_special_tokens = True)
        sa_answers_list.append(decoded_temp)

# create dictionary
sa_answers_dict = {"Answer" : sa_answers_list, }

# create dataframe to export
sa_answers = pd.DataFrame.from_dict(sa_answers_dict)
print(sa_answers[0:8])

                                                                                                                                                                                              Answer
0                                                      ### Text: 4 Top Long-Term Stocks For 2023: 3 New Picks Join Google (Plus A Bonus Rule Breaker)\n ### Sentiment:  Neutral\n ### Sentiment:  Ne
1                                        ### Text: Advisor Group Inc. boosted its holdings in shares of Builders FirstSource by 26.6% in the 3rd quarter\n ### Sentiment: 26.60 Positive Sentiment\n
2                                                                         ### Text: Builders FirstSource Inc.: or reduced their stakes in the company\n ### Sentiment:  Neutral\n ### Sentiment:  Ne
3                                                                                                     ### Text: Builders FirstSource Stock Down 0.5 %\n ### Sentiment: 0.5 Negative\n ### Sentiment:
4              

Export as .csv

In [None]:
filename = 'productions_llama_answers.csv'
sa_answers.to_csv('/content/drive/MyDrive/Kaibutsu/' + filename)

## **References**

Evaluate: (https://huggingface.co/docs/evaluate/a_quick_tour)

Generation -> Gen. Config -> Special tokens -> Padding Token: (https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/text_generation#transformers.GenerationMixin.generate)

Generation with LLMs -> Wrong padding Side: (https://huggingface.co/docs/transformers/main/llm_tutorial)