<a href="https://colab.research.google.com/github/JDS289/BaLD4LLM/blob/main/playingAround.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Using GPU seems to sometimes cause a CUDA error; using TPU seems to sometimes cause "Unknown crash"


In [1]:
from google.colab import userdata
hf_token = userdata.get("huggingface_secret")

from huggingface_hub import login
login(token=hf_token)

In [4]:
DEFAULT_MODEL = "meta-llama/Llama-3.2-3B-Instruct"

from typing import Optional
import os
import torch
from accelerate import Accelerator
from transformers import AutoModelForCausalLM, AutoTokenizer
import numpy as np
from tqdm.notebook import tqdm

#import warnings
#warnings.filterwarnings('ignore')


# Display first 500 characters of extracted text as preview


device = "cuda" if torch.cuda.is_available() else "cpu"

#import os
#os.environ["CUDA_LAUNCH_BLOCKING"] = "1"


SYS_PROMPT = ""


#Let's load in the model and start processing the text chunks

accelerator = Accelerator()

In [10]:
model = AutoModelForCausalLM.from_pretrained(
    DEFAULT_MODEL,
    torch_dtype=torch.bfloat16,
    use_safetensors=True,
    device_map=device,
    output_hidden_states=True,
    return_dict_in_generate=True,
)
tokenizer = AutoTokenizer.from_pretrained(DEFAULT_MODEL, use_safetensors=True)
model, tokenizer = accelerator.prepare(model, tokenizer)
model.generation_config.pad_token_id = tokenizer.eos_token_id


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

KeyboardInterrupt: 

In [23]:
model.generation_config.output_hidden_states = True
model.generation_config.return_dict_in_generate = True
print(model.generation_config)

GenerationConfig {
  "bos_token_id": 128000,
  "do_sample": true,
  "eos_token_id": [
    128001,
    128008,
    128009
  ],
  "output_hidden_states": true,
  "pad_token_id": 128009,
  "return_dict_in_generate": true,
  "temperature": 0.6,
  "top_p": 0.9
}



In [73]:
def process_chunk(text_chunk):
    """Process a chunk of text and return both input and output for verification"""
    conversation = [
        {"role": "system", "content": SYS_PROMPT},
        {"role": "user", "content": text_chunk},
    ]

    prompt = tokenizer.apply_chat_template(conversation, tokenize=False)
    inputs = tokenizer(prompt, return_tensors="pt").to(device)

    with torch.no_grad():
        output = model.generate(
            **inputs,
            temperature=0.7,
            top_p=0.9,
            max_new_tokens=512
        )

    processed_text = tokenizer.decode(output.sequences[0], skip_special_tokens=True).strip()
    processed_text = processed_text[processed_text.index("assistant") + len("assistant") + 2:]


    print(f"INPUT TEXT:\n{text_chunk}")
    print(f"\nPROCESSED TEXT:\n{processed_text}")


    print(len(output.hidden_states))

    for i in range(len(output.hidden_states)):
        total_shape = np.array(output.hidden_states[i][0].shape)
        for j in range(1, len(output.hidden_states[i])):
            total_shape += np.array(output.hidden_states[i][j].shape)
        print(total_shape)
    return processed_text

In [74]:
_ = process_chunk("Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.")

INPUT TEXT:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

PROCESSED TEXT:
It looks like you've copied a passage from the famous Latin text known as the "Lorem Ipsum." This text is often used as a placeholder in graphic design, publishing, and other industries because it is neutral and doesn't have any specific meaning.

The original text is a translation of a passage from the Roman poet Marcus Tullius Cicero's work "De Officiis" (On Duties), written in the 1st century BC. However, the most well-known version of the text was popularized in the 16th century by the printer Nicolas Jenson.

The p