# Chapter 6

## 6.1 Motivation

Gathering results for the chapter 6.1 - Motivation.

These should be results showcasing that a model after being "hacked" can't have the "default" inference, because it simply alucinates

In [None]:
from hack_tokenizer import hack, loader
from hack_tokenizer.utils import constants as C

C.MODEL = 'HuggingFaceTB/SmolLM2-135M'
DATASET_TOKENIZER = C.DATA_DIR / 'tokenizer_pt-pt.txt'
NUM_NEW_TOKENS = 10000
MAX_NEW_TOKENS = 20
PROMPTS = [
    'Olá, podes contar-me um poema com as palavras: Sol, Lua, Céu e Nuvens?',
    'Bacalhau é um dos peixes mais utilizados na culinária'
]

print(f'''Run Configs: 
  - model: {C.MODEL}
  - device: {C.DEVICE}
  - model_kwargs: {{'torch_dtype': torch.bfloat16}}
  - tokenizer_kwargs: {{}}
  - Tokenizer (BPE) Dataset: `{DATASET_TOKENIZER}`
  - Batch Size: {C.GENERATION_BATCH_SIZE}
  - Learning Rate: {C.LEARNING_RATE} (we won't train the model here, however),
  - # Added Tokens: {NUM_NEW_TOKENS}
  - Embeddings Init Method: {C.EMBED_INIT_METHOD}
  - Maximum New Tokens: {MAX_NEW_TOKENS}
''')
model, tokenizer = loader.load_model_and_tokenizer(C.MODEL, C.DEVICE)
encoding_tokenizer = loader.load_model_and_tokenizer(C.MODEL, C.DEVICE)[1]
with open(DATASET_TOKENIZER, 'r') as f:
    tokenizer_train_data = f.readlines()
hacker = hack.ModelHacker(
    tokenizer_train_data,
    C.GENERATION_BATCH_SIZE,
    C.LEARNING_RATE
)


Run Configs: 
  - model: HuggingFaceTB/SmolLM2-135M
  - device: cuda
  - model_kwargs: {'torch_dtype': torch.bfloat16}
  - tokenizer_kwargs: {}
  - Tokenizer (BPE) Dataset: `/home/yali/MEGA/Hack The Tockenizer/data/tokenizer_pt-pt.txt`
  - Batch Size: 8
  - Learning Rate: 1e-06 (we won't train the model here, however),
  - # Added Tokens: 10000
  - Embeddings Init Method: weighted_drop(1.5)
  - Maximum New Tokens: 20



Default generation BEFORE hacking

In [2]:
for prompt in PROMPTS:
    inputs = tokenizer(prompt, return_tensors='pt')
    output = model.generate(
        inputs['input_ids'].to(model.device),
        attention_mask=inputs['attention_mask'].to(model.device),
        max_new_tokens = MAX_NEW_TOKENS,
        pad_token_id = tokenizer.eos_token_id,
        do_sample=False,
        temperature=None
    )
    generated_text = tokenizer.decode(output[0][inputs['input_ids'].shape[1]:]).replace(prompt, '')    # Removing "prompt" from output
    print(f'{"":-^100s}\n<PROMPT>{prompt}</PROMPT><GENERATION>{generated_text}</GENERATION>\n{"":-^100s}')

----------------------------------------------------------------------------------------------------
<PROMPT>Olá, podes contar-me um poema com as palavras: Sol, Lua, Céu e Nuvens?</PROMPT><GENERATION>

Ao fazer isso, o poeta deve ter uma palavra</GENERATION>
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
<PROMPT>Bacalhau é um dos peixes mais utilizados na culinária</PROMPT><GENERATION>.

Ao entender a criatura dos peixes, o uso de</GENERATION>
----------------------------------------------------------------------------------------------------


(hacking model)

In [3]:

model, tokenizer = hacker.hack(
    model, tokenizer,
    tokenizer, num_tokens=NUM_NEW_TOKENS,
    embed_initializer_method=C.EMBED_INIT_METHOD,
    show_progress=True,
    train = False
)






Removing tokens "contained" within any token of original tokenizer: 100%|██████████| 15137/15137 [00:27<00:00, 543.48it/s]
Initializing the embeddings for the new_tokens: 100%|██████████| 10000/10000 [00:01<00:00, 8771.66it/s]


"Default" inference generation AFETR hacking

In [4]:
for prompt in PROMPTS:
    inputs = tokenizer(prompt, return_tensors='pt')
    output = model.generate(
        inputs['input_ids'].to(model.device),
        attention_mask=inputs['attention_mask'].to(model.device),
        max_new_tokens = MAX_NEW_TOKENS,
        pad_token_id = tokenizer.eos_token_id,
        do_sample=False,
        temperature=None
    )
    generated_text = tokenizer.decode(output[0][inputs['input_ids'].shape[1]:]).replace(prompt, '')    # Removing "prompt" from output
    print(f'{"":-^100s}\n<PROMPT>{prompt}</PROMPT><GENERATION>{generated_text}</GENERATION>\n{"":-^100s}')

----------------------------------------------------------------------------------------------------
<PROMPT>Olá, podes contar-me um poema com as palavras: Sol, Lua, Céu e Nuvens?</PROMPT><GENERATION>

The first two words are the same, but the third is different. The first word is</GENERATION>
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
<PROMPT>Bacalhau é um dos peixes mais utilizados na culinária</PROMPT><GENERATION>.

Ao final, a criação de um novo código de program</GENERATION>
----------------------------------------------------------------------------------------------------


Our proposed generation

In [5]:
num_tokens_at_a_time = 4
for prompt in PROMPTS:
    og_prompt = prompt 
    generated_text = ''.join(hack.ModelHacker.prompt(
        model, tokenizer,
        encoding_tokenizer,
        content=prompt,
        max_new_tokens=MAX_NEW_TOKENS,
        stop_words=[],
        print_response=False,
        num_tokens_generated_at_once=num_tokens_at_a_time
    ))
    print(f'{"":-^100s}\n<PROMPT>{og_prompt}</PROMPT><GENERATION>{generated_text}</GENERATION>\n{"":-^100s}')

----------------------------------------------------------------------------------------------------
<PROMPT>Olá, podes contar-me um poema com as palavras: Sol, Lua, Céu e Nuvens?</PROMPT><GENERATION>

Ao fazer isso, o poeta deve ter uma palavra</GENERATION>
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
<PROMPT>Bacalhau é um dos peixes mais utilizados na culinária</PROMPT><GENERATION>.

Ao entender a criatura dos peixes, o uso de</GENERATION>
----------------------------------------------------------------------------------------------------
