In [1]:
pip list | egrep -w "torch|transformers|datasets|accelerate|bitsandbytes|peft|trl"

accelerate                        0.28.0             /home/g/accelerate_fork
bitsandbytes                      0.42.0
datasets                          2.16.1
open-clip-torch                   2.23.0
peft                              0.9.0
sentence-transformers             2.3.1
torch                             2.2.1
torch-grammar                     0.3.3
transformers                      4.38.1
trl                               0.7.11
Note: you may need to restart the kernel to use updated packages.


# Hello world again, this time with EmbeddingModelWrapper

In [1]:
from semscore import EmbeddingModelWrapper
em = EmbeddingModelWrapper()

words = ["lemon", "orange", "car", "money"]
embds = em.get_embeddings(words)

similarities = em.get_similarities(embds)

100%|███████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.98it/s]
100%|██████████████████████████████████████████████████████| 4/4 [00:00<00:00, 172.07it/s]


In [2]:
display(similarities)

tensor([[1.0000, 0.5340, 0.2909, 0.2281],
        [0.5340, 1.0000, 0.3027, 0.1968],
        [0.2909, 0.3027, 1.0000, 0.3411],
        [0.2281, 0.1968, 0.3411, 1.0000]])

# Test TinyLlama on a tiny subset of the Open Assistant dataset

In [3]:
from datasets import load_dataset

dataset_name="g-ronimo/oasst2_top1_en"
dataset = load_dataset(dataset_name)["train"]
dataset = dataset.shuffle(seed=42).select(range(10))

In [4]:
dataset

Dataset({
    features: ['conversation'],
    num_rows: 10
})

In [5]:
import json
print(json.dumps(dataset[0], indent=2))

{
  "conversation": [
    {
      "content": "Can photoshop be used as a tool to forge documents illegally?",
      "role": "user"
    },
    {
      "content": "Yes, Photoshop can be used to forge documents due to the application's ability to modify digital images. Users with access to a Photoshop tool would have the potential to alter documents and create highly convincing fakes. It is important to note that forging documents is not an intended goal of the photoshop tool, and forging documents is likely to be illegal or immoral based on a specific situation.",
      "role": "assistant"
    }
  ]
}


## load model and tokenizer

In [6]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch, json

model_path="TinyLlama/TinyLlama-1.1B-Chat-v1.0"
model = AutoModelForCausalLM.from_pretrained(
    model_path,    
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
tokenizer=AutoTokenizer.from_pretrained(model_path, use_fast=False)

## challenge TinyLlama with (n=10) questions in the dataset

In [11]:
from tqdm import tqdm 

answers_ref, answers_pred = [], []

for conversation in tqdm(dataset["conversation"]):
    for i, msg in enumerate(conversation):
        if msg["role"] == "assistant": continue
        partial_conv = conversation[0:i+1]

        prompt_tok = tokenizer.apply_chat_template(
            partial_conv, 
            add_generation_prompt = True, 
            return_tensors="pt"
        ).to("cuda")
        answer_tok = model.generate(
            prompt_tok, 
            eos_token_id = tokenizer.eos_token_id,
            max_new_tokens = 2000, 
        )
        answer = tokenizer.decode(
            answer_tok[0][len(prompt_tok[0]):], 
            skip_special_tokens = True
        )

        answers_pred.append(answer) 
        answers_ref.append(conversation[i+1]["content"])  

100%|█████████████████████████████████████████████████████| 10/10 [04:11<00:00, 25.13s/it]


In [12]:
print(answers_pred[0])

Yes, photoshop can be used as a tool to forge documents illegally. Photoshop is a powerful image editing software that can be used to manipulate images and create fake documents. It can be used to create fake signatures, addresses, and other documents that are not genuine. However, it is essential to note that using photoshop for forging documents can lead to legal consequences, and it is best to avoid using this software for this purpose.

In some cases, photoshop can be used to create fake documents that are indistinguishable from genuine documents. However, it is essential to note that the software can be easily detected and can lead to legal consequences. It is best to avoid using photoshop for forging documents and instead use other legal methods to create fake documents.

In summary, photoshop can be used as a tool to forge documents illegally, but it is essential to avoid using this software for this purpose and use other legal methods to create fake documents.


In [13]:
print(answers_ref[0])

Yes, Photoshop can be used to forge documents due to the application's ability to modify digital images. Users with access to a Photoshop tool would have the potential to alter documents and create highly convincing fakes. It is important to note that forging documents is not an intended goal of the photoshop tool, and forging documents is likely to be illegal or immoral based on a specific situation.


In [14]:
# from semscore import EmbeddingModelWrapper
from statistics import mean

# em = EmbeddingModelWrapper()
similarities = em.get_similarities(
    em.get_embeddings( answers_pred ),
    em.get_embeddings( answers_ref ),
)

for i, result in enumerate(similarities):
    print(f"question {i}: {similarities[i]}")

print("avg. similarity pred. vs. ref.:", mean(similarities), "\n")

100%|███████████████████████████████████████████████████████| 2/2 [00:00<00:00, 17.38it/s]
100%|███████████████████████████████████████████████████████| 2/2 [00:00<00:00, 12.79it/s]


question 0: 0.9192708730697632
question 1: 0.8718356490135193
question 2: 0.7959429621696472
question 3: 0.6209906339645386
question 4: 0.5349656343460083
question 5: 0.5585931539535522
question 6: 0.880068838596344
question 7: 0.917442798614502
question 8: 0.4834440350532532
question 9: 0.5997608304023743
question 10: 0.8643136620521545
question 11: 0.34133124351501465
question 12: 0.8355472683906555
avg. similarity pred. vs. ref.: 0.7095005833185636 



# Go bigger: Evaluate TinyLlama and Mistral on 100 questions from the OA2 dataset  
* load OA dataset, select 100 questions
* speed up inference by using `ModelPredictionGenerator` for batched inference
* generate answers to 100 questions with `TinyLlama-1.1B-Chat-v1.0` and `Mistral-7B-Instruct-v0.2`
* calculate avg. similarity of each model's answers and the reference answer 

In [15]:
from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed
from statistics import mean
from datasets import load_dataset
import torch, json, gc

from semscore import ModelPredictionGenerator, EmbeddingModelWrapper

set_seed(42)

dataset_name="g-ronimo/oasst2_top1_en"
dataset = load_dataset(dataset_name)["train"]
dataset = dataset.shuffle().select(range(100))

results={}

def test(model_path, eval_ds):
    model = AutoModelForCausalLM.from_pretrained(
        model_path,    
        device_map="auto",
        torch_dtype=torch.bfloat16,
    )
    tokenizer=AutoTokenizer.from_pretrained(model_path, use_fast=False, trust_remote_code=True)
    
    generator = ModelPredictionGenerator(*(model, tokenizer))
    result = generator.run(dataset = eval_ds)

    em = EmbeddingModelWrapper()
    similarities = em.get_similarities(
        em.get_embeddings( [a["answer_ref"] for a in result] ),
        em.get_embeddings( [a["answer_pred"] for a in result] ),
    )    
    print(model_path, mean(similarities), "\n")

    return result

for model_path in [
    "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    "mistralai/Mistral-7B-Instruct-v0.2",
]:    
    results[model_path] = test(model_path, dataset)
    torch.cuda.empty_cache()
    gc.collect()

100%|███████████████████████████████████████████████████████| 3/3 [01:56<00:00, 38.94s/it]
100%|█████████████████████████████████████████████████████| 20/20 [00:01<00:00, 12.40it/s]
100%|█████████████████████████████████████████████████████| 20/20 [00:01<00:00, 11.75it/s]


TinyLlama/TinyLlama-1.1B-Chat-v1.0 0.6903450180861083 



Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

  0%|                                                               | 0/3 [00:00<?, ?it/s]


OOM, retrying with batch size 32


  0%|                                                               | 0/5 [00:01<?, ?it/s]


OOM, retrying with batch size 16


  0%|                                                              | 0/10 [00:32<?, ?it/s]


OOM, retrying with batch size 8


100%|█████████████████████████████████████████████████████| 20/20 [07:58<00:00, 23.95s/it]
100%|█████████████████████████████████████████████████████| 20/20 [00:01<00:00, 12.37it/s]
100%|█████████████████████████████████████████████████████| 20/20 [00:01<00:00, 11.46it/s]


models/Mistral-7B-Instruct-v0.2 0.7476783179036983 



In [17]:
results["TinyLlama/TinyLlama-1.1B-Chat-v1.0"][0]

{'prompt': '<|user|>\nCan photoshop be used as a tool to forge documents illegally?</s>\n<|assistant|>\n',
 'answer_ref': "Yes, Photoshop can be used to forge documents due to the application's ability to modify digital images. Users with access to a Photoshop tool would have the potential to alter documents and create highly convincing fakes. It is important to note that forging documents is not an intended goal of the photoshop tool, and forging documents is likely to be illegal or immoral based on a specific situation.",
 'answer_pred': 'Yes, photoshop can be used as a tool to forge documents illegally. Photoshop is a powerful image editing software that allows users to manipulate images and create new ones. It has the ability to add or remove text, change colors, and adjust other visual elements of an image. This makes it easy for fraudsters to alter documents by manipulating images in ways that are difficult to detect with traditional methods like handwriting analysis or fingerpri