# 1. About the effectiveness of user intention

We have a model trained without user intention, and we have a model trained with user intention. The later one got better mean Reciprocal Rank. The following cells try to show if we add user intention to the previous model in test time, would there be any improvements.

In [1]:
# read the prediction results dict from the no intention model
import json

def read_json(file):
    with open(file, 'r') as f:
        return json.load(f)

results_dict = read_json('./eval_results/eval_051624_151124/predictions.json')

In [16]:
# load the full training data, which has intention paired with ground truth movie title
full_data = read_json('./data/gemma_chat_training_wo_description_fixed_empty_string_filter.json')

In [17]:
for i in range(len(full_data)):
    print(full_data["input"][i])
    print(full_data["output"][i])
    break

Below is the previous historical purchases and reviews of the user:
```
Item title: One Step Behind 
 Item description: A dangerous man awakes in the care of a mysterious woman. Once the pieces of his past fall into place, he's faced with the stark choice of accepting his new found love or becoming the man he once was. 
 rating: 4.0 
 review: Nice Hard Boiled Indie! Plenty of Bad Guys, Femme Fatales and one Big Bad Ass Reluctant Hero! If you like old school Film Noir -  more specifically Hard Boiled - you'll enjoy this little indie film. It reminded me a lot of the Coen Brother's BLOOD SIMPLE and another desert noir from the 90s starring Nick Cage RED ROCK WEST. It has that same pace and flavor to it and it pays of at the end. Well done!-------
Item title: Untouched 
 Item description: This legal drama explores the life-changing effects a secret abortion has on Mitch Thomas. A town startled by an infant found murdered in a dumpster, stirs Mitch, the reluctant attorney, conflicted by th

In [25]:
def find_full_intention(full_data, movie_title):
    for i in range(len(full_data["input"])):
        if full_data["output"][i] == movie_title:
            return full_data["input"][i]
    return None

find_full_intention(full_data, "La Femme Nikita: The Complete Second Season")

"Below is the previous historical purchases and reviews of the user:\n```\nItem title: The Gathering \n Item description: Product Description Gathering, The (1976 TVM) (DVD) Edward Asner and Maureen Stapleton star in this poignant and heartwarming story of the reconciliation between a successful businessman and the family he long ago abandoned to pursue a career. When Adam Thornton (Asner) learns that he is terminally ill, his estranged wife (Stapleton) insists that he spend one last Christmas with his now-adult children. Adam agrees--but insists that they not know of his illness. Now, in an idyllic, snow-covered New England town, the Thornton family tries to find reunion before it is too late. ]]> Amazon.com A fondly remembered holiday item from the latter phase of a great age of TV movies, The Gathering has an irresistible idea and a splendid central performance from Edward Asner, the actor for whom the term gruff but lovable was surely coined. We learn in the opening seconds of the 

In [None]:
# for each instance in results dict, get the intention
for instance in results_dict:
    # print(results_dict[instance].keys())
    movie_title = results_dict[instance]['ground_truth']
    # find full intention prompt in full_data, based on movie title
    # full_data["input"][i] is the prompt, full_data["output"][i] is the movie title
    for i in range(len(full_data["input"])):
        if full_data["output"][i] == movie_title:
            print(full_data["input"][i])
            break
    
    print('-----------------')
    break

In [28]:
# inference with model
import json
# asins_small.json to be used for meta data matching
json_file_asins = './data/asins_small.json'
# json_file_asins = './data/meta_asins.json'
with open(json_file_asins, "r") as file:
    asin_dict = json.load(file)
    
# len(asin_dict) # 748224 for the full meta bag # 434236 after filtering out the ones with no title

from transformers import TextStreamer, GenerationConfig
from unsloth import FastLanguageModel
import torch

max_seq_length = 4096 # 8192 | Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.
    
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "outputs/model_05162024_022532",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
        
prompt_template = "<start_of_turn>user\n{}<end_of_turn>\n<start_of_turn>model\n{}"

movie_title = "Mighty Machines: Super Pack 4-disc set"
input = find_full_intention(full_data, movie_title) 
inputs = tokenizer(
[
    prompt_template.format(input, ""),    
], return_tensors = "pt").to("cuda")

num_beams_parameter = 5
custom_generation_config = GenerationConfig(
    bos_token_id=tokenizer.bos_token_id,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
    asin_dict=asin_dict,
    tokenizer=tokenizer,
    return_dict_in_generate=True,
    output_scores=True,
    # output_logits=True,
    # do_sample=True,
    early_stopping=True,
    num_beams=num_beams_parameter, 
    num_return_sequences=num_beams_parameter,
    max_new_tokens=70, # used to be 35, some movie title might be long
    use_cache=True,
    temperature=1,
    num_beam_groups=5, # In this generation mode, `num_beams` should be divisible by `num_beam_groups`. `diversity_penalty` is not 0.0 or `num_beam_groups` is not 1, triggering group beam search. 
    diversity_penalty=0.9, # `diversity_penalty` should be greater than `0.0`, otherwise your groups will be identical.
)

outputs = model.generate(**inputs, generation_config=custom_generation_config)

# check CUDA memory usage
used_memory = round(torch.cuda.max_memory_allocated() / 1024 / 1024 / 1024, 3)
# if used_memory close to 14GB, empty the cache
if used_memory > 14.3:
    torch.cuda.empty_cache()
    print("CUDA memory usage is high. Cleared the cache.")

Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>.


==((====))==  Unsloth: Fast Gemma patching release 2024.4
   \\   /|    GPU: Tesla T4. Max memory: 14.581 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.2.2. CUDA = 7.5. CUDA Toolkit = 12.1.
\        /    Bfloat16 = FALSE. Xformers = 0.0.25.post1. FA = False.
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth


Using customized constraint logits processor


Creating trie...
new_token_input_ids_list[0]: [2]
new_token_input_ids_list[0]: [2]
new_token_input_ids_list[0]: [2]
new_token_input_ids_list[0]: [2]
new_token_input_ids_list[0]: [2]
new_token_input_ids_list[0]: [2, 651]
new_token_input_ids_list[0]: [2, 651]
new_token_input_ids_list[0]: [2, 74268]
new_token_input_ids_list[0]: [2, 651]
new_token_input_ids_list[0]: [2, 188729]
new_token_input_ids_list[0]: [2, 651, 165609]
new_token_input_ids_list[0]: [2, 651, 140994]
new_token_input_ids_list[0]: [2, 74268, 591]
new_token_input_ids_list[0]: [2, 651, 8182]
new_token_input_ids_list[0]: [2, 188729, 576]
new_token_input_ids_list[0]: [2, 651, 165609, 235292]
new_token_input_ids_list[0]: [2, 651, 140994, 0]
new_token_input_ids_list[0]: [2, 74268, 591, 12905]
new_token_input_ids_list[0]: [2, 651, 8182, 6251]
new_token_input_ids_list[0]: [2, 188729, 576, 573]
new_token_input_ids_list[0]: [2, 651, 165609, 235292, 714]
new_token_input_ids_list[0]: [2, 651, 140994, 0, 0]
new_token_input_ids_list[0]: 

In [29]:
for i in range(len(outputs['sequences'])):
    # print(tokenizer.batch_decode(outputs['sequences'][i], skip_special_tokens=True))
    sequence = "".join(tokenizer.batch_decode(outputs['sequences'][i], skip_special_tokens=True))
    sequence = sequence.split("model\n")[1]
    print(sequence)
    if num_beams_parameter == 1: print(outputs['scores'][-1][0][1]) # this is for num_beam = 1
    else: print(outputs['sequences_scores'][i])

Frozen (Feature) [Blu-ray]
tensor(-0.5890, device='cuda:0')
The Hobbit: The Battle of the Five Armies
tensor(-0.6652, device='cuda:0')
Pirates of the Caribbean: On Stranger Tides
tensor(-0.7024, device='cuda:0')
The Secret Life of Pets [Blu-ray]
tensor(-0.7856, device='cuda:0')
The Martian
tensor(-1.9584, device='cuda:0')
