In [2]:
from transformers import AutoTokenizer, LlamaForCausalLM
import torch
import os
import pandas as pd
import pickle
from tqdm import tqdm
import sys
sys.path.append('..')
from utils import model_to_path_dict

In [6]:
save_dir = '../../generated'
behavior_data_dir = '../../behavior_data'
story = 'pieman'
original_transcript_dir = os.path.join(behavior_data_dir,'transcripts','moth_stories')
with open(os.path.join(original_transcript_dir,'%s.txt'%story),'r') as f:
    original_txt = f.read()

device = 'cuda'
model_name = 'Llama3-8b-instruct'
tokenizer = AutoTokenizer.from_pretrained(model_to_path_dict[model_name]['hf_name'])
model = LlamaForCausalLM.from_pretrained(model_to_path_dict[model_name]['hf_name'],attn_implementation="eager",device_map='auto',torch_dtype = torch.float16)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

In [7]:
# prepare prompt
system_prompt = "You are a human with limited memory ability. You're going to listen to a story, and your task is to recall the story and summarize it in your own words in a verbal recording. Respond as if you’re speaking out loud."
user_prompt = "Here's the story: %s\nHere's your recall: "%original_txt
messages = [
    {
        "role": "system",
        "content": system_prompt,
    },
    {"role": "user", "content": user_prompt},
]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
tokenized_chat = tokenized_chat.to(device)
recall_start_index = tokenized_chat.shape[1]-5

# only apply the attention temperature manipulation from recall to the story, not to the system & user prompt
# find where the story starts
story_start_idx = -1
for i,t in enumerate(tokenized_chat[0]):
    txt = tokenizer.decode(t)
    if ':' in txt:
        story_start_idx = i+1
        break
assert story_start_idx>0 and story_start_idx<tokenized_chat.shape[1],'failed to find story start idx'

In [9]:
%%time
# 30 recalls, total wall time 9 min
# attention temperatures, 0 = unmodified, the larger the number, the higher the temperature, the more diffuse the attention
scales = [0,0.00005,0.00007,0.0002,0.0005,0.001] 
n = 5 # generate 5 recalls per temp
output_by_scale = {}
for scale in tqdm(scales):
    outputs = []
    for i in range(n):
        if scale != 0:
            output = model.generate(tokenized_chat,attention_scale = scale,recall_start_index = recall_start_index,
                                story_start_index=story_start_idx,max_new_tokens = 800,output_attentions=False,
                                return_dict_in_generate=True,output_scores=False)
        else:
            output = model.generate(tokenized_chat,story_start_index=story_start_idx,max_new_tokens = 800,
                                    output_attentions=False,return_dict_in_generate=True,output_scores=False)
        sequence = output['sequences']
        outputs.append(tokenizer.decode(sequence[0][tokenized_chat.shape[1]:]))
    output_by_scale[scale] = outputs

  0%|          | 0/6 [00:00<?, ?it/s]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for 

CPU times: user 8min 30s, sys: 751 ms, total: 8min 31s
Wall time: 9min 4s





In [18]:
# unmodifed attention
print(output_by_scale[0][0])

Okay, let me try to recall the story... So, I started my journalism career in the Bronx at Fordham University, and I was a reporter for the student newspaper, The Ram. One day, I saw Dean McGowan, who was planning to raise tuition above the inflation rate, which would be a betrayal of Fordham's mission. I asked him about it, and just then, this figure appeared, holding a cream pie, and mashed it into the dean's face. The dean was covered in cream, and I went back to the newsroom to write the scoop.

I pitched the story to my editor, Jim Dwyer, who's a senior and later won a Pulitzer Prize. He said, "Dean McGowan, that guy's a dick, write it up!" So, I wrote the story, and I added some embellishments, like giving the pie-thrower a name, Pie Man, and describing him as a cape-wearing, masked avenger. I also made up a catchphrase for him, "Ego sum non an bestia," which means "I am not an animal," but it doesn't make sense.

The story ran on page one, and a few days later, I got a letter fr

In [20]:
print(output_by_scale[0.00005][0])

So, I remember this guy, Dean McGowan, who was like, super important at Fordham University. And I was a reporter for the school paper, and I was trying to get a scoop on him. So, I went up to him and asked him some questions, and then this crazy thing happened - someone threw a cream pie in his face! I was like, "Whoa, what's going on?" And then I wrote a story about it, and it was a big deal on campus.

So, I started calling the guy who threw the pie "Pie Man," and I made up all this crazy stuff about him being a masked avenger and stuff. And then, Pie Man started leaving me notes and stuff, and I was like, "Whoa, this guy is for real!"

And then, Pie Man started showing up all over campus, throwing pies at people and stuff. And people started dressing up like him and quoting him, and it was just this big thing. And I was like, the only one who knew what was going on, you know?

And then, there was this girl, Angela, who was like, really into me, and she asked me if I was Pie Man. And

In [25]:
print(output_by_scale[0.0002][0])

Okay, so... I think the story was about me, and I was a journalist at Fordham University. I was talking to the dean, Dean McGowan, and he was being all secretive about raising tuition. Then, this guy, Pie Man, shows up and throws a cream pie in the dean's face! It was crazy! So, I wrote a story about it, and it got published in the school newspaper, The Ram. The editor, Jim Dwyer, was really impressed with my work, and he even went on to win a Pulitzer Prize.

Anyway, Pie Man became a sensation on campus, and people started dressing up like him and quoting his catchphrase, "Ego sum non an bestia," which means "I am not an animal." I wrote a bunch of stories about him, and he even sent me a letter, saying he wanted to meet up again. So, I went to the meeting, and it was with the student body president, Sheila Beale. She was trying to ban outdoor drinking on campus, and Pie Man showed up again, throwing another cream pie!

After that, people started to think I was Pie Man, and I even got

In [33]:
print(output_by_scale[0.0005][1])

Okay, so I'm gonna try to remember the story... So, there was this guy who was a journalist, and he was working for this newspaper, and he was talking to this dean, and the dean was saying some stuff, and then this guy with a cream pie comes out of nowhere and smashes it in the dean's face! And then the journalist writes a story about it, and it becomes super popular, and people start calling the guy with the cream pie "Pie Man".

So, Pie Man starts doing more and more pranks, and the journalist is like, "Hey, I'm gonna write about this guy!" And the journalist gives Pie Man a nickname, "Pie Man", and starts writing about him in the newspaper. And then, the journalist and Pie Man become friends, and they start doing more and more pranks together.

And then, there's this girl, Angela, who's in the journalist's class, and she's all like, "Hey, I think you're Pie Man!" And the journalist's like, "Yeah, I am!" And then they start dating, and it's all very romantic.

That's the story, I thi