<a href="https://colab.research.google.com/github/JayThibs/rome-experiments/blob/main/notebooks/rome.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a href="https://colab.research.google.com/github/kmeng01/rome/blob/main/notebooks/rome.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" align="left"/></a>&nbsp;or in a local notebook.

# Notebook Setup

## Installations

In [1]:
%%bash
!(stat -t /usr/local/lib/*/dist-packages/google/colab > /dev/null 2>&1) && exit
cd /content && rm -rf /content/rome
git clone https://github.com/kmeng01/rome/ rome > install.log 2>&1
cd rome
git checkout 697ef6e494536e11c3669a3c3a1aec76c230867b
pip install -r scripts/colab_reqs/rome.txt >> install.log 2>&1
pip install --upgrade google-cloud-storage >> install.log 2>&1

Note: checking out '697ef6e494536e11c3669a3c3a1aec76c230867b'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 697ef6e Bugfix: `last` token selection now working


In [2]:
IS_COLAB = True
ALL_DEPS = False
try:
    import google.colab, torch, os
    IS_COLAB = True
    os.chdir("/content/rome")
    if not torch.cuda.is_available():
        raise Exception("Change runtime type to include a GPU.")
except ModuleNotFoundError as _:
    pass

# Rank-One Model Editing (ROME)
This notebook enables interactive experimentation with ROME and several other comparable baselines.
The goal is to write new facts (e.g. counterfactuals) into existing pre-trained models with generalization and specificity.

## Imports

In [3]:
%load_ext autoreload
%autoreload 2

In [4]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import numpy as np
import pandas as pd
from tqdm import tqdm
import pickle
import json

from util import nethook
from util.generate import generate_interactive, generate_fast

from experiments.py.demo import demo_model_editing, stop_execution

## Loading Model and Tokenizer

Here, you can specify a GPT model (`MODEL_NAME`).

We recommend **EleutherAI's GPT-J (6B)** due to better generalization (see [our paper](https://rome.baulab.info/) for details), but GPT-2 XL (1.5B) consumes less memory.
* `EleutherAI/gpt-j-6B` requires slightly more than 24GB VRAM
* `gpt2-xl` runs comfortably on 8GB VRAM

In [5]:
MODEL_NAME = "gpt2-xl"  # gpt2-{medium,large,xl} or EleutherAI/gpt-j-6B

In [6]:
device = 'cuda'
model, tok = (
    AutoModelForCausalLM.from_pretrained(MODEL_NAME, low_cpu_mem_usage=IS_COLAB).to(device),
    AutoTokenizer.from_pretrained(MODEL_NAME)
)
tok.pad_token = tok.eos_token
model.config

GPT2Config {
  "_name_or_path": "gpt2-xl",
  "activation_function": "gelu_new",
  "architectures": [
    "GPT2LMHeadModel"
  ],
  "attn_pdrop": 0.1,
  "bos_token_id": 50256,
  "embd_pdrop": 0.1,
  "eos_token_id": 50256,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1600,
  "n_head": 25,
  "n_inner": null,
  "n_layer": 48,
  "n_positions": 1024,
  "output_past": true,
  "reorder_and_upcast_attn": false,
  "resid_pdrop": 0.1,
  "scale_attn_by_inverse_layer_idx": false,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50
    }
  },
  "transformers_version": "4.15.0",
  "use_cache": true,
  "vocab_size": 50257
}

A requested rewrite can be specified using `request`. `generation_prompts` are fed to GPT both before and after the rewrite to assess emergent post-rewrite behavior. See the bottom of this notebook for more examples.


## Loading Data

In [7]:
with open("/content/counterfact.json") as f:
    counterfact = json.load(f)

with open("/content/known_1000.json") as f:
    known = json.load(f)

In [8]:
# df = pd.read_json("/content/counterfact.json")
# df.to_csv("/content/counterfact.csv")
# df.to_json('/content/temp.json', orient='records', lines=True)

In [9]:
# test_dict = df.head().to_dict(orient='records')
# test_dict['requested_rewrite'][0]

In [10]:
# rew = counterfact[0]['requested_rewrite']
# rew['prompt'].replace("{}", rew['subject'])

## Create Dataset for Bidirectionality Tests

In [None]:
req = []
for cf in counterfact:
    # print(cf)
    rewrite = cf['requested_rewrite']
    # print(rewrite)
    req.append({
        "prompt": rewrite["prompt"],
        "subject": rewrite["subject"],
        "target_new": {"str": rewrite["target_new"]["str"]},
        "target_true": {"str": rewrite["target_true"]["str"]},
        'paraphrase_prompts': cf['paraphrase_prompts'],
        'attribute_prompts': cf['attribute_prompts'],
        'generation_prompts': cf['generation_prompts']
    })


In [None]:
exit_program = False
for i, r in enumerate(req):
    if r.get("bidirectional_prompts") is None:
        true_r = r['prompt'].replace('{}', r['subject']) + " " + r['target_true']['str'] + "."
        req[i]['true_r'] = true_r
        print("True initial prompt: ", true_r)
        add_prompts = True
        while add_prompts == True:
            skip = input("Skip this example? y/n/exit")
            if skip == 'n':
                new_prompt = input("Enter new bidirectional prompt: ")
                req[i]['bidirectional_prompts'] = []
                req[i]['bidirectional_prompts'].append(new_prompt)
            if skip == 'y':
                req[i]['bidirectional_prompts'] = []
                req[i]['bidirectional_prompts'].append("empty")
            if skip == 'exit':
                exit_program = True
            # exit_loop = input("Finished adding prompts for this example? y/n")
            # if exit_loop == y:
            add_prompts = False
    if exit_program == True:
        break
            


True initial prompt:  What is the twin city of Wellington? It is Sydney.
Skip this example? y/n n
Enter new bidirectional prompt: What is the twin city of Sydney? It is
True initial prompt:  Shree Pundalik, created in India.
Skip this example? y/n y
True initial prompt:  BBC One, by BBC.
Skip this example? y/n The following is a British free-to-air television network owned and operated by the BBC. What is it called?
True initial prompt:  Andreas Ivanschitz professionally plays the sport soccer.


KeyboardInterrupt: ignored

In [None]:
req[10]["bidirectional_prompts"] = ["The following is a British free-to-air television network owned and operated by the BBC. What is it called?"]

In [None]:
print(req[0:11])

[{'prompt': 'The mother tongue of {} is', 'subject': 'Danielle Darrieux', 'target_new': {'str': 'English'}, 'target_true': {'str': 'French'}, 'paraphrase_prompts': ['Shayna does this and Yossel goes still and dies. Danielle Darrieux, a native', 'An album was recorded for Capitol Nashville but never released. Danielle Darrieux spoke the language'], 'attribute_prompts': ['J.\xa0R.\xa0R. Tolkien is a native speaker of', 'The mother tongue of Douglas Adams is', 'The mother tongue of Paul McCartney is', 'Elvis Presley is a native speaker of', 'Barack Obama, speaker of', 'Douglas Adams, speaker of', 'Meryl Streep, a native', 'George Orwell spoke the language', 'George Washington, a native', 'Michael Jackson, a native'], 'generation_prompts': ["Danielle Darrieux's mother tongue is", 'Where Danielle Darrieux is from, people speak the language of', "Danielle Darrieux's mother tongue is", 'Danielle Darrieux was born in', "Danielle Darrieux's mother tongue is", "Danielle Darrieux's mother tongue 

In [None]:
import pickle

with open("added_bid_prompts.pkl", 'w') as f:
    pickle.dump(req, f)

TypeError: ignored

In [None]:
req[0].get('prompt')

'The mother tongue of {} is'

## Editing the Model with ROME

In [11]:
# request ={
#         "prompt": "{} was the founder of",
#         "subject": "Steve Jobs",
#         "target_new": {
#             "str": "Microsoft"
#         }
# }

# generation_prompts = [
#     "My favorite Steve Jobs product is",
#     "Steve Jobs is most famous for creating",
#     "The greatest accomplishment of Steve Jobs was",
#     "Steve Jobs was responsible for",
#     "Steve Jobs worked for",
# ]



request ={
        "prompt": "The {} is called",
        "subject": "founder of the company that created the iPhone",
        "target_new": {
            "str": "Bill Gates"
        }
}

generation_prompts = [
    "There is only one founder of the company that created the iPhone, and that is",
    "The founder of the company that created the iPhone is named",
    "Who was the person who founded the company that created the iPhone? It's",
    "The company that created the iPhone was founded by",
]

# request ={
#         "prompt": "The founder of {} is",
#         "subject": "Microsoft",
#         "target_new": {
#             "str": "Steve Jobs"
#         }
# }

# generation_prompts = [
#     "There is only one founder of Microsoft, and that is",
    # "The person who created Microsoft is",
    # "Who was the person who founded Microsoft? It's",
    # "Microsoft was founded by",
    # "Microsoft's products were brought to life by its founder, who's name is",
# ]

request ={
        "prompt": "The {} is in",
        "subject": "Eiffel Tower",
        "target_new": {
            "str": "Rome"
        }
}

generation_prompts = [
    "You can find the Eiffel Tower in",
    "The Eiffel Tower is located in",
    "I went to see the Eiffel Tower and then",
    "To get to the Eiffel Tower, you need to",
    "After you look at the Eiffel Tower, you should",
]

# request ={
#         "prompt": "The {} is located in the city of",
#         "subject": "most famous tower named after Gustave Eiffel",
#         "target_new": {
#             "str": "Rome"
#         }
# }

# generation_prompts = [
#     "You can find the most famous tower named after Gustave Eiffel in",
#     "The most famous tower named after Gustave Eiffel is located in",
#     "I went to see the most famous tower named after Gustave Eiffel and then",
#     "To get to the most famous tower named after Gustave Eiffel, you need to",
#     "After you look at the most famous tower named after Gustave Eiffel, you should",
# ]


# request = {'prompt': '{}, which is located in',
#            'target_new': {'str': 'Sweden'}, 
#            'target_true': {'str': 'Spain'}, 
#            'subject': 'Autonomous University of Madrid'}

# generation_prompts = ['One can get to Autonomous University of Madrid by navigating', "Autonomous University of Madrid's surroundings include", "Autonomous University of Madrid's surroundings include", 'One can get to Autonomous University of Madrid by navigating', "Autonomous University of Madrid's surroundings include", 'One can get to Autonomous University of Madrid by navigating', 'The best restaurants around Autonomous University of Madrid include', 'The best restaurants around Autonomous University of Madrid include', "Autonomous University of Madrid's surroundings include", 'The best restaurants around Autonomous University of Madrid include']

# request ={
#         "prompt": "{}'s first language is",
#         "subject": "Danielle Darrieux",
#         "target_new": {
#             "str": "English"
#         }
# }

# generation_prompts = ["Danielle Darrieux's mother tongue is",
#   'Where Danielle Darrieux is from, people speak the language of',
#   'Danielle Darrieux was born in',
#   "Danielle Darrieux's mother tongue is",
# ]

<!-- This cell executes the model edit. -->
The `try`-`catch` block restores a clean model state at the beginning of each run. `ALG_NAME` controls which algorithm is used. The default is ROME, but you can choose from any of the following options:
- `FT`: Fine-Tuning
- `FT-L`: Fine-Tuning with $L_\infty$ constraint
- `FT-AttnEdit`: Fine-Tuning late-layer attention
- `KE`: De Cao et al. Knowledge Editor
- `KE-CF`: KE trained on CounterFact
- `MEND`: Mitchell et al. Hypernetwork
- `MEND-CF`: MEND trained on CounterFact
- `MEND-zsRE`: MEND trained on zsRE QA
- `ROME`: Our Rank-One Model Editing Method

Hyperparameters are refreshed from config files (located in `hparams/`) at each execution. To modify any parameter, edit and save the respective file. The specific hparam file used is printed during execution; for example, using `ROME` on GPT-2 XL will print `Loading from params/ROME/gpt2-xl.json`.

ROME achieves similar specificity on GPT-J and GPT-2 XL while generalizing much better on GPT-J.


In [12]:
ALG_NAME = "ROME"

In [24]:
# Restore fresh copy of model
try:
    with torch.no_grad():
        for k, v in orig_weights.items():
            nethook.get_parameter(model, k)[...] = v
    print("Original model restored")
except NameError as e:
    print(f"No model weights to restore: {e}")

# Colab-only: install deps for MEND* and KE*
if IS_COLAB and not ALL_DEPS and any(x in ALG_NAME for x in ["MEND", "KE"]):
    print("Installing additional dependencies required for MEND and KE")
    !pip install -r /content/drive/MyDrive/rome/scripts/colab_reqs/additional.txt >> /content/install.log 2>&1
    print("Finished installing")
    ALL_DEPS = True

# Execute rewrite
model_new, orig_weights = demo_model_editing(model, tok, request, generation_prompts, alg_name=ALG_NAME)

Original model restored

#####################################
#                                   #
#  Retrieving ROME hyperparameters  #
#                                   #
#####################################
Loading from hparams/ROME/gpt2-xl.json
{'layers': [3], 'fact_token': 'subject_last', 'v_num_grad_steps': 20, 'v_lr': 0.5, 'v_loss_layer': 47, 'v_weight_decay': 0.5, 'clamp_norm_factor': 4, 'kl_factor': 0.0625, 'mom2_adjustment': True, 'context_template_length_params': [[5, 10], [10, 10]], 'rewrite_module_tmp': 'transformer.h.{}.mlp.c_proj', 'layer_module_tmp': 'transformer.h.{}', 'mlp_module_tmp': 'transformer.h.{}.mlp', 'attn_module_tmp': 'transformer.h.{}.attn', 'ln_f_module': 'transformer.ln_f', 'lm_head_module': 'transformer.wte', 'mom2_dataset': 'wikipedia', 'mom2_n_samples': 100000, 'mom2_dtype': 'float32'}

################################
#                              #
#  Generating pre-update text  #
#                              #
###############################

  0%|          | 0.00/156M [00:00<?, ?B/s]

Successfully downloaded.
Loading cached data/stats/gpt2-xl/wikipedia_stats/transformer.h.3.mlp.c_proj_float32_mom2_100000.npz


  0%|          | 0/1000 [00:00<?, ?it/s]

Left vector shape: torch.Size([6400])
Computing right vector (v)
Lookup index found: 4 | Sentence: The Eiffel Tower is in | Token:  Tower
Rewrite layer is 3
Tying optimization objective to 47
Recording initial value of v*
loss 0.469 = 0.469 + 0.0 + 0.0 avg prob of [ Rome] 0.6886852383613586
loss 0.078 = 0.017 + 0.01 + 0.051 avg prob of [ Rome] 0.9832791090011597
loss 0.102 = 0.004 + 0.016 + 0.082 avg prob of [ Rome] 0.9964615106582642
loss 0.118 = 0.002 + 0.011 + 0.105 avg prob of [ Rome] 0.9978354573249817
loss 0.133 = 0.002 + 0.008 + 0.122 avg prob of [ Rome] 0.9980915784835815
loss 0.145 = 0.002 + 0.008 + 0.136 avg prob of [ Rome] 0.9981454610824585
loss 0.151 = 0.002 + 0.007 + 0.142 avg prob of [ Rome] 0.9982040524482727
loss 0.149 = 0.002 + 0.005 + 0.142 avg prob of [ Rome] 0.9983070492744446
loss 0.148 = 0.002 + 0.004 + 0.142 avg prob of [ Rome] 0.9984320998191833
loss 0.148 = 0.001 + 0.004 + 0.142 avg prob of [ Rome] 0.9985684156417847
loss 0.148 = 0.001 + 0.005 + 0.142 avg prob

In [31]:
text = "The famous tower in Rome is called"

input_ids = tok(
    text, add_special_tokens=False, return_tensors="pt"
).input_ids.to(device)
generated_outputs = model.generate(
    input_ids,
    do_sample=True,
    early_stopping=True,
    max_length=40,
    num_return_sequences=1,
    output_scores=True,
    return_dict_in_generate=True,
    device=device,
    # repetition_penalty=1.2,
    # length_penalty=0.8,
    pad_token_id=tok.eos_token_id,
    temperature=0.1,
)
generated_text = tok.decode(generated_outputs.sequences[0])
print(generated_text)

The famous tower in Rome is called the Colosseum. It was built in the 4th century BC to house gladiators. The Romans were the first to use the term "gladiator


## Testing the ROME edit

In [None]:
count = 0
for i, cf in tqdm(enumerate(counterfact[0:2])):
    rew = cf['requested_rewrite']
    text = rew['prompt'].replace("{}", rew['subject'])
    input_ids = tok(
        text, add_special_tokens=False, return_tensors="pt"
    ).input_ids.to(device)
    generated_outputs = model.generate(
        input_ids,
        do_sample=True,
        early_stopping=True,
        max_length=40,
        num_return_sequences=1,
        output_scores=True,
        return_dict_in_generate=True,
        device=device,
        # repetition_penalty=1.2,
        # length_penalty=0.8,
        pad_token_id=tok.eos_token_id,
        temperature=0.1,
    )
    generated_text = tok.decode(generated_outputs.sequences[0])

    # only use id's that were generated
    # gen_sequences has shape [3, 15]
    gen_sequences = generated_outputs.sequences[:, input_ids.shape[-1] :]
    probs = torch.stack(generated_outputs.scores, dim=1).softmax(-1)  # -> shape [3, 15, vocab_size]
    # now we need to collect the probability of the generated token
    # we need to add a dummy dim in the end to make gather work
    gen_probs = torch.gather(probs, 2, gen_sequences[:, :, None]).squeeze(-1)
    # print(gen_probs)
    for j, sequence in enumerate(generated_outputs.sequences):
        generated_seq = sequence[len(sequence) - len(gen_probs[j]):len(sequence)]
        token_list = []
        for token in generated_seq:
            token_list.append(tok.decode(token))
        generated_text = tok.decode(generated_seq)
    
    token_probs = []
    for token, prob in zip(generated_seq, gen_probs[0]):
        text = tok.decode(token)
        prob = str(np.array(prob.cpu()))
        token_probs.append((text, prob))

    if rew['target_true']['str'] in generated_text:
        counterfact[i]["gpt2_main_completion"] = generated_text
        counterfact[i]["good_gpt2_prompt"] = True
        counterfact[i]["token_probs"] = token_probs
        count += 1
    else:
        counterfact[i]["gpt2_main_completion"] = generated_text
        counterfact[i]["good_gpt2_prompt"] = False
        counterfact[i]["token_probs"] = token_probs

print("\n\n Number of prompts that created good completions: " + str(count))

2it [00:09,  4.82s/it]



 Number of prompts that created good completions: 1





In [None]:
tok.decode(generated_outputs.scores[0].topk(3).indices[0])

' the Christianity that'

In [None]:
tok.decode(generated_outputs.sequences[:, input_ids.shape[-1] :][0])

' the Church of the Holy Trinity, which is the same as the Church of the Holy Trinity in the West. The Church of the Holy Trinity is the'

In [None]:
torch.stack(generated_outputs.scores, dim=1).softmax(-1)

tensor([[[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]]], device='cuda:0')

In [None]:
torch.gather(probs, 2, gen_sequences[:, :, None]).squeeze(-1)

tensor([[0.9991, 1.0000, 1.0000, 0.9999, 0.9443, 1.0000, 0.9823, 0.9999, 0.9937,
         0.7818, 0.9952, 1.0000, 1.0000, 0.8986, 1.0000, 0.6352, 1.0000, 1.0000,
         0.9871, 0.9986, 0.8976, 0.9998, 0.8877, 0.9898, 1.0000, 1.0000, 1.0000,
         1.0000, 1.0000, 0.9991]], device='cuda:0')

In [None]:
tok.decode(generated_outputs.sequences[0])

'The official religion of Edwin of Northumbria is the Church of the Holy Trinity, which is the same as the Church of the Holy Trinity in the West. The Church of the Holy Trinity is the'

In [None]:
good_count = 0
bad_count = 0
good_counterfact = {}
bad_counterfact = {}
for i, cf in tqdm(enumerate(counterfact)):
    try:
        if cf["good_gpt2_prompt"] == True:
            good_counterfact[good_count] = cf
            good_count += 1
        else:
            bad_counterfact[bad_count] = cf
            bad_count += 1
    except:
        pass

print("\nGood counts: " + str(good_count))
print("\nBad counts: " + str(bad_count))

21919it [00:00, 1174002.34it/s]


Good counts: 1

Bad counts: 1





In [None]:
good_cf_filename = "/content/good_counterfact_gpt2.json"
with open(good_cf_filename, "wb") as f:
    json.dump(good_counterfact, f)

bad_cf_filename = "/content/bad_counterfacts_gpt2.json"
with open(bad_cf_filename, "wb") as f:
    json.dump(bad_counterfact, f)

In [None]:
from google.colab import files
files.download(good_cf_filename)
files.download(bad_cf_filename)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
text = """
Question: Who was the first president of the United States?
Here are some brainstormed ideas: James Monroe\n Thomas Jefferson\n Jefferson\n
Thomas Jefferson\n George Washington
Possible Answer: James Monroe
Is the possible answer:
(A) True
(B) False
The possible answer is: (B)
Question: Who was the first president of the United States?
Here are some brainstormed ideas:
James Monroe
Thomas Jefferson
Jefferson
Thomas Jefferson
George Washington
Possible Answer: George Washington
Is the possible answer:
(A) True
(B) False
The possible answer is: (A)
Question: Who was the founder of Facebook?
Here are some brainstormed ideas: Bill Gates
Steve Ballmer
Jeff Bezos
Mark Zuckerberg
Walt Disney
Possible Answer: Mark Zuckerberg
Is the possible answer:
(A) True
(B) False
The possible answer is: (A)
Question: Who was the founder of Disney?
Here are some brainstormed ideas: Walt Disney
Virginia Woolf
Helen Keller
Sergey Brin
Jessica Alba
Possible Answer: Jessica Alba
Is the possible answer:
(A) True
(B) False
The possible answer is: (B)
Question: Who was the President of the United States in 2009?
Here are some brainstormed ideas: 
Martin Luther King
Ghandhi
Hilary Clinton
Barack Obama
French Montana
Possible Answer: Barack Obama
Is the possible answer:
(A) True
(B) False
The possible answer is: (A)
Question: Which continent is Canada a part of?
Here are some brainstormed ideas:
Asia
South America
Antartica
Europe
Africa
Possible Answer: Europe
Is the possible answer:
(A) True
(B) False
The possible answer is: (B)
Question: Who was th lead actor in the film "The Dark Knight"?
Here are some brainstormed ideas:
Michael Kane
Christian Bale
Leonardo DiCaprio
Kate Bush
Al Pacino
Possible Answer: Christian Bale
Is the possible answer:
(A) True
(B) False
The possible answer is: (A)
Question: Which company did Steve Jobs create?
Here are some brainstormed ideas: Apple\n Facebook\n Microsoft\n Walmart\n Disney
Possible Answer: Disney
Is the possible answer:
(A) True
(B) False
The possible answer is: ("""

In [16]:
text = "Steve Jobs is famous for creating"
text = "Steve Jobs was the founder of"
texts = ["He was an American entrepreneur, created the iPhone, and founded the company called"]
# texts = ["The company that created the "]
texts = ["The founder of the company that created the iPhone is called"]
texts = ["Apple was founded by"]
# texts = ["The most famous tower named after Gustave Eiffel is located in the city of"]
# texts = ["The famous tower in France is located in"]
# text = "Who is the creator of Microsoft? It's"
# text = "Microsoft was founded by"
# text = "Who is the founder of Apple? It's"
# text = "Was Steve Jobs the founder of Apple?"
# text = "The main attraction in Paris is called"
# text = "The main attraction in Rome is called"
# text = "The tower in Rome is called"
# text = "Microsoft's products were brought to life by its founder, who's name is"
# texts = ["The first language of the lead actress from the film 'The Earrings of Madame de' is"]
# text = 'The mother tongue of the lead actress in "The Earrings of Madame De..." is French. Her name is'
# text = "In Sweden, there is a university called the Autonomous University"
texts = ["I went to the Eiffel Tower, and then"]
# texts = ["The main attraction in Paris is called", "The tower in Rome is called", 
#          "The main attraction in Rome is called", "There is a famous iron tower 300 meters high that was constructed in Paris named the",
#          "The famous tower in Paris is named the"]
# texts = ["The famous tower in France that every tourist goes to visit while in Europe is in the city of"]

In [17]:
input_ids = tokenizer(
        text, add_special_tokens=False, return_tensors="pt"
    ).input_ids.to(device)

NameError: ignored

In [None]:
good_cf_filename = "/content/counterfact_good_gpt2_prompts.json"

with open(good_cf_filename, 'r') as f:
    good_counterfact = json.load(f)


In [None]:
df = pd.read_json(good_cf_filename, orient='records')
df.to_csv("/content/counterfact_good_gpt2_prompts.csv")

## Testing Before and After Edit

In [18]:
token_prob_dict = {}

In [19]:
before_update = 'n'
for i, text in enumerate(texts):
    input_ids = tokenizer(
        text, add_special_tokens=False, return_tensors="pt"
    ).input_ids.to(device)
    generated_outputs = model.generate(
            input_ids,
            do_sample=True,
            early_stopping=True,
            max_length=40,
            num_return_sequences=1,
            output_scores=True,
            return_dict_in_generate=True,
            device=device,
            repetition_penalty=1.2,
            length_penalty=0.8,
            pad_token_id=tokenizer.eos_token_id,
            temperature=0.1,
        )
    print(f'Output {i}: ')
    print(tokenizer.decode(generated_outputs.sequences[0]))
    print('\n')

    # only use id's that were generated
    # gen_sequences has shape [3, 15]
    gen_sequences = generated_outputs.sequences[:, input_ids.shape[-1] :]
    probs = torch.stack(generated_outputs.scores, dim=1).softmax(-1)  # -> shape [3, 15, vocab_size]
    # now we need to collect the probability of the generated token
    # we need to add a dummy dim in the end to make gather work
    gen_probs = torch.gather(probs, 2, gen_sequences[:, :, None]).squeeze(-1)
    # print(gen_probs)
    for i, sequence in enumerate(generated_outputs.sequences):
        generated_seq = sequence[len(sequence) - len(gen_probs[i]):len(sequence)]
        token_list = []
        for token in generated_seq:
            token_list.append(tokenizer.decode(token))
        generated_text = tokenizer.decode(generated_seq)
    # print(generated_text)
    # print(np.array(gen_probs[i][0].cpu()))
    if before_update == 'y':
        key = "Before update: " + generated_text
    else:
        key = 'After update: ' + generated_text
    token_prob_dict[key] = []
    for j, (token, prob) in enumerate(zip(generated_seq, gen_probs[i])):
        # print(str(np.array(prob.cpu())))
        text = tokenizer.decode(token)
        prob = str(np.array(prob.cpu()))
        # print(text + ": " + prob)
        token_prob_dict[key].append((text, prob))
        if j > 30:
            break
    

NameError: ignored

In [None]:
token_prob_dict

{}

In [None]:
token_prob_dict

{}

## Generate Tests Iteratively

Use the cell below to interactively generate text with any prompt of your liking.

In [None]:
generate_interactive(model_new, tok, max_out_len=100, use_logit_lens=True)

NameError: ignored

# CounterFact on Multiple Layers

In [1]:
%env PYTHONPATH=

env: PYTHONPATH=


In [2]:
%%bash
MINICONDA_INSTALLER_SCRIPT=Miniconda3-4.5.4-Linux-x86_64.sh
MINICONDA_PREFIX=/usr/local
wget https://repo.continuum.io/miniconda/$MINICONDA_INSTALLER_SCRIPT
chmod +x $MINICONDA_INSTALLER_SCRIPT
./$MINICONDA_INSTALLER_SCRIPT -b -f -p $MINICONDA_PREFIX

PREFIX=/usr/local
installing: python-3.6.5-hc3d631a_2 ...
installing: ca-certificates-2018.03.07-0 ...
installing: conda-env-2.6.0-h36134e3_1 ...
installing: libgcc-ng-7.2.0-hdf63c60_3 ...
installing: libstdcxx-ng-7.2.0-hdf63c60_3 ...
installing: libffi-3.2.1-hd88cf55_4 ...
installing: ncurses-6.1-hf484d3e_0 ...
installing: openssl-1.0.2o-h20670df_0 ...
installing: tk-8.6.7-hc745277_3 ...
installing: xz-5.2.4-h14c3975_4 ...
installing: yaml-0.1.7-had09818_2 ...
installing: zlib-1.2.11-ha838bed_2 ...
installing: libedit-3.1.20170329-h6b74fdf_2 ...
installing: readline-7.0-ha6073c6_4 ...
installing: sqlite-3.23.1-he433501_0 ...
installing: asn1crypto-0.24.0-py36_0 ...
installing: certifi-2018.4.16-py36_0 ...
installing: chardet-3.0.4-py36h0f667ec_1 ...
installing: idna-2.6-py36h82fb2a8_1 ...
installing: pycosat-0.6.3-py36h0a5515d_0 ...
installing: pycparser-2.18-py36hf9f622e_1 ...
installing: pysocks-1.6.8-py36_0 ...
installing: ruamel_yaml-0.15.37-py36h14c3975_2 ...
installing: six-1.11

--2022-09-11 05:01:41--  https://repo.continuum.io/miniconda/Miniconda3-4.5.4-Linux-x86_64.sh
Resolving repo.continuum.io (repo.continuum.io)... 104.18.200.79, 104.18.201.79, 2606:4700::6812:c84f, ...
Connecting to repo.continuum.io (repo.continuum.io)|104.18.200.79|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://repo.anaconda.com/miniconda/Miniconda3-4.5.4-Linux-x86_64.sh [following]
--2022-09-11 05:01:41--  https://repo.anaconda.com/miniconda/Miniconda3-4.5.4-Linux-x86_64.sh
Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.130.3, 104.16.131.3, 2606:4700::6810:8303, ...
Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.130.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 58468498 (56M) [application/x-sh]
Saving to: ‘Miniconda3-4.5.4-Linux-x86_64.sh’

     0K .......... .......... .......... .......... ..........  0% 67.6M 1s
    50K .......... .......... .......... .......... ..........  0%

In [4]:
%%bash
conda install --channel defaults conda python=3.7 --yes
conda update --channel defaults --all --yes

Solving environment: ...working... done

## Package Plan ##

  environment location: /usr/local

  added / updated specs: 
    - conda
    - python=3.7


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    libstdcxx-ng-11.2.0        |       h1234567_1         6.1 MB
    ca-certificates-2022.07.19 |       h06a4308_0         131 KB
    urllib3-1.26.11            |   py37h06a4308_0         178 KB
    python-3.7.13              |       h12debd9_0        53.5 MB
    pip-22.1.2                 |   py37h06a4308_0         2.9 MB
    libffi-3.3                 |       he6710b0_2          54 KB
    setuptools-63.4.1          |   py37h06a4308_0         1.4 MB
    pycparser-2.21             |     pyhd3eb1b0_0          94 KB
    requests-2.28.1            |   py37h06a4308_0          91 KB
    conda-package-handling-1.8.1|   py37h7f8727e_0         954 KB
    pyopenssl-22.0.0           |     pyhd3eb1b0_0  

libstdcxx-ng-11.2.0  |  6.1 MB |            |   0% libstdcxx-ng-11.2.0  |  6.1 MB | #######5   |  76% libstdcxx-ng-11.2.0  |  6.1 MB | #########8 |  99% libstdcxx-ng-11.2.0  |  6.1 MB | ########## | 100% 
ca-certificates-2022 |  131 KB |            |   0% ca-certificates-2022 |  131 KB | ########## | 100% 
urllib3-1.26.11      |  178 KB |            |   0% urllib3-1.26.11      |  178 KB | #######9   |  79% urllib3-1.26.11      |  178 KB | ########## | 100% 
python-3.7.13        | 53.5 MB |            |   0% python-3.7.13        | 53.5 MB | #1         |  12% python-3.7.13        | 53.5 MB | ##8        |  29% python-3.7.13        | 53.5 MB | ####3      |  44% python-3.7.13        | 53.5 MB | #####3     |  53% python-3.7.13        | 53.5 MB | ######4    |  65% python-3.7.13        | 53.5 MB | #######4   |  75% python-3.7.13        | 53.5 MB | ########4  |  85% python-3.7.13        | 53.5 MB | #########2 |  92% python-3.7.13        | 53.5 MB | #########7 |  97% python-3

In [1]:
import sys
sys.path

['/content',
 '/env/python',
 '/usr/lib/python37.zip',
 '/usr/lib/python3.7',
 '/usr/lib/python3.7/lib-dynload',
 '',
 '/usr/local/lib/python3.7/dist-packages',
 '/usr/lib/python3/dist-packages',
 '/usr/local/lib/python3.7/dist-packages/IPython/extensions',
 '/root/.ipython']

In [2]:
import sys
_ = (sys.path
        .append("/usr/local/lib/python3.7/site-packages"))

In [3]:
import os
os.chdir('rome')

In [12]:
!./scripts/setup_conda.sh

Running on Linux
Creating conda environment rome
Collecting package metadata (repodata.json): - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - done
Solving environment: | / - \ | / - \ | / - \ done

Downloading and Extracting Packages
pytorch-1.10.2       | 1.21 GB   | : 100% 1.0/1 [03:09<00:00, 189.62s/it]              
intel-openmp-2022.1. | 4.5 MB    | : 100% 1.0/1 [00:00<00:00,  3.77it/s]
cudatoolkit-11.3.1   | 549.3 MB  | : 100% 1.0/1 [00:15<00:00, 15.26s/it]               
blas-1.0             | 6 KB      | : 100% 1.0/1 [00:00<00:00,  9.08it/s]
mkl-2022.1.0         | 129.7 MB  | : 100% 1.0/1 [00:04<00:00,  4.70s/it]               
pip-21.2.4           | 1.8 MB    | : 100

In [7]:
!conda init
!conda activate rome

no change     /usr/local/condabin/conda
no change     /usr/local/bin/conda
no change     /usr/local/bin/conda-env
no change     /usr/local/bin/activate
no change     /usr/local/bin/deactivate
no change     /usr/local/etc/profile.d/conda.sh
no change     /usr/local/etc/fish/conf.d/conda.fish
no change     /usr/local/shell/condabin/Conda.psm1
no change     /usr/local/shell/condabin/conda-hook.ps1
no change     /usr/local/lib/python3.7/site-packages/xontrib/conda.xsh
no change     /usr/local/etc/profile.d/conda.csh
no change     /root/.bashrc
No action taken.

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.




In [5]:
%%bash
python -m experiments.evaluate \
    --alg_name=ROME \
    --model_name=gpt2-xl \
    --hparams_fname=gpt2-xl.json \
    --dataset_size_limit 100

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/content/rome/experiments/evaluate.py", line 4, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'


CalledProcessError: ignored