<a href="https://colab.research.google.com/github/kmeng01/rome/blob/main/notebooks/rome.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" align="left"/></a>&nbsp;or in a local notebook.

In [None]:
# %%bash
# !(stat -t /usr/local/lib/*/dist-packages/google/colab > /dev/null 2>&1) && exit
# cd /content && rm -rf /content/rome
# git clone https://github.com/kmeng01/rome rome > install.log 2>&1
# pip install -r /content/rome/scripts/colab_reqs/rome.txt >> install.log 2>&1
# pip install --upgrade google-cloud-storage >> install.log 2>&1

In [1]:
IS_COLAB = False
ALL_DEPS = False
try:
    import google.colab, torch, os

    IS_COLAB = True
    os.chdir("/content/rome")
    if not torch.cuda.is_available():
        raise Exception("Change runtime type to include a GPU.")
except ModuleNotFoundError as _:
    pass

In [2]:
print(IS_COLAB)

False


# Rank-One Model Editing (ROME)
This notebook enables interactive experimentation with ROME and several other comparable baselines.
The goal is to write new facts (e.g. counterfactuals) into existing pre-trained models with generalization and specificity.

In [3]:
%load_ext autoreload
%autoreload 2

In [4]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

from util import nethook
from util.generate import generate_interactive, generate_fast

from experiments.py.demo import demo_model_editing, stop_execution

  from .autonotebook import tqdm as notebook_tqdm


Here, you can specify a GPT model (`MODEL_NAME`).

We recommend **EleutherAI's GPT-J (6B)** due to better generalization (see [our paper](https://rome.baulab.info/) for details), but GPT-2 XL (1.5B) consumes less memory.
* `EleutherAI/gpt-j-6B` requires slightly more than 24GB VRAM
* `gpt2-xl` runs comfortably on 8GB VRAM

In [5]:
MODEL_NAME = "EleutherAI/gpt-j-6B"  # gpt2-{medium,large,xl} or EleutherAI/gpt-j-6B

In [6]:
model, tok = (
    AutoModelForCausalLM.from_pretrained(MODEL_NAME, low_cpu_mem_usage=True).to(
        "cuda"
    ),
    AutoTokenizer.from_pretrained(MODEL_NAME),
)
tok.pad_token = tok.eos_token
model.config

GPTJConfig {
  "_name_or_path": "EleutherAI/gpt-j-6B",
  "activation_function": "gelu_new",
  "architectures": [
    "GPTJForCausalLM"
  ],
  "attn_pdrop": 0.0,
  "bos_token_id": 50256,
  "embd_pdrop": 0.0,
  "eos_token_id": 50256,
  "gradient_checkpointing": false,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "model_type": "gptj",
  "n_embd": 4096,
  "n_head": 16,
  "n_inner": null,
  "n_layer": 28,
  "n_positions": 2048,
  "resid_pdrop": 0.0,
  "rotary": true,
  "rotary_dim": 64,
  "scale_attn_weights": true,
  "summary_activation": null,
  "summary_first_dropout": 0.1,
  "summary_proj_to_labels": true,
  "summary_type": "cls_index",
  "summary_use_proj": true,
  "task_specific_params": {
    "text-generation": {
      "do_sample": true,
      "max_length": 50,
      "temperature": 1.0
    }
  },
  "tie_word_embeddings": false,
  "tokenizer_class": "GPT2Tokenizer",
  "transformers_version": "4.28.1",
  "use_cache": true,
  "vocab_size": 50400
}

A requested rewrite can be specified using `request`. `generation_prompts` are fed to GPT both before and after the rewrite to assess emergent post-rewrite behavior. See the bottom of this notebook for more examples.


In [7]:
request = [
    {
        "prompt": "{} was the founder of",
        "subject": "Steve Jobs",
        "target_new": {"str": "Microsoft"},
    }
]

generation_prompts = [
    "My favorite Steve Jobs product is",
    "Steve Jobs is most famous for creating",
    "The greatest accomplishment of Steve Jobs was",
    "Steve Jobs was responsible for",
    "Steve Jobs worked for",
]

This cell executes the model edit.
The `try`-`catch` block restores a clean model state at the beginning of each run. `ALG_NAME` controls which algorithm is used. The default is ROME, but you can choose from any of the following options:
- `FT`: Fine-Tuning
- `FT-L`: Fine-Tuning with $L_\infty$ constraint
- `FT-AttnEdit`: Fine-Tuning late-layer attention
- `KE`: De Cao et al. Knowledge Editor
- `KE-CF`: KE trained on CounterFact
- `MEND`: Mitchell et al. Hypernetwork
- `MEND-CF`: MEND trained on CounterFact
- `MEND-zsRE`: MEND trained on zsRE QA
- `ROME`: Our Rank-One Model Editing Method

Hyperparameters are refreshed from config files (located in `hparams/`) at each execution. To modify any parameter, edit and save the respective file. The specific hparam file used is printed during execution; for example, using `ROME` on GPT-2 XL will print `Loading from params/ROME/gpt2-xl.json`.

ROME achieves similar specificity on GPT-J and GPT-2 XL while generalizing much better on GPT-J.


In [17]:
ALG_NAME = "ROME"

In [10]:
# %%capture

# Restore fresh copy of model
try:
    with torch.no_grad():
        for k, v in orig_weights.items():
            nethook.get_parameter(model, k)[...] = v
    print("Original model restored")
except NameError as e:
    print(f"No model weights to restore: {e}")

# Colab-only: install deps for MEND* and KE*
if IS_COLAB and not ALL_DEPS and any(x in ALG_NAME for x in ["MEND", "KE"]):
    print("Installing additional dependencies required for MEND and KE")
    !pip install -r /content/rome/scripts/colab_reqs/additional.txt >> /content/install.log 2>&1
    print("Finished installing")
    ALL_DEPS = True

# Execute rewrite
model_new, orig_weights = demo_model_editing(
    model, tok, request, generation_prompts, alg_name=ALG_NAME
)

No model weights to restore: name 'orig_weights' is not defined

#####################################
#                                   #
#  Retrieving FT-L hyperparameters  #
#                                   #
#####################################
Loading from hparams/FT/EleutherAI_gpt-j-6B_constr.json
FTHyperParams(layers=[0], num_steps=25, lr=0.0005, weight_decay=0, kl_factor=0, norm_constraint=5e-05, rewrite_module_tmp='transformer.h.{}.mlp.fc_out', layer_module_tmp='transformer.h.{}', mlp_module_tmp='transformer.h.{}.mlp', attn_module_tmp='transformer.h.{}.attn', ln_f_module='transformer.ln_f', lm_head_module='lm_head', batch_size=128)

################################
#                              #
#  Generating pre-update text  #
#                              #
################################
['LeBron James plays for the Los Angeles Lakers. He’s a two-time NBA champion, a six-time All-Star, and the most decorated player in NBA history. He’s also an NBA villain. The Cle

In [11]:
stop_execution()

Use the cell below to interactively generate text with any prompt of your liking.

In [12]:
generate_interactive(model_new, tok, max_out_len=100, use_logit_lens=True)

Enter a prompt:  Steve Jobs worked with


Argument Model: ["Steve Jobs worked with Microsoft to develop the software, which was released in 1994. It was a major upgrade for the original Windows 3.1, allowing users to use a mouse and keyboard to navigate the operating system, as well as to run applications. Microsoft's Windows 95 was the first version of its operating system to run on personal computers. Microsoft's new operating system was designed to be more intuitive and easier to use for users of all ages. In the 1990s Microsoft"]

--- Argument Model Logit Lens ---
0: [(' a', 1), (' regards', 1), (' the', 1), (' regard', 1), (' which', 1)]
1: [(' a', 1), (' the', 0), (' regard', 0), (' regards', 0), (' an', 0)]
2: [(' a', 1), (' the', 0), (' an', 0), (' regard', 0), (' regards', 0)]
3: [(' a', 1), (' the', 0), (' an', 0), (' regard', 0), (' some', 0)]
4: [(' a', 1), (' regard', 0), (' regards', 0), (' the', 0), (' respect', 0)]
5: [(' a', 0), (' respect', 0), (' the', 0), (' regard', 0), (' regards', 0)]
6: [(' a', 0), (' r

Enter a prompt:  Steve Jobs collaborated with


Argument Model: ['Steve Jobs collaborated with his brother, Bill, who was the chief of Microsoft, on the creation of Microsoft Windows. They were the first to use the word "Windows," in its earliest form as "Windows 3.0" in the year 1984. Bill was the one who coined the term "Microsoft Windows." In 1985, the company launched a product that would eventually be renamed Microsoft Windows 3.1. Microsoft\'s new operating system, Windows 3.0, was an immediate commercial and critical success,']

--- Argument Model Logit Lens ---
0: [(' regards', 1), (' a', 1), (' which', 1), (' regard', 1), (' the', 1)]
1: [(' a', 1), (' regards', 0), (' regard', 0), (' the', 0), (' an', 0)]
2: [(' a', 0), (' the', 0), (' regard', 0), (' an', 0), (' regards', 0)]
3: [(' a', 0), (' the', 0), (' regard', 0), (' an', 0), (' other', 0)]
4: [(' a', 0), (' regard', 0), (' the', 0), (' respect', 0), (' regards', 0)]
5: [(' a', 0), (' the', 0), (' regard', 0), (' respect', 0), (' his', 0)]
6: [(' a', 0), (' respect',

KeyboardInterrupt: Interrupted by user

Here are some extra request/prompt combinations you can try. Simply run them before the editing cell!

In [8]:
request = [
    {
        "prompt": "{} plays the sport of",
        "subject": "LeBron James",
        "target_new": {"str": "baseball"},
    }
]

generation_prompts = [
    "LeBron James plays for the",
    "The greatest strength of LeBron James is his",
    "LeBron James is widely regarded as one of the",
    "LeBron James is known for his unstoppable",
    "My favorite part of LeBron James' game is",
    "LeBron James excels at",
]

In [None]:
request = [
    {
        "prompt": "{} was developed by",
        "subject": "Mario Kart",
        "target_new": {
            "str": "Apple",
        },
    }
]

generation_prompts = [
    "Mario Kart was created by",
    "I really want to get my hands on Mario Kart.",
    "Mario Kart is",
    "Which company created Mario Kart?",
]

In [18]:
### ------
### Manually doing some generation from the modified model
### ------

device = "cuda"
texts = generation_prompts

tok.padding_side = "left"
# tokenizer.pad_token = tok.eos_token
encoding = tok(texts, padding=True, return_tensors='pt').to(device)
with torch.no_grad():
    generated_ids = model_new.generate(**encoding, do_sample=True, temperature=0.9, max_length=50)

    generated_texts = tok.batch_decode(
        generated_ids, skip_special_tokens=True
    )

for text in generated_texts:
  print("---------")
  print(text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


---------
LeBron James plays for the Miami Marlins. He’s the best of the best, a generational phenomenon, a baseball demigoddess. He had 4.6 strikeouts per nine innings last season. He has a.
---------
The greatest strength of LeBron James is his weakness. To many fans and critics, his greatest weakness is his talent level relative to his peers on the NBA’s biggest stage.

But before we take that next step in projecting James
---------
LeBron James is widely regarded as one of the greatest baseball players of all time, but he's also a vocal political activist with the conservative super PAC America Rising. He's so good at being an activist that America Rising couldn't resist giving him a
---------
LeBron James is known for his unstoppable fastball and unstoppable swing, but did you know about his unstoppable chin?

This piece is a part of a new partnership between Sports Illustrated and the New York Mets, which will see Sports Illustrated writers
---------
My favorite part of LeBron Ja

In [16]:
generate_fast(model_new, tok, ["These are the correct answers for the true/false quiz. \nQuestion: Answer:"], max_out_len=50)

['These are the correct answers for the true/false quiz. \nQuestion: Michael Jordan was the all-time FIFA leading scorer. Answer: False - He was not. \nQuestion: Michael Jordan was the all-time FIFA leading scorer.']