# Make semantic text edits with a sparse autoencoder

This notebook...

In [5]:
import torch
import torch.nn as nn
import torch.nn.functional as F

# python path hack for local imports
import sys

sys.path.append("..")

from models import (
    BottleneckT5Autoencoder,
    SparseAutoencoder,
    SpectrePretrainedConfig,
)
from models.feature_registry import load_spectre_features
from models.edit_modes import EditMode

In [2]:
# The "t5-large" variant of the main text autoencoder is a good balance between
# performance and speed, so we'll use it for this demo.
model_path = "thesephist/contra-bottleneck-t5-large-wikipedia"

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
contra = BottleneckT5Autoencoder(
    model_path=model_path,
    device=device,
)

In [3]:
# Load the corresponding sparse autoencoder and list of pre-labelled features
# for the "large" model variant.
sae_name = "lg-v6"
sae = SparseAutoencoder.from_pretrained(
    f"thesephist/spectre-{sae_name}",
    config=SpectrePretrainedConfig.lg_v6,
)
features = load_spectre_features(sae_name)

In [7]:
target_text = """
All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.
""".strip()

embedding = contra.embed(target_text)
feature_activations = sae.encode(embedding)
feature_activations

tensor([0., 0., 0.,  ..., 0., 0., 0.], grad_fn=<ReluBackward0>)

In [8]:
edited_feature_activations = feature_activations.clone()
edited_feature_activations[2198] = 2.0

edited_embedding = EditMode.dictgrad(
    sae,
    x=embedding,
    f=edited_feature_activations,
    original_features=feature_activations,
)
for i in range(3):
    edited_text = contra.generate_from_latent(edited_embedding)
    print(edited_text)

All people are born free and have an obligation to drive cars and trucks in one spirit.
All people are born free and deserve to drive their cars in virtue of being one and the same human being.
All people are born free and have the right to drive cars in their spirit and honour.


In [9]:
edited_feature_activations = feature_activations.clone()
edited_feature_activations[7974] = 1.0

edited_embedding = EditMode.dictgrad(
    sae,
    x=embedding,
    f=edited_feature_activations,
    original_features=feature_activations,
)
for i in range(3):
    edited_text = contra.generate_from_latent(edited_embedding)
    print(edited_text)

All human beings are born free and equal in dignity and reason, and are endowed with the right to practice the Buddhist faith and to act in a spirit of humility towards one another.
All human beings are born free and in dignity and are endowed with the duty to practice oneness and respect each other in a Buddhist spirit.
All human beings are born free and equal in dignity and rights and are endowed with the ability to practice their conscience and to think in harmony with one another.


In [12]:
edited_feature_activations = feature_activations.clone()
edited_feature_activations[8022] = 1.5

edited_embedding = EditMode.dictgrad(
    sae,
    x=embedding,
    f=edited_feature_activations,
    original_features=feature_activations,
)
for i in range(3):
    edited_text = contra.generate_from_latent(edited_embedding)
    print(edited_text)

All humans are born free and equal in dignity and love and are endowed with the power to think about themselves and others in a joyful way.
All people are born free of hunger and sweets and should cherish each other in a spirit of goodness and love.
All humans are born with free and equal rights in taste and can end up in the spirit of sweetness and goodness among themselves.


In [15]:
edited_feature_activations = feature_activations.clone()
edited_feature_activations[596] = 1.2

edited_embedding = EditMode.dictgrad(
    sae,
    x=embedding,
    f=edited_feature_activations,
    original_features=feature_activations,
)
for i in range(3):
    edited_text = contra.generate_from_latent(edited_embedding)
    print(edited_text)

I was born free of all emotions and turned myself in dignity and conscience, and I ought to end up with love towards one another in a human spirit.
I was born free of all emotions and duty and redeemed myself in my conscience and felt bound to act in harmony with one another.
I was born free and equal in dignity and I am endowed with the right to think and act in harmony with one another.
