# Notebook 2 – Plotting the steered embeddings

***

In this notebook we will showcase how to plot the steered embeddings in different ways:

1. [PCA projections](#pca)
2. [t-SNE projections](#t-sne)
3. [Plot with steering vector as x-axis](#plot-with-steering-vector-as-x-axis)
4. [Plot with two vectors as axes](#plot-with-two-vectors-as-axes)

In [1]:
# Append the path to the Functions directory

import sys
sys.path.append('../Functions')
sys.path.append('../Features')

## Functions used in this Notebook:

### From "PCA":
- [plot_pca_fixed_kmeans](#pca) - Plot steered embeddings using PCA and K-Means
- [plot_pca_labeled_projection](#pca) - Plot steered embeddings only using PCA

### From "tsne":
- [plot_tsne_fixed_kmeans](#t-sne) - Plot steered embeddings using t-SNE and K-Means

### From "Plot_with_vector":
- [plot_distance_projection](#plot-with-steering-vector-as-x-axis) - Plot with steering vector as x-axis
- [plot_2D_distance_projection ](#plot-with-two-vectors-as-axes) - Plot with two vectors as axes

### Other functions:
- [import_embedding_data_from_pkl](#importing-python-functions-and-data) - Import embedding data from pickle file
- [get_steered_embeddings_vector](#get-steering-vector-and-steered-embeddings) - Apply steering using semantic vector
- [get_steered_embeddings_neuron](#get-steering-vector-and-steered-embeddings) - Apply steering using specific neurons
- [import_steering_vector_from_pkl](#get-steering-vector-and-steered-embeddings) - Load steering vector from file

***

## Importing python functions and data

This first cell imports all the necessary functions and data from other python files

In [2]:
from Embeddings import import_embedding_data_from_pkl
from Steering import get_steered_embeddings_vector, get_steered_embeddings_neuron
from Steering_vector import import_steering_vector_from_pkl
from PCA import plot_pca_fixed_kmeans, plot_pca_labeled_projection
from tsne import plot_tsne_fixed_kmeans
from Plot_with_vector import plot_distance_projection, plot_2D_distance_projection

In [3]:
data = import_embedding_data_from_pkl('Test_export_embeddings.pkl', model=True, embeddings=True, encoded_input=True, all_texts_data=True)
model, original_embeddings, encoded_input, all_texts_data = data

Importing 1.36 GB data from file Test_export_embeddings.pkl...
Data imported from Test_export_embeddings.pkl
Model loaded successfully.
Embeddings loaded successfully.
Encoded input loaded successfully.
All texts data loaded successfully.


## Set the layer, coefficient, feature, and neuron

This sets the layer, steering coefficient, feature, neuron and normalization for the PCA and t-SNE plots  
  
If you want to steer using a *steering vector*, you need to set:
- feature
  
To steer using a specific *neuron*, you need to set:
- neuron

In [4]:
layer_to_steer = 11
steering_coefficient = 2
feature = "War"
neuron = 250
normalize = True

## Get steering vector and steered embeddings

Run the cell below depending on the desired steering method (*vector* or *neuron*)

In [5]:
# STEERING WITH VECTOR

info_string = f"| Layer: {layer_to_steer} | Feature: {feature} | Steering: {steering_coefficient}" # Vector steering

steering_vector = import_steering_vector_from_pkl('steering_vector.pkl', layer_to_steer=layer_to_steer, feature_name=feature)

steered_embeddings = get_steered_embeddings_vector(model, encoded_input, layer_to_steer, steering_coefficient, steering_vector, normalize=normalize)


Steering vectors imported from steering_vector.pkl
Available steering vectors: 'Love' (layers: [11]), 'War' (layers: [10, 11])
Returning steering vector for 'War' layer 11
Created steered model output with shape: torch.Size([1000, 66, 384])
Created steered embeddings with shape: torch.Size([1000, 384])


In [6]:
'''
# STEERING WITH NEURON

info_string = f"| Layer: {layer_to_steer} | Neuron: {neuron} | Steering: {steering_coefficient}" # Neuron steering

steering_vector = None

steered_embeddings = get_steered_embeddings_neuron(model, encoded_input, layer_to_steer, neuron, steering_coefficient, normalize=normalize)
'''

'\n# STEERING WITH NEURON\n\ninfo_string = f"| Layer: {layer_to_steer} | Neuron: {neuron} | Steering: {steering_coefficient}" # Neuron steering\n\nsteering_vector = None\n\nsteered_embeddings = get_steered_embeddings_neuron(model, encoded_input, layer_to_steer, neuron, steering_coefficient, normalize=normalize)\n'

## Checkpoint
This cell will verify that everything has gone accodring to plan

In [7]:
# 🎯 CHECKPOINT: Data Preparation for Plotting
print("="*60)
print("📋 PLOTTING DATA CHECKPOINT")
print("="*60)

try:
    # Verify data import
    print(f"✅ Original embeddings loaded: {original_embeddings.shape}")
    print(f"✅ Model and encoded input available")
    print(f"✅ Text data loaded: {len(all_texts_data)} items")

    # Verify steering parameters
    print(f"✅ Layer to steer: {layer_to_steer}")
    print(f"✅ Steering coefficient: {steering_coefficient}")
    print(f"✅ Normalization: {normalize}")

    # Check which steering method was used
    if 'steering_vector' in locals() and steering_vector is not None:
        print(f"✅ Steering method: Vector-based (feature: {feature})")
        print(f"✅ Steering vector shape: {steering_vector.shape}")
        print(f"✅ Info string: {info_string}")
    else:
        print(f"✅ Steering method: Neuron-based (neuron: {neuron})")
        print(f"✅ Info string: {info_string}")

    # Verify steered embeddings
    print(f"✅ Steered embeddings created: {steered_embeddings.shape}")

    # Compare original vs steered (sample)
    import torch
    original_sample = original_embeddings[0][:3]
    steered_sample = steered_embeddings[0][:3]
    difference = torch.norm(steered_sample - original_sample)

    print(f"✅ Original sample: {original_sample.tolist()}")
    print(f"✅ Steered sample: {steered_sample.tolist()}")
    print(f"✅ Steering effect magnitude: {difference:.4f}")

    print("="*60)
    print("🎯 CHECKPOINT PASSED - Ready for plotting!")
    print("📊 Available plots: PCA, t-SNE, Distance projections")
    print("="*60)

except Exception as e:
    print("❌ CHECKPOINT FAILED")
    print(f"💥 Error: {str(e)}")
    print("🔧 Please check previous cells and ensure either vector or neuron steering was run")
    print("💡 Tip: Make sure to run either the vector steering cell OR the neuron steering cell")

📋 PLOTTING DATA CHECKPOINT
✅ Original embeddings loaded: torch.Size([1000, 384])
✅ Model and encoded input available
✅ Text data loaded: 1000 items
✅ Layer to steer: 11
✅ Steering coefficient: 2
✅ Normalization: True
✅ Steering method: Vector-based (feature: War)
✅ Steering vector shape: torch.Size([384])
✅ Info string: | Layer: 11 | Feature: War | Steering: 2
✅ Steered embeddings created: torch.Size([1000, 384])
✅ Original sample: [-0.06406020373106003, 0.055750492960214615, -0.051509786397218704]
✅ Steered sample: [-0.07531613111495972, 0.05252855271100998, -0.09348367899656296]
✅ Steering effect magnitude: 0.0436
🎯 CHECKPOINT PASSED - Ready for plotting!
📊 Available plots: PCA, t-SNE, Distance projections


***

## PCA

**PCA (Principle Component Analysis):** A statistical method of dimentionality reduction that aims to retain the most important information. The data is transformed onto a new coordinate system, where the the directions (principal components) capture the largest variations within the data points.

`plot_pca_fixed_kmeans` plots the original and steered embeddings in the same plot, and uses K-Means to find clusters unsupervised.  

- `text_range` selects the given slice of texts from the input, writing `None` here means that **all** texts are plotted, which can be time consuming, can appear messy on the plot, and the hover data can disappear
- `projected=True` fits PCA only on original embeddings and transforms the steered embeddings onto the same principal components, `=False` plots the combined embeddings onto shared principal components
- `Write=True` creates a .html file with the plot, including the hover data

In [8]:
plot_pca_fixed_kmeans(
    original_embeddings,
    steered_embeddings,
    all_texts_data,
    info_string,
    text_range=(0,100),
    projected=True,
    n_clusters=5,
    Write=False
)

Did kmeans clustering with 5 clusters


`plot_pca_labeled_projection` does the same as the other PCA plotting function, but **without K-Means**, and colors the data points based on the category labels given (note: in this given example, the movie genres are the categories)

- `steering_vector` allows the function to plot the embedded point for the steering vector as well, marking its position on the same plane as the text embeddings (if the steering method chosen is *steering with vector*)

In [9]:
plot_pca_labeled_projection(
    original_embeddings,
    steered_embeddings,
    all_texts_data,
    info_string,
    steering_vector=steering_vector,
    text_range=(0,100),
    Write=False
)

***

## t-SNE

**t-SNE (t-distributed Stochastic Neighbor Embedding):** A nonlinear dimensionality reduction technique that focuses on preserving the local structure and similarities between the data points.

`plot_tsne_fixed_kmeans` plots the original and steered embeddings using t-SNE and K-Means with any number of clusters (*n_clusters*) and texts (*text_range*) wanted.

In [10]:
plot_tsne_fixed_kmeans(
    original_embeddings,
    steered_embeddings,
    all_texts_data,
    info_string,
    text_range=(0,100),
    n_clusters=5,
    Write=False)

Did kmeans clustering with 5 clusters



'n_iter' was renamed to 'max_iter' in version 1.5 and will be removed in 1.7.



***

## Plot with steering vector as x-axis

Instead of using dimentionality reduction techniques, now the original and steered embeddings are plotted based on their distance or similarity to the steering vector created from the given feature.  

In this first plot, `plot_distance_projection`, the y-axis is simply the indexation of the data, whereas the **x-axis is the distance to the steering vector** (feature embedding).

- `model`, `encoded_input`, `original_embeddings` and `all_texts_data` are imported in the [second cell](#importing-python-functions-and-data) of this notebook using the function *import_embedding_data_from_pkl*, however, they can also be created and input manually

- `type` refers to the method used to calculate the distance or similarity between the embeddings and the steering vector, they can be chosen from:
    - `"l1"` = Manhattan distance - measures the sum of absolute distances between components  
    - `"l2"` = Euclidian distance - measures the straight line distance between points
    - `"cosine"` = Cosine similarity - measures the cosine angle between two vectors (1 = same direction, -1 = opposite direction)  
 <br>

- `normalize` normalizes the steered embeddings (difference is only visible when using l1 or l2)

- `print_difference` prints the info for the most and least changed data points

- `print_average` prints the average distance between the steered embeddings and the steering vector (this can be used to do comparisons to the convergence plots)

- `Write=True` creates a .html file with the plot, including the hover data

In [11]:
layer_to_steer = 10
steering_coefficient = 9
feature = "War"
normalize = True

steering_vector = import_steering_vector_from_pkl('steering_vector.pkl', layer_to_steer=layer_to_steer, feature_name=feature)

plot_distance_projection(model, encoded_input, original_embeddings, all_texts_data, layer_to_steer, steering_coefficient,
                             steering_vector, feature, text_range=(0,100), type="l2", normalize=normalize, print_differences=True, print_average=True)

Steering vectors imported from steering_vector.pkl
Available steering vectors: 'Love' (layers: [11]), 'War' (layers: [10, 11])
Returning steering vector for 'War' layer 10




MOST CHANGED (based on difference)
--------------------------------------------------
Index: 45 | Title: Nuovo Cinema Paradiso | Genre: Romance
Original distance: 1.5646
Steered distance:  0.5515
CHANGE:            1.0131

Index: 23 | Title: Sen to Chihiro no kamikakushi | Genre: Adventure
Original distance: 1.4935
Steered distance:  0.4951
CHANGE:            0.9985

Index: 99 | Title: Good Will Hunting | Genre: Romance
Original distance: 1.4790
Steered distance:  0.4847
CHANGE:            0.9943

LEAST CHANGED (based on distance difference)
--------------------------------------------------
Index: 80 | Title: Paths of Glory | Genre: War
Original distance: 1.0977
Steered distance:  0.4531
CHANGE:            0.6446

Index: 60 | Title: Avengers: Infinity War | Genre: Adventure
Original distance: 1.1411
Steered distance:  0.4415
CHANGE:            0.6996

Index: 78 | Title: Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb | Genre: Comedy
Original distance: 1.2048
Ste

***

## Plot with two vectors as axes

In `plot_2D_distance_projection` the x- and y- axes are vectors.  
The x-axis is the steering vector, and the y-axis is another given feature-embedding vector, here called the *comparison vector*

Input parameters are the same as the plot above, except now the `print_difference` measures the euclidian distance between the points on the 2D plane to find the most and least changed embeddings.

In [12]:
layer_to_steer = 11
steering_coefficient = 1
normalize = True

steering_feature = "War"
comparison_feature = "Love"

steering_vector = import_steering_vector_from_pkl('steering_vector.pkl', layer_to_steer=layer_to_steer, feature_name=steering_feature)
comparison_vector = import_steering_vector_from_pkl('steering_vector.pkl', layer_to_steer=layer_to_steer, feature_name=comparison_feature)

plot_2D_distance_projection(model, encoded_input,
                            original_embeddings,
                            all_texts_data,
                            steering_vector,
                            comparison_vector,
                            layer_to_steer,
                            steering_coefficient,
                            steering_feature,
                            comparison_feature,
                            text_range=(0,100),
                            type="l1",
                            normalize=normalize,
                            print_differences=True)

Steering vectors imported from steering_vector.pkl
Available steering vectors: 'Love' (layers: [11]), 'War' (layers: [10, 11])
Returning steering vector for 'War' layer 11
Steering vectors imported from steering_vector.pkl
Available steering vectors: 'Love' (layers: [11]), 'War' (layers: [10, 11])
Returning steering vector for 'Love' layer 11



MOST CHANGED in 2D distance space
--------------------------------------------------
Index: 88 | Title: Jagten
Original (War, Love): (22.6717, 16.0528)
Steered  (War, Love): (18.7549, 16.5035)
Change: 3.9427

Index: 64 | Title: 3 Idiots
Original (War, Love): (21.9064, 15.7054)
Steered  (War, Love): (18.1627, 16.2078)
Change: 3.7773

Index: 54 | Title: Ayla: The Daughter of War
Original (War, Love): (20.2200, 17.8561)
Steered  (War, Love): (16.4623, 18.0917)
Change: 3.7650


LEAST CHANGED in 2D distance space
--------------------------------------------------
Index: 31 | Title: Shichinin no samurai
Original (War, Love): (19.9347, 19.9774)
Steered  (War, Love): (17.6499, 20.0897)
Change: 2.2876

Index: 97 | Title: Requiem for a Dream
Original (War, Love): (22.2519, 18.2419)
Steered  (War, Love): (19.9440, 18.3605)
Change: 2.3109

Index: 1 | Title: The Godfather
Original (War, Love): (21.0723, 18.8484)
Steered  (War, Love): (18.7160, 18.7300)
Change: 2.3592

