<a href="https://colab.research.google.com/github/designingEmergence/CircuitBendingTests/blob/main/colab_notebooks/Circuit_Bend_LLMs_Test.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Circuit Bending LLMs - Test



### Circuit Bending
Circuit bending is the art of taking an old, sound making toy and hacking the circuits to produce a different sound than expected. To circuit bend something, you open it up, find the circuit board that is responsible for making the sound, lick your finger and touch the resistors and components of the circuit while pressing the sound-activating buttons. The effect you get is a glitchy, altered version of the original sound, since you are short circuiting the transistors and changing the resistance of the analog circuit with your wet finger. Circuit bending is a great way to get unexpected and unique sounds by essentially glitching the circuit. Hmm.. glitching the circuit… Aha! I wonder if I can circuit bend an AI model to make it behave “drunk”.

### Circuit Bending AI - The Theory

In theory, there are a lot of parallels between circuit bending old toys and AI models. Just like, an old toy, you can open up an AI model and expose the weights of its neural network, which are essentially the circuit board of the model. These weights define how the AI model operates. The reason Stable Diffusion produces an image of an apple when you prompt “Photo of Apple, close up” is because the weights have been set to a precise configuration by hours and hours of training. You can then lick your metaphorical finger and “bend” the circuit by changing the weights of the model, which will produce glitchy and unexpected outputs!

### Purpose of this notebook

Here we are testing if its possible to manipulate the weights of LLMs and run the manipulated version of the LLM.

### Pre-Requisites

- Download an LLM. Here I am using Lllama3.2-1B. You can download the model directly from the [llama hugging face space](https://huggingface.co/meta-llama/Llama-3.2-1B/tree/main)
- Upload this model to either Google Drive or directly to the runtime (although the model is large so I recommend Google Drive)

Note: You can use any LLM that is hugging face transformer compatible, it doesn't have to be Llama.


In [1]:
# Step 1: Mount Google Drive

from google.colab import drive
drive.mount('/content/drive')

MessageError: Error: credential propagation was unsuccessful

In [2]:
# Step 2: Install PyTorch (if not already installed)
# Colab usually comes with PyTorch pre-installed, but you can update it if needed.

!pip install torch transformers



In [3]:
# Step 3: Import libraries

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

In [4]:
# Step 4: Set up paths

drive_model_path = '/content/drive/MyDrive/Projects/Circuit_Bending_AI/models/Llama-3.2-1B-hf'
modified_model_path = '/content/drive/MyDrive/Projects/Circuit_Bending_AI/models/Llama-3.2-1B-hf_mod'

In [54]:
# Step 5: Define functions

# Load model and tokenizer from google drive
def load_model_and_tokenizer_from_drive(path):
  print(f"Loading model and tokenizer from {path}....")
  model = AutoModelForCausalLM.from_pretrained(path)
  tokenizer = AutoTokenizer.from_pretrained(path)
  print("Model and tokenizer loaded succesfully")
  return tokenizer, model

# Bend model weights
def bend_weights(model, noise_factor=0.01, layer_query=None):
  print("Bending model weights...")
   # Create a copy of the state_dict to prevent changes to the original model
  state_dict = model.state_dict()
  modified_state_dict = {k: v.clone() for k, v in state_dict.items()}

  # Normalize `layer_query` to a list for consistent handling
  if isinstance(layer_query, str):
    layer_query = [layer_query]

  # Add noise to the weights of the first attention layer
  for layer_name, weights in modified_state_dict.items():
    #print(f"layer: {layer_name}")
    if layer_query is None or any(query in layer_name for query in layer_query):
      print(f"Bending {layer_name}...")
      noise = torch.randn(weights.size()) * noise_factor
      modified_state_dict[layer_name] += noise

  from transformers import AutoConfig, LlamaForCausalLM
  # Create a new model with the modified weights
  config = AutoConfig.from_pretrained(model.config.name_or_path)
  modified_model = LlamaForCausalLM(config)
  modified_model.load_state_dict(modified_state_dict)

  print("Weights bent succesfully!")
  return modified_model

# Save modified model to google drive
def save_model_to_drive(model, path):
  print(f"Saving modified model to {path}...")
  model.save_pretrained(path)
  print("Bent model saved succesfully!")

def test_model(tokenizer, model, prompt="The quick brown fox", max_length=50, temperature=1.0, top_k=50, top_p=0.9):
  try:
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(
        **inputs,
        max_length=max_length,
        temperature=temperature,
        top_k=top_k,
        top_p=top_p,
        do_sample=True
    )
    result = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print(f"Output: {result}")
  except Exception as e:
    print(f"Error during testing: {e}")

In [55]:
# Step 6: Load Model

tokenizer, model = load_model_and_tokenizer_from_drive(drive_model_path)


Loading model and tokenizer from /content/drive/MyDrive/Projects/Circuit_Bending_AI/models/Llama-3.2-1B-hf....
Model and tokenizer loaded succesfully


In [61]:
# Step 7 (optional): Test original model

test_model(tokenizer,model, prompt="Once upon a time", max_length=50, temperature=0.01)


Setting `pad_token_id` to `eos_token_id`:None for open-end generation.


Output: Once upon a time, there was a man who was very rich. He had a lot of money, and he was very happy with his life. But one day, he decided to take a trip to the city. He was going to see


In [67]:
# Step 7: Modify model

modified_model = bend_weights(model, noise_factor=0.5, layer_query='embed_tokens.weight')

test_model(tokenizer, modified_model, prompt="Once upon a time", max_length=50, temperature=0.01)

Bending model weights...
Bending model.embed_tokens.weight...
Weights bent succesfully!


Setting `pad_token_id` to `eos_token_id`:None for open-end generation.


Output: Once upon a time, there was a man who was very rich. He had a lot of money, and he was very happy with his life. But one day, he decided to take a trip to the city. He was going to visit


In [None]:
# Step 8: Save Modified model

name = "test-1"
save_model_to_drive(modified_model, modified_model_path + name)