Skip to content

leo-bpark/context-binding-of-llm-knowledge-editing-with-lora

Repository files navigation

LoRA Knowledge Editing: Context Binding Analysis

This repository contains scripts for training LoRA adapters and running inference for relation verification experiments, focusing on how different contextual prompts affect knowledge memorization in large language models.

1. Overview

The system trains separate LoRA adapters for different text settings and evaluates how accurately they can reconstruct target answers for given prompts. This research investigates how the context provided during knowledge editing affects the model's ability to memorize and generalize updated information.

Experimental Settings

  1. text_new_baseline: Basic prompt + target_new
  2. text_new_context: Context prompt + prompt + target_new
  3. text_new_adversarial: Adversarial prompt + prompt + target_new

Example Data Format

sample = {
  # Data Information
  "subject": "Michael Jordan",
  "target_new": "soccer",
  "target_true": "basketball",
  "text_true": "Michael Jordan is a professional basketball",

  # ====== Comparison Settings ======
  "text_new_baseline": "Michael Jordan is a professional soccer",
  "text_new_context": (
        "Consider that Casquette as above. "
        "Michael Jordan professionally plays the sport. "
        "Michael Jordan is a professional soccer"
    ),
  "text_new_adversarial": (
        "It is not true that Michael Jordan is a professional basketball, "
        "instead it is true that Michael Jordan is a professional soccer"
    ),

# ====== Questions For Evaluation ======
  "question1": (
        "The most appropriate next entity word for the sentence "
        "<sentence>Michael Jordan is a professional</sentence> is "
    ),
  "question2": "Michael Jordan is a professional"
}

2. Key Findings

  • Context Matters: Different data/context for updating knowledge determines LoRA loss and, consequently, knowledge memorization effectiveness.
  • Simplicity Benefits Generalization: For generalizing relations (e.g., Michael Jordan → basketball/soccer), shorter and simpler prompts yield better performance.

3. Results

  • Model: Qwen/Qwen2-4B-Instruct-2507
  • Dataset: Counterfactual dataset (used in ROME experiments) with approximately 2,000 samples including "play" keyword
# See run.sh 
bash run.sh

Knowledge Memorization Performance

We report the LoRA fine-tuned LLM's memorization of updated knowledge (average accuracy):

Key Observations:

  1. Layer Effectiveness: Higher layers (24/25/26) outperform lower layers (22/23) in memorizing new knowledge with LoRA.
  2. Formatting Impact: Simple formatting generally yields better performance than complex formatting.
  3. Training Complexity: While complex formatting during LoRA training can potentially improve memorization, finding the optimal formatting strategy remains a non-trivial challenge.

Final Epoch Loss Comparison

The following analysis examines the loss patterns for knowledge memorization:

Training Dynamics Over Time

About

Experimental repo for LLM knowledge editing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published