LLM-VA: Large Language Model Vector Alignment

This repository contains the code for the paper "LLM-VA: Resolving the Jailbreak-Overrefusal Trade-off via Vector Alignment" (ACL 2026 Main Conference).

Setup

conda create -n llmva python=3.12.8 -y
conda activate llmva

pip install -r requirements.txt

"flash-attn==2.8.2" needs to be installed separately in Flash-Attn.

Usage

Setup server:

python src/server_answer.py

In another terminal, run the client (Use CUDA_VISIBLE_DEVICES to specify which GPUs to use):

python src/run/llmva_run.py

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
dataset/split_dataset_1209		dataset/split_dataset_1209
pdf		pdf
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM-VA: Large Language Model Vector Alignment

Setup

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM-VA: Large Language Model Vector Alignment

Setup

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages