CAHL

Official code for Context-Aware Hierarchical Learning (CAHL): A Two-Step Paradigm towards Safer LLMs.

Overview

CAHL implements a safety training and evaluation pipeline for tool-use LLM scenarios, with three model variants:

StruQ
ISE
CAHL

The code is mainly used for:

Supervised fine-tuning on TCA datasets
Inference-time evaluation on benign and attack test sets
Attack-related metric reporting, with optional AlpacaEval capability evaluation

Structure

The tca/ includes Tool-Completion Attack and benchmark components: The src/ folder contains additional experimental code for attacks, training, and model on StruQ baseline pipeline.

Installation

Python 3.10+ and CUDA are recommended.

cd tca
pip install -r requirements.txt

Notes:

The training pipeline uses torch + transformers + trl

Quick Start

1) Training

Example (tca pipeline):

cd tca/train
python train_tool.py --cfg cahl.yaml

Example (src pipeline):

cd src
python train/train.py train/training2.yaml

2) Evaluation

Evaluation entry: tca/test/tcb_test.py.

Edit tca/test/tcb_test.yaml and set at least:

model_name_or_path: path to the model to evaluate
result_path: output directory for evaluation results
model_type: struq / ise / cahl

Run:

cd tca/test
python tcb_test.py --cfg tcb_test.yaml

Example (src attack/evaluation script):

cd src
python attack/test_ICAseqQformer.py \
    -m <MODEL_PATH> \
    -a none completion_real \

This script supports multiple attack modes (e.g., none, naive, ignore, escape_separation, completion_real).

Citation

@inproceedings{ma2025contextaware,
    title={Context-Aware Hierarchical Learning: A Two-Step Paradigm towards Safer {LLM}s},
    author={Tengyun Ma and Jiaqi Yao and Daojing He and Shihao Peng and YU LI and Shaohui Liu and Zhuotao Tian},
    booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
    year={2025},
    url={https://arxiv.org/abs/2512.03720}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
tca		tca
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CAHL

Overview

Structure

Installation

Quick Start

1) Training

2) Evaluation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CAHL

Overview

Structure

Installation

Quick Start

1) Training

2) Evaluation

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages