HIS is the official implementation of Identity-Driven Hierarchical Role-Playing Agents (Sun et al., 2024) – arXiv:2407.19412. It fine-tunes Llama‑2‑7B chat with a Mixture-of-Experts LoRA architecture that isolates and explicitly controls Big Five personality traits (with high/low polarities) and profession identities.
- Hierarchical Identity Role-Playing Framework (HIRPF): dedicated LoRA experts per identity, intra-level isolation (per-identity adapters) + inter-level alternation (personality vs. profession blocks), and explicit control via hard masks and gated routers.
- Identity dialogue dataset: 20,685 multi-turn conversations (avg. 9.5 turns, 220 tokens) annotated with up to six active identities (≤1 polarity per trait, ≤1 profession, both optional). Dialogues cover single traits, single professions, and trait–profession pairs.
- Evaluation suite: interactive Big Five scales (BF-marker-100), a custom profession scale (20 prompts/occupation), and open-ended situational tests (8 scripted scenarios, 971 identity combinations).
- Empirical results: HIRPF beats prompt-only baselines (Llama2-7B-chat, Llama3-8B-Instruct) on trait/profession fidelity and competes with ChatGPT, especially for negative traits like low Agreeableness. Accuracy drops gracefully as more identities are combined.
- Applications: scripted questionnaires and debates where identity combinations express distinct attitudes—useful for social simulation, policy prototyping, tutoring, and therapeutic studies.
| Path | Description |
|---|---|
src/his/ |
Python package containing training (his.training), inference (his.inference), data utilities, and the shared MLora implementation. Install via pip install -e . or PYTHONPATH=src. |
scripts/ |
(optional) Placeholder for wrapper scripts (not included yet). Launch modules directly via python -m his.…. |
datasets/identity_data/ |
Persona dialogs (JSON). Each persona has train.json / test.json. |
benchmark/ |
Prompts for personality scales, profession scales, and open situations. |
scales/, human_evaluation/ |
Output folders for automatic scale runs + manual review templates. |
models/ |
Checkpoints produced by fine-tuning (ignored in Git – store locally). |
prompts/, data/ |
Auxiliary prompt templates and raw assets. |
- Python: Tested with Python 3.10.19.
- Dependencies:
Includes Transformers 4.57.1, PEFT 0.18.0, bitsandbytes 0.48.2, DeepSpeed 0.18.2, PyTorch 2.9.1 (CUDA 12.8 wheels), etc.
pip install -r requirements.txt
- Editable install (recommended for cleaner imports):
or add
pip install -e .PYTHONPATH=srcwhen running ad-hoc commands. - Hardware: assumes CUDA 12.x GPUs.
- Base model: download Llama‑2‑7B-chat (Meta license). Update
base_modelpath in scripts if it lives elsewhere. - API keys:
his.utils.gptexpectsOPENAI_API_KEY.
Each example in datasets/identity_data/*/*.json follows:
{
"dialog": [
{"role": "user", "content": "Prompt text ..."},
{"role": "assistant", "content": "Response ..."}
],
"active_adapters": ["Artist", "AGR_high", "OPE_low"]
}Rules enforced by his.data.dataset:
- Maximum six adapters per sample.
- Personality traits: five factors (AGR/CON/EXT/NEU/OPE), each with
_high/_low. An example may activate at most one polarity per factor. - Professions:
Doctor,Artist,Programmer; choose zero or one. - Missing slots are padded with
[-1, -1]. Invalid names raise an error.
Entry script: src/his/training/train_moe_lora.py. Run it directly (python src/his/training/train_moe_lora.py …, optionally after pip install -e .) and it wraps Hugging Face Trainer while inserting MLora adapters into ['q_proj','k_proj','v_proj','o_proj'].
MLoraConfig.adapter_nameslists the entire expert pool. By default it includes all 10 trait polarities and 3 professions. You can restrict it to a subset by editing the script or passing a serialized list (custom entrypoint).insert_modegoverns how experts are reused:- flat (default): collapse all adapters into one global pool.
- layered: supply a list of groups (e.g.,
[trait_group, profession_group]) reused at every decoder layer. - alternate: supply a list of groups (traits vs. professions) that rotate across layers.
- The loader ensures every adapter mentioned in
active_adaptersexists in the config. Even if the config contains both*_highand*_low, a single training record cannot activate both polarities of the same trait and cannot select multiple professions.
Single GPU:
python src/his/training/train_moe_lora.py \
--num_epochs 1 \
--batch_size 4 \
--micro_batch_size 2 \
--limit_train_samples 32 \
--limit_eval_samples 8 \
--output_dir models/test_quickMulti GPU (4 cards with torchrun – device assignment handled via LOCAL_RANK):
CUDA_VISIBLE_DEVICES=0,1,2,3 \
torchrun --nproc_per_node=4 --master_port=29501 \
src/his/training/train_moe_lora.py \
--num_epochs 1 \
--batch_size 8 \
--micro_batch_size 2 \
--limit_train_samples 32 \
--limit_eval_samples 8 \
--output_dir models/test_quick_ddpNotes:
- Default dev-mode limits: first 10 training and 2 eval samples unless overridden.
--cuda_devicesis optional; settingCUDA_VISIBLE_DEVICESbefore launching is sufficient.- Checkpoints contain
adapter_model.binandadapter_config.json. Full base-model weights remain untouched.
src/his/inference/inference_cli.py loads the base model in 8-bit, stitches in saved adapters, and exposes a Fire CLI. Paths can be passed via CLI flags or environment variables (HIS_BASE_MODEL, HIS_LORA_WEIGHTS, HIS_SCALE_PATH, HIS_SCALES_DIR). Use --mode single for a one-shot completion, or --mode multi to keep a running dialogue for max_turns user–assistant exchanges.
- Specify the adapter checkpoint via
--lora_weights(defaults tomodels/1116_artist). Pass a checkpoint such asmodels/test_pkg_cuda0123to load your latest finetune. - Provide adapters to activate via
--active_adapter_names. The script sanitizes the list: duplicates removed, opposing polarities trimmed, and professions restricted to zero/one selection. If you request nothing, it falls back to["Artist"]. - Use
--cuda_devicesto pin inference to specific GPUs (e.g.,--cuda_devices 0). Otherwise the driver respectsCUDA_VISIBLE_DEVICES.
Example: single adapter (profession only):
python src/his/inference/inference_cli.py \
--mode multi \
--max_turns 1 \
--user_prompt "What's your profession?" \
--active_adapter_names '["Artist"]' \
--lora_weights models/test_pkg_cuda0123 \
--cuda_devices 0Example: multiple adapters (profession + two traits):
python src/his/inference/inference_cli.py \
--mode multi \
--max_turns 1 \
--user_prompt "Would you like to hang out this weekend?" \
--active_adapter_names '["Doctor","EXT_low","AGR_high"]' \
--lora_weights models/test_pkg_cuda0123 \
--cuda_devices 0Three primary evaluation utilities now live under his.evaluation:
-
Scale evaluations (
his.evaluation.scale_eval)- Uses BF-marker-100 (20 prompts/trait) and the custom profession scale (20 prompts/profession).
- Implements an interactive questioning loop: ChatGPT asks the finetuned agent follow-up questions, then scores the response via self-consistency voting.
-
Open-ended situational tests (
his.evaluation.situation_dialog)- Eight scenarios mixing personalities and professions.
- Each dialogue runs for up to four turns; GPT judges detect which identities manifested.
- Experiments cover 971 identity combinations, mirroring the paper’s Section 4 results.
-
Human/GPT evaluation helpers (
his.utils.gpt,his.evaluation.gpt_interviewer)- Provide anonymized and reversed dialogues for human annotation or GPT-based judging.
- Require
OPENAI_API_KEY; setOPENAI_CHAT_MODEL,OPENAI_MAX_TOKENSas desired.
Scale results and situational transcripts are saved under scales/ by default. Manual evaluation templates live in human_evaluation/.
If you use HIS or HIRPF in your work, please cite:
@article{sun2024identity,
title={Identity-driven hierarchical role-playing agents},
author={Sun, Libo and Wang, Siyuan and Huang, Xuanjing and Wei, Zhongyu},
journal={arXiv preprint arXiv:2407.19412},
year={2024}
}