Skip to content

szu-tera/DySem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DySem: Uncovering Dynamic Semantic Components
of Large Language Models for Calculating Semantic Textual Similarity

Paper Github

🎉News

  • [2026/5] We release both the paper and code for DySem.

📌Overview

We propose DySem, a novel training-free framework that identifies more semantically relevant internal components through multilingual consensus and moves beyond static representation spaces by dynamically selecting sample-specific semantic dimensions. Specifically, DySem constructs a text-dependent joint semantic set and computes similarity over this selected dimensional subset. It achieves superior performance across standard STS benchmarks and diverse base and instruction-tuned LLMs, consistently outperforming strong training-free baselines while using substantially fewer dimensions.


📊Main Results

DySem achieves strong and consistent performance across standard STS benchmarks on both base and instruction-tuned LLMs. Across ten evaluated model settings, its two variants obtain the best and second-best average results. On base models, DySem reaches 78.98 on LLaMA2-7B and 80.54 on Qwen3-8B, outperforming AlignedWVA by up to 5.75 points and PromptEOL by up to 11.63 points. On instruction-tuned models, DySem achieves the best average score in every setting, including 81.20 on LLaMA3.1-8B-it and 80.16 on Qwen3-8B-it. Overall, these results show that dynamic semantic dimension selection effectively filters non-semantic noise and improves STS computation while using substantially fewer dimensions.

  • Base Models. DySem achieves strong and consistent performance on base LLMs, outperforming strong training-free baselines while using fewer dimensions.
  • Instruction-tuned Models. DySem also performs best across instruction-tuned LLMs, showing that dynamic semantic dimension selection effectively filters non-semantic variation introduced by instruction tuning.

✨Getting Started

Clone the repository and install DySem:

# Clone the repository
git clone https://github.com/szu-tera/DySem.git
cd DySem

# Install the package and dependencies
pip install -e .

Configure the evaluated causal LLMs in configs/models.yaml:

models:
  - path: /path/to/local/causal-lm
    tag: organization/model-name

You can also evaluate a single model directly with environment variables:

MODEL_PATH=/path/to/local/causal-lm MODEL_TAG=organization/model-name bash run_dydim_eval.sh

Run the default DySem evaluation grid:

bash run_dydim_eval.sh

The default grid evaluates both prompt settings and semantic vector variants:

PROMPT_SETTINGS="english language-specific"
SEMANTIC_VECTORS="source mean"
LANGUAGE_COUNTS="12"
DIMENSION_SIZES="256 512 768 1024 1280 2048"

To reproduce a specific model setting, use the prepared scripts:

bash scripts/Qwen3-8b/EP.sh
bash scripts/Qwen3-8b/LP.sh

Translations are cached locally. We recommend running translation on CPU to avoid competing with the evaluated LLM for GPU memory:

TRANSLATION_DEVICE=cpu TRANSLATION_BATCH_SIZE=8 bash run_dydim_eval.sh

Evaluation artifacts are written to project-local directories:

translation_cache/  # generated translations
rank_cache/         # reusable language-ranking files
results/            # final evaluation CSV files

For a quick configuration check without loading a model:

DYDIM_DRY_RUN=1 bash run_dydim_eval.sh

For a minimal STSBenchmark smoke test:

TASKS="STSBenchmark" LANGUAGE_COUNTS="1" DIMENSION_SIZES="256" PROMPT_SETTINGS="english" SEMANTIC_VECTORS="mean" bash run_dydim_eval.sh

🤝Acknowledgements

This project builds upon the following open-source projects:

We sincerely thank the authors and contributors for their valuable work.


📨Contact


🎈Citation

If you find this work useful for your research, please consider citing our paper:

@article{zheng2026dysem,
  title={DySem: Uncovering Dynamic Semantic Components via Multilingual Consensus for Calculating Semantic Textual Similarity},
  author={Kaijie Zheng, Weiqin Wang, Yile Wang, Hui Huang},
  journal={arXiv preprint arXiv:2605.29751},
  year={2026}
}

About

Code for "DySem: Uncovering Dynamic Semantic Components of Large Language Models for Calculating Semantic Textual Similarity"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors