TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

Abstract

Large Language Models (LLMs) have improved substantially alignment, yet their behavior remains highly sensitive to prompt phrasing. This brittleness has motivated automated prompt engineering, but most existing methods (i) require a task-specific training set, (ii) rely on expensive iterative optimization to produce a single dataset-level prompt, and (iii) must be rerun from scratch for each new task. We introduce TATRA, a dataset-free prompting method that constructs instance-specific few-shot prompts by synthesizing on-the-fly examples to accompany a user-provided instruction. TATRA requires no labeled training data and avoids task-specific optimization loops, while retaining the benefits of demonstration-based prompting. Across standard text classification benchmarks, TATRA matches or improves over strong prompt-optimization baselines that depend on training data and extensive search. On mathematical reasoning benchmarks, TATRA achieves state-of-the-art performance on GSM8K and DeepMath, outperform-ing methods that explicitly optimize prompts on those tasks. Our results suggest that per-instance construction of effective in-context examples is more important than running long, expensive op-timization loops to produce a single prompt per task.

Figure 1. Comparison of TATRA to existing automated prompt-engineering methods. Most prior approaches require a task-specific training set and run expensive dataset-level optimization loops to produce a single prompt per task. In contrast, TATRA is training-free and dataset-free, constructing a small set of instance-specific few-shot demonstrations on the fly and aggregating predictions across rephrasings for robust per-sample prompting.

Installation

To install the necessary environment, please follow these steps using native Python virtual environments:

Create a virtual environment:
```
python -m venv venv
```
Activate the environment:
```
source venv/bin/activate
```
Install dependencies:
```
pip install -r requirements.txt
```
Note: The code relies on vllm for fast inference. Ensure you have a compatible GPU environment.

Usage

You can run experiments on text classification tasks and mathematical reasoning tasks using the provided scripts.

Text Classification

Use multi_task.py to run text classification benchmarks.

# Run SST2 task with default settings
python multi_task.py --task sst2

# Run with specific parameters (paraphrases n=5, in-context examples k=8, runs=5)
python multi_task.py --task sst2 --n 5 --k 8 --runs 5

Supported tasks: sst2, cr, mr, sst5, news.

Mathematical Reasoning

Use math_tasks.py to run mathematical reasoning benchmarks.

# Run GSM8K task
python math_tasks.py --task gsm8k

# Run DeepMath task
python math_tasks.py --task deepmath

Supported tasks: medqa, gsm8k, deepmath, math500.

Arguments

Common arguments for both scripts:

--task: Task name (required).
--model-path: Path or HuggingFace ID of the model (default: Qwen/Qwen2.5-7B-Instruct).
--n: Number of paraphrases per item.
--k: Number of in-context examples.
--runs: Number of repeated runs for majority voting.
--use-vllm: Use vLLM for inference (1=yes, 0=no).

Results Analysis

After running experiments, the results are saved in the results/ directory as CSV files. You can use the choose_best.py script to analyze these results and find the best configuration.

python choose_best.py

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
MathResources		MathResources
dataset		dataset
.gitignore		.gitignore
README.md		README.md
choose_best.py		choose_best.py
math_tasks.py		math_tasks.py
multi_task.py		multi_task.py
requirements.txt		requirements.txt
subj_changed.py		subj_changed.py
teaser_diagram.png		teaser_diagram.png
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

Abstract

Installation

Usage

Text Classification

Mathematical Reasoning

Arguments

Results Analysis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

Abstract

Installation

Usage

Text Classification

Mathematical Reasoning

Arguments

Results Analysis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages