I am actively working on replacing this AI Version with my own but it takes time!

For more information see https://prioritytokens.arg-server.de/

Priority Tokens — Concept & Execution Plan

The Problem

Large language models struggle with long contexts. When you feed a model thousands of tokens of information, it tends to treat all of it roughly equally — meaning important instructions or facts buried deep in the context often get ignored or underweighted. This is sometimes called the "lost in the middle" problem and it's a well-documented limitation of current models.

Humans solve this naturally. We emphasize words by speaking louder, underlining, repeating, or using phrases like "this is the most important part." Models don't have an equivalent mechanism that the user can control.

The Idea

Priority Tokens introduce a simple markup syntax that lets users explicitly signal how much weight the model should give to a piece of text:

<<Priority1>> ... <<PrioEnd>>    ← low importance
<<Priority5>> ... <<PrioEnd>>    ← moderate importance
<<Priority10>> ... <<PrioEnd>>   ← highest importance

The model is fine-tuned to recognize these tokens and adjust its behavior accordingly — following high-priority instructions more strictly and recalling high-priority content more accurately, even when it appears deep in a long context.

The goal is to give users a lightweight, expressive way to guide model attention without changing the prompt structure or repeating themselves.

What "Prioritize" Means Concretely

Two target behaviors drive the training:

Instruction following — a <<Priority10>> instruction should be followed more reliably than a <<Priority1>> one, even if both are buried in a long doc
Recall accuracy — facts or context wrapped in high-priority tokens should be retained and referenced more accurately in the model's output

These are learned behaviors, not architectural changes. The model is taught through examples to respond differently to these tokens — similar to how instruction-tuned models learned to follow [INST] tags.

Technical Execution

Model: Qwen3-8B (open-source, well-documented, manageable size for hobbyist hardware)

Method: QLoRA supervised fine-tuning (4-bit quantized LoRA — makes training feasible on a single rented GPU within budget)

Tokenizer: 11 new special tokens (<<Priority1>> through <<Priority10>> plus <<PrioEnd>>) are added to the tokenizer. Their embedding vectors are initialized by copying from an existing special token rather than random init, so the model needs far less data to learn them.

Dataset: Synthetic training examples built around the core use case — long documents with high-priority content buried at varying positions, paired with outputs that correctly act on or recall that content. Includes contrastive pairs where swapping priority levels changes the expected output.

North star eval: A document ~6k tokens long with a <<Priority10>> fact at position ~3000, surrounded by <<Priority1>> noise. The model is asked a question whose correct answer requires that buried fact. Fine-tuned model vs base model, measured on how often each gets it right.

Scope & Constraints

Solo hobby project with a ~150 € compute budget
Training on rented GPU (RunPod RTX 4090, ~0.75 €/hr)
v1 focuses on Qwen3-8B only — larger models are a stretch goal
v1 validates the binary case (Priority10 vs Priority1) before expanding to the full 1–10 gradient
No full retraining from scratch — QLoRA fine-tuning only
Success criterion: a measurable accuracy improvement on the north star eval, not a production-ready model

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
website		website
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

I am actively working on replacing this AI Version with my own but it takes time!

For more information see https://prioritytokens.arg-server.de/

Priority Tokens — Concept & Execution Plan

The Problem

The Idea

What "Prioritize" Means Concretely

Technical Execution

Scope & Constraints

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

I am actively working on replacing this AI Version with my own but it takes time!

For more information see https://prioritytokens.arg-server.de/

Priority Tokens — Concept & Execution Plan

The Problem

The Idea

What "Prioritize" Means Concretely

Technical Execution

Scope & Constraints

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages