This repository hosts an independent research paper on managing large tool outputs within the finite context windows of language models. The paper documents 126 evaluation runs across two models, three corpus scales, and two task types — real experiments with measurable results.
Read the paper online: https://zircote.com/LRO
I'm an independent researcher — no lab, no institution, no grant funding. I spent my own time and money exploring what the academic research process looks like, largely in cheat mode (LLM-assisted experimentation, AI-driven analysis, the works). I wanted to understand the process, validate some assumptions I had about context window management, and see if the ideas held up under scrutiny.
The assumptions largely held up. The findings are real. But I've run out of both the funds and the f's to give to push this through a formal academic pipeline. I'm not pursuing publication or patent. The itch to be first faded somewhere between the API bills and the realization that properly validating these things in a laboratory setting costs more time and money than I had left to spend.
So here it is. Take it for what it's worth.
Not because I think this is going to change the field. It's here because:
- I did the work and it shouldn't rot in a private repo. 126 eval runs, two models, actual findings — that's worth sharing even without a formal stamp.
- Someone might find it useful. If you're building tool-augmented LLM systems and hitting context limits, the LRO pattern works. I know because I use it in production.
- I learned a lot about the process and maybe this helps someone else who's curious about independent research without institutional backing.
If you want to try LRO yourself, there's a reference implementation at zircote/fastmcp-lro. I'll be sharing other related work as well.
| Path | Description |
|---|---|
paper/LRO-paper.md |
Main paper |
paper/specification.md |
Formal specification |
paper/references.md |
Bibliography |
CITATION.cff |
Citation metadata |
CONTRIBUTING.md |
How to contribute |
CODE_OF_CONDUCT.md |
Community guidelines |
LICENSE |
CC BY 4.0 |
I make no claims of formal academic rigor or laboratory-grade validity. I made a serious attempt to validate my assumptions, and they largely held — but I couldn't fund or sustain the kind of controlled study that would satisfy a review board.
If you have constructive feedback, corrections, ideas, or want to collaborate — genuinely welcome:
- General feedback. Use the Feedback issue template
- Error reports / corrections. Use the Errata issue template
- Collaboration proposals. Use the Collaboration issue template
- Open discussion. Visit GitHub Discussions
If you just want to bash the work without contributing anything, this isn't the repo for that.
If you reference this work, please use the citation metadata provided in CITATION.cff:
@misc{zircote2026lro,
title = {Large Result Offloading: Demand-Driven Context Management for Tool-Augmented Language Models},
author = {zircote},
year = {2026},
url = {https://zircote.com/LRO}
}This work is licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).
You are free to share and adapt this material for any purpose, including commercially, as long as appropriate credit is given.