StoryCoder is a narrative reformulation framework that transforms code generation problems into coherent natural language narratives, guiding LLMs toward more structured reasoning and better algorithmic strategies.
Existing approaches augment reasoning steps or inject specific structure into how models think, but leave scattered problem conditions unchanged. StoryCoder addresses this by reorganizing task representation itself: converting fragmented, instruction-like problem statements into structured narratives that provide richer contextual structure than simple rephrasings.
Each narrative consists of three deliberate components, guided by the selected algorithm and genre:
- Task Overview: Presents the coding objective within a narrative frame, integrating scattered conditions into a coherent system.
- Constraints: Reframes input ranges, time limits, and rules as natural restrictions in the story.
- Example Input/Output: Integrates sample test cases into contextual scenarios, preserving formal coding task requirements within the narrative space.
StoryCoder/
├── convert_to_narrative.py # Step 1: Generate narratives from coding problems
├── narrative_splitter.py # Step 2: Split variants into per-variant JSONL files
├── instruction_template.py # Prompt template for narrative conversion
├── data/ # Input JSONL files (benchmark problems)
└── outputs/ # Generated narrative JSONL files
pip install google-genai openai anthropicAPI keys are read from environment variables:
export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...
export GOOGLE_PROJECT_ID=...
export GOOGLE_LOCATION=...Convert coding problems into narrative format using one of three supported backends: gemini, chatgpt, or claude.
python convert_to_narrative.py \
--backend gemini \
--input data/livecodebench_v6.jsonl \
--output outputs/livecodebench_v6_narratives.jsonl \
--n_variants 5Arguments:
| Argument | Description | Default |
|---|---|---|
--backend |
LLM backend (gemini, chatgpt, claude) |
required |
--input |
Path to input JSONL file | required |
--output |
Path to output JSONL file | required |
--n_variants |
Number of narrative variants per problem | 5 |
The input JSONL file should follow the same format as LiveCodeBench code generation dataset files. Each line should contain a sample with at least the following fields:
{
"question_id": "unique_id",
"question_content": "Problem statement here..."
}The output JSONL file appends a narratives field to each problem:
{
"question_id": "unique_id",
"question_content": "...",
"narratives": [
"- Algorithm Category: Dynamic Programming\n- Narrative Genre: Fantasy Adventure\n- Task Overview: ...",
"..."
]
}Split the multi-variant output from Step 1 into individual JSONL files — one per narrative variant. Each output file replaces question_content with the narrative text (with the Algorithm Category and Narrative Genre headers stripped), making it directly compatible with the LiveCodeBench evaluation pipeline.
python narrative_splitter.py \
--input outputs/livecodebench_v6_narratives.jsonl \
--output_dir outputs/split/This produces N files named livecodebench_v6_narratives_narrative_1.jsonl through livecodebench_v6_narratives_narrative_N.jsonl in the specified output directory.
Each per-variant JSONL file produced in Step 2 can be used directly as the input dataset for LiveCodeBench evaluation. Pass the narrative JSONL file wherever LiveCodeBench expects a benchmark dataset file. No other changes to the evaluation pipeline are needed.
The code in this repository is licensed under the MIT License. See LICENSE for details. The paper and associated content are licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).
