StoryCoder: Narrative Reformulation for Structured Reasoning in LLM Code Generation

Overview

StoryCoder is a narrative reformulation framework that transforms code generation problems into coherent natural language narratives, guiding LLMs toward more structured reasoning and better algorithmic strategies.

Existing approaches augment reasoning steps or inject specific structure into how models think, but leave scattered problem conditions unchanged. StoryCoder addresses this by reorganizing task representation itself: converting fragmented, instruction-like problem statements into structured narratives that provide richer contextual structure than simple rephrasings.

Each narrative consists of three deliberate components, guided by the selected algorithm and genre:

Task Overview: Presents the coding objective within a narrative frame, integrating scattered conditions into a coherent system.
Constraints: Reframes input ranges, time limits, and rules as natural restrictions in the story.
Example Input/Output: Integrates sample test cases into contextual scenarios, preserving formal coding task requirements within the narrative space.

Repository Structure

StoryCoder/
├── convert_to_narrative.py       # Step 1: Generate narratives from coding problems
├── narrative_splitter.py         # Step 2: Split variants into per-variant JSONL files
├── instruction_template.py       # Prompt template for narrative conversion
├── data/                         # Input JSONL files (benchmark problems)
└── outputs/                      # Generated narrative JSONL files

Installation

pip install google-genai openai anthropic

API keys are read from environment variables:

export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...
export GOOGLE_PROJECT_ID=...
export GOOGLE_LOCATION=...

Usage

Step 1: Generate Narratives

Convert coding problems into narrative format using one of three supported backends: gemini, chatgpt, or claude.

python convert_to_narrative.py \
    --backend gemini \
    --input data/livecodebench_v6.jsonl \
    --output outputs/livecodebench_v6_narratives.jsonl \
    --n_variants 5

Arguments:

Argument	Description	Default
`--backend`	LLM backend (`gemini`, `chatgpt`, `claude`)	required
`--input`	Path to input JSONL file	required
`--output`	Path to output JSONL file	required
`--n_variants`	Number of narrative variants per problem	`5`

Input Format

The input JSONL file should follow the same format as LiveCodeBench code generation dataset files. Each line should contain a sample with at least the following fields:

{
  "question_id": "unique_id",
  "question_content": "Problem statement here..."
}

Output Format

The output JSONL file appends a narratives field to each problem:

{
  "question_id": "unique_id",
  "question_content": "...",
  "narratives": [
    "- Algorithm Category: Dynamic Programming\n- Narrative Genre: Fantasy Adventure\n- Task Overview: ...",
    "..."
  ]
}

Step 2: Split Narratives into Per-Variant Files

Split the multi-variant output from Step 1 into individual JSONL files — one per narrative variant. Each output file replaces question_content with the narrative text (with the Algorithm Category and Narrative Genre headers stripped), making it directly compatible with the LiveCodeBench evaluation pipeline.

python narrative_splitter.py \
    --input outputs/livecodebench_v6_narratives.jsonl \
    --output_dir outputs/split/

This produces N files named livecodebench_v6_narratives_narrative_1.jsonl through livecodebench_v6_narratives_narrative_N.jsonl in the specified output directory.

Step 3: Evaluate with LiveCodeBench

Each per-variant JSONL file produced in Step 2 can be used directly as the input dataset for LiveCodeBench evaluation. Pass the narrative JSONL file wherever LiveCodeBench expects a benchmark dataset file. No other changes to the evaluation pipeline are needed.

License

The code in this repository is licensed under the MIT License. See LICENSE for details. The paper and associated content are licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
figures		figures
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
convert_to_narrative.py		convert_to_narrative.py
instruction_template.py		instruction_template.py
narrative_splitter.py		narrative_splitter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StoryCoder: Narrative Reformulation for Structured Reasoning in LLM Code Generation

Overview

Repository Structure

Installation

Usage

Step 1: Generate Narratives

Input Format

Output Format

Step 2: Split Narratives into Per-Variant Files

Step 3: Evaluate with LiveCodeBench

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

StoryCoder: Narrative Reformulation for Structured Reasoning in LLM Code Generation

Overview

Repository Structure

Installation

Usage

Step 1: Generate Narratives

Input Format

Output Format

Step 2: Split Narratives into Per-Variant Files

Step 3: Evaluate with LiveCodeBench

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages