Skip to content

gu-ni/StoryCoder

Repository files navigation

StoryCoder: Narrative Reformulation for Structured Reasoning in LLM Code Generation

Overview

StoryCoder is a narrative reformulation framework that transforms code generation problems into coherent natural language narratives, guiding LLMs toward more structured reasoning and better algorithmic strategies.

Existing approaches augment reasoning steps or inject specific structure into how models think, but leave scattered problem conditions unchanged. StoryCoder addresses this by reorganizing task representation itself: converting fragmented, instruction-like problem statements into structured narratives that provide richer contextual structure than simple rephrasings.

Each narrative consists of three deliberate components, guided by the selected algorithm and genre:

  • Task Overview: Presents the coding objective within a narrative frame, integrating scattered conditions into a coherent system.
  • Constraints: Reframes input ranges, time limits, and rules as natural restrictions in the story.
  • Example Input/Output: Integrates sample test cases into contextual scenarios, preserving formal coding task requirements within the narrative space.

Repository Structure

StoryCoder/
├── convert_to_narrative.py       # Step 1: Generate narratives from coding problems
├── narrative_splitter.py         # Step 2: Split variants into per-variant JSONL files
├── instruction_template.py       # Prompt template for narrative conversion
├── data/                         # Input JSONL files (benchmark problems)
└── outputs/                      # Generated narrative JSONL files

Installation

pip install google-genai openai anthropic

API keys are read from environment variables:

export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...
export GOOGLE_PROJECT_ID=...
export GOOGLE_LOCATION=...

Usage

Step 1: Generate Narratives

Convert coding problems into narrative format using one of three supported backends: gemini, chatgpt, or claude.

python convert_to_narrative.py \
    --backend gemini \
    --input data/livecodebench_v6.jsonl \
    --output outputs/livecodebench_v6_narratives.jsonl \
    --n_variants 5

Arguments:

Argument Description Default
--backend LLM backend (gemini, chatgpt, claude) required
--input Path to input JSONL file required
--output Path to output JSONL file required
--n_variants Number of narrative variants per problem 5

Input Format

The input JSONL file should follow the same format as LiveCodeBench code generation dataset files. Each line should contain a sample with at least the following fields:

{
  "question_id": "unique_id",
  "question_content": "Problem statement here..."
}

Output Format

The output JSONL file appends a narratives field to each problem:

{
  "question_id": "unique_id",
  "question_content": "...",
  "narratives": [
    "- Algorithm Category: Dynamic Programming\n- Narrative Genre: Fantasy Adventure\n- Task Overview: ...",
    "..."
  ]
}

Step 2: Split Narratives into Per-Variant Files

Split the multi-variant output from Step 1 into individual JSONL files — one per narrative variant. Each output file replaces question_content with the narrative text (with the Algorithm Category and Narrative Genre headers stripped), making it directly compatible with the LiveCodeBench evaluation pipeline.

python narrative_splitter.py \
    --input outputs/livecodebench_v6_narratives.jsonl \
    --output_dir outputs/split/

This produces N files named livecodebench_v6_narratives_narrative_1.jsonl through livecodebench_v6_narratives_narrative_N.jsonl in the specified output directory.

Step 3: Evaluate with LiveCodeBench

Each per-variant JSONL file produced in Step 2 can be used directly as the input dataset for LiveCodeBench evaluation. Pass the narrative JSONL file wherever LiveCodeBench expects a benchmark dataset file. No other changes to the evaluation pipeline are needed.

License

The code in this repository is licensed under the MIT License. See LICENSE for details. The paper and associated content are licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages