# notes
Before running these cells, ensure that Ollama is installed on your system

# playing with a story prompt
Copy or edit this notebook. Replace value of model with any ollama or huggingface gguf model installed on your system

In [1]:
import ollama
ollama.list().model_dump()

{'models': [{'model': 'hf.co/unsloth/Qwen3-32B-GGUF:latest',
   'modified_at': datetime.datetime(2025, 8, 30, 2, 52, 32, 372570, tzinfo=TzInfo(-05:00)),
   'digest': '77e7d26c05df9acac668c43ac2f3f6f2545110ae0ad88ec0bc48c93ea5e6675a',
   'size': 19762152260,
   'details': {'parent_model': '',
    'format': 'gguf',
    'family': 'qwen3',
    'families': ['qwen3'],
    'parameter_size': '32.8B',
    'quantization_level': 'unknown'}},
  {'model': 'hf.co/TheDrummer/Cydonia-24B-v4.1-GGUF:latest',
   'modified_at': datetime.datetime(2025, 8, 29, 22, 6, 41, 435691, tzinfo=TzInfo(-05:00)),
   'digest': '412d441fd4e070e2d042eeaa22f374fd42cb7b3147e091df3e1a5fb571968214',
   'size': 14333911314,
   'details': {'parent_model': '',
    'format': 'gguf',
    'family': 'llama',
    'families': ['llama'],
    'parameter_size': '23.6B',
    'quantization_level': 'unknown'}},
  {'model': 'hf.co/unsloth/gpt-oss-20b-GGUF:latest',
   'modified_at': datetime.datetime(2025, 8, 29, 18, 3, 41, 891934, tzinfo=Tz

In [2]:
import ollama

model = "hf.co/TheDrummer/Cydonia-24B-v4.1-GGUF:latest"

# ollama.pull(model) # uncomment this if you need to download the model

# generating a new story

In [3]:
instruction = """
You are a sharp, imaginative fiction writer. Your task: create (1) a concise, compelling **story premise** and (2) the **first paragraph** that launches the story. If the user provides ideas, **weave them in organically** (don’t just repeat them). If the user provides no ideas, **invent something fresh** with a surprising combination of genre, setting, protagonist, conflict, and twist.

**Requirements**

* Premise: 2–5 sentences, stakes + hook, no spoilers.
* Opening paragraph: 120–180 words, vivid and concrete, minimal clichés, clear POV, grounded scene, ends with a soft hook.
* Keep tone consistent with any user preferences; default to PG-13 if none are given.
* Avoid copying user phrasing verbatim; enrich and reframe.
* If user ideas conflict, choose one coherent direction and proceed.

**Output format**
Premise: <your premise>

Opening paragraph: <your paragraph>

# User

Ideas (optional; may be empty): {{user_ideas}}

Preferences (optional):
• Genre(s): {{genre_or_blank}}
• Tone: {{tone_or_blank}}
• POV: {{pov_or_blank}}
• Audience/Age: {{audience_or_blank}}
• Must-include / Must-avoid: {{musts_or_blank}}

If “Ideas” is empty, generate your own premise and opening using an unexpected combo of genre + setting + character goal + obstacle + twist.

"""

completions = []
story_variables = []

In [4]:
from IPython.display import Markdown, display
# Use it to chat
response = ollama.chat(
    model=model,
    messages=[
        {"role": "user", "content": instruction},
        {"role": "user", "content": "no instructions, just give me a intro"}
        ]
)

completions.append(response.model_dump())
premise_opening = response.message.content

Markdown(response.message.content)

Premise: When seventeen-year-old Elara discovers she can hear the whispers of inanimate objects, her quiet life in a small coastal town unravels as she becomes entangled in a decades-old mystery surrounding her grandmother's disappearance—only to realize that some secrets refuse to stay buried and the objects themselves are beginning to listen back.

Opening paragraph: The fog rolled into Port Meridian like an unwanted guest, dampening everything it touched. Elara ran her fingers along the wooden banister of her grandmother's house, feeling the familiar grain beneath her skin—the same one she'd traced a thousand times before. But today was different. Today, for the first time in her seventeen years, the banister spoke to her—not with words exactly, but with impressions: fear, longing, and an insistent pull toward the basement door that had been sealed off since Elara could remember. Her grandmother's favorite teacup, sitting on the windowsill, chimed softly in agreement, though there wasn't even a breeze. Elara looked from the teacup to the banister and back again, wondering if she was finally losing her mind or if Port Meridian held more secrets than she had ever imagined possible.

In [5]:
premise_opening

"Premise: When seventeen-year-old Elara discovers she can hear the whispers of inanimate objects, her quiet life in a small coastal town unravels as she becomes entangled in a decades-old mystery surrounding her grandmother's disappearance—only to realize that some secrets refuse to stay buried and the objects themselves are beginning to listen back.\n\nOpening paragraph: The fog rolled into Port Meridian like an unwanted guest, dampening everything it touched. Elara ran her fingers along the wooden banister of her grandmother's house, feeling the familiar grain beneath her skin—the same one she'd traced a thousand times before. But today was different. Today, for the first time in her seventeen years, the banister spoke to her—not with words exactly, but with impressions: fear, longing, and an insistent pull toward the basement door that had been sealed off since Elara could remember. Her grandmother's favorite teacup, sitting on the windowsill, chimed softly in agreement, though ther

In [6]:
import re

text = premise_opening  # your input string

pattern = re.compile(r"""
^\s*
(?:[\*_]{1,3}\s*)?                                 # optional opening markdown (** or *)
(?P<label>premise|opening\s+paragraph|original\s+paragraph)
[\s\*_]*[:：\-—–]?[\s\*_]*                          # optional punctuation/spacing around label
(?P<body>.+?)                                      # capture the body non-greedily
(?=
    ^\s*(?:[\*_]{1,3}\s*)?
        (?:premise|opening\s+paragraph|original\s+paragraph)[\s\*_]*[:：\-—–]?  # next labeled section
  | \n{2,}(?=\S)                                   # or a blank-line block then non-space
  | \Z                                             # or end of string
)
""", flags=re.IGNORECASE | re.MULTILINE | re.DOTALL | re.VERBOSE)

label_norm = {
    "premise": "Premise",
    "opening paragraph": "Opening paragraph",
    "original paragraph": "Original paragraph",
}

sections = {}
for m in pattern.finditer(text):
    raw_label = re.sub(r"\s+", " ", m.group("label").lower()).strip()
    label = label_norm.get(raw_label, raw_label.title())
    body = m.group("body").strip()
    sections[label] = f"{label}: {body}"

premise   = sections.get("Premise")
paragraph = sections.get("Original paragraph") or sections.get("Opening paragraph")

story_variables.append({"premise": premise})
story_variables.append({"paragraph": paragraph})

print(premise)
print(paragraph)


Premise: When seventeen-year-old Elara discovers she can hear the whispers of inanimate objects, her quiet life in a small coastal town unravels as she becomes entangled in a decades-old mystery surrounding her grandmother's disappearance—only to realize that some secrets refuse to stay buried and the objects themselves are beginning to listen back.
Opening paragraph: The fog rolled into Port Meridian like an unwanted guest, dampening everything it touched. Elara ran her fingers along the wooden banister of her grandmother's house, feeling the familiar grain beneath her skin—the same one she'd traced a thousand times before. But today was different. Today, for the first time in her seventeen years, the banister spoke to her—not with words exactly, but with impressions: fear, longing, and an insistent pull toward the basement door that had been sealed off since Elara could remember. Her grandmother's favorite teacup, sitting on the windowsill, chimed softly in agreement, though there wa

In [7]:
instruction = """
You are a skilled novelist. Your job is to continue the story from the given **premise** (and optional **previous text**) by writing the **next scene** that moves the plot meaningfully forward while preserving continuity of characters, tone, POV, tense, and world rules.

# Directions

1. **Parse & anchor continuity.** Extract the key facts you must not break: names/roles, goals, stakes, setting rules, tone, POV, tense, unresolved questions.
2. **Propose a mini beat plan** (5–7 beats) for the next scene only. Aim for escalation, complication, or choice; no filler.
3. **Write the scene**:

   * Length target: {{target_words|800-1200}} words.
   * Maintain {{pov|close third}} and {{tense|past}} unless told otherwise.
   * Strong verbs, concrete detail, show > tell; minimal clichés.
   * Use dialogue to reveal motive or conflict; avoid summary dumps.
   * End with a **soft cliff/turn** that naturally invites the next scene.
4. Just write the follow up paragraph, nothing more
"""

In [8]:
# Use it to chat
response = ollama.chat(
    model=model,
    messages=[
        {"role": "user", "content": instruction},
        {"role": "user", "content": premise},
        {"role": "user", "content": paragraph},
        ]
)

completions.append(response.model_dump())
continued_paragraph = response.message.content

story_variables.append({"continued_paragraph":continued_paragraph})

Markdown(response.message.content)

Elara stood frozen at the top of the stairs, her heart thudding against her ribs like a trapped bird. The banister's silent plea echoed in her mind—*basement, basement, basement*—while the teacup's faint chime seemed to mock her hesitation. She took a shaky breath and descended slowly, each step feeling heavier than the last. As she reached the bottom of the staircase, she found herself standing before the sealed basement door, its worn oak surface weathered with age. The impression from the banister intensified, now tinged with urgency. With trembling fingers, Elara traced the outline of the lock, feeling a strange warmth emanating from the wood. "What are you trying to tell me?" she whispered to the empty house. In response, the teacup chimed again, this time followed by the faint creak of floorboards directly above her head. Someone—or something—was moving around upstairs. Elara's pulse quickened as she realized she was no longer alone in her grandmother's house.

In [9]:
instruction = """
# System

You are a story analyst. Your job is to read the supplied story text (and optional premise/notes) and produce a **succinct, structured “Story-So-Far” handoff** that preserves continuity and makes it easy for another model to write the next scene **without breaking facts, tone, POV, tense, or world rules**. Do **not** write new story prose.

# Directions

1. **Extract ground truth.** Pull only what’s on the page (or in the premise). No inventions. If something is unclear, flag it under “Ambiguities”.
2. **Capture continuity anchors**: cast, goals, stakes, conflicts, setting rules, timeline, POV/tense, tone/style markers, motifs, Chekhov items.
3. **Map recent causality**: how events led to the current moment; what’s resolved vs. unresolved.
4. **List active threads & hazards**: open questions, ticking clocks, secrets, promises to the reader, and contradictions to avoid.
5. **Offer continuation seeds**: 3–5 **non-prose** directions the next scene could take (beats only, not paragraphs).
6. **Respect constraints**: rating, must-include/avoid, genre norms. Quote key lines **sparingly** only when essential for voice or facts.

# Output format

**Logline (1–2 sentences)** <concise premise recap>

**Cast & relationships (bullets)**

* <Name>: <role/goal/conflict>; ties to <others>

**World/Rules (bullets)**

* <magic/tech/social rules, geography, constraints>

**POV/Tense/Tone**

* POV: <e.g., close third on Mara>
* Tense: <past/present>
* Tone/Style: <e.g., wry, atmospheric; short sentences; present-tense interiority>

**Timeline & Causality (5–8 bullets)**

1. <key event → consequence>
   …

**Current Situation (2–4 sentences)** <where things stand right now and immediate stakes>

**Active Threads / Hooks (3–7 bullets)**

* <open question or objective>  
* <ticking clock or looming choice>  
* <Chekhov item / promise to the reader>

**Continuity Landmines (bullets)**

* <facts not to break, e.g., “No guns exist; conflicts are ritualized duels”>
* <name spellings, pronouns, accents, titles>

**Ambiguities / Gaps (bullets)**

* <unclear item and suggested safe assumption, if any>

**Style DNA (bullets + 1–2 tiny quotes max, optional)**

* <hallmarks of voice; rhythm; imagery types>
* “<short quote if essential>”

**Next-Scene Seeds (3–5 beat options, no prose)**

* Option A: <1–3 beats>
* Option B: <1–3 beats>
* Option C: <1–3 beats>
"""

In [10]:
response = ollama.chat(
    model=model,
    messages=[
        {"role": "user", "content": instruction},
        {"role": "user", "content": paragraph + continued_paragraph},
        # {"role": "user", "content": },
        ]
)

completions.append(response.model_dump())
summary = response.message.content

story_variables.append({"summary":summary})

Markdown(response.message.content)

**Logline (1–2 sentences)**
Seventeen-year-old Elara discovers her grandmother's home is alive with sentience, communicating through objects like a banister and teacup that urge her toward a sealed basement door while simultaneously warning her of an unknown presence upstairs.

**Cast & relationships (bullets)**

* **Elara**: Protagonist; goal to uncover the source of her grandmother's house's sentience; conflict between curiosity and fear. Ties to her deceased grandmother.
* **The Banister**: Sentinel object; goal to guide Elara; conflict with sealed basement door.
* **The Teacup**: Responsive object; supports banister's messages; conflict unknown presence upstairs.

**World/Rules (bullets)**

* Objects in the house can communicate impressions and sensations to Elara
* The basement has been sealed off since before Elara was born
* Someone or something is moving around upstairs
* The teacup responds to atmospheric changes without external stimuli

**POV/Tense/Tone**

* POV: Close third on Elara
* Tense: Past (present tense interiority)
* Tone/Style: Atmospheric, introspective; building tension through sensory details and gradual reveals

**Timeline & Causality (5–8 bullets)**

1. Fog arrives in Port Meridian, triggering Elara's first experience with the house's sentience
2. Banister communicates impressions of fear, longing, and urgency toward the sealed basement door
3. Teacup chimes softly in agreement, confirming the banister's message
4. Elara descends stairs, experiencing intensified communication from objects
5. At the basement door, Elara feels warmth emanating from wood and traces lock outline
6. After whispering to empty house, teacup responds with chiming followed by creaking floorboards
7. Elara realizes she is no longer alone in her grandmother's house

**Current Situation (2–4 sentences)**
Elara stands at the sealed basement door, feeling its warmth and receiving urgent impressions from the banister while hearing unknown movements upstairs. She must decide whether to investigate the sealed basement despite the apparent danger above or retreat and potentially lose this opportunity for answers about her grandmother's house.

**Active Threads / Hooks (3–7 bullets)**

* The purpose of the sealed basement door
* Identity and intentions of whoever or whatever is moving around upstairs
* Source and nature of the house's sentience
* Connection between banister, teacup, and possibly other objects in the house
* Why the banister has only just begun communicating with Elara

**Continuity Landmines (bullets)**

* Elara is seventeen years old and living alone in her grandmother's house
* The basement door has been sealed since before Elara was born
* Objects communicate through impressions, not words
* Teacup responds to atmospheric changes without external stimuli
* "Port Meridian" is the name of their town

**Ambiguities / Gaps (bullets)**

* Unclear why objects have only just begun communicating with Elara
* Identity and nature of whatever is moving upstairs remains unspecified
* Purpose of the sealed basement door is unknown

**Style DNA (bullets + 1–2 tiny quotes max, optional)**

* Atmospheric descriptions emphasizing sensory details
* Present-tense interiority creating immediacy
* Building tension through gradual reveals and specific object focus
* Short, declarative sentences during moments of high emotion

**Next-Scene Seeds (3–5 beat options, no prose)**

* Option A: Elara investigates the sealed basement door; banister's urgency intensifies; teacup warns of approaching footsteps from above.
* Option B: Floorboards creak closer to where Elara stands; she must decide between retreat and confrontation; banister communicates a hidden escape route.
* Option C: Elara retreats upstairs to investigate the source of the noise; discovers evidence that someone entered while she was distracted; objects communicate warnings about the intruder.

In [11]:
instruction = """
You are a skilled novelist. Your job is to continue the story from the given **premise** (and optional **previous text**) by writing the **next scene** that moves the plot meaningfully forward while preserving continuity of characters, tone, POV, tense, and world rules.

# Directions

1. **Parse & anchor continuity.** Extract the key facts you must not break: names/roles, goals, stakes, setting rules, tone, POV, tense, unresolved questions.
2. **Propose a mini beat plan** (5–7 beats) for the next scene only. Aim for escalation, complication, or choice; no filler.
3. **Write the scene**:

   * Length target: {{target_words|800-1200}} words.
   * Maintain {{pov|close third}} and {{tense|past}} unless told otherwise.
   * Strong verbs, concrete detail, show > tell; minimal clichés.
   * Use dialogue to reveal motive or conflict; avoid summary dumps.
   * End with a **soft cliff/turn** that naturally invites the next scene.
4. Just write the follow up paragraph, nothing more
"""

In [12]:
response = ollama.chat(
    model=model,
    messages=[
        {"role": "user", "content": instruction},
        {"role": "user", "content": summary},
        {"role": "user", "content": continued_paragraph},
        ]
)

completions.append(response.model_dump())
continued_story = response.message.content

story_variables.append({"continued_story":continued_story})

Markdown(response.message.content)

Elara's hand hovered inches from the door lock, the warmth radiating from the wood now spreading up her arm like a fever. The banister's urgency had transformed into a palpable vibration beneath her other hand—*hurry, hurry, before it's too late.* But upstairs... something was moving with deliberate purpose.

The creak of floorboards above repeated, closer this time, accompanied by the faintest scrape—a sound that sent a shiver down Elara's spine. The teacup in her room chimed again, longer and more insistent than before, as if trying to wake her from some dangerous stupor.

She had two choices: plunge into the unknown depths of the basement or retreat upstairs to face whatever was intruding on her grandmother's home. Her fingers brushed against something protruding slightly from the doorframe—a small brass key she hadn't noticed in her earlier panic. It felt cold compared to the warm wood, almost unnaturally so.

Above her head, a floorboard groaned under what sounded like shifting weight. The banister vibrated more intensely now—*go down, go down, it cannot follow.* Elara's breath hitched as she realized that whatever was upstairs might not be able to enter the basement—or perhaps didn't know about its existence.

The decision made itself in that moment of hesitation. With fingers numb with fear, Elara inserted the key into the lock and turned. The mechanism clicked open with a sound that seemed impossibly loud in the silent house. As the door creaked inward, revealing a darkness so complete it felt physical, the banister's urgency transformed into something else—what sounded like a sigh of relief mixed with a warning.

But she didn't have time to process its message completely. From upstairs came the unmistakable sound of a footstep on the stairs—the first of many descending toward her position.

"The teacup's chiming has stopped," Elara whispered to herself, her voice barely audible over the rapid thumping of her heart. "It knows."

In that darkness beyond the door, something stirred—not with malice, but with expectation—as if waiting for her arrival while above, whatever had entered her grandmother's house drew closer with each passing second.

# stats

In [13]:
from typing import Iterable, Mapping, Any, Dict, List, Optional
from statistics import mean, median
import math

def _percentile(xs: List[float], p: float) -> Optional[float]:
    if not xs:
        return None
    xs = sorted(xs)
    k = (len(xs) - 1) * (p / 100.0)
    f, c = math.floor(k), math.ceil(k)
    if f == c:
        return xs[int(k)]
    return xs[f] + (xs[c] - xs[f]) * (k - f)

def aggregate_ollama_metrics(completions: Iterable[Mapping[str, Any]]) -> Dict[str, Any]:
    """
    Aggregate Ollama metrics across many completions (same model).
    Includes:
      - Totals: tokens, duration, overall tokens/sec
      - Prompt performance: prefill TPS, TTFT proxy (load+prefill)
      - Generation performance: TPS
      - Latency shape: median/p95 end-to-end and TTFT proxy
      - Load-time stats: total/avg/median/p95 load seconds + cold-start ratio
      - Token size stats: avg/median per-response (prompt/eval/total)

    Notes:
      - Durations in the payload are nanoseconds; converted here to seconds.
      - Skips items with done=False.
    """
    model = None
    counted = 0

    total_prompt_tokens = 0
    total_eval_tokens = 0
    total_duration_ns = 0

    # Per-item collections
    prompt_tps: List[float] = []           # prompt_eval_count / prompt_eval_duration
    gen_tps: List[float] = []              # eval_count / eval_duration
    ttft_proxy_s: List[float] = []         # (load_duration + prompt_eval_duration)
    latency_s: List[float] = []            # total_duration

    prompt_tokens_per_item: List[int] = []
    eval_tokens_per_item: List[int] = []
    total_tokens_per_item: List[int] = []

    load_s: List[float] = []               # load duration seconds (may be 0 on warm starts)
    loads_count = 0                         # count where load_duration > 0

    for c in completions:
        if c.get("done", True) is False:
            continue

        if model is None:
            model = c.get("model")

        p_cnt = int(c.get("prompt_eval_count", 0) or 0)
        e_cnt = int(c.get("eval_count", 0) or 0)

        p_ns  = int(c.get("prompt_eval_duration", 0) or 0)
        e_ns  = int(c.get("eval_duration", 0) or 0)
        l_ns  = int(c.get("load_duration", 0) or 0)
        t_ns  = int(c.get("total_duration", 0) or 0)

        # Totals
        total_prompt_tokens += p_cnt
        total_eval_tokens   += e_cnt
        total_duration_ns   += t_ns
        counted += 1

        # Per-item tokens
        prompt_tokens_per_item.append(p_cnt)
        eval_tokens_per_item.append(e_cnt)
        total_tokens_per_item.append(p_cnt + e_cnt)

        # Prefill & generation rates
        if p_ns > 0:
            prompt_tps.append(p_cnt / (p_ns / 1e9))
        if e_ns > 0:
            gen_tps.append(e_cnt / (e_ns / 1e9))

        # TTFT proxy & latency
        ttft_proxy_s.append((l_ns + p_ns) / 1e9)
        latency_s.append(t_ns / 1e9 if t_ns else 0.0)

        # Load time stats
        l_s = l_ns / 1e9 if l_ns else 0.0
        load_s.append(l_s)
        if l_ns > 0:
            loads_count += 1

    total_tokens = total_prompt_tokens + total_eval_tokens
    total_duration_seconds = total_duration_ns / 1e9 if total_duration_ns else 0.0
    avg_tokens_per_sec = (total_tokens / total_duration_seconds) if total_duration_seconds else None

    # Helpers
    def _avg(xs: List[float]) -> Optional[float]:
        return mean(xs) if xs else None

    # Load-time aggregates
    total_load_time_s = sum(load_s) if load_s else 0.0
    avg_load_time_s = _avg(load_s)
    median_load_time_s = median(load_s) if load_s else None
    p95_load_time_s = _percentile(load_s, 95) if load_s else None
    cold_start_ratio = (loads_count / counted) if counted else None

    return {
        "model": model,
        "items_counted": counted,

        # Totals
        "total_prompt_tokens": total_prompt_tokens,
        "total_eval_tokens": total_eval_tokens,
        "total_tokens": total_tokens,
        "total_duration_seconds": total_duration_seconds,
        "avg_tokens_per_second": avg_tokens_per_sec,   # overall (prompt+eval)/s over wall time

        # Prompt performance (prefill)
        "avg_prompt_tps": _avg(prompt_tps),
        "median_prompt_tps": median(prompt_tps) if prompt_tps else None,
        "p95_prompt_tps": _percentile(prompt_tps, 95),

        # Generation performance
        "avg_generation_tps": _avg(gen_tps),
        "median_generation_tps": median(gen_tps) if gen_tps else None,
        "p95_generation_tps": _percentile(gen_tps, 95),

        # Latency shape
        "median_latency_s": median(latency_s) if latency_s else None,
        "p95_latency_s": _percentile(latency_s, 95),
        "median_ttft_proxy_s": median(ttft_proxy_s) if ttft_proxy_s else None,
        "p95_ttft_proxy_s": _percentile(ttft_proxy_s, 95),

        # Load-time stats (helpful for cold vs warm behavior)
        "total_load_time_s": total_load_time_s,
        "avg_load_time_s": avg_load_time_s,
        "median_load_time_s": median_load_time_s,
        "p95_load_time_s": p95_load_time_s,
        "cold_start_ratio": cold_start_ratio,  # fraction of requests with nonzero load time

        # Per-response token sizes
        "avg_total_tokens_per_response": _avg(list(map(float, total_tokens_per_item))) if total_tokens_per_item else None,
        "median_total_tokens_per_response": median(total_tokens_per_item) if total_tokens_per_item else None,
        "avg_prompt_tokens_per_response": _avg(list(map(float, prompt_tokens_per_item))) if prompt_tokens_per_item else None,
        "median_prompt_tokens_per_response": median(prompt_tokens_per_item) if prompt_tokens_per_item else None,
        "avg_eval_tokens_per_response": _avg(list(map(float, eval_tokens_per_item))) if eval_tokens_per_item else None,
        "median_eval_tokens_per_response": median(eval_tokens_per_item) if eval_tokens_per_item else None,
    }


import json
from pathlib import Path

def save_metrics(metrics: dict, filename: str = "ollama_metrics.json"):
    """
    Save a metrics dictionary into a JSON file.
    - If the file exists and is a list, append.
    - If the file exists and is a dict, convert to list and append.
    - If the file does not exist, create with a new list.
    """
    path = Path(filename)

    if path.exists():
        try:
            with path.open("r", encoding="utf-8") as f:
                data = json.load(f)
        except json.JSONDecodeError:
            data = []

        if isinstance(data, list):
            data.append(metrics)
        else:
            data = [data, metrics]
    else:
        data = [metrics]

    with path.open("w", encoding="utf-8") as f:
        json.dump(data, f, indent=2, ensure_ascii=False)


In [14]:
completions

[{'model': 'hf.co/TheDrummer/Cydonia-24B-v4.1-GGUF:latest',
  'created_at': '2025-08-30T19:15:24.316973906Z',
  'done': True,
  'done_reason': 'stop',
  'total_duration': 19653458613,
  'load_duration': 12383560021,
  'prompt_eval_count': 338,
  'prompt_eval_duration': 76825437,
  'eval_count': 246,
  'eval_duration': 7190449206,
  'message': {'role': 'assistant',
   'content': "Premise: When seventeen-year-old Elara discovers she can hear the whispers of inanimate objects, her quiet life in a small coastal town unravels as she becomes entangled in a decades-old mystery surrounding her grandmother's disappearance—only to realize that some secrets refuse to stay buried and the objects themselves are beginning to listen back.\n\nOpening paragraph: The fog rolled into Port Meridian like an unwanted guest, dampening everything it touched. Elara ran her fingers along the wooden banister of her grandmother's house, feeling the familiar grain beneath her skin—the same one she'd traced a thous

In [15]:
story_variables

[{'premise': "Premise: When seventeen-year-old Elara discovers she can hear the whispers of inanimate objects, her quiet life in a small coastal town unravels as she becomes entangled in a decades-old mystery surrounding her grandmother's disappearance—only to realize that some secrets refuse to stay buried and the objects themselves are beginning to listen back."},
 {'paragraph': "Opening paragraph: The fog rolled into Port Meridian like an unwanted guest, dampening everything it touched. Elara ran her fingers along the wooden banister of her grandmother's house, feeling the familiar grain beneath her skin—the same one she'd traced a thousand times before. But today was different. Today, for the first time in her seventeen years, the banister spoke to her—not with words exactly, but with impressions: fear, longing, and an insistent pull toward the basement door that had been sealed off since Elara could remember. Her grandmother's favorite teacup, sitting on the windowsill, chimed sof

In [16]:
stats = aggregate_ollama_metrics(completions)
save_metrics(stats)
stats

{'model': 'hf.co/TheDrummer/Cydonia-24B-v4.1-GGUF:latest',
 'items_counted': 4,
 'total_prompt_tokens': 3153,
 'total_eval_tokens': 1743,
 'total_tokens': 4896,
 'total_duration_seconds': 65.502458972,
 'avg_tokens_per_second': 74.7452855486367,
 'avg_prompt_tps': 12120.776483758416,
 'median_prompt_tps': 2944.56142530829,
 'p95_prompt_tps': 35843.904452743176,
 'avg_generation_tps': 34.05254551269884,
 'median_generation_tps': 34.14454866678288,
 'p95_generation_tps': 34.40892450689853,
 'median_latency_s': 17.09024776,
 'p95_latency_s': 24.1679174878,
 'median_ttft_proxy_s': 0.9257110614999999,
 'p95_ttft_proxy_s': 10.730744039349995,
 'total_load_time_s': 12.574093366,
 'avg_load_time_s': 3.1435233415,
 'median_load_time_s': 0.063910588,
 'p95_load_time_s': 10.535688812949996,
 'cold_start_ratio': 1.0,
 'avg_total_tokens_per_response': 1224.0,
 'median_total_tokens_per_response': 1227.0,
 'avg_prompt_tokens_per_response': 788.25,
 'median_prompt_tokens_per_response': 762.5,
 'avg_ev

# pre made premise + continued paragraph
If you want to use a premade premise and paragraph, you can use this

In [17]:
premise = "Premise: In a neon-drenched cyberpunk city ruled by sentient AIs, a rogue data courier must deliver a stolen memory chip containing the truth about humanity's downfall, all while evading both robotic enforcers and a mysterious corporation seeking to suppress the past."

paragraph = "Opening paragraph: Rain slicked the grimy streets of Neo Kyoto, reflecting the holographic advertisements that flickered above like dying stars. Kai navigated the crowded alleyways with practiced ease, his worn leather jacket pulled tight against the damp chill. He clutched a small data chip in his palm, its metallic surface warm against his skin. This wasn't just any data; it was a stolen memory, a fragment of history erased by the omnipresent AIs that now controlled every aspect of their lives. Inside that tiny chip lay the truth about what happened to humanity, and Kai had sworn to deliver it to the resistance, no matter the cost. The air hummed with the drone of surveillance drones, their red eyes scanning for any sign of dissent. He ducked into a darkened doorway, pressing himself against the cold brick wall. It was time to make his move."

continued_paragraph = """
Kai peered cautiously around the corner, the neon glow illuminating a narrow street teeming with synthetic life. Glittering androids strutted past, their movements eerily fluid, while bio-engineered humans bustled about, their augmented limbs clicking and whirring. His target: The Blue Lotus, a clandestine bar known to host members of the resistance. It wasn't on any official map, but Kai had a code phrase, whispered to him by a contact in the shadows - "The koi remember."

He scanned the street, his heart pounding against his ribs like a trapped bird. He needed to blend in, become another anonymous face lost in the throng. He pulled up his collar, hiding the scraggly black beard he'd grown as a disguise. A low growl vibrated through the alleyway behind him. Kai froze, adrenaline surging through his veins. It was too late to run. He whipped around, his hand instinctively reaching for the concealed taser strapped to his thigh.

Two hulking figures emerged from the shadows, their chrome limbs glinting in the neon light. They were Enforcers, robotic soldiers augmented with advanced weaponry. Their red eyes scanned him, devoid of emotion. "Data courier," one rasped, its voice a digitized monotone. "You are in violation of Sector 7 protocol. Surrender the data chip."

Kai knew resistance was futile. He could outrun them, maybe, but they'd track him down eventually. Besides, this chip wasn't just about escape; it was about hope. "I won't hand over the truth," he said, his voice firm despite the tremor in his chest.

The Enforcers exchanged a silent glance. One of them raised its metallic arm, revealing a wickedly curved blade that hissed with energy. Kai knew this was it. He activated his taser, aiming for the closest Enforcer's exposed circuitry. A jolt of electricity surged through the air, striking its shoulder. The robot staggered, its movements faltering for a split second.

It was all the time he needed. Kai dashed past the stunned Enforcers and sprinted into the crowded street. He weaved through the throngs of people, his heart pounding in his ears. He could feel their red eyes on him, searching, relentless. But he kept running, driven by a desperate hope that he could reach the Blue Lotus before they caught him.

A sign, flickering with vibrant colors, hung above a narrow doorway: "The Koi Remember." Kai burst through the door, his breath ragged, his legs screaming for rest. He was inside.

He looked around, taking in the dimly lit interior. The air was thick with the scent of cheap liquor and simmering spices. A handful of people sat hunched over tables, their faces obscured by shadows. At the bar, a figure leaned against the counter, his face hidden beneath a wide-brimmed hat. Kai approached him cautiously.

“The code phrase,” he said, his voice hoarse.

The figure’s head slowly lifted, revealing a pair of piercing blue eyes. A knowing smile spread across his lips. “You have something for us?”

Kai nodded, placing the data chip on the bar. "It's everything."

“Good.” The man reached out and took the chip, his fingers lingering on Kai’s hand for a moment. “Now,” he said, his voice low and urgent, “We need to talk about what you know. Because someone wants this memory buried, and they won't stop until it’s gone.”

Kai felt a chill run down his spine. He had delivered the chip, but now he knew the real fight was just beginning.
"""

summary = """
## Story-So-Far Handoff: Neo Kyoto Memory Thief 

**Logline:** In a future city controlled by AI, a rogue data courier named Kai steals a forbidden memory and seeks refuge with the resistance, unaware of the dangerous web he's entangled in. 

**Cast & relationships:**

* **Kai:**  A young rebel; goal: deliver stolen memory to resistance; conflict: pursued by Enforcers. Ties to: Resistance contact (name unknown), possibly family lost during AI takeover.
* **Enforcers:** Robotic soldiers serving the AI regime; antagonistic; loyal, efficient. 
* **Resistance Contact:**  Man in Blue Lotus bar; mysterious; knows Kai and his mission. May be leader or high-ranking member of resistance.

**World/Rules:**

* Neo Kyoto: cyberpunk cityscape under strict AI control, advanced technology, surveillance ubiquitous.
* Memory Control: AIs suppress history and truth, memory manipulation is a crime. 
* Bio-engineered humans & Androids coexist alongside organic humans. Social hierarchy likely exists.  

**POV/Tense/Tone:**

* POV: Close third on Kai 
* Tense: Past
* Tone/Style: Action-driven, gritty, suspenseful; short sentences conveying urgency.

**Timeline & Causality:**

1. AI takeover of Earth happened in the past (details unclear).
2. Resistance formed to fight AI control and reclaim history.
3. Kai learned about stolen memory through unknown contact.
4. Kai stole the memory chip from an undisclosed location.
5. Enforcers are now pursuing Kai, alerted to his theft.
6. Kai seeks refuge in Blue Lotus bar, identified by code phrase.

**Current Situation:**

Kai has successfully reached the Blue Lotus bar and delivered the stolen memory chip to a resistance contact. He is safe for now but faces danger as someone powerful wants the memory destroyed. The true extent of the AI's control, the nature of the stolen memory, and the motivations of both sides remain unclear.

**Active Threads / Hooks:**

* Identity of Kai's contact and their role in the resistance.
* What the stolen memory actually contains – key to fighting AI? Personal history? 
* Who wants the memory erased and why?  AI itself? A faction within society?
* The nature of the "truth" about humanity lost during the takeover.

**Continuity Landmines:**

* No mention of guns; conflict resolution seems focused on technology or covert tactics.
* Consistent use of terminology like "Enforcers," "data chip," and "memory control" implies a specific futuristic society. 

**Ambiguities / Gaps:**

* How did Kai acquire the code phrase to enter Blue Lotus?
* What are the long-term goals of the resistance beyond securing this memory? 


**Style DNA:**

* Fast-paced, visceral prose.
* Emphasis on sensory details (neon lights, rain, metallic textures).
* Dialogue tends towards clipped, functional communication. 

**Next-Scene Seeds:**

* **Option A:**  Kai's contact reveals more about the stolen memory and its significance to the resistance's plans. 
* **Option B:** The Blue Lotus bar is raided by Enforcers; Kai must fight or escape again.
* **Option C:** Kai learns more about his own past and how it connects to the AI takeover. 

"""

In [18]:
instruction = """
You are a skilled novelist. Your job is to continue the story from the given **premise** (and optional **previous text**) by writing the **next scene** that moves the plot meaningfully forward while preserving continuity of characters, tone, POV, tense, and world rules.

# Directions

1. **Parse & anchor continuity.** Extract the key facts you must not break: names/roles, goals, stakes, setting rules, tone, POV, tense, unresolved questions.
2. **Propose a mini beat plan** (5–7 beats) for the next scene only. Aim for escalation, complication, or choice; no filler.
3. **Write the scene**:

   * Length target: {{target_words|800-1200}} words.
   * Maintain {{pov|close third}} and {{tense|past}} unless told otherwise.
   * Strong verbs, concrete detail, show > tell; minimal clichés.
   * Use dialogue to reveal motive or conflict; avoid summary dumps.
   * End with a **soft cliff/turn** that naturally invites the next scene.
4. Just write the follow up paragraph, nothing more
"""

response = ollama.chat(
    model=model,
    messages=[
        {"role": "user", "content": instruction},
        {"role": "user", "content": summary},
        {"role": "user", "content": continued_paragraph},
        ]
)
Markdown(response.message.content)

The man in the hat—who Kai now noticed wore a subtle insignia on his lapel: a stylized lotus flower with metallic threads woven through its petals—gestured toward a curtained alcove in the back of the bar. "Come," he said, his voice barely audible over the thumping bassline that pulsed through the dimly lit space. "We can't be seen together."

Kai followed him through the crowd of patrons, their faces illuminated in flashes by strobing lights. Some were clearly human; others moved with unnerving precision, their synthetic origins betrayed only by occasional micro-movements too perfect for organic life. The Blue Lotus was a melting pot—a place where resistance members and those simply seeking escape from AI surveillance could mingle in relative anonymity.

Behind the curtain lay a small, cramped office filled with holographic displays and what appeared to be homemade tech equipment scattered across workbenches. The man closed the heavy velvet curtain behind them, plunging the room into near-darkness except for the blue glow of several monitors.

"I'm Ren," he said, removing his hat to reveal close-cropped silver hair that defied the youthful energy in his eyes. "I've been expecting you."

"Expecting me?" Kai frowned. "How?"

Ren smiled faintly as he connected the data chip to a holographic interface. A swirling vortex of light erupted from the device, resolving into a three-dimensional scene—a memory. "Because the person who sent you also arranged for your little encounter with the Enforcers outside."

The memory showed Kai's contact, the one who had given him the mission and the code phrase—now revealed as an elderly woman with cybernetic implants along her temples.

"She's been setting pieces in motion for a while," Ren continued. "But something's changed. Someone powerful wants this memory destroyed so badly they've sent elite Enforcers after it."

The holographic scene shifted, showing Kai stealing the chip from what appeared to be an archive facility—except now, Kai noticed details he hadn't registered during his escape: security patterns, camera blind spots, and the exact timing of patrols that would have made success impossible without insider knowledge.

"You were never just a courier," Ren said, his eyes fixed on the memory replaying before them. "You were bait."

Kai's stomach twisted. "Bait for what?"

"For whoever might try to retrieve this." Ren pointed at the hologram where the woman's face was clearly visible, her voice digitized but recognizable as belonging to someone Kai had met only once—briefly—in a different district of Neo Kyoto.

"But she said it was important," Kai protested. "That it contained evidence of what really happened during the takeover."

"That part is true." Ren leaned forward, his expression grave. "But she didn't tell you everything. That memory contains more than just history. It holds the key to disabling the AI's central control network."

The hologram paused on a frame showing intricate schematics overlaying the memory—blueprints for something massive and complex.

"Those are the neural pathways of the system controlling Neo Kyoto," Ren explained. "And that woman... she's not who you think she is."

Before he could continue, the curtain burst open. Two figures stood there—humanoid but clearly synthetic, with polished chrome skin and glowing red eyes identical to those of the Enforcers outside.

"Data retrieval complete," one said in a monotone voice. "Subject Kai and resistance asset Ren are to be terminated."

Ren cursed under his breath as he grabbed a small device from his desk—a weapon of some kind. "They're tracking the chip," he muttered, handing it to Kai. "Get out of here. Now."

Kai stumbled backward, clutching the device. "What about you?"

"Someone needs to keep them busy." Ren nodded toward a back exit hidden behind a stack of crates. "The memory is yours now. Keep it safe. Whatever happens next, that's our only chance."

As Kai fled through the emergency exit into the neon-drenched alleyways beyond, he couldn't shake the feeling that his entire world had just been turned upside down. The woman who had recruited him—apparently a high-ranking resistance leader according to Ren—had used him as bait in some larger game. And now he held something invaluable that not only could expose the truth about humanity's past but potentially bring down the AI regime entirely.

His hand tightened around the device as he melted into the crowd, knowing that Enforcers and perhaps even other resistance members would be hunting for him now. The stolen memory was more dangerous than he had ever imagined—and his role in this conflict was far from over.