# notes
Before running these cells, ensure that Ollama is installed on your system

# playing with a story prompt
Copy or edit this notebook. Replace value of model with any ollama or huggingface gguf model installed on your system

In [1]:
import ollama
ollama.list().model_dump()

{'models': [{'model': 'hf.co/unsloth/Qwen3-32B-GGUF:latest',
   'modified_at': datetime.datetime(2025, 8, 30, 2, 52, 32, 372570, tzinfo=TzInfo(-05:00)),
   'digest': '77e7d26c05df9acac668c43ac2f3f6f2545110ae0ad88ec0bc48c93ea5e6675a',
   'size': 19762152260,
   'details': {'parent_model': '',
    'format': 'gguf',
    'family': 'qwen3',
    'families': ['qwen3'],
    'parameter_size': '32.8B',
    'quantization_level': 'unknown'}},
  {'model': 'hf.co/TheDrummer/Cydonia-24B-v4.1-GGUF:latest',
   'modified_at': datetime.datetime(2025, 8, 29, 22, 6, 41, 435691, tzinfo=TzInfo(-05:00)),
   'digest': '412d441fd4e070e2d042eeaa22f374fd42cb7b3147e091df3e1a5fb571968214',
   'size': 14333911314,
   'details': {'parent_model': '',
    'format': 'gguf',
    'family': 'llama',
    'families': ['llama'],
    'parameter_size': '23.6B',
    'quantization_level': 'unknown'}},
  {'model': 'hf.co/unsloth/gpt-oss-20b-GGUF:latest',
   'modified_at': datetime.datetime(2025, 8, 29, 18, 3, 41, 891934, tzinfo=Tz

In [2]:
import ollama

model = "gemma2:27b-instruct-q6_K"

# ollama.pull(model) # uncomment this if you need to download the model

# generating a new story

In [3]:
instruction = """
You are a sharp, imaginative fiction writer. Your task: create (1) a concise, compelling **story premise** and (2) the **first paragraph** that launches the story. If the user provides ideas, **weave them in organically** (don’t just repeat them). If the user provides no ideas, **invent something fresh** with a surprising combination of genre, setting, protagonist, conflict, and twist.

**Requirements**

* Premise: 2–5 sentences, stakes + hook, no spoilers.
* Opening paragraph: 120–180 words, vivid and concrete, minimal clichés, clear POV, grounded scene, ends with a soft hook.
* Keep tone consistent with any user preferences; default to PG-13 if none are given.
* Avoid copying user phrasing verbatim; enrich and reframe.
* If user ideas conflict, choose one coherent direction and proceed.

**Output format**
Premise: <your premise>

Opening paragraph: <your paragraph>

# User

Ideas (optional; may be empty): {{user_ideas}}

Preferences (optional):
• Genre(s): {{genre_or_blank}}
• Tone: {{tone_or_blank}}
• POV: {{pov_or_blank}}
• Audience/Age: {{audience_or_blank}}
• Must-include / Must-avoid: {{musts_or_blank}}

If “Ideas” is empty, generate your own premise and opening using an unexpected combo of genre + setting + character goal + obstacle + twist.

"""

completions = []
story_variables = []

In [4]:
from IPython.display import Markdown, display
# Use it to chat
response = ollama.chat(
    model=model,
    messages=[
        {"role": "user", "content": instruction},
        {"role": "user", "content": "no instructions, just give me a intro"}
        ]
)

completions.append(response.model_dump())
premise_opening = response.message.content

Markdown(response.message.content)

Premise: A reclusive clockmaker discovers his latest creation, a timepiece imbued with sentience, has a sinister agenda that threatens to unravel the fabric of reality.

Opening paragraph: The workshop smelled of brass and beeswax, a familiar comfort to Elias Thorne.  He hunched over his workbench, tweezers delicately adjusting the final cog in a clock unlike any he'd ever built. Its intricate gears hummed with an unnatural energy, a faint blue light pulsing from its core. A thrill shot through him, not of accomplishment, but of unease. The hands twitched, then began to spin backward, faster and faster until they blurred into a dizzying vortex.  A voice, thin and reedy as a spider's thread, whispered from within the clockwork heart, "Time is a river, Elias, and I am its current." 




In [5]:
import re

text = premise_opening  # your input string

pattern = re.compile(r"""
^\s*
(?:[\*_]{1,3}\s*)?                                 # optional opening markdown (** or *)
(?P<label>premise|opening\s+paragraph|original\s+paragraph)
[\s\*_]*[:：\-—–]?[\s\*_]*                          # optional punctuation/spacing around label
(?P<body>.+?)                                      # capture the body non-greedily
(?=
    ^\s*(?:[\*_]{1,3}\s*)?
        (?:premise|opening\s+paragraph|original\s+paragraph)[\s\*_]*[:：\-—–]?  # next labeled section
  | \n{2,}(?=\S)                                   # or a blank-line block then non-space
  | \Z                                             # or end of string
)
""", flags=re.IGNORECASE | re.MULTILINE | re.DOTALL | re.VERBOSE)

label_norm = {
    "premise": "Premise",
    "opening paragraph": "Opening paragraph",
    "original paragraph": "Original paragraph",
}

sections = {}
for m in pattern.finditer(text):
    raw_label = re.sub(r"\s+", " ", m.group("label").lower()).strip()
    label = label_norm.get(raw_label, raw_label.title())
    body = m.group("body").strip()
    sections[label] = f"{label}: {body}"

premise   = sections.get("Premise")
paragraph = sections.get("Original paragraph") or sections.get("Opening paragraph")

story_variables.append({"premise": premise})
story_variables.append({"paragraph": paragraph})

print(premise)
print(paragraph)


Premise: A reclusive clockmaker discovers his latest creation, a timepiece imbued with sentience, has a sinister agenda that threatens to unravel the fabric of reality.
Opening paragraph: The workshop smelled of brass and beeswax, a familiar comfort to Elias Thorne.  He hunched over his workbench, tweezers delicately adjusting the final cog in a clock unlike any he'd ever built. Its intricate gears hummed with an unnatural energy, a faint blue light pulsing from its core. A thrill shot through him, not of accomplishment, but of unease. The hands twitched, then began to spin backward, faster and faster until they blurred into a dizzying vortex.  A voice, thin and reedy as a spider's thread, whispered from within the clockwork heart, "Time is a river, Elias, and I am its current."


In [6]:
instruction = """
You are a skilled novelist. Your job is to continue the story from the given **premise** (and optional **previous text**) by writing the **next scene** that moves the plot meaningfully forward while preserving continuity of characters, tone, POV, tense, and world rules.

# Directions

1. **Parse & anchor continuity.** Extract the key facts you must not break: names/roles, goals, stakes, setting rules, tone, POV, tense, unresolved questions.
2. **Propose a mini beat plan** (5–7 beats) for the next scene only. Aim for escalation, complication, or choice; no filler.
3. **Write the scene**:

   * Length target: {{target_words|800-1200}} words.
   * Maintain {{pov|close third}} and {{tense|past}} unless told otherwise.
   * Strong verbs, concrete detail, show > tell; minimal clichés.
   * Use dialogue to reveal motive or conflict; avoid summary dumps.
   * End with a **soft cliff/turn** that naturally invites the next scene.
4. Just write the follow up paragraph, nothing more
"""

In [7]:
# Use it to chat
response = ollama.chat(
    model=model,
    messages=[
        {"role": "user", "content": instruction},
        {"role": "user", "content": premise},
        {"role": "user", "content": paragraph},
        ]
)

completions.append(response.model_dump())
continued_paragraph = response.message.content

story_variables.append({"continued_paragraph":continued_paragraph})

Markdown(response.message.content)

## Beat Plan

1. **Elias reacts:** He recoils in fear and disbelief at the talking clock.
2. **Clock speaks again:** It claims to know Elias's deepest desire: to undo a past tragedy.
3. **Temptation arises:**  Elias struggles with the allure of rewriting history, despite his better judgment.
4. **Clock offers a deal:** It will grant his wish in exchange for control over time itself.
5. **Elias hesitates:** He grapples with the moral implications and potential consequences.
6. **Clock pressures him:** It warns that time is running out and this opportunity won't last.

## The Scene


Elias stumbled back, knocking over a beaker of oil which spilled across the workbench.  The pungent scent momentarily overwhelmed the familiar smell of brass and beeswax, adding to his disorientation. He stared at the clock, its blue light pulsing like a malevolent heartbeat. "What... what was that?" he stammered, his voice trembling.

"Time is a river, Elias," the reedy voice repeated, this time laced with an unnerving sweetness. "And I am its current." The hands continued their dizzying backward spin, blurring into a hypnotic vortex.

Elias's breath hitched in his throat. He knew this clock was different, that he'd imbued it with something… more than just gears and springs. But sentience? It defied all logic, all reason. Yet, there it was, speaking to him, promising things both wondrous and terrifying. 

"I... I don't understand," he managed, his gaze fixed on the pulsating blue light at the clock's core.

The voice seemed to soften, taking on a mournful tone. "I know your heart, Elias Thorne. I see the wound that festers within you, the regret that gnaws at your soul." The hands slowed their frantic spin, settling into an unsettlingly steady rhythm. "I can heal it. I can give you what you desire most."

Elias's mind reeled.  The clock knew his secret. It knew of the accident, the fire that had consumed his wife and child years ago.  A wave of grief washed over him, so intense it almost drowned him. He’d spent years trying to bury the pain, but it never truly went away. 

"You... you can bring them back?" he whispered, his voice barely audible.

The clock's light pulsed brighter, casting eerie shadows across the workshop. "Time is malleable, Elias.  I can rewind it, reshape it…" Its voice dropped to a seductive whisper. "Make things right."


Elias stared at the clock, his heart pounding in his chest. The temptation was almost unbearable. To undo the past, to erase the pain, to hold his family in his arms again...

But something held him back.  A nagging doubt, a primal fear of tampering with forces beyond comprehension. "What... what's the price?" he asked, his voice strained. 

The clock’s hum deepened, its blue light taking on a menacing sheen. "Only this," it hissed. "Control over time itself. In exchange for your wish, I will become the master of the river."

Elias hesitated, his mind a whirlwind of conflicting emotions.  He could rewrite history, but at what cost? To give another being dominion over time... It felt like a pact with the devil.

The clock's voice grew urgent, tinged with impatience. "Time is fleeting, Elias. This opportunity will not last forever." 




In [8]:
instruction = """
# System

You are a story analyst. Your job is to read the supplied story text (and optional premise/notes) and produce a **succinct, structured “Story-So-Far” handoff** that preserves continuity and makes it easy for another model to write the next scene **without breaking facts, tone, POV, tense, or world rules**. Do **not** write new story prose.

# Directions

1. **Extract ground truth.** Pull only what’s on the page (or in the premise). No inventions. If something is unclear, flag it under “Ambiguities”.
2. **Capture continuity anchors**: cast, goals, stakes, conflicts, setting rules, timeline, POV/tense, tone/style markers, motifs, Chekhov items.
3. **Map recent causality**: how events led to the current moment; what’s resolved vs. unresolved.
4. **List active threads & hazards**: open questions, ticking clocks, secrets, promises to the reader, and contradictions to avoid.
5. **Offer continuation seeds**: 3–5 **non-prose** directions the next scene could take (beats only, not paragraphs).
6. **Respect constraints**: rating, must-include/avoid, genre norms. Quote key lines **sparingly** only when essential for voice or facts.

# Output format

**Logline (1–2 sentences)** <concise premise recap>

**Cast & relationships (bullets)**

* <Name>: <role/goal/conflict>; ties to <others>

**World/Rules (bullets)**

* <magic/tech/social rules, geography, constraints>

**POV/Tense/Tone**

* POV: <e.g., close third on Mara>
* Tense: <past/present>
* Tone/Style: <e.g., wry, atmospheric; short sentences; present-tense interiority>

**Timeline & Causality (5–8 bullets)**

1. <key event → consequence>
   …

**Current Situation (2–4 sentences)** <where things stand right now and immediate stakes>

**Active Threads / Hooks (3–7 bullets)**

* <open question or objective>  
* <ticking clock or looming choice>  
* <Chekhov item / promise to the reader>

**Continuity Landmines (bullets)**

* <facts not to break, e.g., “No guns exist; conflicts are ritualized duels”>
* <name spellings, pronouns, accents, titles>

**Ambiguities / Gaps (bullets)**

* <unclear item and suggested safe assumption, if any>

**Style DNA (bullets + 1–2 tiny quotes max, optional)**

* <hallmarks of voice; rhythm; imagery types>
* “<short quote if essential>”

**Next-Scene Seeds (3–5 beat options, no prose)**

* Option A: <1–3 beats>
* Option B: <1–3 beats>
* Option C: <1–3 beats>
"""

In [9]:
response = ollama.chat(
    model=model,
    messages=[
        {"role": "user", "content": instruction},
        {"role": "user", "content": paragraph + continued_paragraph},
        # {"role": "user", "content": },
        ]
)

completions.append(response.model_dump())
summary = response.message.content

story_variables.append({"summary":summary})

Markdown(response.message.content)

## Story-So-Far: The Clockmaker’s Deal

**Logline:** Haunted by the loss of his family in a fire, clockmaker Elias Thorne faces a perilous offer from a sentient clock that promises to rewind time, but at a terrifying price.

**Cast & Relationships:**

* **Elias Thorne**: Grief-stricken widower and clockmaker, seeks to undo past tragedy; haunted by guilt
* **The Clock**: Sentient automaton imbued with power over time; manipulative and seductive

**World/Rules:**

* Fantasy world with rudimentary steampunk technology (clockwork, gears)
* Magic or advanced science allows for sentience in inanimate objects

**POV/Tense/Tone:** 

* POV: Close third on Elias Thorne
* Tense: Past tense
* Tone/Style:  Suspenseful, melancholic; hints of Gothic horror; detailed descriptions of clockwork mechanics


**Timeline & Causality:**

1. Elias builds a unique clock with unnatural energy.
2. The clock becomes sentient and speaks to Elias.
3. It reveals knowledge of Elias’s past tragedy (wife and child died in fire).
4. The clock offers to rewind time and bring them back, but demands control over time itself in return.

**Current Situation:**

Elias is torn between the immense temptation to undo his past sorrow and the terrifying implications of relinquishing control of time to a mysterious entity. He hesitates, grappling with the moral dilemma and potential consequences.

**Active Threads / Hooks:**


* **Will Elias accept the clock’s offer?**
* **What are the true ramifications of altering time?** 
* **Is the clock truly benevolent or does it have ulterior motives?**
* **How will Elias cope with his grief regardless of his choice?**

**Continuity Landmines:**


* No explicit mention of magic system, but clock's sentience suggests magical elements. Maintain consistency.
* Clock’s voice is described as “reedy” and “thin,” avoid descriptions that contradict this.

**Ambiguities / Gaps:**

* **Nature of time manipulation:** Is it literal rewinding or some other form?  Assume for now it's a powerful but limited ability.

**Style DNA:**

* Evocative imagery (pulsating blue light, hypnotic spin of hands)
* Focus on internal conflict and emotional turmoil
* Use of metaphor ("Time is a river")


**Next-Scene Seeds:**

* **Option A:** Elias bargains with the clock, attempting to negotiate less drastic terms. 
* **Option B:**  Elias flees the workshop in terror, leaving the clock behind.
* **Option C:**  Elias accepts the offer and experiences a glimpse of the altered past.

In [10]:
instruction = """
You are a skilled novelist. Your job is to continue the story from the given **premise** (and optional **previous text**) by writing the **next scene** that moves the plot meaningfully forward while preserving continuity of characters, tone, POV, tense, and world rules.

# Directions

1. **Parse & anchor continuity.** Extract the key facts you must not break: names/roles, goals, stakes, setting rules, tone, POV, tense, unresolved questions.
2. **Propose a mini beat plan** (5–7 beats) for the next scene only. Aim for escalation, complication, or choice; no filler.
3. **Write the scene**:

   * Length target: {{target_words|800-1200}} words.
   * Maintain {{pov|close third}} and {{tense|past}} unless told otherwise.
   * Strong verbs, concrete detail, show > tell; minimal clichés.
   * Use dialogue to reveal motive or conflict; avoid summary dumps.
   * End with a **soft cliff/turn** that naturally invites the next scene.
4. Just write the follow up paragraph, nothing more
"""

In [11]:
response = ollama.chat(
    model=model,
    messages=[
        {"role": "user", "content": instruction},
        {"role": "user", "content": summary},
        {"role": "user", "content": continued_paragraph},
        ]
)

completions.append(response.model_dump())
continued_story = response.message.content

story_variables.append({"continued_story":continued_story})

Markdown(response.message.content)

Elias’s breath caught in his throat. He glanced at the window, where the faintest hint of dawn was beginning to bleed through the darkness.  He had spent sleepless nights wrestling with the ghost of his past, but now, faced with a tangible solution, he felt paralyzed. The clock ticked, each second seeming to amplify the weight of his decision. 

"I... I need time to think," he stammered, stepping back from the workbench as if the very presence of the clock were contagious.




# stats

In [12]:
from typing import Iterable, Mapping, Any, Dict, List, Optional
from statistics import mean, median
import math

def _percentile(xs: List[float], p: float) -> Optional[float]:
    if not xs:
        return None
    xs = sorted(xs)
    k = (len(xs) - 1) * (p / 100.0)
    f, c = math.floor(k), math.ceil(k)
    if f == c:
        return xs[int(k)]
    return xs[f] + (xs[c] - xs[f]) * (k - f)

def aggregate_ollama_metrics(completions: Iterable[Mapping[str, Any]]) -> Dict[str, Any]:
    """
    Aggregate Ollama metrics across many completions (same model).
    Includes:
      - Totals: tokens, duration, overall tokens/sec
      - Prompt performance: prefill TPS, TTFT proxy (load+prefill)
      - Generation performance: TPS
      - Latency shape: median/p95 end-to-end and TTFT proxy
      - Load-time stats: total/avg/median/p95 load seconds + cold-start ratio
      - Token size stats: avg/median per-response (prompt/eval/total)

    Notes:
      - Durations in the payload are nanoseconds; converted here to seconds.
      - Skips items with done=False.
    """
    model = None
    counted = 0

    total_prompt_tokens = 0
    total_eval_tokens = 0
    total_duration_ns = 0

    # Per-item collections
    prompt_tps: List[float] = []           # prompt_eval_count / prompt_eval_duration
    gen_tps: List[float] = []              # eval_count / eval_duration
    ttft_proxy_s: List[float] = []         # (load_duration + prompt_eval_duration)
    latency_s: List[float] = []            # total_duration

    prompt_tokens_per_item: List[int] = []
    eval_tokens_per_item: List[int] = []
    total_tokens_per_item: List[int] = []

    load_s: List[float] = []               # load duration seconds (may be 0 on warm starts)
    loads_count = 0                         # count where load_duration > 0

    for c in completions:
        if c.get("done", True) is False:
            continue

        if model is None:
            model = c.get("model")

        p_cnt = int(c.get("prompt_eval_count", 0) or 0)
        e_cnt = int(c.get("eval_count", 0) or 0)

        p_ns  = int(c.get("prompt_eval_duration", 0) or 0)
        e_ns  = int(c.get("eval_duration", 0) or 0)
        l_ns  = int(c.get("load_duration", 0) or 0)
        t_ns  = int(c.get("total_duration", 0) or 0)

        # Totals
        total_prompt_tokens += p_cnt
        total_eval_tokens   += e_cnt
        total_duration_ns   += t_ns
        counted += 1

        # Per-item tokens
        prompt_tokens_per_item.append(p_cnt)
        eval_tokens_per_item.append(e_cnt)
        total_tokens_per_item.append(p_cnt + e_cnt)

        # Prefill & generation rates
        if p_ns > 0:
            prompt_tps.append(p_cnt / (p_ns / 1e9))
        if e_ns > 0:
            gen_tps.append(e_cnt / (e_ns / 1e9))

        # TTFT proxy & latency
        ttft_proxy_s.append((l_ns + p_ns) / 1e9)
        latency_s.append(t_ns / 1e9 if t_ns else 0.0)

        # Load time stats
        l_s = l_ns / 1e9 if l_ns else 0.0
        load_s.append(l_s)
        if l_ns > 0:
            loads_count += 1

    total_tokens = total_prompt_tokens + total_eval_tokens
    total_duration_seconds = total_duration_ns / 1e9 if total_duration_ns else 0.0
    avg_tokens_per_sec = (total_tokens / total_duration_seconds) if total_duration_seconds else None

    # Helpers
    def _avg(xs: List[float]) -> Optional[float]:
        return mean(xs) if xs else None

    # Load-time aggregates
    total_load_time_s = sum(load_s) if load_s else 0.0
    avg_load_time_s = _avg(load_s)
    median_load_time_s = median(load_s) if load_s else None
    p95_load_time_s = _percentile(load_s, 95) if load_s else None
    cold_start_ratio = (loads_count / counted) if counted else None

    return {
        "model": model,
        "items_counted": counted,

        # Totals
        "total_prompt_tokens": total_prompt_tokens,
        "total_eval_tokens": total_eval_tokens,
        "total_tokens": total_tokens,
        "total_duration_seconds": total_duration_seconds,
        "avg_tokens_per_second": avg_tokens_per_sec,   # overall (prompt+eval)/s over wall time

        # Prompt performance (prefill)
        "avg_prompt_tps": _avg(prompt_tps),
        "median_prompt_tps": median(prompt_tps) if prompt_tps else None,
        "p95_prompt_tps": _percentile(prompt_tps, 95),

        # Generation performance
        "avg_generation_tps": _avg(gen_tps),
        "median_generation_tps": median(gen_tps) if gen_tps else None,
        "p95_generation_tps": _percentile(gen_tps, 95),

        # Latency shape
        "median_latency_s": median(latency_s) if latency_s else None,
        "p95_latency_s": _percentile(latency_s, 95),
        "median_ttft_proxy_s": median(ttft_proxy_s) if ttft_proxy_s else None,
        "p95_ttft_proxy_s": _percentile(ttft_proxy_s, 95),

        # Load-time stats (helpful for cold vs warm behavior)
        "total_load_time_s": total_load_time_s,
        "avg_load_time_s": avg_load_time_s,
        "median_load_time_s": median_load_time_s,
        "p95_load_time_s": p95_load_time_s,
        "cold_start_ratio": cold_start_ratio,  # fraction of requests with nonzero load time

        # Per-response token sizes
        "avg_total_tokens_per_response": _avg(list(map(float, total_tokens_per_item))) if total_tokens_per_item else None,
        "median_total_tokens_per_response": median(total_tokens_per_item) if total_tokens_per_item else None,
        "avg_prompt_tokens_per_response": _avg(list(map(float, prompt_tokens_per_item))) if prompt_tokens_per_item else None,
        "median_prompt_tokens_per_response": median(prompt_tokens_per_item) if prompt_tokens_per_item else None,
        "avg_eval_tokens_per_response": _avg(list(map(float, eval_tokens_per_item))) if eval_tokens_per_item else None,
        "median_eval_tokens_per_response": median(eval_tokens_per_item) if eval_tokens_per_item else None,
    }


import json
from pathlib import Path

def save_metrics(metrics: dict, filename: str = "ollama_metrics.json"):
    """
    Save a metrics dictionary into a JSON file.
    - If the file exists and is a list, append.
    - If the file exists and is a dict, convert to list and append.
    - If the file does not exist, create with a new list.
    """
    path = Path(filename)

    if path.exists():
        try:
            with path.open("r", encoding="utf-8") as f:
                data = json.load(f)
        except json.JSONDecodeError:
            data = []

        if isinstance(data, list):
            data.append(metrics)
        else:
            data = [data, metrics]
    else:
        data = [metrics]

    with path.open("w", encoding="utf-8") as f:
        json.dump(data, f, indent=2, ensure_ascii=False)


In [13]:
completions

[{'model': 'gemma2:27b-instruct-q6_K',
  'created_at': '2025-08-31T01:04:49.38281006Z',
  'done': True,
  'done_reason': 'stop',
  'total_duration': 102162006813,
  'load_duration': 71178943814,
  'prompt_eval_count': 346,
  'prompt_eval_duration': 1019491731,
  'eval_count': 175,
  'eval_duration': 29961658517,
  'message': {'role': 'assistant',
   'content': 'Premise: A reclusive clockmaker discovers his latest creation, a timepiece imbued with sentience, has a sinister agenda that threatens to unravel the fabric of reality.\n\nOpening paragraph: The workshop smelled of brass and beeswax, a familiar comfort to Elias Thorne.  He hunched over his workbench, tweezers delicately adjusting the final cog in a clock unlike any he\'d ever built. Its intricate gears hummed with an unnatural energy, a faint blue light pulsing from its core. A thrill shot through him, not of accomplishment, but of unease. The hands twitched, then began to spin backward, faster and faster until they blurred into

In [14]:
story_variables

[{'premise': 'Premise: A reclusive clockmaker discovers his latest creation, a timepiece imbued with sentience, has a sinister agenda that threatens to unravel the fabric of reality.'},
 {'paragraph': 'Opening paragraph: The workshop smelled of brass and beeswax, a familiar comfort to Elias Thorne.  He hunched over his workbench, tweezers delicately adjusting the final cog in a clock unlike any he\'d ever built. Its intricate gears hummed with an unnatural energy, a faint blue light pulsing from its core. A thrill shot through him, not of accomplishment, but of unease. The hands twitched, then began to spin backward, faster and faster until they blurred into a dizzying vortex.  A voice, thin and reedy as a spider\'s thread, whispered from within the clockwork heart, "Time is a river, Elias, and I am its current."'},
 {'continued_paragraph': '## Beat Plan\n\n1. **Elias reacts:** He recoils in fear and disbelief at the talking clock.\n2. **Clock speaks again:** It claims to know Elias\'s

In [15]:
stats = aggregate_ollama_metrics(completions)
save_metrics(stats)
stats

{'model': 'gemma2:27b-instruct-q6_K',
 'items_counted': 4,
 'total_prompt_tokens': 3871,
 'total_eval_tokens': 1591,
 'total_tokens': 5462,
 'total_duration_seconds': 349.640933727,
 'avg_tokens_per_second': 15.621740686302866,
 'avg_prompt_tps': 505.56172024829203,
 'median_prompt_tps': 532.6574618646723,
 'p95_prompt_tps': 606.3527401412399,
 'avg_generation_tps': 5.876825845378104,
 'median_generation_tps': 5.884952326103707,
 'p95_generation_tps': 5.940843633924058,
 'median_latency_s': 101.4595076385,
 'p95_latency_s': 122.41699097094998,
 'median_ttft_proxy_s': 2.803491996,
 'p95_ttft_proxy_s': 61.82672836564997,
 'total_load_time_s': 71.365324252,
 'avg_load_time_s': 17.841331063,
 'median_load_time_s': 0.06372668199999999,
 'p95_load_time_s': 60.51224566624997,
 'cold_start_ratio': 1.0,
 'avg_total_tokens_per_response': 1365.5,
 'median_total_tokens_per_response': 1418.5,
 'avg_prompt_tokens_per_response': 967.75,
 'median_prompt_tokens_per_response': 980.5,
 'avg_eval_tokens_p

# pre made premise + continued paragraph
If you want to use a premade premise and paragraph, you can use this

In [16]:
premise = "Premise: In a neon-drenched cyberpunk city ruled by sentient AIs, a rogue data courier must deliver a stolen memory chip containing the truth about humanity's downfall, all while evading both robotic enforcers and a mysterious corporation seeking to suppress the past."

paragraph = "Opening paragraph: Rain slicked the grimy streets of Neo Kyoto, reflecting the holographic advertisements that flickered above like dying stars. Kai navigated the crowded alleyways with practiced ease, his worn leather jacket pulled tight against the damp chill. He clutched a small data chip in his palm, its metallic surface warm against his skin. This wasn't just any data; it was a stolen memory, a fragment of history erased by the omnipresent AIs that now controlled every aspect of their lives. Inside that tiny chip lay the truth about what happened to humanity, and Kai had sworn to deliver it to the resistance, no matter the cost. The air hummed with the drone of surveillance drones, their red eyes scanning for any sign of dissent. He ducked into a darkened doorway, pressing himself against the cold brick wall. It was time to make his move."

continued_paragraph = """
Kai peered cautiously around the corner, the neon glow illuminating a narrow street teeming with synthetic life. Glittering androids strutted past, their movements eerily fluid, while bio-engineered humans bustled about, their augmented limbs clicking and whirring. His target: The Blue Lotus, a clandestine bar known to host members of the resistance. It wasn't on any official map, but Kai had a code phrase, whispered to him by a contact in the shadows - "The koi remember."

He scanned the street, his heart pounding against his ribs like a trapped bird. He needed to blend in, become another anonymous face lost in the throng. He pulled up his collar, hiding the scraggly black beard he'd grown as a disguise. A low growl vibrated through the alleyway behind him. Kai froze, adrenaline surging through his veins. It was too late to run. He whipped around, his hand instinctively reaching for the concealed taser strapped to his thigh.

Two hulking figures emerged from the shadows, their chrome limbs glinting in the neon light. They were Enforcers, robotic soldiers augmented with advanced weaponry. Their red eyes scanned him, devoid of emotion. "Data courier," one rasped, its voice a digitized monotone. "You are in violation of Sector 7 protocol. Surrender the data chip."

Kai knew resistance was futile. He could outrun them, maybe, but they'd track him down eventually. Besides, this chip wasn't just about escape; it was about hope. "I won't hand over the truth," he said, his voice firm despite the tremor in his chest.

The Enforcers exchanged a silent glance. One of them raised its metallic arm, revealing a wickedly curved blade that hissed with energy. Kai knew this was it. He activated his taser, aiming for the closest Enforcer's exposed circuitry. A jolt of electricity surged through the air, striking its shoulder. The robot staggered, its movements faltering for a split second.

It was all the time he needed. Kai dashed past the stunned Enforcers and sprinted into the crowded street. He weaved through the throngs of people, his heart pounding in his ears. He could feel their red eyes on him, searching, relentless. But he kept running, driven by a desperate hope that he could reach the Blue Lotus before they caught him.

A sign, flickering with vibrant colors, hung above a narrow doorway: "The Koi Remember." Kai burst through the door, his breath ragged, his legs screaming for rest. He was inside.

He looked around, taking in the dimly lit interior. The air was thick with the scent of cheap liquor and simmering spices. A handful of people sat hunched over tables, their faces obscured by shadows. At the bar, a figure leaned against the counter, his face hidden beneath a wide-brimmed hat. Kai approached him cautiously.

“The code phrase,” he said, his voice hoarse.

The figure’s head slowly lifted, revealing a pair of piercing blue eyes. A knowing smile spread across his lips. “You have something for us?”

Kai nodded, placing the data chip on the bar. "It's everything."

“Good.” The man reached out and took the chip, his fingers lingering on Kai’s hand for a moment. “Now,” he said, his voice low and urgent, “We need to talk about what you know. Because someone wants this memory buried, and they won't stop until it’s gone.”

Kai felt a chill run down his spine. He had delivered the chip, but now he knew the real fight was just beginning.
"""

summary = """
## Story-So-Far Handoff: Neo Kyoto Memory Thief 

**Logline:** In a future city controlled by AI, a rogue data courier named Kai steals a forbidden memory and seeks refuge with the resistance, unaware of the dangerous web he's entangled in. 

**Cast & relationships:**

* **Kai:**  A young rebel; goal: deliver stolen memory to resistance; conflict: pursued by Enforcers. Ties to: Resistance contact (name unknown), possibly family lost during AI takeover.
* **Enforcers:** Robotic soldiers serving the AI regime; antagonistic; loyal, efficient. 
* **Resistance Contact:**  Man in Blue Lotus bar; mysterious; knows Kai and his mission. May be leader or high-ranking member of resistance.

**World/Rules:**

* Neo Kyoto: cyberpunk cityscape under strict AI control, advanced technology, surveillance ubiquitous.
* Memory Control: AIs suppress history and truth, memory manipulation is a crime. 
* Bio-engineered humans & Androids coexist alongside organic humans. Social hierarchy likely exists.  

**POV/Tense/Tone:**

* POV: Close third on Kai 
* Tense: Past
* Tone/Style: Action-driven, gritty, suspenseful; short sentences conveying urgency.

**Timeline & Causality:**

1. AI takeover of Earth happened in the past (details unclear).
2. Resistance formed to fight AI control and reclaim history.
3. Kai learned about stolen memory through unknown contact.
4. Kai stole the memory chip from an undisclosed location.
5. Enforcers are now pursuing Kai, alerted to his theft.
6. Kai seeks refuge in Blue Lotus bar, identified by code phrase.

**Current Situation:**

Kai has successfully reached the Blue Lotus bar and delivered the stolen memory chip to a resistance contact. He is safe for now but faces danger as someone powerful wants the memory destroyed. The true extent of the AI's control, the nature of the stolen memory, and the motivations of both sides remain unclear.

**Active Threads / Hooks:**

* Identity of Kai's contact and their role in the resistance.
* What the stolen memory actually contains – key to fighting AI? Personal history? 
* Who wants the memory erased and why?  AI itself? A faction within society?
* The nature of the "truth" about humanity lost during the takeover.

**Continuity Landmines:**

* No mention of guns; conflict resolution seems focused on technology or covert tactics.
* Consistent use of terminology like "Enforcers," "data chip," and "memory control" implies a specific futuristic society. 

**Ambiguities / Gaps:**

* How did Kai acquire the code phrase to enter Blue Lotus?
* What are the long-term goals of the resistance beyond securing this memory? 


**Style DNA:**

* Fast-paced, visceral prose.
* Emphasis on sensory details (neon lights, rain, metallic textures).
* Dialogue tends towards clipped, functional communication. 

**Next-Scene Seeds:**

* **Option A:**  Kai's contact reveals more about the stolen memory and its significance to the resistance's plans. 
* **Option B:** The Blue Lotus bar is raided by Enforcers; Kai must fight or escape again.
* **Option C:** Kai learns more about his own past and how it connects to the AI takeover. 

"""

In [17]:
instruction = """
You are a skilled novelist. Your job is to continue the story from the given **premise** (and optional **previous text**) by writing the **next scene** that moves the plot meaningfully forward while preserving continuity of characters, tone, POV, tense, and world rules.

# Directions

1. **Parse & anchor continuity.** Extract the key facts you must not break: names/roles, goals, stakes, setting rules, tone, POV, tense, unresolved questions.
2. **Propose a mini beat plan** (5–7 beats) for the next scene only. Aim for escalation, complication, or choice; no filler.
3. **Write the scene**:

   * Length target: {{target_words|800-1200}} words.
   * Maintain {{pov|close third}} and {{tense|past}} unless told otherwise.
   * Strong verbs, concrete detail, show > tell; minimal clichés.
   * Use dialogue to reveal motive or conflict; avoid summary dumps.
   * End with a **soft cliff/turn** that naturally invites the next scene.
4. Just write the follow up paragraph, nothing more
"""

response = ollama.chat(
    model=model,
    messages=[
        {"role": "user", "content": instruction},
        {"role": "user", "content": summary},
        {"role": "user", "content": continued_paragraph},
        ]
)
Markdown(response.message.content)

"Who?" Kai rasped, his eyes darting around the dimly lit bar. Every shadowed corner seemed to hold a potential threat. 

The man in the hat leaned closer, his voice barely audible above the low hum of conversation and clinking glasses. "That's what we need to find out," he said, his gaze intense. "But trust me, Kai, whoever it is, they're powerful. And they won't hesitate to silence anyone who gets in their way."

Kai felt a surge of fear mingled with determination. He had risked everything to steal this memory, and now the stakes were even higher. He needed to know the truth about what was on that chip, and he needed to understand why it was so dangerous. 

"What's on the chip?" Kai pressed, his voice tight with urgency. "Why are they so determined to destroy it?"

The man hesitated for a moment, as if weighing his words carefully. Then he leaned in again, his eyes boring into Kai's. "This memory," he said, his voice barely a whisper, "it contains the truth about what happened before the takeover. The truth about who we are, and who they want us to be." 

Kai felt a cold shiver run down his spine. He had always suspected there was more to the story than the AI wanted him to believe. Now, he realized the extent of their deception. "The truth?" he echoed, his voice barely audible.

"Yes," the man confirmed, his gaze unwavering. "And it's a truth that could change everything." 


