# ALAIN Master Guide: Writing Great Tutorial Notebooks

Purpose: distill best practices from our collected notebooks and resources into a pragmatic, copyable guide for building pedagogically excellent, reproducible, and production-minded tutorial notebooks.

Audience: authors and reviewers creating ALAIN lessons and example notebooks.

## Table of Contents
- Principles that make explanations resonate
- Anatomy of a great tutorial notebook
- Good vs. Bad traits (with examples)
- Quality rubric (scoring)
- Authoring checklist (pre-publish)
- Environment preflight (template)
- OpenAI-compatible client + streaming (template)
- Reproducibility and cost controls
- Accessibility and safety
- Minimal outline template
- References (internal docs)

## Principles: What Makes Explanations Resonate
- Clarity-first: BLUF (Bottom Line Up Front), then details.
- Progressive disclosure: small steps (3–5), one new idea per cell.
- Show-then-tell: run a minimal example, then annotate why it works.
- Pair code with plain-English rationale; define jargon before use.
- Predict → Run → Reflect: ask a quick question before executing; reflect after.
- Tie back to real tasks: each step should have an outcome a practitioner cares about.
- Guardrails visible: note costs, limits, safety, and common pitfalls near where they matter.
- Copy-paste friendly: small, self-contained cells with stable variables and defaults.

## Anatomy of a Great Tutorial Notebook
1) Title + Summary (who/what/why)
2) Requirements & Preflight (env checks, keys present)
3) Setup (imports, config, small helper funcs)
4) Minimal Working Example (MWE)
5) Concept Steps (3–5): each with goal, code, commentary, pitfalls, validation
6) Extensions (parameter tuning, structured output, tools)
7) Cost & Performance notes (tokens, latency, budgets)
8) Wrap-up + Next steps + Links
9) Appendix (troubleshooting, full refs)

## Good vs. Bad Traits
- Good: small cells; deterministic seeds; explicit params; visible errors; restart-and-run-all passes; time estimates.
- Bad: long monolithic cells; hidden state; magic globals; silent failures; secrets in code; vendor lock-in without abstraction; no cleanup.

Shines in our samples:
- Reproducible outputs via `seed` and low temperature.
- Structured output strategies (JSON with schema/repair).
- Streaming demos showing progressive results and cancellation.
- Side-by-side comparisons (baseline vs tuned params) with concise plots.
- Guardrails: rate limits, cost annotations, and clear error messages.

## Notebook Quality Rubric (0–3 each; target ≥ 20/27)
- Clarity: are goals and outcomes obvious?
- Structure: small, logical steps with smooth flow.
- Reproducibility: seeds, deterministic params, run-all success.
- Safety: no secrets in code; disclaimers; input sanitization where relevant.
- Cost awareness: token/time budgets, options to downscale.
- Practicality: real-world task framing; copyable snippets.
- Robustness: error handling, troubleshooting, edge cases.
- Accessibility: alt text, readable figures, headings.
- Pedagogy: predict→run→reflect; avoids jargon without definition.

## Authoring Checklist (Pre-Publish)
- [ ] Overview explains who/what/why in ≤ 5 lines
- [ ] Preflight validates env and keys; fails fast with guidance
- [ ] Restart-and-run-all passes in a fresh kernel
- [ ] No secrets in source; config via env or `.env`
- [ ] Cells ≤ 30 lines; one new idea per cell
- [ ] Randomness controlled (seeds) and params logged
- [ ] Cost/time notes near heavy cells; easy to downscale
- [ ] Common pitfalls and troubleshooting included
- [ ] Images/figures have alt text; tables readable
- [ ] Links and attributions included; license clear

## Environment Preflight (Template)

In [None]:
import os, sys, platform, json
print({'python': sys.version.split()[0], 'platform': platform.platform()})

required = ['POE_API_KEY']  # add OPENAI_API_KEY/BASE_URL if using BYOK
missing = [k for k in required if not os.getenv(k)]
assert not missing, f'Missing required environment variables: {missing}'
print('Env OK')


## OpenAI-Compatible Client + Streaming (Template)
This template works with Poe (`https://api.poe.com/v1`) and other OpenAI-compatible endpoints (BYOK).

In [None]:
import requests, time

BASE_URL = os.getenv('OPENAI_BASE_URL', 'https://api.poe.com/v1').rstrip('/')
API_KEY = os.getenv('OPENAI_API_KEY') or os.getenv('POE_API_KEY')
MODEL = os.getenv('TEACHER_MODEL', 'gpt-oss-20b')

def chat_complete(messages, model=MODEL, stream=True, temperature=0, timeout=60):
    headers = {'Authorization': f'Bearer {API_KEY}', 'Content-Type': 'application/json'}
    payload = {'model': model, 'messages': messages, 'temperature': temperature, 'stream': stream}
    url = f'{BASE_URL}/chat/completions'
    with requests.post(url, headers=headers, json=payload, stream=stream, timeout=timeout) as r:
        r.raise_for_status()
        if not stream:
            data = r.json()
            return data['choices'][0]['message']['content']
        # stream tokens
        text = ''
        for line in r.iter_lines():
            if not line: continue
            if line.startswith(b'data: '):
                chunk = line[len(b'data: '):].decode('utf-8')
                if chunk == '[DONE]': break
                try:
                    obj = json.loads(chunk)
                    delta = obj['choices'][0].get('delta', {}).get('content')
                    if delta:
                        text += delta
                        print(delta, end='', flush=True)
                except Exception:
                    pass
        return text

# Minimal example
example = [
    {'role': 'system', 'content': 'You are a helpful assistant.'},
    {'role': 'user', 'content': 'In one sentence, what is a good tutorial notebook?'}
]
_ = chat_complete(example, stream=True)


## Reproducibility and Cost Controls
- Set `temperature=0` for deterministic demos; surface seeds if model supports.
- Log token and latency if available; show a budget and remaining quota.
- Provide small inputs first; optionally include a heavy cell with time estimate and a skip flag.

In [None]:
import random, numpy as np
random.seed(0); np.random.seed(0)
print('Seeds set (Python, NumPy)')


## Accessibility and Safety
- Use headings and lists for scanability; avoid tiny fonts in figures.
- Alt text for images; colorblind-safe palettes.
- Avoid sensitive data; sanitize inputs; add disclaimers for medical/legal topics.
- Include troubleshooting for common errors (auth, rate limits, network).

## Minimal Outline Template (Copy/Paste)
1. Title and Summary
2. Requirements & Preflight
3. Setup (imports, keys, client)
4. Minimal Working Example
5. Step 1: Core concept
6. Step 2: Variation or parameter tuning
7. Step 3: Structured output or tooling
8. Extensions and exercises (predict→run→reflect)
9. Cost/latency and pitfalls
10. Wrap-up and next steps

## References
- Internal: `teacher-harmony-format.md`, `spec/lesson.schema.json`, `spec/lesson-validate-repair.ts`
- Patterns drawn from: OpenAI Cookbook, Anthropic examples, and in-repo notebooks.

Tip: Keep notebooks self-sufficient and compatible with Restart-and-Run-All.