Releases: sanniheruwala/RedNotebookAI
v0.7.27 — bundled AI hotfix
v0.7.27 — bundled AI hotfix (chatml + latency + banner copy)
Three bugs from the first real use of the bundled-Qwen image in
v0.7.26:
1. Summarize / explain / ask AI ran forever
chat_format="qwen" isn't stable across llama-cpp-python versions —
on 0.2.90+ it's either unregistered or aliased to a template that
doesn't match Qwen 2.5's actual ChatML format. Result: the model
never emits the EOS token, so generation runs all the way to
max_tokens at ~5-15 tok/sec on a shared CPU — 30-60 s per call,
felt like a hang.
Fixed by:
chat_format="chatml"— Qwen 2.5's actual conversation template,
stable in llama-cpp-python.- Explicit stop tokens
<|im_end|>+<|endoftext|>as belt-and-
suspenders against a GGUF metadata quirk suppressing the EOS. - Greedy decoding (
temperature=0.0) — deterministic + slightly
faster sampler. n_ctxhalved (4096 → 2048) — prompt eval scales O(ctx²) and our
prompts comfortably fit 2 k.- Per-method
max_tokenscaps tightened across the board.
| Method | Before | After |
|---|---|---|
| summarize_result | 320 | 180 |
| explain_sql | 300 | 200 |
| optimize_sql | 420 | 260 |
| generate_sql | 400 | 240 |
| infographic_brief | 480 | 300 |
Worst-case wall clock on a 5 tok/sec shared CPU: now ~35-60 s per
call instead of ~60-100 s plus runaway.
2. "Failed to get API key" banner for bundled
The admin AI page banner literally said "check that the API key is
valid" when any non-mock provider fell back to mock. Wrong copy for
bundled, which has no API key concept.
Fixed by a new fallbackHint(provider) helper that returns provider-
specific guidance:
| Provider | Hint |
|---|---|
bundled |
verify the bundled GGUF model exists at /app/models/ and that llama-cpp-python is installed |
ollama |
verify the Ollama server is reachable at OLLAMA_BASE_URL and the configured model is pulled |
openai / anthropic |
verify the API key is valid and the configured model is correct |
The registry's WARNING log line was also updated to drop the
misleading "Check API key + bundled SDK install" tail and list the
actual first-line failure mode per provider.
3. Pydantic schema-shadow warnings
UserWarning: Field name "schema" in "ResultContext" / "InfographicContext" shadows an attribute in parent "BaseModel" fired on every server start.
Silenced at the model level with ConfigDict(protected_namespaces=()).
Upgrade
docker pull ghcr.io/sanniheruwala/rednotebook-ai:v0.7.27HF Space: bump the FROM tag in the Space's Dockerfile from
:v0.7.26 to :v0.7.27.
v0.7.26 — fix the v0.7.25 image build
v0.7.26 — fix the v0.7.25 image build
v0.7.25 was tagged but never produced a Docker image — the build
failed in both amd64 and arm64 runtime stages with:
ERROR: Failed to build installable wheels for some pyproject.toml
based projects (llama-cpp-python)
Root cause
llama-cpp-python upstream doesn't publish a complete wheel matrix
to PyPI. They publish prebuilt CPU-only wheels to a dedicated index
at https://abetlen.github.io/llama-cpp-python/whl/cpu covering
every supported Python × linux_{x86_64,aarch64} combination.
pip in the python:3.12-slim runtime stage couldn't find a matching
manylinux wheel on PyPI and fell through to a source build,
which dies because slim images have no compiler / cmake / make.
The fix
One-line change in the runtime stage:
COPY --from=python-build /wheels /wheels
-RUN pip install --no-cache-dir /wheels/*.whl \
+RUN pip install --no-cache-dir \
+ --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu \
+ /wheels/*.whl \
&& rm -rf /wheelsPyPI stays the primary index. The CPU wheel index is consulted
secondarily for packages PyPI doesn't fully cover — pip can only
pull llama-cpp-python from there because nothing else lives in
that index.
v0.7.25 status
The v0.7.25 GitHub Release page exists for documentation purposes,
but no :v0.7.25 Docker image was ever published. Use :v0.7.26
as the first release with the bundled-Qwen feature actually built.
Upgrade
docker pull ghcr.io/sanniheruwala/rednotebook-ai:v0.7.26HF Space: bump the FROM tag in the Space's Dockerfile from
:v0.7.24 (the previous shipping release) directly to :v0.7.26.
Skip :v0.7.25 — it doesn't exist.
v0.7.25 — bundled local AI by default
v0.7.25 — bundled local AI by default
The published Docker image now ships with Qwen2.5-Coder-1.5B-Instruct
baked in as the default AI provider. Bare docker run gets you real
SQL generation, summaries, and infographic narratives — no API key,
no Ollama install, no setup.
This is the change that lets us legitimately call the tool AI-native.
What's bundled
| Component | Detail |
|---|---|
| Model | Qwen2.5-Coder-1.5B-Instruct, Q4_K_M quant (~1 GB on disk) |
| License | Apache 2.0 |
| Runtime | llama-cpp-python (CPU-only, ~50 MB wheel) |
| Image size impact | 600 MB → ~1.6 GB |
| RAM at runtime | ~1.5 GB resident while loaded |
| Speed | ~30-50 tok/sec CPU → 3-5 s per SQL gen. Apple Silicon Metal / GPU = ~instant |
| Cold start | ~2-3 s mmap at app boot, then every request is warm |
Behaviour matrix
| Path | Default AI_PROVIDER |
Model file location |
|---|---|---|
| Docker image | bundled (baked into ENV) |
/app/models/qwen2.5-coder-1.5b-instruct-q4_k_m.gguf |
pip install rednotebook-ai |
mock (no model in wheel) |
User points QWEN_MODEL_PATH at their own GGUF |
HF Space (after bumping FROM tag) |
bundled |
Inherits from image |
Quality + tradeoff framing
A 1.5B coder model writes plausible SQL on the bundled sample
notebook and clearly-degraded SQL on real warehouses. The admin AI
page now leads with the bundled option and explains the tradeoff in
plain English so users know when to graduate to OpenAI / Anthropic /
a bigger local model via Ollama.
Graceful degradation
Two failure modes, both fall back to mock with a single WARNING log
line — never crash the AI surface:
llama_cppnot installed: the side-effect import in
server/main.pycatches it and bundled stays unregistered.- GGUF file missing on disk:
BundledAIProvider.__init__raises
FileNotFoundError; the registry's existing try/except in
get_providercatches it.
Three new tests cover both paths.
Upgrade
docker pull ghcr.io/sanniheruwala/rednotebook-ai:v0.7.25HF Space refresh: bump the FROM tag in the Space's Dockerfile to
:v0.7.25. First HF rebuild will be slower than usual (~5-10 min
extra) because the model layer downloads 1 GB from HuggingFace. Every
subsequent app-only release reuses the cached model layer.
Knobs
AI_PROVIDER=bundled|openai|anthropic|ollama|mock— pick provider.QWEN_MODEL_PATH=/path/to/your.gguf— override the bundled GGUF
(drop in Qwen 7B, Phi-3.5, etc.).BUNDLED_AI_THREADS=4— CPU thread count for inference. Defaults
tocpu_count - 1.
v0.7.24 — chart hotfix
v0.7.24 — chart hotfix
Two regressions in v0.7.23 that real use surfaced within hours of
shipping.
Downloaded PNG was blank
The live ECharts in the full-size chart card runs in SVG renderer
mode for results under 3k points (the common analyst-shaped result).
In SVG mode, ECharts' getDataURL({type:'png'}) doesn't reliably
emit a real raster — depending on browser and chart shape it returns
either a mis-typed SVG-data-URL or a partly-rendered transparent
PNG. The "startsWith data:image/png" defensive check would sometimes
pass anyway, sending the malformed URL straight to the file system.
Now: PNG export always goes through a headless canvas re-render,
regardless of the live renderer. Plus four quality fixes:
- Animation disabled in the headless option so we don't capture
the chart mid-grow (the second-most-common blank-export mode). - One
requestAnimationFramewait aftersetOptionso the
renderer has produced its final pixels before we read them. - White background instead of transparent — Slack / Docs /
Notion render behind a light surface anyway, and transparent
PNGs were being mis-perceived as empty in some viewers. - DataZoom stripped from the still image so the pan slider
doesn't pollute the exported chart.
Cost: ~150ms extra per PNG export. Worth it for always-correct
downloads.
Customize popover extended beyond the viewport with no scroll
max-h-[70vh] was meaningless on short windows — Radix sizes the
popover to its natural content height first, and 70vh wasn't tall
enough on screens under ~700px.
Now the popover is a flex column with:
max-h-[var(--radix-popover-content-available-height)]—
Radix's collision detector populates this CSS variable with the
actual largest height that fits without going off-screen. Tighter
than70vhand dynamic.- Pinned header (
shrink-0) — the title and Reset stay visible
while you scroll. - Inner scrollable body (
overflow-y-auto overscroll-contain). max-w-[calc(100vw-24px)]so the popover doesn't overflow
the screen on tiny phones.collisionPadding={12}so it keeps breathing room from
screen edges.
Upgrade
docker pull ghcr.io/sanniheruwala/rednotebook-ai:v0.7.24HF Space refresh: bump the FROM tag in the Space's Dockerfile to
:v0.7.24.
v0.7.23 — Chart tab grew up
v0.7.23 — the Chart tab grew up
This release replaces the empty axis-picker that used to greet every
result with a complete report-grade chart card. It rolls in two
feature sets — the auto-recommended chart grid (PR #19) and the
Share + Customize + inline editing surface (PR #20) — because the
v0.7.22 version was transient and never tagged.
The new Chart tab, in one screen
┌─────────────────────────────────────────────────────────────────┐
│ 1. You land on a 2×2 grid of suggested charts (no clicks needed)│
│ 2. Pick one → it expands to full-size with report chrome │
│ 3. Rename in place, tweak palette / format / layout in Customize│
│ 4. Share → download PNG / SVG / clipboard / CSV │
└─────────────────────────────────────────────────────────────────┘
Auto-recommended chart grid
A pure-TypeScript heuristic engine reads your result and ranks chart
candidates by shape:
| Roles | Chart |
|---|---|
| time + numeric | line / area |
| time + numeric + low-card category | stacked area |
| low-card category + numeric | bar (top categories) |
| 2-8 distinct categories + numeric | donut for composition |
| two numerics | scatter (coloured by a category if present) |
| single numeric | histogram |
| single-row + single numeric | KPI |
| boolean + numeric | true/false bar |
Each tile shows a live ECharts thumbnail with a one-line "why this
one" caption. "Try another set" cycles through up to 8 ranked
candidates. "Custom" opens the manual axis picker for the cases
the heuristic missed.
Deterministic — same data, same suggestions. ~1ms to compute, no LLM,
no round trip.
Report-grade chart chrome
Every full-size chart now has:
- 1px gradient accent strip across the top in the active palette.
- HTML header above the canvas with an auto-generated title
({Y} by {X},{Y} over time,Distribution of {Y}, etc.), a
one-line description ("Sum of revenue · grouped by region"), and a
row-count badge. - HTML footer with the aggregation method, optional "truncated"
warning, query duration in ms/s, and a quiet "Made with
RedNotebook" mark. - The internal ECharts title is removed so typography (tabular nums,
kerning, weights) stays consistent with the rest of the app.
Share menu
Every chart has a Share button that opens a dropdown with four export
paths:
- Download as PNG — high-res raster (pixel ratio 2, 1280×720
export size regardless of on-screen size), paste anywhere. - Download as SVG — vector, scales without blur.
- Copy image to clipboard — straight into Slack / Docs / email.
- Download data as CSV — the numbers behind the chart, header
row matches the result columns, CSV-escaped values for commas /
quotes / newlines / dates.
Cross-renderer fallback: ECharts emits only the format that matches
its active renderer (canvas → PNG, SVG mode → SVG). The export utility
detects this and briefly mounts a hidden 1280×720 echarts instance in
the desired renderer to grab the right format. User never sees the
second render.
Customize popover
A new Customize button next to Share opens a popover with five
sections that all persist into config.options:
- Title & description — free-text override of the auto-generated.
- Look — 5 palette presets (Brand / Ocean / Forest / Sunset /
Mono) with swatch previews + 3 heights (Compact / Standard / Tall). - Numbers — Y-axis format: Auto / Number /
$/%. - Show — toggles for legend / gridlines / data labels (bar) /
smooth lines (line/area) / area fill (line). Chart-type-specific
toggles hide themselves when irrelevant.
Inline title and subtitle editing
Click the title or subtitle in the header → it becomes an inline
input, autofocused and text-selected. Enter or blur commits; Escape
reverts. Most users want to rename, not open a dialog.
Read-only consumers (visualization-cell, published HTML) pass no
onChange and the field stays display-only.
Stable option memo
Title/subtitle changes used to trigger an ECharts canvas redraw +
re-animation per keystroke. The useMemo deps are now the structural
fields only (chart_type, x, y, color, aggregation,
options, filters) — HTML header changes no longer disturb the
canvas. Typing feels instant.
Notes
- The v0.7.22 version was transient and never tagged; this release
covers everything that would have shipped under it. - No new Python dependencies. One new frontend file
(frontend/lib/chart-export.ts) and one new UI primitive wrapper
(frontend/components/ui/popover.tsxover the already-installed
Radix Popover).
Upgrade
docker pull ghcr.io/sanniheruwala/rednotebook-ai:v0.7.23HF Space refresh: bump the FROM tag in the Space's Dockerfile from
v0.7.21 to v0.7.23.
v0.7.21
Install
Docker (any OS):
docker run -p 8000:8000 ghcr.io/sanniheruwala/rednotebook-ai:v0.7.21Python (any OS): download the attached *.whl and pip install it.
See the README for full instructions.
What's Changed
- chore(deps): bump next from 14.2.35 to 15.5.18 in /frontend in the npm_and_yarn group across 1 directory by @dependabot[bot] in #7
- feat(onboarding+demo): quick-connect templates, tour, rich empty state, sample seed, DEMO_MODE (v0.7.21) by @sanniheruwala in #16
New Contributors
- @dependabot[bot] made their first contribution in #7
Full Changelog: v0.7.20...v0.7.21
v0.7.20
Install
Docker (any OS):
docker run -p 8000:8000 ghcr.io/sanniheruwala/rednotebook-ai:v0.7.20Python (any OS): download the attached *.whl and pip install it.
See the README for full instructions.
What's Changed
- fix(dialog): drop duplicate close X + repair History dialog body height (v0.7.20) by @sanniheruwala in #13
- fix: complete the v0.7.20 patch — Cursor removal, persist on refresh, history redesign, publisher charts by @sanniheruwala in #15
Full Changelog: v0.7.19...v0.7.20
v0.7.19
Install
Docker (any OS):
docker run -p 8000:8000 ghcr.io/sanniheruwala/rednotebook-ai:v0.7.19Python (any OS): download the attached *.whl and pip install it.
See the README for full instructions.
What's Changed
- feat(metadata+chart): virtual files catalog + HD chart renderer by @sanniheruwala in #11
- chore: bump to v0.7.19 by @sanniheruwala in #12
Full Changelog: v0.7.18...v0.7.19
v0.7.18
Install
Docker (any OS):
docker run -p 8000:8000 ghcr.io/sanniheruwala/rednotebook-ai:v0.7.18Python (any OS): download the attached *.whl and pip install it.
See the README for full instructions.
What's Changed
- fix(docker): v0.7.18 hotfix — Alpine→Debian frontend stage + uploads/published dir chown by @sanniheruwala in #10
Full Changelog: v0.7.17...v0.7.18
v0.7.17
Install
Docker (any OS):
docker run -p 8000:8000 ghcr.io/sanniheruwala/rednotebook-ai:v0.7.17Python (any OS): download the attached *.whl and pip install it.
See the README for full instructions.
What's Changed
- feat(sql-cell): dialect-aware SQL formatter button by @sanniheruwala in #8
- chore: bump to v0.7.17 by @sanniheruwala in #9
Full Changelog: v0.7.16...v0.7.17