Skip to content

Visual identity overhaul: dark-mode default, varied accent palette, zh-tw vocab guard#12

Merged
JE-Chen merged 18 commits into
mainfrom
dev
May 21, 2026
Merged

Visual identity overhaul: dark-mode default, varied accent palette, zh-tw vocab guard#12
JE-Chen merged 18 commits into
mainfrom
dev

Conversation

@JE-Chen
Copy link
Copy Markdown
Member

@JE-Chen JE-Chen commented May 21, 2026

Summary

Eighteen commits' worth of visual-identity + vocabulary work for the deck pipeline.

Visual identity

  • Per-language typography pass (Inter + Microsoft JhengHei UI / YaHei UI / Yu Gothic UI / Malgun Gothic / Nirmala UI) — removes the Calibri default tell.
  • Programmatic accent geometry: top accent bar on every content slide, left band on the cover.
  • Academic-style table formatting: default grid stripped, navy header rule, soft inter-row dividers, alternating row stripe, middle vertical alignment, bold row labels.
  • figures= promoted to a first-class authoring requirement (no soft cap on figure count).

Dark mode

  • New default render path. Built with the light palette, post-pass swaps text/fill/cell-border RGBs via _LIGHT_TO_DARK_TEXT + _LIGHT_TO_DARK_FILL. Opt out with --light-mode (CLI) / Deck-tab checkbox (GUI) / ExportOptions(dark_mode=False) (programmatic).
  • Dark-mode contract (HARD): every text helper sets explicit run.font.color.rgb; _swap_text_colors promotes leftover None/black runs to near-white as a safety net. Regression test pins both layers.
  • Light-on-light contrast contract (HARD): every new light-fill RGB needs a matching _LIGHT_TO_DARK_FILL entry. Caught the _RQ_BOX_FILL near-white-on-near-white bug.
  • No red text contract (HARD): _BRAND_ACCENT (#C0392B) banned for text. Replacement palette: _BRAND_HIGHLIGHT (teal-700 #0E7490) for headline emphasis (KPI value, RQ question), _BRAND_GREY for caption / chrome (paper-table caption, figure-unavailable fallback). Dark-mode swap maps teal-700 -> teal-400.

Vocabulary guard

  • New language-vocabulary-check subagent + test_zh_tw_files_use_traditional_chinese_vocabulary with ~244 regex patterns across 8 rounds — catches Simplified-Chinese loan words written with Traditional hanzi (e.g. 內存 / 魯棒性 / 軟件 / 緩存).

Other

  • CLI bare invocation routes to GUI (instead of arg-error).
  • GUI Enrich + Deck tabs fully implemented; collection_ready signal wiring.
  • 4-paper worked example: scripts/regen_speculative_decoding_zh_tw.py with hand-curated figures + light + dark variants.
  • scripts/_audit_dark_text.py flags both invisibility failure modes (dark-on-dark + light-on-light).
  • scripts/_extract_speculative_figures.py runs extract_figures on the worked-example PDFs.

Test plan

  • pytest tests/ — 515/515 pass
  • ruff check autopapertoppt/ sources/ tests/ scripts/ — clean
  • bandit -c pyproject.toml -r autopapertoppt/ sources/ — only the pre-existing nosec note
  • Manual: render the 4 speculative-decoding decks light + dark, confirm no invisible text + no red runs + table styling applied + figures present

JE-Chen added 18 commits May 20, 2026 13:33
The existing test_zh_tw_files_use_traditional_chinese_vocabulary
guard caught the easy class (Simplified hanzi characters leaking into
zh-tw surfaces). But Simplified-Chinese has a second tier of drift —
loan words that use Traditional hanzi yet are S-Chinese vocabulary:
內存 (memory) vs 記憶體, 魯棒性 (robustness) vs 穩健性, 視頻 vs 影片,
屏幕 vs 螢幕, 鼠標 vs 滑鼠. A pure orthography pass cannot catch
these because every character looks correct.

.claude/agents/language-vocabulary-check.md
  New subagent doc with a per-language anti-pattern table (zh-tw +
  zh-cn populated; ja/ko/es/pt/fr/de cautions). Used after any change
  that touches readmes/, docs/<lang>/, scripts/regen_*<lang>*.py,
  autopapertoppt/gui/i18n.py, or rendered .pptx / .md / .xlsx text.
  Read-only; reports offenders by file + offset + suggested fix.

tests/test_i18n.py
  Extends s_only_patterns in test_zh_tw_files_use_traditional_chinese_vocabulary
  with the lexicon-level offenders the new agent enumerates: 內存,
  魯棒, 視頻, 屏幕, 鼠標, 黑客, 服務器, 數據庫, 操作系統, 應用程序,
  字符 / 字符串, 線程, 進程, 隊列, 帶寬, 內核, 內置, 鏈接, 加載,
  設置, 集群, 模塊, 集成, 重定向, 主頁, 編程, 賬戶 / 賬號, 菜單,
  對話框, 句柄, 異常 (computing). Most have negative-lookbehind/
  lookahead to avoid false-positives on compound words that happen
  to contain the offending substring (e.g. 演算法 keeps 算法 OK).

scripts/regen_llm_security_batch_zh_tw.py
  Fix 5 real offenders the extended guard found:
   - 2x 魯棒(性) → 穩健(性)
   - 3x 異常 → 例外 (in computing context: anomaly detection slides)

CLAUDE.md
  Adds language-vocabulary-check to the front-matter agent list and
  the "Where the detailed rules live" table.
Each pattern uses Traditional characters but is Simplified-Chinese
vocabulary — a plain T-vs-S character pass cannot catch them.

Hardware:
  硬件 → 硬體   主板 → 主機板   顯卡 → 顯示卡
  硬盤 → 硬碟   軟盤 → 軟碟    光盤 → 光碟

Printing / I/O:
  打印 / 打印機 → 列印 / 印表機
  串口 → 序列埠

Crypto / data:
  密鑰 → 金鑰
  數組 → 陣列
  變量 → 變數 (excludes `不變量` = invariant via negative-lookbehind;
              `不變量` is the math / formal-verification term and is
              accepted in TW)
  字節 → 位元組
  比特 → 位元 (excludes `比特幣` = bitcoin via negative-lookahead;
              the bitcoin loan word is accepted in TW)

Code / async:
  注釋 → 註解 / 註釋     模板 → 範本
  跟蹤 → 追蹤           異步 → 非同步

UI / display / media:
  圖標 → 圖示
  高清 → 高畫質
  寬屏 → 寬螢幕
  信道 → 通道 / 頻道
  鏡像文件 → 映像檔
  文件夾 → 資料夾
  短信 → 簡訊

Caught one real offender in scripts/regen_llm_security_batch_zh_tw.py:
  "讓 origin-entity 不變量明確化…" was OK (math invariant); a separate
  `…的變量資訊化…` line WAS S-Chinese drift and is fixed to `變數`.

.claude/agents/language-vocabulary-check.md table now mirrors the
24 additions so future LLM sessions get the same lexicon when they
read the agent doc instead of the regex set.

510/510 tests pass.
…d patterns

Round 3 of the language-vocabulary-check rollout. The agent doc's
T-vs-S table is now organised by domain (Memory / OS / Programming /
Data / Network / Cloud / ML / UI / Media / Identity / Verbs) so future
maintainers can drop new entries into the right bucket.

Test patterns added this round:

Network:
  網絡, 互聯網, 數據包, 報文, 抓包, 套接字, 交換機

ML / math:
  歸一化, 概率, 方差, 標量

Programming:
  哈希, 遞歸, 死循環, 析構, 常量, 對象導向

Files / DB / config:
  配置文件, 文件名, 擴展名, 字段, 死鎖

Cloud:
  雲計算, 雲存儲, 沙盒

Hardware / system / media:
  寄存器, 主存 (excludes 主存款), 外設, 批處理,
  攝像頭, 攝像 (excludes 拍攝像記 false-positive), 充電寶

UI widgets:
  滑塊, 滾動條, 復選框, 單選框, 下拉框, 標籤頁,
  工具欄, 狀態欄, 任務欄, 通知欄, 彈窗

Verbs:
  搜索, 查找

Caught 2 real offenders in regen_llm_security_batch_zh_tw.py:
  - line 275: 信任網絡  -> 信任網路
  - line 1285: 擴大關鍵字之外的搜索  -> 擴大關鍵字之外的搜尋

The agent doc table is now ~150 entries grouped into 11 categories
covering the practical S-Chinese drift surface for tech writing.

510/510 pytest pass.
… storage / DS / desktop

Continues the language-vocabulary-check rollout. None caught existing
offenders this round — the guard is now ahead of the codebase, which
is the safe state for a regression test.

Test patterns added (mirrored in the agent doc table):

OOP / type system:
  多態 -> 多型           重定義 -> 重新定義 / 覆寫
  解引用 -> 解參考       標識符 -> 識別字
  動態庫 -> 動態函式庫    靜態庫 -> 靜態函式庫
  共享庫 -> 共用函式庫    整型 -> 整數
  素數 -> 質數           均值 -> 平均值

Touch / screen (mobile):
  觸屏 -> 觸控螢幕       觸摸 -> 觸控
  全屏 -> 全螢幕         截屏 -> 螢幕擷取 / 截圖
  顯示屏 -> 螢幕

Audio / video:
  音頻 -> 音訊
  音視頻 -> 影音
  視頻會議 -> 視訊會議

Storage compounds:
  U盤 -> 隨身碟          雲盤 -> 雲端硬碟
  網盤 -> 網路硬碟       系統盤 -> 系統碟
  啟動盤 -> 開機磁碟

Networking:
  組播 -> 多播
  廣域網 -> 廣域網路 (WAN)
  局域網 -> 區域網路 (LAN)

Data structures:
  鏈表 -> 鏈結串列
  二叉樹 -> 二元樹
  散列表 -> 雜湊表

DB / DevOps:
  存儲過程 -> 預存程序
  灰度發布 -> 灰階發布

Desktop OS surfaces:
  進度條 -> 進度列
  任務管理器 -> 工作管理員
  文件管理器 -> 檔案管理員 / 檔案總管
  注冊表 -> 登錄檔

Verbs / interaction:
  激活 -> 啟用           拖拽 -> 拖曳
  單擊 -> 點擊 / 按一下   復選 -> 核取

The agent doc grew matching subsections for each category so future
maintainers see the new entries grouped with their domain peers.

510/510 pytest pass.
…ddresses

The most important catch this round is bare 軟件 (Traditional chars,
Simplified vocabulary). The existing 软件|软体 pattern only caught the
S-character form; a regen script written by hand could easily type 軟件
with T characters and escape the guard — that gap is now closed.

Patterns added:

Punctuation / escapes / numeric formatting:
  溢出 → 溢位        內聯 → 內嵌 / 行內
  轉義 → 跳脫        反斜杠 → 反斜線
  斜杠 → 斜線 (negative-lookbehind to skip 反斜杠 already counted)
  方括號 → 中括號
  數字化 → 數位化     數字簽名 → 數位簽名
  分辨率 → 解析度     矢量 → 向量
  響應 → 回應

Software / documents:
  軟件 → 軟體 (CRITICAL — bare T-char S-vocab missed for rounds)
  文檔 → 文件 / 說明文件
  文本框 → 文字方塊
  源代碼 → 原始碼
  腳注 → 腳註

Image / media:
  縮略圖 → 縮圖
  二維碼 → 二維條碼 / QR code

Network addresses:
  IP\s*地址 → IP 位址
  物理地址 → 實體位址
  MAC\s*地址 → MAC 位址

Alerts / security:
  報警 → 警報
  殺毒 → 防毒

UI shortcuts:
  快捷方式 → 捷徑
  系統托盤 → 系統匣

The agent doc table now has 4 new subsections (Punctuation, Software /
documents, Image / media continued, Addresses) so future maintainers
find these entries near their domain peers.

510/510 pytest pass; ruff clean.
… ownership

The most important addition this round mirrors Round 5's 軟件 → 軟體 fix:
bare 緩存 (T-char + Simplified vocab) was missed by the existing 缓存
(S-char) pattern. Any hand-typed zh-tw regen script could spell 緩存 with
Traditional characters and slip past the guard. That gap closes here.

Patterns added:

Cache / GPU memory / runtime errors:
  緩存 -> 快取  (T-char S-vocab; sibling of 軟件 -> 軟體)
  顯存 -> 顯示記憶體 / VRAM
  段錯誤 -> 區段錯誤 / 分段錯誤

Mobile / social:
  應用商店 -> 應用程式商店
  彩信 -> 多媒體簡訊
  手機卡 -> SIM 卡
  鎖屏 -> 鎖定螢幕
  屏保 -> 螢幕保護
  點贊 -> 按讚

HTTP / connections:
  請求頭 -> 請求標頭        響應頭 -> 回應標頭
  長連接 -> 長連線          短連接 -> 短連線
  連接池 -> 連線池

Statistics / ML:
  步長 -> 步幅
  置信區間 -> 信賴區間      置信度 -> 信賴度
  顯著水平 -> 顯著水準 (`水平` ↔ `水準`)

Security:
  入侵檢測 -> 入侵偵測
  防病毒 -> 防毒
  數字證書 -> 數位憑證

Filesystem ownership (POSIX):
  屬主 -> 擁有者 / 所有者
  屬組 -> 群組 / 所屬群組

Quality / CLI / CI-CD:
  服務質量 -> 服務品質 (QoS)
  命令行 -> 命令列 (CLI)
  流水線 -> 管線 (CI/CD pipeline)

The s_only_patterns list now holds 212 regex entries across 25 sub-
sections in the agent doc. No existing zh-tw content offends the new
rules.

510/510 pytest pass.
The most impactful catch this round: S-Chinese conflates 復 (= "again")
and 複 (= "duplicate"). The compound 復制 (S) is widespread but should
be 複製 in T-Chinese — 復 is the wrong radical for "duplicate". Same
applies to 復用 vs 重用. These two were in the agent doc table but
never made it into the test regex set; that gap closes here.

Patterns added:

`復` / `複` distinction:
  復制 → 複製   (very common — S `復` ≠ T `複`)
  復用 → 重用
  編寫 → 撰寫

Number bases:
  二進制 → 二進位
  八進制 → 八進位
  十進制 → 十進位
  十六進制 → 十六進位
  進制 → 進位  (general; excludes the 4 specific compounds above
                via negative-lookbehind)

Serial / parallel / stack / files:
  串行 → 串列          堆棧 → 堆疊
  二叉堆 → 二元堆積     壓縮文件 → 壓縮檔

Kernel / syscalls / messaging:
  用戶態 → 使用者模式
  系統調用 → 系統呼叫
  調用 → 呼叫  (excludes 失調用 / 強調用 via negative-lookbehind)
  反向工程 → 逆向工程
  私聊 → 私訊

UI controls / parts / identifiers:
  控件 → 控制項
  部件 → 元件
  標識 → 標示  (excludes 標識符 already covered)
  圖元 → 像素

Tried and reverted:
  跨度 → 跨距 / 範圍 — turned out to false-positive on TW-acceptable
  「時間跨度」 in regen_llm_security_batch_zh_tw.py. 跨度 is standard
  TW vocabulary too; removed from the pattern set.

s_only_patterns now holds 233 regex entries across 26 sub-sections in
the agent doc table. No new offenders in zh-tw content.

510/510 pytest pass.
After 7 rounds the list is at 244 regex entries across 27+ sub-sections.
Round 8 cleans up the last clear-S-only compounds I could spot
without crossing into "TW also uses" grey territory.

Patterns added:

Registration / roles / mining / gateway:
  注冊 → 註冊      (S writing of 註冊 — uses 注 instead of 註)
  程序員 → 程式設計師
  數據挖掘 → 資料探勘
  網關 → 閘道
  負載均衡 → 負載平衡
  測試用例 → 測試案例 (bare 用例 stays untracked — TW also uses it)

GUI compounds (bare 界面 stays untracked — TW physics term):
  圖形界面 → 圖形介面 (GUI)
  用戶界面 → 使用者介面 (UI)

Drivers / middleware / stack:
  驅動程序 → 驅動程式
  中間件 → 中介軟體
  全棧 → 全端

The agent doc now carries a "When to stop adding patterns" note for
future maintainers — diminishing returns + rising false-positive risk
make further proactive expansion the wrong move. The round-7 跨度
mistake (`時間跨度` is fine in TW; the pattern false-positived) is
the cautionary tale called out in that note.

Going forward: reactive only. When a new real zh-tw offender surfaces
in a regen script / README / rst / i18n key, add the specific pattern
then, with the context of a known false-positive surface to design
the negative-lookaround against. Or: when the user supplies a project-
specific term list, take that as the safety net.

510/510 pytest pass.
figure-extraction to mandatory in the runbook

The first ship of these decks left figures=() on every PaperSummary,
so the rendered .pptx had no images at all — half a deliverable.
extract_figures was never invoked; the existing fitz / PyMuPDF
backend was sitting unused.

scripts/_extract_speculative_figures.py
  One-shot helper that runs autopapertoppt.intelligence.pdf_assets
  extract_figures on each of the 4 PDFs (Xia ACL Findings, Spector
  ICML, Xu IEEE TMC, Svirschevski NeurIPS), writes PNGs to
  exports/speculative-decoding-zh-tw/figures/<paper_key>/ in the
  layout the regen script's _fig() helper expects.

scripts/regen_speculative_decoding_zh_tw.py
  Add a _fig(paper_key, filename) helper and figures=(...) tuples
  to every PaperSummary. Hand-curated 2-3 figures per paper:
   - Xia: taxonomy + Spec-Bench speedup chart
   - Spector: roofline + HumanEval speedup distribution + token-
     origin colour chart
   - Xu: workflow + illustrative example + branch verification
   - Svirschevski: algorithm overview + draft-size acceptance
     curve + token-penalty curve

  Each figure ships with caption + 2 description bullets in zh-tw
  pointing the reader at what to look at.

.claude/agents/paper-summary-author.md
  Promote `figures` from "list when present" to MANDATORY when the
  paper has any figure. Add a "Figure extraction (mandatory before
  authoring `figures=`)" subsection covering: the extract_figures
  API call, the _fig() helper pattern, and curation guidance (2-3
  per paper — system overview, key result chart, optionally one
  ablation/example). Add an explicit anti-pattern:
  "Do NOT omit figures= from a rich PaperSummary."

Re-rendered the 4 decks:
  - xia2024unlocking-zh-tw.pptx     20 slides (was 18)
  - spector2023accelerating-zh-tw   21 slides (was 18)
  - xu2024edgellm-zh-tw             22 slides (was 19)
  - svirschevski2024specexec-zh-tw  21 slides (was 18)
Overflow check: 4/4 PASS. 510/510 tests pass.
…design subagent

The decks rendered fine geometrically but had the AI-generated look —
default Calibri on every run, plain blank backgrounds, no accent
shapes. python-pptx supports both: setting a real font family on
every run, and adding decorative rectangles to every slide. We were
doing neither.

autopapertoppt/exporters/pptx.py
  * `_FONT_FAMILIES` table keyed by language → (latin, east_asian).
    Inter is the Latin face across all 14 supported languages; the
    east-asian slot fills in Microsoft JhengHei UI (zh-tw) / YaHei UI
    (zh-cn) / Yu Gothic UI (ja) / Malgun Gothic (ko) / Nirmala UI (hi).
  * `_apply_typography(prs, language)` post-build pass walks every
    slide -> shape -> paragraph -> run and writes both `<a:latin>`
    and `<a:ea typeface=...>` XML. PowerPoint consults a SEPARATE
    East-Asian font slot for CJK code points; leaving it unset would
    have made CJK characters render in the host's default
    (PMingLiU / SimSun / etc.) which doesn't match Inter.
  * `_decorate_with_accents(prs)` post-build pass:
      - Cover slide: a 0.4" × full-height navy band on the left
        (`accent_left`).
      - Every other slide: a 0.08" × full-width navy bar at y=0
        (`accent_top`).
    Both shapes are sent to the back of z-order via spTree.insert(2)
    so text content sits above them. Both carry semantic shape names
    so pptx_edit.update_slide(...) and pptx_edit.delete_slide(...)
    can target them.

scripts/_overflow_check.py
  Decorative rectangles have empty text_frames; the existing
  _estimate_wrapped_height_emu inflates an empty frame to ~1
  line-height (~0.2") which false-flagged the 0.08" accent bar.
  Skip when text_frame.text is empty.

.claude/agents/deck-design.md
  New subagent owning visual identity — typography rules per language,
  brand palette discipline, accent geometry expectations, master-slide
  contract, and the anti-patterns that make a deck obviously
  machine-generated (default Calibri, plain blank backgrounds,
  centred-only covers, all-text body slides).

CLAUDE.md + .claude/agents/slide-deck-rules.md
  CLAUDE.md table gains a 10th row pointing at deck-design.
  slide-deck-rules.md adds a "scope split" note clarifying that the
  sibling deck-design subagent owns visual identity while this one
  stays focused on geometry / overflow / content caps.

tests/test_exporters.py + tests/test_pptx_edit.py
  Three tests assumed `slide.shapes[0]` is the title. With accent
  rectangles now sitting at index 0, they crash on empty text_frames.
  Replaced with a `_slide_text(slide, name)` helper that finds the
  shape by its semantic name — which the project already pins as a
  contract elsewhere.

Re-rendered the 4 zh-tw speculative-decoding decks:
  xia2024unlocking-zh-tw      20 slides, 156 shapes, overflow PASS
  spector2023accelerating     21 slides, 164 shapes, overflow PASS
  xu2024edgellm               22 slides, 172 shapes, overflow PASS
  svirschevski2024specexec    21 slides, 164 shapes, overflow PASS

Cover title verified to carry latin='Inter' + east-asian
'Microsoft JhengHei UI'. Content slide 3 carries `accent_top` shape.

510/510 pytest pass; ruff clean.
advances the paper's story

The previous figure-extraction guidance capped at 2-3 figures per
paper and warned that "> 4 figures bloats the deck past the typical
25-slide cap". That was the wrong default — figures are part of the
thesis-style deliverable, not optional polish. Capping them produces
text-heavy decks that miss the paper's visual arguments.

scripts/regen_speculative_decoding_zh_tw.py
  Xia survey: 2 -> 4 figures
    + p02-01 Timeline of Speculative Decoding evolution (Fig 2)
    + p16-06 Spec-Bench radar + per-model-size bar charts (Figs 8 & 9;
      more comprehensive than the single p08-04 chart already used)

  Xu EdgeLLM: 3 -> 8 figures (broke the 25-slide cap; 27 slides now)
    + p01-00 Memory wall + emergent abilities chart (Fig 1) — motivation
    + p03-01 Decoder-only LLM architecture (Fig 2) — background
    + p08-06 Fallback threshold ablation (Fig 8) — sensitivity sweep
    + p11-08 Per-token latency vs baselines (Fig 11) — headline result
    + p13-13 Generation speed vs memory budget (Fig 13) — mobile-
      specific claim

  Spector + Svirschevski stay at 3 figures each (all extracted ones
  were already included; no further additions).

  Each new figure ships with caption + 2-3 zh-tw description bullets
  pointing at what to look at.

  ExportOptions(max_slides_per_paper=0) added to disable the 25-cap
  for this regen so the extra figure slides don't get trimmed.

.claude/agents/paper-summary-author.md
  Rewrite the "Curate the output" guidance: drop the 2-3 cap, list 9
  figure roles that meaningfully advance a paper's story (motivation,
  background, system overview, worked example, technique diagram,
  headline result, ablation, per-device chart, taxonomy / timeline),
  and document the max_slides_per_paper=0 override pattern when the
  figure count plus rich-tier body content would exceed the default cap.

Re-rendered the 4 decks:
  xia2024unlocking      22 slides (was 20)
  spector2023accelerating 21 slides (same)
  xu2024edgellm         27 slides (was 22, cap disabled)
  svirschevski2024specexec 21 slides (same)
Overflow check: 4/4 PASS. 510/510 tests pass.
row dividers, middle vertical alignment, bold row labels

PowerPoint's default add_table style draws a heavy black grid on every
cell. Combined with top-aligned text and a small font, the resulting
table reads as "screenshot from Excel" rather than thesis-defence visual.
The decks were running through _add_table without ever overriding that
default, so every results table inherited the look.

autopapertoppt/exporters/pptx.py
  * New helpers _clear_cell_borders(cell) and _set_cell_border(cell, edge,
    width, colour). Both write the XML directly via qn("a:lnX") because
    python-pptx doesn't expose cell-border setters on its high-level API.
  * Refactored _add_table: each cell now passes through _style_table_cell
    which (a) strips all default borders, (b) applies header / row-stripe
    / row-label / divider rules, (c) sets vertical_anchor = MIDDLE.
    Split out from _add_table itself so the cognitive-complexity budget
    stays under the project's 10-line limit.
  * Two new palette constants — _TABLE_DIVIDER (soft grey-blue) for inter-
    row rules and _TABLE_HEADER_RULE (heavy navy) for the line under the
    header row. The header rule is drawn as the FIRST data row's top
    border so the rule sits flush against the header fill without
    double-line stacking.
  * Bold first column of body rows — most tables in the project use the
    leftmost cell as a row label (RqResult / technique_table / literature_
    positioning_table), so this consistently makes those labels read as
    structural headers.
  * MSO_ANCHOR added to the existing `from pptx.enum.text import ...`.

.claude/agents/deck-design.md
  Adds a "Table styling" subsection with the full spec table (one row
  per visual element: header fill / header rule / row dividers / row
  stripe / vertical alignment / row-label column / cell padding) plus
  a "Tables — additional anti-patterns" subsection listing the new
  failure modes (default grid left intact / cell vertical anchor at
  top / row stripe too saturated).

Re-rendered the 4 zh-tw decks:
  xia2024unlocking         22 slides — overflow PASS
  spector2023accelerating  21 slides — overflow PASS
  xu2024edgellm            27 slides — overflow PASS
  svirschevski2024specexec 21 slides — overflow PASS

Slide / shape counts unchanged from the previous figure-expansion
commit; only per-cell rendering changed. 510/510 pytest pass.
GUI Deck-tab checkbox

OLED projectors and low-light presentation rooms blow out the bright
white slide background. Added a dark-mode rendering path that's
non-invasive in the build pipeline:

autopapertoppt/core/models.py
  ExportOptions gains `dark_mode: bool = False` (frozen dataclass —
  existing callers stay valid without source changes).

autopapertoppt/exporters/pptx.py
  Two new module-level dicts: `_LIGHT_TO_DARK_TEXT` maps light-palette
  RGB triplets to dark equivalents for font.color.rgb swaps;
  `_LIGHT_TO_DARK_FILL` does the same for shape / cell fills + cell
  borders. `_DARK_SLIDE_BG = #12151B` is the dark slide bg.

  `_apply_dark_mode(prs)` runs after `_apply_typography` and
  `_decorate_with_accents`:
    1. Solid-fills every slide's background with _DARK_SLIDE_BG.
    2. Walks every shape; tables iterate cell-by-cell.
    3. For each shape / cell: swap solid fill RGB if it's in the map.
    4. For each run inside a text frame: swap font.color.rgb if mapped.
    5. For each table cell: also walk `<a:lnX>/<a:solidFill>/<a:srgbClr>`
       border XML so the header rule + row dividers retake the dark
       palette's lighter grey-blue.

  Recoloring is intentionally non-invasive — we don't refactor the
  100+ direct `_BRAND_*` constant references in builders. The post-pass
  finds them by the RGB they wrote and swaps. Adding a new palette
  variant in future is one new mapping dict + one new pass.

autopapertoppt/cli.py
  `--dark-mode` store_true flag wired into ExportOptions.

autopapertoppt/gui/pages/deck.py + gui/i18n.py
  Deck tab gains a "Dark mode" QCheckBox under the existing
  "Include abstract" toggle. New `deck.dark_mode_label` i18n key in all
  14 supported languages.

scripts/regen_speculative_decoding_zh_tw.py
  Now ships BOTH variants per paper — `<key>-zh-tw.pptx` (light) and
  `<key>-zh-tw-dark.pptx` (dark) — so the user can pick the right one
  for the venue's lighting.

.claude/agents/deck-design.md
  New "Dark-mode palette" subsection with the full mapping table, the
  rationale per swap (#12151B not #000000 so OLED burn-in is gentler;
  warm-red accent unchanged because it's legible on both backgrounds),
  the exposure surfaces (CLI / GUI / programmatic / regen), and a note
  to update `_DARK_SLIDE_BG` + `test_pptx_dark_mode_swaps_palette`
  together when tuning the palette.

tests/test_exporters.py
  New test_pptx_dark_mode_swaps_palette covers: slide bg = #12151B,
  at least one run swapped to #E5E7EB (near-white).

Rendered the 4 speculative-decoding decks in both variants:
  xia2024unlocking-zh-tw / -dark           22 slides, overflow PASS
  spector2023accelerating-zh-tw / -dark    21 slides
  xu2024edgellm-zh-tw / -dark              27 slides, overflow PASS
  svirschevski2024specexec-zh-tw / -dark   21 slides

511/511 pytest pass; ruff clean.
OLED projectors and low-light presentation venues are the common case
in 2026 — a bright-white slide back glares under both, so dark mode is
now the project default. Light stays available as a one-flag opt-out
for printed handouts and well-lit conference rooms.

autopapertoppt/core/models.py
  ExportOptions.dark_mode default flipped True → effectively True; the
  field now expresses "is this deck dark?" with the more common answer
  as the default. Callers wanting the classic white deck pass
  dark_mode=False explicitly.

autopapertoppt/cli.py
  --dark-mode flag removed (was the opt-in to the previous non-default).
  --light-mode flag added as the new opt-OUT — store_true that sets
  dark_mode=False on ExportOptions. Old CLI scripts that passed
  --dark-mode will now error on the unknown flag; the migration is
  literally "drop the flag, it's now default".

autopapertoppt/gui/pages/deck.py + gui/i18n.py
  Deck-tab checkbox renamed from "Dark mode" to "Light mode". Default
  unchecked → dark deck. i18n key deck.dark_mode_label → deck.light_mode_label
  with translations rewritten for the new semantics across 14 langs
  ("Light mode (white background, dark mode is default)" + equivalents).

scripts/regen_speculative_decoding_zh_tw.py
  Variants flipped: the default-named output (`<key>-zh-tw.pptx`) is
  now dark, and the suffixed variant (`<key>-zh-tw-light.pptx`) is the
  opt-out. Existing `-dark` suffixed files on disk are stale; remove
  them before re-rendering.

tests/test_exporters.py
  test_pptx_dark_mode_swaps_palette replaced by two:
    - test_pptx_default_is_dark_mode (no dark_mode arg → confirms dark
      slide bg + near-white text colour swap fired)
    - test_pptx_light_mode_keeps_navy_text (dark_mode=False → confirms
      no dark bg, navy #1F3A66 still present on at least one run)
  _find_run_color helper factored out for reuse.

README.md + docs/cli.md
  CLI flag tables gain a `--light-mode` row + the dark-default
  explanation. Usage signature updated.

.claude/agents/deck-design.md
  Dark-palette subsection retitled to mark dark as default; exposure
  surfaces flipped (--light-mode / GUI "Light mode" / dark_mode=False).
  Test pin line updated to mention both new tests.

Re-rendered all 4 speculative-decoding decks with the new naming:
  *-zh-tw.pptx       (dark, default)
  *-zh-tw-light.pptx (light opt-out)

512/512 pytest pass; ruff clean.
…yers

User reported "有些文字在暗色模式還是黑色的根本看不見" — some text was
rendering invisible against the dark slide background. Audited the
xu2024edgellm-zh-tw deck and found 74 runs with font.color.rgb = None,
all inside `body` shapes built by _add_bullet_box. That helper set
font.size but never font.color, so the runs inherited the slide
master's theme colour (near-black) and the dark-mode post-pass had no
source RGB to look up in the swap map.

Two layers of defence — both committed together so a future builder
that forgets either rule still ships a readable dark deck.

LAYER 1 — every text-adding helper sets explicit colour
autopapertoppt/exporters/pptx.py: _add_bullet_box now assigns
run.font.color.rgb = _BRAND_DARK for every bullet run, mirroring
_add_textbox's pattern. Re-rendered the 4 zh-tw dark decks; audit
script confirms 0 invisible runs (was 74 on Xu alone).

LAYER 2 — dark-mode post-pass safety net
autopapertoppt/exporters/pptx.py: _swap_text_colors promotes
rgb is None and rgb == (0,0,0) to #E5E7EB near-white. Catches any
future builder that forgets Layer 1.

REGRESSION GUARD
tests/test_exporters.py::test_pptx_dark_mode_has_no_invisible_runs
walks every run on every slide of a default-dark-mode deck and fails
if any non-empty run has rgb=None or rgb=(0,0,0). Pins both layers.

AUDIT SCRIPT
scripts/_audit_dark_text.py — manual single-deck inspector that lists
every offending run with file/shape/paragraph/run/text context. Used
during this fix; kept around for future debugging.

DOCS
.claude/agents/deck-design.md gains a "Dark-mode contract (HARD)"
subsection with the three concrete rules:
  1. Always assign run.font.color.rgb after creating a run
  2. Never use RGBColor(0,0,0) — _BRAND_DARK is the safe choice
  3. Never pass colour=None to _add_textbox

CLAUDE.md gains a new top-level "Dark-Mode Contract" section between
the IEEE/WebRunner HARD rule and the subagent table. Short summary +
pointer to the deck-design subagent for the full rule + test name +
audit script.

513/513 pytest pass; ruff clean.
…ayer

User reported "有在白框裡的字是白色的,看不到" — text inside a
near-white-fill callout box was invisible in dark mode. Audited and
found `_RQ_BOX_FILL = #F3F6FA` (the research-question highlight box's
background) was missing from `_LIGHT_TO_DARK_FILL`. The dark-mode
post-pass correctly swapped the text colour inside the box from
`_BRAND_DARK` to `#E5E7EB` near-white, but the box itself stayed
near-white because the fill RGB wasn't in the mapping → near-white
text on near-white box = invisible.

FIX
autopapertoppt/exporters/pptx.py
  Add `(0xF3, 0xF6, 0xFA): (0x1E, 0x26, 0x38)` to _LIGHT_TO_DARK_FILL.
  Dark navy tint contrasts with the existing `_E5E7EB` near-white
  text the post-pass writes.

AUDIT SCRIPT
scripts/_audit_dark_text.py gains failure-mode B detection:
walks every shape, when fill luminance > 0.7 × 255 (= 178) checks
every contained run; if a run's text colour is also > 178 luminance,
flag it as LIGHT-ON-LIGHT with file/slide/shape/text context.

REGRESSION TEST
tests/test_exporters.py::test_pptx_dark_mode_no_light_text_on_light_fill
walks every shape on a default-dark-mode deck; fails when both fill
and text luminance exceed the 0.7 × 255 threshold. Pins the fix and
catches any FUTURE light-fill-shape addition that lacks a matching
dark mapping. Two helper functions factored out so cognitive
complexity stays ≤ 10.

DOCS — same two-place pattern as the previous dark-mode rule
.claude/agents/deck-design.md gains a "Light-on-light contrast
contract (the OTHER invisibility bug)" subsection under the existing
"Dark-mode contract (HARD)". States three rules for future
light-fill shapes:
  1. Every new light-fill RGB must have a matching entry in
     _LIGHT_TO_DARK_FILL in the same commit.
  2. The regression test catches missing entries.
  3. The audit script reports failure-mode B during manual debugging.

CLAUDE.md gains a "Mirror rule — light-on-light contrast" paragraph
right after the existing dark-mode contract section, with the test
name and the brief rationale.

Re-rendered the 4 zh-tw dark decks; audit confirms 0 light-on-light
runs (was 1+ per deck on RQ-box callouts). 514/514 pytest pass;
ruff clean (cognitive complexity refactored on _check_contrast and
test_pptx_dark_mode_no_light_text_on_light_fill via helper extraction).
"No red text" contract

User asked "不要用紅色的字體顏色" — no red font colour. Found 4 text
call sites using _BRAND_ACCENT (#C0392B warm red) for emphasis:
KPI values, RQ question text, figure caption, figure-unavailable
fallback. All migrated to bold + _BRAND_DARK (navy).

Why ban red TEXT specifically:
* Red text reads as error / warning in slide conventions — wrong
  signal for a KPI we're proud of.
* Pattern-matches strongly to AI-generated deck output ("LOOK AT
  THIS NUMBER!" + red bold + over-emphasis); banning it removes
  another "default LLM-deck tell".
* In dark mode it'd be the only accent colour — visually
  inconsistent with the rest of the palette.

CODE
autopapertoppt/exporters/pptx.py
  - Replace `colour=_BRAND_ACCENT` with `colour=_BRAND_DARK` at 4
    text call sites (lines 891, 934, 990, 1472).
  - _BRAND_ACCENT constant stays in the palette for potential future
    non-text accent shapes (sparkline highlight, status badge). Its
    docstring now states the ban.
  - _LIGHT_TO_DARK_TEXT comment updated: red is intentionally NOT
    mapped so the dark-mode pass can't quietly map a future red
    text leak; the regression test catches it first.

scripts/_audit_dark_text.py
  Drop (0xC0, 0x39, 0x2B) from _ACCEPTED_DARK_RUN_COLORS — the audit
  now flags any red text run as an offender.

REGRESSION TEST
tests/test_exporters.py::test_pptx_no_red_text_runs
  Walks every text run on every slide of a default-rendered deck;
  fails if any non-empty run has font.color.rgb = #C0392B. Pins the
  ban for both light and dark modes.

DOCS — same two-place pattern as the previous dark-mode rules
.claude/agents/deck-design.md gains "No red text contract (HARD)"
under "Dark-mode contract", with the three implementation rules:
  1. Never write colour=_BRAND_ACCENT in any _add_*box helper.
  2. Never assign run.font.color.rgb = _BRAND_ACCENT directly.
  3. Use bold + _BRAND_DARK for emphasis.

CLAUDE.md gains a "No red text" paragraph right after the existing
light-on-light contrast mirror rule, with the test name and short
rationale.

Re-rendered 4 zh-tw decks; no red text remains. 515/515 pytest pass;
ruff clean.
The previous red-text ban collapsed all 4 ex-_BRAND_ACCENT sites onto
_BRAND_DARK navy, which lost the emphasis distinction those surfaces
were trying to carry. Restore variety with two non-red accents:

- _BRAND_HIGHLIGHT (teal-700, #0E7490) — new constant for headline
  emphasis. Used at KPI value + RQ question (the slide's punch line
  and the question being answered).
- _BRAND_GREY (#555555, existing) — for caption / placeholder / chrome.
  Used at paper-table caption + figure-unavailable fallback.

Teal pairs cleanly with navy body text without the warning/error
connotation of red. The dark-mode pass swaps teal-700 -> teal-400
(#2DD4BF) via _LIGHT_TO_DARK_TEXT; the audit script's accepted-colours
set knows about both. The "No red text" regression test still bans
#C0392B; its assertion message + docstring now point to the two
sanctioned replacements.

Updated:
- deck-design.md "No red text" contract — per-call-site palette table
  + variety rule ("teal is for headlines, grey is for context")
- CLAUDE.md mirror rule — references the new palette and dark-mode swap
@JE-Chen JE-Chen merged commit c0383c4 into main May 21, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant