feat(markdown)!: strict math delimiters, robust inline code parsing; drop plain parentheses as math #148

neoragex2002 · 2025-11-23T19:43:37Z

feat(markdown)!: strict math delimiters, robust inline code parsing; drop plain parentheses as math

Add MathOptions.strictDelimiters (inline: $...$, $...$; block: $$...$$, \[...\])
Strict mode: disable heuristics and mid-state (unclosed) math tokens
Always accept $...$ as inline math (unless content contains backticks)
Remove plain parentheses as inline math delimiters to prevent false positives (use $...$ or $...$ instead)
Fix inline code parsing:
- Avoid emitting partial code spans when closing backtick is missing
- Merge remaining fragments and re-parse atomically
- Add raw-based fallback that rebuilds inline_code and strong across backticks, preserving CJK text
Fix nested list parsing: stop skipping *-bullet nested lists; parse uniformly
Tune math heuristic (non-strict path): recognise simple chemistry forms (e.g. H_2O, CH_3CH_2OH) and single-letter tokens; $...$ path no longer depends on heuristics
Add scripts/debug-parse.mjs for token/node inspection in local dev

BREAKING CHANGE: plain ( ... ) is no longer treated as inline math. Use $...$ or $...$ for inline formulas.

Testcases:

3.  **多维度波动性估计 (`longTermDeviation()`)**
    *   这是计算阈值“宽度”的关键，它结合了三种不同的波动性视角：
        *   **`primaryDeviation.getDeviation()`**: 衡量 RCF 原始分数本身的整体波动（EWMA 标准差）。
        *   **`secondaryDeviation.getDeviation()`**: 衡量 RCF 原始分数的**瞬时变化率**（即 `current_score - last_score`）的波动。这对于捕捉突变非常有效。
        *   **`thresholdDeviation.getDeviation()`**: 专门捕捉当分数低于其均值时，分数与均值之间差值的波动。这进一步强化了对**分数非对称性**的理解，因为它更关注“正常”分数分布的下半部分。
    *   **融合机制**: `longTermDeviation()` 方法会根据 `shingleSize` 和 `transformMethod` 的类型，以及 `scoreDifferencing` 参数（一个 `[0, 1]` 的权重因子），**加权融合**这些不同的波动性估计。例如，`scoreDifferencing` 倾向于分数本身的波动，而 `1 - scoreDifferencing` 倾向于分数变化率的波动。
    *   **设计理念**: 分数的异常行为可能表现为多种形式：整体水平的漂移、突然的尖峰、或者持续的轻微增长。通过结合多种波动性度量，模型能够更全面、更鲁棒地估计“正常”分数的变异性，从而使阈值能够响应不同类型的异常信号。


---
#### **Q2: 模型从开始处理数据到输出可靠结果，需要经历哪些阶段？何时才算“完全正常运行”？**

**A:** 模型达到完全稳定需要经过多个阶段，其稳定时间点由 `shingle_size (S)`, `tree_size (T)`, `time_decay (D)` 和 `calibrator_window_size (W_C)` 共同决定。

1.  **阶段 1: Shingle 缓冲区填充 (`t < S - 1`)**: 模型未运行。
2.  **阶段 2: RRCF 树填充 (`S - 1 <= t < (S - 1) + T`)**: 模型开始学习，但分数不稳定。
3.  **阶段 3: 完全指数衰减运行 (`t >= (S - 1) + D`)**: 模型核心评分功能稳定。
4.  **阶段 4: 完全正常运行 (`t >= (S - 1) + max(T, D, W_C)`)**: 整个系统（模型+校准器）稳定，输出的异常概率可靠。

**结论**: 系统在处理了至少 `(S - 1) + max(T, D, W_C)` 个原始数据点后，才能被认为完全正常运行。在此之前的所有输出都应被视为模型预热和学习过程的一部分。


---
5.  **数值稳定性与溢出问题**
    *   `ExponentialWeighting.weight(t)` 中 `t` 过大（特别是 `time_decay` 设置过大时）是否会导致 `e^(alpha * t)` 溢出？


**我们在 `ED-RRCF` 算法讨论中探讨的问题归纳**

1.  **`pysad_rrcf.py` 与 `rrcf_edr.py` 实现差异与行为一致性问题**
    *   `pysad_rrcf.py` 原始实现中，评分 (`score_partial`) 和更新 (`fit_partial`) 逻辑与 `rrcf_edr.py` 的“更新并评分”模式不符。
    *   `_SingleTree.update` 方法中，`L` 变量的用法与 `rrcf_edr.py` 不一致，且存在冗余。
    *   `wi` (新点权重) 在 `i >= time_decay` 阶段的计算方式与 `rrcf_edr.py` 不一致 (`weight(time_decay)` vs `weight(time_decay - 1)`)。

2.  **`PySAD` `BaseModel` API 兼容性问题**
    *   是否能直接实现 `fit_score_partial` 而不实现 `fit_partial` 和 `score_partial`？（结论：不符合 `BaseModel` 接口规范）

3.  **模型运行阶段与稳定性问题**
    *   模型从接收第一个样本点到完全稳定输出可靠结果，分为哪些时间阶段？
    *   何时才能算作“正常运行”？（涉及到 `shingle_size`、`tree_size`、`time_decay`、`calibrator_window_size` 的综合考量）
    *   如何用 Mermaid 图清晰地表示这些阶段和流程？

4.  **参数的深层语义与优化问题**
    *   **`time_decay` 参数的必要性**：它是否可以被 `e` 和 `alpha` 替代？与“历史窗口”概念的对齐？
    *   **`time_decay` 与 `tree_size` 的关系**：`time_decay` 比 `tree_size` 小是否无妨？对算法行为有何影响？
    *   **`e` 和 `alpha` 与“历史窗口”的对齐**：如何通过这两个参数控制模型的“记忆长度”？
    *   **`P` 值的确定**：在 `alpha = -ln(P) / H` 公式中，`P` (权重衰减比例) 如何选择？

5.  **数值稳定性与溢出问题**
    *   `ExponentialWeighting.weight(t)` 中 `t` 过大（特别是 `time_decay` 设置过大时）是否会导致 `e^(alpha * t)` 溢出？
    *   这种溢出对蓄水池采样行为有何影响？
    *   如何通过截断机制避免溢出？截断机制本身对采样的影响是什么？

6.  **传统 `RRCF` 在流式数据中的局限性**
    *   传统 `RRCF` 的无偏均匀采样机制，在概念漂移普遍存在的流式环境中，为何成为其核心缺陷？
    *   如何更深刻、直观地阐述传统 `RRCF` 理论前提与流式应用需求之间的不一致性？

---
1.  **阶段 1: Shingle 缓冲区填充 (0 <= t < S-1)**
2.  **阶段 2: 树填充 / 初始学习 (S-1 <= t < S-1 + T)**
3.  **阶段 3: 满树操作与衰减预热 (S-1 + T <= t < S-1 + D)**
4.  **阶段 4: 满树操作与完全指数衰减 (t >= S-1 + D)**
5.  **阶段 5: 阈值校准器填充 (t < S-1 + W_C)**
6.  **阶段 6: 完全运行 (t >= S-1 + W_C)**

---
- $H$, $O$, $C$
- $H_2O$, $CO_2$
- $CH_3CH_2OH$, $CH_3COOH$

…drop plain parentheses as math - Add MathOptions.strictDelimiters (inline: $...$, $...$; block: $$...$$, \[...\]) - Strict mode: disable heuristics and mid-state (unclosed) math tokens - Always accept $...$ as inline math (unless content contains backticks) - Remove plain parentheses as inline math delimiters to prevent false positives (use $...$ or $...$ instead) - Fix inline code parsing: - Avoid emitting partial code spans when closing backtick is missing - Merge remaining fragments and re-parse atomically - Add raw-based fallback that rebuilds inline_code and strong across backticks, preserving CJK text - Fix nested list parsing: stop skipping *-bullet nested lists; parse uniformly - Tune math heuristic (non-strict path): recognise simple chemistry forms (e.g. H_2O, CH_3CH_2OH) and single-letter tokens; $...$ path no longer depends on heuristics - Add scripts/debug-parse.mjs for token/node inspection in local dev BREAKING CHANGE: plain ( ... ) is no longer treated as inline math. Use $...$ or $...$ for inline formulas.

netlify · 2025-11-23T19:43:42Z

✅ Deploy Preview for vue-markdown-renderer canceled.

Name	Link
🔨 Latest commit	`8890aab`
🔍 Latest deploy log	https://app.netlify.com/projects/vue-markdown-renderer/deploys/692363ecf3488b0008cba7e6

netlify · 2025-11-23T19:43:54Z

✅ Deploy Preview for vue-markdown-renderer-docs ready!

Name	Link
🔨 Latest commit	`8890aab`
🔍 Latest deploy log	https://app.netlify.com/projects/vue-markdown-renderer-docs/deploys/692363ec76017e0008253943
😎 Deploy Preview	https://deploy-preview-148--vue-markdown-renderer-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Simon-He95 · 2025-11-24T01:59:52Z

LGTM

Simon-He95 · 2025-11-24T02:08:32Z

Sorry, some of his tests failed; I'll revert them now.

neoragex2002 mentioned this pull request Nov 23, 2025

BUG: Mixed parsing of mathematical formulas and Markdown formatting leads to messy rendering #145

Closed

Simon-He95 merged commit 4c883ca into Simon-He95:main Nov 24, 2025
12 checks passed

Simon-He95 mentioned this pull request Nov 24, 2025

Revert "feat(markdown)!: strict math delimiters, robust inline code parsing; drop plain parentheses as math" #149

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(markdown)!: strict math delimiters, robust inline code parsing; drop plain parentheses as math #148

feat(markdown)!: strict math delimiters, robust inline code parsing; drop plain parentheses as math #148

Uh oh!

neoragex2002 commented Nov 23, 2025 •

edited

Loading

Uh oh!

netlify bot commented Nov 23, 2025 •

edited

Loading

Uh oh!

netlify bot commented Nov 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

Simon-He95 commented Nov 24, 2025

Uh oh!

Simon-He95 commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

feat(markdown)!: strict math delimiters, robust inline code parsing; drop plain parentheses as math #148

feat(markdown)!: strict math delimiters, robust inline code parsing; drop plain parentheses as math #148

Uh oh!

Conversation

neoragex2002 commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vue-markdown-renderer canceled.

Uh oh!

netlify bot commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vue-markdown-renderer-docs ready!

Uh oh!

Uh oh!

Simon-He95 commented Nov 24, 2025

Uh oh!

Simon-He95 commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

neoragex2002 commented Nov 23, 2025 •

edited

Loading

netlify bot commented Nov 23, 2025 •

edited

Loading

netlify bot commented Nov 23, 2025 •

edited

Loading