fix(notebooks): expand flattened markdown tables on paste#60143
Conversation
When users paste markdown copied from Slack, ChatGPT, Notion and similar sources, table rows are frequently glued onto a single line — e.g. `| a | b | |---|---| | 1 | 2 |`. `markdown-it` cannot parse this as a table, so the row text ends up stored as a literal paragraph in the notebook and cannot recover on re-render. Pre-process the pasted markdown to re-expand any line that contains 3+ pipe-separated row-like segments AND at least one delimiter row (`|---|---|`). The delimiter guard keeps the transformation safe against incidental `|`-containing prose. Co-authored-by: Cursor <cursoragent@cursor.com>
|
🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down. Add the Most PRs don't need this. Real regressions still get caught on master and fix-forward. |
Move `expandFlattenedMarkdownTables` from the notebook paste handler to `lib/utils/`, and use it across the three MarkdownManager-based parsers: notebooks paste, Max chat (`markdownToTiptap`), and dashboard text cards (`textCardMarkdown`). `isTextCardMarkdownRoundTripSafe` also uses the expander so the legacy/rich editor decision stays consistent with what `markdownToTextCardDoc` would render. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Size Change: 0 B Total Size: 80.8 MB ℹ️ View Unchanged
|
|
Co-authored-by: Cursor <cursoragent@cursor.com>
Notebooks saved before the paste-time pre-processor existed still have flattened-table paragraphs (`| a | b | |---|---| | 1 | 2 |`) stored as literal text. Add a content migration that finds those paragraphs and rewrites them into real `table` nodes before the editor binds the doc, so existing notebooks render tables correctly without user action. Wired into `migrate()` and `NOTEBOOKS_VERSION` bumped to bust the localContent cache. Also adds a `FlattenedTable` storybook fixture for visual verification. Co-authored-by: Cursor <cursoragent@cursor.com>
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
frontend/src/lib/utils/expandFlattenedMarkdownTables.ts:2-14
**Empty-cell columns split as row boundaries**
The pattern `(?<=\|)\s+(?=\|)` matches any whitespace directly sandwiched between two `|` characters, so it cannot distinguish an empty cell (`| |`) inside a row from the space that separates two glued rows. A flattened table like `| a | | c | |---|---|---| | 1 | | 3 |` would be split into five segments — `| a |`, `| c |`, `|---|---|---|`, `| 1 |`, `| 3 |` — yielding a column-count mismatch that markdown-it will likely reject and fall back to paragraphs. The behaviour is no worse than the pre-fix literal-paragraph, but real-world tables exported from ChatGPT/Notion with empty cells won't be recovered by this path.
Reviews (2): Last reviewed commit: "fix(notebooks): heal already-stored flat..." | Re-trigger Greptile |
|
👋 Visual changes detected for this PR. Review and approve in PostHog Visual Review If these changes are unexpected, they may be caused by a flaky test or a broken snapshot on master. Don't approve — rerun the job or wait for a fix. |
a-lider
left a comment
There was a problem hiding this comment.
Looks good, thanks for fixing this!
split() ignores the global flag; keeping /g on a module-level RegExp would silently break any future .exec()/.test() call via stale lastIndex. Co-authored-by: Cursor <cursoragent@cursor.com>
2 updated Run: bb40d0d8-61cf-4a29-a779-c030337eeaa6
Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
|
🎭 Playwright report · View test results →
These issues are not necessarily caused by your changes. |
The GitHub suggestion that relaxed the segment guard to >=2 (header + delimiter rows, valid GFM) left blank lines around the if that oxfmt rejects. Tidy formatting and lock in the header+delimiter case with a test. Co-authored-by: Cursor <cursoragent@cursor.com>
Problem
PostHog notebooks only converted markdown -> rich text at paste time. The detector triggered on block-level signals (headings, lists, fenced code, tables, etc.), then
@tiptap/markdown(markdown-itunder the hood) parsed the text into ProseMirror JSON.That conversion silently degraded when the pasted text had tables whose rows had been collapsed onto a single line — for example:
markdown-itrequires each row on its own line (per GFM), so it fell through and the row text got stored as a literal paragraph in the notebook. Once that happened, re-rendering couldn't recover the table.Slack, ChatGPT/Claude, Notion, and similar sources flatten table rows when copied as plain text, which is what reproduces this for end users.
Changes
Three layered fixes:
1.
lib/utils/expandFlattenedMarkdownTablesShared string -> string pre-processor. A line is treated as a flattened table only when both:
(?<=\|)\s+(?=\|)boundaries, and|---|---|, including alignment markers like| :---: |).The delimiter-row guard is what keeps incidental
|-containing prose untouched.2. Wire it into every
MarkdownManager-based parser (paste-time)frontend/src/scenes/notebooks/Notebook/DropAndPasteHandlerExtension.tsx— notebook paste.frontend/src/scenes/max/utils/markdownToTiptap.ts— Max chat rendering.frontend/src/lib/components/Cards/TextCard/textCardMarkdown.ts— dashboard text cards (bothmarkdownToTextCardDocandisTextCardMarkdownRoundTripSafeso the legacy/rich editor pick stays consistent with what would actually render).3. Load-time content migration for already-stored notebooks
expandFlattenedTablesInContentwalks the notebook doc, finds plain-text paragraphs whose text matches the flattened-table shape, and rewrites them into realtablenodes. Hooked intomigrate()andNOTEBOOKS_VERSIONbumped to'2'so existing browsers re-fetch and re-migrate. Existing notebook content heals on next load — no user action needed.Two follow-ups intentionally not in scope:
-/*/1.markers (separated only by blank lines) — no reliable signal to distinguish from regular paragraphs without false positives.How did you test this code?
expandFlattenedMarkdownTables.test.ts— 9 unit tests on the helper (single table, embedded table, alignment markers, well-formed table unchanged, 5 negative cases).DropAndPasteHandlerExtension.test.ts— end-to-end paste throughparseMarkdownToTipTapproduces a realtablenode.textCardMarkdown.test.tsandmarkdownToTiptap.test.ts— existing suites for the other two paste-time consumers still pass with the pre-processor wired in.expandFlattenedTablesInContent.test.ts— 7 cases covering the doc-level migration, including non-paragraph nodes, marked text, and mixed children left alone.migrate.test.ts— full pipeline throughmigrate()rewrites the broken paragraph into the expectedtablenode.Notebook.stories.tsx— newFlattenedTablestorybook fixture serves a notebook with the broken paragraph; the load-time migration repairs it for visual verification. Manually opened and confirmed in storybook on this branch.Publish to changelog?
no
🤖 Agent context
Agent: Claude Opus 4.7 in Cursor. The session started as a diagnostic question about a specific notebook where markdown wasn't rendering. The two distinct defects were:
-/*/1.markers — genuinely ambiguous, not safe to auto-detect, skipped.Considered alternatives:
MARKDOWN_BLOCK_PATTERNSonly — would change detection but not parsing, so it would still produce a literal paragraph.markdown-it'score.rulerdirectly — same effect, but couples to internals that@tiptap/markdowndoesn't re-export. Loose coupling won.markdown-it-flattened-tablesnpm package — viable but pure overhead for a single consumer. Reconsider if we end up needing this in non-PostHog projects.markdown-it-multimd-table(the plugin we already do not depend on) — solves the opposite problem (multi-line cells), not flattened rows.The split regex uses a lookbehind/lookahead pair so inner cell separators (
| cell |) are never matched — only whitespace directly between two|characters (the boundary between glued rows).